How to setup zero budget weblog monitoring (Part 1)

Stepanie walked into the Lounge for the official Bloggers of LeWeb 10 and asked "Do we have a Yahoo Pipes expert here?" I replied: What do you wanna do?

She needed to setup a site where all posts from us official bloggers (/several dozen) appeared that were tagged with 'leweb' or 'leweb10'. And her twitter netweok had suggested to do this with Yahoo Pipes.

So we did
- only want a defined set of blogs as a source
- only want blog posts that were tagged (as some of us posted other stiuff in between or even worked for team blogs that had other people produce non-LeWeb content, too)

This is how we set it up:

1) First we went to the list of Blogs (that was in a Google Doc) and had teo pero0le work on subscribeing to them in Google reader and adding them to 'labels'. We used tweo different accounts and two labels so we could work faster. Later this was unified into one list for purposes of better overview on who was 'in' and who was maybe missing. [THeroretically you could add 40+ blogs one by one to the Yahoo Pipe but you would soon lose all overview on what you added and what you did not etc. Also the Google Redaer detects the RSS-Feeds automatically and has a bookmarklet for subscription.]

2) The collectors published these labels (Settings/Labels) and sent the respective page URLs to the "Pipe Master" (me ;) ).

3) The resulting Pipe was this.

It has some duplication and red8ndancy built in intentionally.

This is what it does:
a) it collects the RSS feeds (or better: ATOM) of the two published Google Reader Labels

b) It points to the respective Google Reader pages and has Pipes itself look for RSS
[this is done, because Pipes said the RSS feeds weree faulty, so I wanted to make sure to give it the sources both explicitely and implicetly so that a maximum number of posts would be caught]

c) it does a 'union' operation sp that the resulting combined feed can be processed with a singe 'pipe'

d) it looks both for leweb10 and leweb

e) in the title, in the body AND in the tags of the post
[this is to gove leeway for some tag-indicipline and to make sure that even untagged posts might be caught]

f) it does a 'unique' operation based on the URL of the respective article, so that each article would show up only once in the result

g) it creates an output as RSS

The output is also visble as a page or avaiable as a widget.

The Result:

Pipes: LeWeb Paris 10 Official Bloggers. (This is the Widget view in a page, it will adjust to width...).

Later another issue showed up: The pipe would only show the last few items.

This can be helped by involving Google Reader again, because that buffers a feed for all eternity (but only from the point it was subscribed to/indexed, I think).

So what I did:
- I subscribed to the RSS from the results-Page
- added it to a label
- and published that label

The final result (that will only have the posts found since this morning.

Any suggestions on how to improve this for number of posts caught?


Do you need screenshots with that? I assume anyone building sth. like this would be able to do without ;)

