Thursday, May 23, 2013

Getting RSS from a site that doesn't offer RSS using Dapper and Yahoo Pipes

7:36 AM

I was asked if it was possible to get RSS from a site that doesn't offer RSS.

One site whose content I was interested in was "Community & Networks Connection" - it aggregates lots of "community and collaboration software" news.

 Although the site offers RSS feeds, the news in the RSS feed looks like this below - all of the articles are chunked into daily digests forcing you to click through to the site and never, ever, catching your eye.


Of course it would be possible to screen-scrape the data from the site and republish as RSS, maybe using a scripting language or the excellent ScaperWiki tool, but I really wanted something that anyone could use... in seconds.


Dapper To The Rescue

I began by visiting Dapper, a tool that lets you point and click and select which bits of a page you want to scrape. I began by clicking on the images of the news articles at the top.



After a little fiddling, you can choose whether you want that data in RSS or CSV or even as a Google Map. ( It really does take some fiddling and pruning to work out what you do here. Dapper is an astonishingly wonderful tool, I've never seen anything that does what it does with such elegance, but it does work once you've got your head around it. )

I could then choose to add my new RSS feed to my RSS Reader, but I actually made another Dapp that got the articles lower down the page. That now leaves me with two RSS feeds which I don't really want.

One of the "dapps" I created is here:
http://open.dapper.net/dapp-howto-use.php?dappName=CommunitiesandNetworkConnectionDapperVersion2



Yahoo Pipes To The Rescue

Yahoo Pipes is a wonderful visual tool for "piping" together different information sources and republishing it again. The pipe I created ( shown below ) looks like this and takes the two RSS feeds ( at the top ) from Dapper, joins them together ( Union ) , strips out any duplicates ( Unique ) and lastly filters out any junk posts.



The RSS feed that Yahoo Pipes creates is here:
http://pipes.yahoo.com/pipes/pipe.run?_id=10c40fa02b113c58042af74deead0c1a&_render=rss

And it looks a bit like this:


After a few minutes configuring using point and click tools, I can now keep in touch with the news from the site from my news reader. 

Written by

We are one of the initiators of the development of information technology in understanding the need for a solution that is familiar and close to us.

0 comments:

Post a Comment

 

© 2013 Klick Dev. All rights resevered.

Back To Top