Merging RSS and Atom feeds from various sources
Could'nt find any tool that would merge entries from several sources out there, in a smart way, by trying to find doublons.
I wrote a little script, extending Mark Pilgrim's feedparser we use in CPSRSS, to merge several sources, using the difflib module and the rss rendering we have in CPSBlog.
It calculates the diff ratio on the title and content of each entry to decide wheter
it's the same entry. When the ratio is <= 0.2 it's the same entry (hopefully :) )
Here's an example ran on these:
- http://www.planetpython.org/rss20.xml
- http://www.artima.com/buzz/feeds/python.rss
- http://blogs.nuxeo.com/sections/aggregators/all_posts/exportrss
- http://aspn.activestate.com/ASPN/Cookbook/Python/index_rss
The result is here
(It's a one-shot xmlfile, made today, so it's not a real feed
it is still readable by any client though)
Now I've been told that this was pretty useless, and that i would better make some clean in my feeds and do more interesting stuff in my spare time.
But i can't help it: everytime i see a feed related to python I just add the stuff
to my client :'). So for an unorganized person like me, a CPRSS personnal website with this merging capability, where i can drop tons of feeds would be perfect.
(Post originally written by Tarek Ziadé on the old Nuxeo blogs.)
Subscribe to Feed
Follow us on Twitter