Most RSS feeds these days seem to be short snippets. Is there a programmatic way to get a “full” copy of an article from the feed?
I’d even be happy with all the random crap on the webpage showing up in my reader - I just don’t want to have to click through everything.
I’m trying to use DEVONthink for my feeds, so various options built into various RSS clients probably aren’t viable. I’d want something to basically convert a whole feed.
On the Mac I would suggest that you parse the source URLs from the RSS feed and then retrieve the content.
You could potentially use
grep to parse out URLs and
wget to grab pages through some shell scripting, or you could write something in Python or the like to do this which would probably give you much greater control over the final format of any content downloaded.
If you have a preferred RSS reader, and it is scriptable through AppleScript (or at worst, via key presses and clicks with Keyboard Maestro), you might be able to automate that to run through the feeds, load the full page source, and export that as a PDF. Those apps are designed around grabbing the articles based on the feed, not just the excerpts, so it should absolutely be possible as long as you have such an app that is a good Mac citizen.
Hope that helps,
For what it’s worth I managed to very quickly find resources that explain elements of the command line/Python approach. They might help if you do want to investigate those approaches further.
This post has some command line examples around processing RSS feeds.
This post has a nice worked example of a shell script to parse an RSS feed.
This postgives some starting points for a Python based library for RSS.
This post has another worked example of working with an RSS feed in Python.
If you don’t want to “roll your own”, there are several free tools out there you can find by searching for Full Text RSS.
Most of these are someone’s pet project that they threw together with some PHP and put up on a website for free. Every now and again some of those will die off, but new ones crop up fairly often too. Of course there are some commercial options, too. Haven’t used any of those, so I can’t say how they work.
(When I worked at TUAW, our RSS feed was truncated by decree of our AOL overlords. I used to have a TextExpander snippet of the URL to use with one of the free “get full-text RSS feeds” services that I would send to anyone who wrote in or posted on Twitter complaining about it.)