I mentioned in my last post that I wrote a module that processed my RSS feeds. After further testing, this module turned out to be quite unreliable. It worked for my feed and a few others but was not robust enough to meet the varying ways in which people represent published dates on their blog RSS feeds. Because dates could be formatted in one of many ways, I would have to spend a lot of time gathering different timestamps and writing code that supported them. This was far from ideal. Also, I found some feeds were marked up using Atom and so might not have worked with my code.
In short, writing an RSS parser capable of adapting to different feeds is not a job that I wanted to do myself. That’s why I just set up a new account on Feedly, the RSS reader tool with an API. I have used Feedly in the past but I did not stick with it. This time, even if I don’t use the Feedly web interface, I will still be using their API. The Feedly API meets my need of seeing all the latest posts submitted by people whose RSS feeds I follow. Feedly processes timestamps and converts them into UNIX time. This means it is easy for me to process published timestamps.
Knowing when a blog post was published is incredibly important because I only want to print the titles of articles published in the last day. Without this logic, I believe I would end up printing out at least 10 headlines for each feed I follow because feeds usually contain data on at least the 10 most recently published posts. This would consume a lot of paper and is completely impractical for me. I have a Python if statement that checks if a post was published in the last day and, if so, will execute code that will print the title and URL of the article in my daily update.
The logic for the new RSS module is as follows:
- Print message informing me the next section of the daily update will show my RSS feeds.
- Get articles in my main feed from Feedly.
- Iterate over each article and check if it was published yesterday.
- If an article was published yesterday, the title and URL of the article should be printed to the console.
- Otherwise, nothing should happen.
This post has the title Part II.5 because this post addresses more of a technical hurdle than anything else. But I did learn a valuable lesson: RSS feeds are not entirely consistent and leaving feed parsing to the pros is probably the best choice given my skills and use case of reading feeds for this program.
Also posted on IndieNews.