Retweet, you robot!
The idea
The are a lot of people in the Ruby/Rails community that I’d like to follow. However, in order to read their interesting tweets I’d also have to agree to read all the other things they tweet, and then I wouldn’t do anything else than read tweets all day.
Basically, I want to read this:
Thanks @steveklabnik for reminding me about this article. Every programmer should read it: kalzumeus.com/2010/06/17/fal…
— Aaron Patterson (@tenderlove) August 31, 2012
But not this:
Just saw a poo that looked like a starfish.
— Aaron Patterson (@tenderlove) October 15, 2012
(sorry, Aaron…)
What I need is someone that would follow all these people, read all their tweets and retweet only what seems important. This bot is my attempt at creating such filter.
How it works
The basic idea was that the best tweets get retweeted a lot, so I made the bot select tweets with a high number of retweets. Adding favorites improved things further, because a lot of tweets get many favorites but not many retweeets (especially some useful but not funny tweets from @ruby_news or @rubyflow – the funny ones get retweeted the most). I’ve ignored retweets of tweets by people outside the list, because almost all of them were off topic.
Now I had most of the interesting tweets marked to be retweeted, but most of the top tweets were still not relevant – funny tweets about random things, tweets about politics, current news, Apple, Microsoft, startups, religion, etc. So then I’ve added a keyword whitelist – I went through the top tweets and I’ve prepared a list of keywords that would only match the tweets I’d like to see retweeted.
I’ve also made the minimum number of retweets+favorites depend on the author – those with a high number of followers get much more retweets on average, so a post with 30 retweets by @spastorino (3871 followers) will usually be more interesting than a post with 30 retweets by @dhh (72141 followers).
The end result is that even though some good tweets are ignored and some off topic tweets get retweeted (e.g. this Aaron Patterson’s tweet got through because the bot thought that the word “rest” was about REST), the filter works surprisingly well in most cases. It should retweet about 4 tweets per day on average, which sounds like an acceptable number. I’ll be checking the results from time to time and making tweaks to the keyword list and the algorithm to make sure the bot makes the right choices.
Check out the sample below and follow @rails_bot if you like it. If you’d like to learn more about how it works (and maybe help me improve it), see the source code on GitHub.
Enjoy your cleaner twitter feed!