I had planned to write this post a few months ago, but I suddenly became a research assistant and got very busy, so I could not find time. Last semester, I took “Machine Learning for Multimedia Informatics” course and we were free to find our own topic to develop a machine learning application as our course projects. Since I had tried to gather some data from Tinder and use it before taking the course, I wanted to build something on top of it, but later was directed towards different topics and I chose to create something related to Twitter.
The basic idea is to predict the popularity of a given tweet even before it gets tweeted. In the literature, popularity is mostly measured by the retweet count, but most of the people do not get retweeted much and that measure does not apply to private accounts. Since I wanted to create something that could potentially be useful for any Twitter user, I decided to measure the number of favorites (likes). I believe this is a more stable metric because people are reluctant to retweet an ordinary user, but they are more generous with favorites, maybe because they do not appear with a user’s own tweets (like retweets do) in their main profile feed.