Using ScraperWiki, Plotly And R to Analyze Twitter Follower Data

ScraperWiki is powerful tool used to scrape sites and analyze data. The platform is quite robust and growing number of developers have started using it as evident by the number of github repos. One of the built in scrapers allows you to scraper your twitter account for follower information. I decided this would be a good way to cut my teeth with ScraperWiki, and gave me an excuse to look at my twitter followers in depth.

Scraping

After signing into your ScraperWiki account, getting your Twitter followers is as straightforward as entering your Twitter handle. From here, ScraperWiki has some built in tools to visualize the data, but I chose to export the csv and work with it in R.

Tweet Distribution

I wanted to look at how many of my followers tweeted most. Turns out not many of my followers are heavy into Tweeting, which isn’t surprising. Having 500+ followers, I wouldn’t imagine most of them being active:112

Locations

Now this metric isn’t perfect– it’s just the text one can include in their profile. I treated each location as a string and then got a count of the most frequent. Utah (where I live now) accounts for the most of my followers. The different stylized ways to writing Salt Lake City certainly come out in this graph as well.

Future

The profile description text is among the most interesting of the data pulled from this scrape and I will do a future post on how to fit a Latent Dirichlet Allocation model to this data in order to pull topics from it. It’ll also be interesting to look at data of when my followers signed up for Twitter.

129 Code:

Jowanza JosephMarch 15, 2014Data, R, Visualization, Web

Blog

Using ScraperWiki, Plotly And R to Analyze Twitter Follower Data