Site icon R-bloggers

Tweets vs. Likes: What gets shared on Twitter vs. Facebook?

[This article was first published on Edwin Chen's Blog » r, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It always strikes me as curious that some posts get a lot of love on Twitter, while others get many more shares on Facebook:

What accounts for this difference? Some of it is surely site-dependent: maybe one blogger has a Facebook page but not a Twitter account, while another has these roles reversed. But even on sites maintained by a single author, tweet-to-likes ratios can vary widely from post to post.

So what kinds of articles tend to be more popular on Twitter, and which spread more easily on Facebook? To take a stab at an answer, I scraped data from a couple of websites over the weekend.

tl;dr Twitter is still for the techies: articles where the number of tweets greatly outnumber FB likes tend to revolve around software companies and programming. Facebook, on the other hand, appeals to everyone else: not only to the masses, but to non-software technical folks in general as well.

FlowingData

The first site I looked at was Nathan Yau’s popular FlowingData website on data visualization. To see which articles are more popular on Facebook and which are more popular on Twitter, let’s sort all the FlowingData articles by their # tweets / # likes ratio.

Here are the 10 posts with the lowest tweets-to-likes ratio (i.e., the posts that were especially popular with Facebook users):

And here are the 10 posts with the highest tweets-to-like ratio (i.e., the posts especially popular with Twitter users):

Notice any differences between the two?

In fact, if we compare articles with pictures to those without, we see that 32% of articles with at least one picture have more shares on Facebook than on Twitter, compared to only 4% of articles with no picture at all.

We can also break down the percentage of articles with more Facebook shares by category.

(I filtered the categories so that each category in the plot above contains at least 5 articles.)

What do we find?

New Scientist

When looking at FlowingData, we saw that compared to Facebook users, Twitter users are much bigger on sharing technical articles. But is this true for technical articles in general, or only for programming-related posts? (In my experience with Twitter, I haven’t seen many people from math and the non-computer sciences.)

To answer, I took articles from the Physics & Math and Technology sections of New Scientist, and

Here’s a graph of the result:

Notice that:

So this suggests that Twitter really is for computer technology in particular, not technical matters in general (though it would be nice to look at areas other than physics and math as well).

XKCD

Finally, let’s take a look at which XKCD comics are especially popular on Facebook vs. Twitter.

Here are the 10 comics with the highest likes-to-tweets ratio (i.e., the comics especially popular on Facebook):

Here are the 10 comics with the highest tweets-to-likes ratio (i.e., the comics especially popular on Twitter):

Note that the XKCD comics popular on Facebook have more of a layman flavor, while the XKCD comics popular on Twitter are much more programming-related:

What’s Next?

The three websites I looked at are all fairly tech-oriented, so it would be nice to gather data from other kinds of websites as well.

And now that we have an idea how Twitter and Facebook compare, the next burning question is surely what do people share on Google+?!

Addendum

Let’s consider the following thought experiment. Suppose you come across the most unpopular article ever written. What will its FB vs. Twitter shares look like? Although no real person will ever share this article, I think Twitter has many more spambots (who tweet out any and every link) than FB does, so maybe unpopular articles will have more tweets than likes by default. Conversely, suppose you come across the most popular article ever written, which everybody wants to share. Then since FB has many more users than Twitter does, maybe popular articles will tend to have more likes than tweets anyways.

Thus, in order to find out which types of articles are especially popular on FB vs. Twitter, instead of looking at tweets-to-likes ratios directly, we could try to remove this baseline popularity effect. (Taking ratios instead of raw number of tweets or raw number of likes is one kind of normalization; this is another.)

So does this scenario (or something similar to it) actually play out in practice?

Here I’ve plotted the overall popularity of a post (the total number of shares it received on either Twitter or FB) against the percentage of shares on Facebook alone, and we can see that as a post’s popularity grows, more and more shares do indeed tend to come from Facebook rather than Twitter.

Also, see the posts at the lower end of the popularity scale that are only getting shares on Twitter? Let’s take a look at the five most unpopular of these:

Notice that they’re all shoutouts to FlowingData’s sponsors! There’s pretty much no reason any real person would share these on Twitter or Facebook, and indeed, checking Twitter to see who actually tweeted out these links, we see that the tweeters are bots:

Now let’s switch to a slightly different view of the above scenario, where I plot number of tweets against number of likes:

We see that as popularity on Twitter increases, so too does popularity on Facebook — but at a slightly faster rate. (The form of the blue line plotted is roughly .)

So instead of looking at the ratios above, to figure out which articles are popular on FB vs. Twitter, we could look at the residuals of the above plot. Posts with large positive residuals would be posts that are especially popular on FB, and posts with negative residuals would be posts that are especially popular on Twitter.

In practice, however, there wasn’t much difference between looking at residuals vs. ratios directly when using the datasets I had, so to keep things simple in the main discussion above, I stuck to ratios alone. Still, it’s another option which might be useful when looking at different questions or different sources of data, so just for completeness, here’s what the FlowingData results look like if we use residuals instead.

The 10 articles with the highest residuals (i.e., the articles most popular on Facebook):

The 10 articles with the lowest residuals (i.e., the articles most popular on Twitter):

Here’s a density plot of article residuals, split by whether the article has a visualization or not (residuals of picture-free articles are clearly shifted towards the negative end):

Here are the mean residuals per category (again, we see that the miscellaneous, data underload, data art, and infographics categories tend to be more popular on Facebook, while the data sources, software, online applications, and news categories tend to be more popular on Twitter):

And that’s it! In the spirit of these findings, I hope this article gets liked a little and tweeted lots and lots.


To leave a comment for the author, please follow the link and comment on their blog: Edwin Chen's Blog » r.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.