Site icon R-bloggers

Retweet Network Analysis in Cryptocurrencies

[This article was first published on R – Predictive Hacks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this tutorial, we will run a simple network analysis on retweets that contain the hashtag “#Crypto” by taking into consideration Twitter data. For this tutorial, we assume that you are familiar with SNA, and you know how to get Twitter data using R or Python. For this tutorial, we will work in R.

Get the Tweets

The first thing that we need to do is to get the tweets that contain the hashtag “#Crypto“. Let’s install the libraries and get authorized.

#########################################
################# Crypto Network Analysis
#########################################
library(rtweet)
library(tidyverse)
library(igraph)


## store api keys (these are fake example values; replace with your own keys)
api_key <- "e2doc7YCE7QXOTJEMZjuto1u3"
api_secret_key <- "kEXdrury86McHUouQMy4SdEyFzgNBcJvoAjD1NJ3xL81RLuLgu"
access_token <- "313313903-fauYDPDt3PT6ACsaB8H1hIJsUygwLKbO2BNFYQyd"
access_token_secret <- "4qK7jeN9PX8zXwek1GyeGVOEUlEzbZU8d4OV0YifWBKuz"

## authenticate via web browser
token <- create_token(
  app = "gpipis_test_01",
  consumer_key = api_key,
  consumer_secret = api_secret_key,
  access_token = access_token,
  access_secret = access_token_secret)

get_token()
 

Now, let’s get 50,000 tweets that contain the hashtag “#Crypto“. Then we will keep the domain names of the source, i.e. the users who posted the tweets and the target, i.e. the users who retweeted the post.

# search for tweets containing the hashtag #Crypto
rt_all<-search_tweets("#Crypto", n=50000, include_rts = TRUE, retryonratelimit=TRUE)

# keep only the screen_name and the retweet_screen_name
rt<-rt_all[, c("screen_name", "retweet_screen_name")]

# remove the NA rows and duplicates
rt<-rt%>%na.omit()%>%distinct_all()
 

It will be helpful to get an idea of the data that we obtained, starting with the “tweets”.

We can have a look at the tweets that got the most likes.

View(rt_all%>%arrange(desc(favorite_count))%>%head(10))
 

Retweet Network Analysis

Our goal here is to find the users who are influencers, meaning that their tweets are becoming popular and are retweeted by other users. In network analysis, we call it “in-degree centrality”. The data frame of the “source” and “target” is the following:

head(rt)
 

# A tibble: 6 x 2
  screen_name     retweet_screen_name
  <chr>           <chr>              
1 Aughauztyne     Cryptoconflict_    
2 JavadHo49442036 Raiinmakerapp      
3 maubigwin_dong  kakanftworld       
4 tanvirtarek5    ThePulseLorian     
5 merGeyi93956765 KCC_Enthusiast     
6 merGeyi93956765 Texan_Shinja

Using the igraph library we will create the directed network and then we will get the users with the highest in-degree centrality.

# create the directed network graph
crypto_network <- graph_from_edgelist(el = as.matrix(rt), directed = TRUE)


# get the in-degree i.e. users who are re-tweeted
in_degree<-degree(crypto_network, mode=c("in"))

# get the top 10 users in terms of in degree
as.data.frame(in_degree)%>%arrange(desc(in_degree))%>%head(10)
 

We get:

               in_degree
SombraNetwork        382
kakanftworld         337
ThePulseLorian       289
revolut20            247
DeeColinok           195
CryptoTownEU         188
Trush_io             181
cryptoo_kingg        113
kucoincom            105
dio_ianakiara         97

We can have a look at the profiles of these “key” users.

Apart from the in-degree centrality, there is also the out-degree centrality, meaning the users who retweet. Let’s see the accounts with the highest out-degree centrality.

# get the out-degree i.e. users who re-tweet
out_degree<-degree(crypto_network, mode=c("out"))

# get the top 10 users in terms of out degree
as.data.frame(out_degree)%>%arrange(desc(out_degree))%>%head(10)
 

And we get:

                out_degree
Jisan909                13
yamada42663496           9
HERO_SAMURAI1            8
Realist481               8
smartsandal              7
Okami_mxsamurai          7
Fabriciosx               7
E__dollar                7
roseannb17               7
Shido_samuraii           6

Finally, we can get the betweenness centrality, which is a measurement of a user’s influence in the flow of information in the social network. We can consider them as the bridge between two key players.

# between centrality
bt_degree<-betweenness(crypto_network, directed = TRUE)
as.data.frame(bt_degree)%>%arrange(desc(bt_degree))%>%head(10)
 

And we get:

                bt_degree
Shido_samuraii       30.5
Emmy_prime_          14.0
miticomansamur1      12.0
ultracig             11.0
Okami_mxsamurai       9.5
Sohylasamurai2        6.0
Nanceebaybehh         6.0
BobEatsBacon1         5.0
VanZoy1               5.0
_borutoKING           4.0

Final Thoughts

We know that asset prices are affected by the available information. There is a motto, we buy the rumors and we sell the news. Since many investors are analyzing trends in social media such as Twitter, the retweet network analysis can help to detect the influencers of the network and is a good idea to keep monitoring them since they can impact the market to some extent.

To leave a comment for the author, please follow the link and comment on their blog: R – Predictive Hacks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.