Site icon R-bloggers

Attach Your R Code To Charts You Tweet For Reproducible R Tweets!

[This article was first published on R – rud.is, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I caught this tweet by Terence Eden about using Twitter image alt-text to “PGP sign” tweet and my mind immediately went to “how can I abuse this for covert communications, malicious command-and-control, and embedding R code in tweets?”.

When you paste or upload an image to tweet (web interface, at least) you have an opportunity to add “alt” text which is — in theory — supposed to help communicate the content of the image to folks using assistive technology. Terence figured out the alt-text limit on Twitter is large (~1K) which is plenty of room for useful R code.

I poked around for something to use as an example and settled on using data from COVID Stimulus Watch. The following makes the chart in this tweet — https://twitter.com/hrbrmstr/status/1261641887603179520.

I’m not posting the chart here b/c it’s nothing special, but the code for it is below.

library(hrbrthemes);

x <- read.csv("https://data.covidstimuluswatch.org/prog.php?&detail=export_csv")[,3:5];

x[,3] <- as.numeric(gsub("[$,]","",x[,3]));
x <- x[(x[,1]>20200400)&x[,3]>0,];
x[,1] <- as.Date(as.character(x[,1]),"%Y%m%d");

ggplot(x, aes(Award.Date, Grant.Amount, fill=Award.Type)) +
  geom_col() +
  scale_y_comma(
    labels = c("$0", "$5bn", "$10bn", "$15bn")
  ) +
  labs(
    title = "COVID Stimulus Watch: Grants",
    caption = "Source: https://data.covidstimuluswatch.org/prog.php?detail=opening"
  ) +
  theme_ipsum_es(grid="XY")

Semicolons are necessary b/c newlines are going to get stripped when we paste that code block into the alt-text entry box.

We can read that code back into R with some help from read_html() & {styler}:

library(rtweet)
library(rvest)
library(stringi)
library(magrittr)

pg <- read_html("https://twitter.com/hrbrmstr/status/1261641887603179520")

html_nodes(pg, "img") %>% 
  html_attr("alt") %>% 
  keep(stri_detect_fixed, "library") %>% 
  styler::style_text()

library(hrbrthemes)
x <- read.csv("https://data.covidstimuluswatch.org/prog.php?&detail=export_csv")[, 3:5]
x[, 3] <- as.numeric(gsub("[$,]", "", x[, 3]))
x <- x[(x[, 1] > 20200400) & x[, 3] > 0, ]
x[, 1] <- as.Date(as.character(x[, 1]), "%Y%m%d")
ggplot(x, aes(Award.Date, Grant.Amount, fill = Award.Type)) +
  geom_col() +
  scale_y_comma(
    labels = c("$0", "$5bn", "$10bn", "$15bn")
  ) +
  labs(
    title = "COVID Stimulus Watch: Grants",
    caption = "Source: https://data.covidstimuluswatch.org/prog.php?detail=opening"
  ) +
  theme_ipsum_es(grid = "XY")

Twitter’s API does not seem to return alt-text:

rtweet::lookup_statuses("1261641887603179520") %>% 
  jsonlite::toJSON(pretty=TRUE)
## [
##   {
##     "user_id": "5685812",
##     "status_id": "1261641887603179520",
##     "created_at": "2020-05-16 12:57:20",
##     "screen_name": "hrbrmstr",
##     "text": "Twitter's img alt-text limit is YUGE! So, we can abuse it for semi-covert comms channels, C2, or for \"embedding\" the code ## that makes this chart!\n\nUse `read_html()` on URL of this tweet; find 'img' nodes w/html_nodes(); extract 'alt' attr text w/## html_attr(). #rstats \n\nh/t @edent https://t.co/v5Ut8TzlRO",
##     "source": "Twitter Web App",
##     "display_text_width": 278,
##     "is_quote": false,
##     "is_retweet": false,
##     "favorite_count": 8,
##     "retweet_count": 2,
##     "hashtags": ["rstats"],
##     "symbols": [null],
##     "urls_url": [null],
##     "urls_t.co": [null],
##     "urls_expanded_url": [null],
##     "media_url": ["http://pbs.twimg.com/media/EYI_W-xWsAAZFeP.png"],
##     "media_t.co": ["https://t.co/v5Ut8TzlRO"],
##     "media_expanded_url": ["https://twitter.com/hrbrmstr/status/1261641887603179520/photo/1"],
##     "media_type": ["photo"],
##     "ext_media_url": ["http://pbs.twimg.com/media/EYI_W-xWsAAZFeP.png"],
##     "ext_media_t.co": ["https://t.co/v5Ut8TzlRO"],
##     "ext_media_expanded_url": ["https://twitter.com/hrbrmstr/status/1261641887603179520/photo/1"],
##     "mentions_user_id": ["14054507"],
##     "mentions_screen_name": ["edent"],
##     "lang": "en",
##     "geo_coords": ["NA", "NA"],
##     "coords_coords": ["NA", "NA"],
##     "bbox_coords": ["NA", "NA", "NA", "NA", "NA", "NA", "NA", "NA"],
##     "status_url": "https://twitter.com/hrbrmstr/status/1261641887603179520",
##     "name": "boB • Everywhere is Baltimore • Rudis",
##     "location": "Doors & Corners",
##     "description": "Don't look at me…I do what he does—just slower. 🇷 #rstats avuncular • pampa • #tired • 👨‍🍳 • ✝ • Prìomh ## Neach-saidheans Dàta @ @rapid7",
##     "url": "https://t.co/RgY1wHjoqM",
##     "protected": false,
##     "followers_count": 11886,
##     "friends_count": 458,
##     "listed_count": 667,
##     "statuses_count": 84655,
##     "favourites_count": 15140,
##     "account_created_at": "2007-05-01 14:04:24",
##     "verified": true,
##     "profile_url": "https://t.co/RgY1wHjoqM",
##     "profile_expanded_url": "https://rud.is/b",
##     "profile_banner_url": "https://pbs.twimg.com/profile_banners/5685812/1398248552",
##     "profile_background_url": "http://abs.twimg.com/images/themes/theme15/bg.png",
##     "profile_image_url": "http://pbs.twimg.com/profile_images/824974380803334144/Vpmh_s3x_normal.jpg"
##   }
## ]

but I still need to poke over at the API docs to figure out if there is a way to get it more programmatically.

If we want to be incredibly irresponsible and daft (like a recently semi-shuttered R package installation service) we can throw caution to the wind and just plot it outright:

library(rtweet)
library(rvest)
library(stringi)
library(magrittr)

pg <- read_html("https://twitter.com/hrbrmstr/status/1261641887603179520")

html_nodes(pg, "img") %>% 
  html_attr("alt") %>% 
  keep(stri_detect_fixed, "library") %>% 
  textConnection() %>% 
  source() %>% # THIS IS DANGEROUS DO NOT TRY THIS AT HOME
  print()

Seriously, though, don’t do that. Lots of bad things can happen when you source() from the internet.

FIN

Now, you can make your Twitter charts reproducible on-platform (until Twitter does something to thwart this new communication and file-sharing channel).

Since twitter status URLs are just GET requests, orgs should consider running the content of those URLs through alt-text extractors just in case there’s some funny business going on across user endpoints.

To leave a comment for the author, please follow the link and comment on their blog: R – rud.is.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.