Wordless Song: Benchmarking Wordle performance using R

quantixed

2 months ago

[This article was first published on Rstats – quantixed, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A quick post about a puzzle called Wordle that is currently taking over the internet. It’s a mastermind-like game where the object is to guess an unknown 5-letter word.

Puzzlers are encouraged to share their results after completing a puzzle. Here is an example for puzzle 192.

So how do you know if your performance on today’s puzzle was any good? Why not benchmark your effort against the crowd?

Using rtweet to find the performance of the crowd

We can use rtweet to harvest the most recent tweets and extract the number of tries that people were successful with, for today’s puzzle.

library(rtweet)
library(httpuv)
library(ggplot2)

# create token named "twitter_token"
# see previous posts for how to complete this step
twitter_token <- create_token(
  app = appname,
  consumer_key = key,
  consumer_secret = secret)

# get last 18K tweets about wordle
dl <- search_tweets("wordle", n = 18000, include_rts = FALSE)
# filter for tweets that contain today's wordle result
wordle_200 <- subset(dl, grepl("^Wordle 200 ..6",dl$text))
# cleanup
results <- as.numeric(substr(wordle_200$text,12,12))
results <- ifelse(results > 6, NA, results)
results <- results[!is.na(results)]
df <- data.frame(success = results)

p <- ggplot(df, aes(x=success)) +
  geom_histogram(binwidth = 1, colour="black", fill="grey") +
  scale_x_continuous(breaks = seq(1,6,1)) +
  labs(title="Wordle 200", x="Success", y = "Tweets")
p
ggsave("Output/Plots/wordle_200.png",width = 900, height = 600, units = "px")

This gives us the following plot.

Very few people guessed today’s puzzle in 1-2 tries. Most took 3 or 4.

So we can use a plot like this to understand if people generally found the puzzle easy or hard. And how our own effort compared.

There’s more that can be done with the data since the location of correct-letter guesses are also revealed in the downloaded tweets. Parsing them should be possible…

Caveats

OK, so people are more likely to post their good scores than bad and there are a wealth of other confounders. Nonetheless, I couldn’t resist giving this a quick try.

—

The post title comes from “Wordless Song by Gorky’s Zygotic Mynci from their “Barafundle” album released in 1997. Hey, it features the letters WORDLE.

To leave a comment for the author, please follow the link and comment on their blog: Rstats – quantixed.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.