Site icon R-bloggers

Wingspan Data Analysis

[This article was first published on R Archives - Dan Oehm | Gradient Descending, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Wingspan is a great game even though I’ve only played it a few times. The mechanics are great, there are lots of bird varitions, and a bunch of different strategies to try. There are 170 birds, and I’ve probably only seen 30 of them. So, true to form, I’ve dabbled in a bit of data analysis to get a view of all the different types of cards in the game.

Open source wins again since the {wingspan} R package exists. It contains the details of each bird in the core, European, Oceania, and swift start sets. I’ll only be using the core set for this analysis since that’s the only one I’m semi familiar with.

What’s the most common food type?

There are five food types: invertebrate (let’s be honest, it’s grub, and I’ll choose 1 syllable over 4 any day), seed, fruit, fish, and rat (it’s rat). Grubs are definitely more common as a food cost, but how much more?

Seeds and grubs are 2.5-3x more common food cost than the other three food types when summed across all 170 cards. If you’re not looking for specific food types, choosing grubs and seeds from the bird feeder will give you more options. If you’ve played it a few times, this becomes obvious pretty quickly.

What is the average egg capacity?

The average egg capacity is 2.85, although the distribution of egg capacity and the relationship with victory points is more interesting.

This is useful to know in terms of the odds of picking up valuable cards from the tray or deck. Of course, some of the lower value cards will have great activations, but at the end game, you’ll be looking for the big ones.

What is the habitat distribution?

There are almost equal numbers of birds across the habitats: 83 birds in the forest and grassland and 85 in the wetland.

The breakdown is mildly interesting:

What is the most common power?

Flocking cards (or tuck cards) are the most common power other than ‘Other’, which tends to include drawing more bonus cards or moving the bird to another habitat.

There are only 6 birds without powers, which are all high VP birds. I was surprised that cards with egg laying, card drawing, or food from the supply powers account for only 11% of the cards each.

Predicting victory points

I expect victory points to correlate with egg capacity, food cost, activation power, and habitat. Fitting a model to predict the number of victory points allows us to see which cards have a good bang for buck.

Or, what I actually expect is that cards with fewer victory points than expected have strong activation powers to compensate. However, I am making an assumption here that there has been a lot of play testing and that the cards have been adjusted to be balanced.

Data setup

A couple of things to note regarding the data setup:

Fitting the model

I’ve fit a GLM with victory points as the response and the food cost, egg capacity, habitat, and power category as predictors. I’ve removed the intercept from the model formula because it makes interpreting the coefficients easier.

library(wingspan)
library(tidyverse)

df <- birds |>
  rename(vp = victory_points) |>
  filter(
    set == "core",
    !food_cost_div
    ) |>
  mutate(power_category = replace_na(power_category, "No power")) |>
  mutate_at(c("forest", "grassland", "wetland"), as.numeric)

mod <- lm(vp ~ invertebrate + seed + fruit + fish + rodent + any_food +
  egg_capacity + forest + grassland + wetland + power_category - 1, data = df)

summary(mod)

Call:
lm(formula = vp ~ egg_capacity + invertebrate + seed + fruit + 
    fish + rodent + any_food + forest + grassland + wetland + 
    power_category - 1, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.04909 -0.54122 -0.03641  0.43533  2.15785 

Coefficients:
                                   Estimate Std. Error t value Pr(>|t|)    
egg_capacity                       -0.42837    0.06754  -6.342 4.15e-09 ***
invertebrate                        1.57877    0.14491  10.895  < 2e-16 ***
seed                                1.53325    0.13819  11.096  < 2e-16 ***
fruit                               1.89461    0.17910  10.578  < 2e-16 ***
fish                                1.94935    0.19816   9.837  < 2e-16 ***
rodent                              1.88318    0.18670  10.086  < 2e-16 ***
any_food                            1.26317    0.17042   7.412 1.92e-11 ***
forest                             -0.56666    0.18740  -3.024  0.00305 ** 
grassland                          -0.39201    0.17402  -2.253  0.02610 *  
wetland                             0.03936    0.18564   0.212  0.83243    
power_categoryNo power              5.94229    0.54650  10.873  < 2e-16 ***
power_categoryCaching Food          2.09541    0.51713   4.052 9.05e-05 ***
power_categoryEgg-laying            1.64914    0.41463   3.977  0.00012 ***
power_categoryCard-drawing          2.80260    0.40526   6.915 2.43e-10 ***
power_categoryFlocking              1.83560    0.38535   4.763 5.38e-06 ***
power_categoryFood from Supply      2.96775    0.40531   7.322 3.05e-11 ***
power_categoryHunting/Fishing       3.10179    0.42053   7.376 2.31e-11 ***
power_categoryFood from Birdfeeder  3.09215    0.41705   7.414 1.90e-11 ***
power_categoryOther                 2.15066    0.40827   5.268 6.16e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8583 on 120 degrees of freedom
Multiple R-squared:  0.973,	Adjusted R-squared:  0.9687 
F-statistic: 227.6 on 19 and 120 DF,  p-value: < 2.2e-16

This is pretty neat; almost all are significant predictors of victory points. The takeaways:

I also fit the power colour into the model, but it wasn’t a significant predictor. That surprised me since I would expect brown and pink powers to have fewer VPs than white. You can see where they place in the residual plot below.

Residual plot

By plotting the victory points by the residuals, we can see if the number of victory points is higher or lower than expected. Those above the line have fewer VPs than expected, and those below the line have fewer VPs than expected. I chose the grub to score birds with an ‘or’ condition in their food cost.

Click image to expand.

By inspection, the birds with weaker or activation powers for all players are typically above the line. They have more VPs to be worth playing. Those below the line typically have some pretty sweet powers given the cost.

“The Power 4” as they are colloquially known, are the:

All four are well below the line, which is some evidence to support my theory that birds with fewer VPs than expected have strong activation powers. The common raven is a bit higher suggesting it’s the pick of the four. This is cool because it potentially allows you to identify other strong cards you’ve overlooked.

Click image to expand.

This isn’t always the case, though. For example, the bird with 5 VPs at the bottom of the column, below the Common Raven, is the Indigo Bunting. It costs a grub, a seed, and a fruit. Its power is to gain a grub or a fruit from the birdfeeder. Not as good as discarding 1 egg to gain 2 whatevers, or even gaining a single grub from the supply. In this case, I’d say it either needs to be cheaper or have another VP, or both. Probably not worth paying the cost in my opinion.

The bird at the bottom of the 4 VP column is the Brown Pelican. It costs 2 fish; when played, you get 3 fish from the supply. That’s it. In my opinion, it needs more VPs or better activation.

The Northern Bobwhite is a great card: 5 victory points, estimated VPs of 3.3 (good bang for the buck), an egg capacity of 6, and an activation power to lay an egg on the card. It’s a great card at any stage of the game.

This analysis doesn’t dictate the clear best and worst cards, but I have found it useful to determine whether a card is a great bang for your buck or a bit expensive.

Every bird in the charts above is also in the look-up table below. ‘Est. VPs’ is the model estimated VPs, and ‘res’ is the residual (VPs – Est. VPs). It’s a great look-up table to compare birds.

Follow the link to view the table in a new window.

Final thoughts

There are a few interesting things that have come out of the analysis, particularly with the model. The predicted VPs and the residual plot is useful for critically assessing each card and if it’s worth the cost. I’ve already referred to the table way more than I expected.

It would be amazing to have data on game stats, such as the final boards, which birds were played, what turn each bird was played, the final VPs, who won, etc. That would uncover some pretty cool stuff, I reckon. If you know of such a dataset, let me know!

Anyway, happy bird watching!

The post Wingspan Data Analysis appeared first on Dan Oehm | Gradient Descending.

To leave a comment for the author, please follow the link and comment on their blog: R Archives - Dan Oehm | Gradient Descending.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version