Site icon R-bloggers

Survivor Confessionals Data: Dataset showcase for {survivoR}

[This article was first published on R Archives - Dan Oehm | Gradient Descending, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Confessionals loosely represent a player’s screen time where they talk strategy and replay events. It is an imperfect measure but can indicate success in the game. It’s often used to show balance or imbalance in the editing.

This is a high-level summary of confessionals a showcase of the dataset and an analysis of the edit for key demographics. All code is found on Github and here’s a link to the package data.

TL;DR

In summary:

Full confessional tables and list of castaways

For a detailed view and a full list of confessional tables, follow the links below. They are regularly maintained throughout seasons that are in progress and updated when new data is received.

Castaway confessionals: Demographic analysis

Over the 42 seasons of US Survivor, there have been 626 castaways which include 103 returning players. Of those that have played more than once

This is excluding the time Russell Hantz and Sandra played on Survivor Australia.

In total there have been

There are more confessionals for men than women in absolute terms. It is interesting that men and women of colour have approximately the same number of confessionals overall although there are 26 fewer men. This is likely due to the fact that historically women were targeted early in the game relative to the men and did not have the opportunity to pick up confessionals.

Total confessionals and edit percentage

Boston Rob has the most amount of confessionals out of anyone else who has played the game at 246. Not surprising though given he has also played the game 5 times. This is excluding his appearance in season 39 Island of the Idols. He wasn’t participating in the game so it is not appropriate to include them in this count, contrary to what you may find elsewhere.

Edit percentage methodology

The edit percentage is the percentage of confessionals the castaway received above or below the amount they would have received given equal distribution. This factors in the length of time the castaway spent in the game, for example, it is expected finalists would receive more confessionals than others that were booted in earlier episodes. There is simply more chance for them to receive a confessional. It also factors in the number of castaways and the length of the episode e.g. double episodes. For example, +50% means they received 50% more confessionals than expected (over-edit).

On the other end, castaways that were booted early e.g. before episode 5 have only 1-4 episodes to receive a confessional. Often first boots get a more favourable edit in the first episode to give them some screen time. In these cases, first boots could appear to be grossly over-edited when in reality they just haven’t had the chance to have a normal edit.

For example, Zach was the first boot from S42 and received 7 confessionals. The average for the episode is 3.4 per person meaning his edit percentage would be +105%, the most over-edited castaway from S42.

This is simply a small data issue and so they don’t dominate the list and mask the more interesting cases I have model-adjusted early boots to be more in line with expectation. There are a few early boots that rank highly e.g. Jacob Derwin from S36 at +131% but, according to the data he received 15 confessionals, a genuinely high amount for his time. The first episode was a double episode but this is factored into the calculation.

Top 40 edits

The top 40 castaways are ranked by their total number of confessionals and the top 40 most favorable edits were over the seasons.

The full list of castaways can be found here.

Difference in demographics

While it may appear there is a difference between genders, is it a measurable difference?

Assuming independence between each castaway we estimate the mean edit percentage for each demographic using a Bayesian regression model using {brms}. The results suggest there is a definite difference between gender but no measurable difference for race.

Results:

The bands show the full posterior distribution of the mean. There is little overlap of the 0% line suggesting there is a genuine effect.

There could be a few reasons for women, on average, getting an under-edit and I don’t believe it’s definitively due to production choosing to edit men over women. It could be due to data collection or possibly that historically women are more likely to be voted out of the game early. While the index accounts for this to a degree there is still less chance for them to receive a more favourable edit.

Closing thoughts: The issues with confessionals

We have seen that there appears to be a genuine difference between edits for men and women. However, it needs to be said that confessionals are the worst possible measure for quantifying screen time.

They are inherently subjective and prone to clerical error. Sure, there are guidelines to counting but ultimately it’s a human following the logic, there are situations where it isn’t clear and humans make errors.

A confessional that lasts 2 seconds has the same intrinsic value as a confessional that lasts 30. A confessional where a castaway is replaying the events of finding a hidden immunity idol could count for 3 or 4 if there’s a 10-second gap but is effectively a single confessional. An edit where it switches from one player to another may count for a few but is essentially just breaking up 1. Confessionals from voice-overs could be misinterpreted. It’s not exactly a standardised measure.

The best thing we can do is consolidate and average across multiple sources and have contributors of the package independently count confessionals. Even so, we should be cautious about making grand conclusions about the editing choices of production, unless there is strong evidence.

With that in mind, there are some genuinely interesting trends and stats that can be found in the counts but should be interpreted with care.

A more appropriate measure would be the total minutes of screen time, but that’s far more challenging to pull together. So, we’ll have to do with confessionals for now.

Putting this together

All charts a combined into a final infographic.

All code used for this analysis can be found on Github as well as dark versions of the charts.

Comments welcomed!

The post Survivor Confessionals Data: Dataset showcase for {survivoR} appeared first on Dan Oehm | Gradient Descending.

To leave a comment for the author, please follow the link and comment on their blog: R Archives - Dan Oehm | Gradient Descending.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.