Site icon R-bloggers

Omega Results and the 2021 Olympic Trials

[This article was first published on Welcome to Swimming + Data Science on Swimming + Data Science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
  • Omega Timing is the official timekeeper for the Olympic Games, including US Olympic Trails. They don’t do very many other events, which is why SwimmeR hasn’t supported Omega-style results. Until now that is. Omega results can now be read into R with versions of SwimmeR >= 0.10.2, presently available as developmental versions from Github. We’ll read some Omega results in, and then do a quick set of tests about athlete reaction times.

    devtools::install_github("gpilgrim2670/SwimmeR", build_vignettes = TRUE)

    The 2020 US Trials are being held in 2021, in two parts. Wave I was held June 4th to 7th, and Wave II is currently being held June 13th – 20th. Omega has published the entire Wave I results here, but to avoid any potential broken links down the road I’m also hosting them on github here.

    Let’s get set up and take a look.

    library(SwimmeR)
    library(dplyr)
    library(stringr)
    library(ggplot2)
    library(flextable)
    
    flextable_style <- function(x) {
      x %>%
        flextable() %>%
        bold(part = "header") %>% # bolds header
        bg(bg = "#D3D3D3", part = "header") %>%  # puts gray background behind the header row
        autofit()
    }



    US Trials Wave I – Getting Omega Results

    The process of reading in Omega results with SwimmeR is exactly the same as reading in Hy-Tek or S.A.M.M.S.. Here’s the entire set of results from Wave I.

    file <-
      "https://github.com/gpilgrim2670/Pilgrim_Data/raw/master/Omega/Omega_OT_Wave1_FullResults_2021.pdf"
    
    Wave_I <- file %>%
      read_results() %>%
      swim_parse(splits = TRUE)

    Here’s the top three finishers in the Women’s 100 Fly Final. The usual information is present – Place, Name, Team Finals_Time (Omega results don’t include prelims times…), various Splits columns. Also present is a Reaction_Time column, that will be the focus of a little demonstration later on.

    Wave_I %>%
      filter(Event == "6 JUN 2021 - 7:37 PM Women's 100m Butterfly Final") %>%
      head(3) %>%
      select(where( ~ !all(is.na(.)))) %>% # remove splits columns that aren't relevant to this race (Split_150 etc.)
      select(-DQ,
             -Exhibition,
             "Reaction" = "Reaction_Time",
             "Finals" = "Finals_Time") %>%
      flextable_style()
    < template id="72cf3bda-8c25-43c1-8a8a-e0f0b37d0812">

    Place

    Lane

    Name

    Team

    Reaction

    Finals

    Event

    Split_50

    Split_100

    1

    6

    LU Sydney

    PLS

    0.64

    1:00.38

    6 JUN 2021 – 7:37 PM Women’s 100m Butterfly Final

    28.54

    31.84

    2

    4

    SMITHWICK Heidi

    JDST

    0.68

    1:00.56

    6 JUN 2021 – 7:37 PM Women’s 100m Butterfly Final

    28.08

    32.48

    3

    5

    VANNOTE Ellie

    UNC

    0.69

    1:00.60

    6 JUN 2021 – 7:37 PM Women’s 100m Butterfly Final

    28.33

    32.27



    US Trials Wave II

    Wave II of the US trials is where the actual Olympic Team is being selected. It’s still underway as of this writing, so there’s not a single document containing all results available. Individual result documents for each event are being posted however, as the events are completed. Here’s the Women’s 100 Breaststroke final, featuring Lilly King.

    file <-
      "https://github.com/gpilgrim2670/Pilgrim_Data/raw/master/Omega/Omega_OT_Wave2_W100Br_Finals_2021.pdf"
    
    W100Br <- file %>%
      read_results() %>%
      swim_parse(splits = TRUE)
    
    W100Br %>%
      select(-DQ,
             -Exhibition,
             "Reaction" = "Reaction_Time", 
             "Finals" = "Finals_Time") %>%
      flextable_style()
    < template id="652fcaa7-e0d4-4898-bca0-7e075dbe54ce">

    Place

    Lane

    Name

    Team

    Reaction

    Finals

    Event

    Split_50

    Split_100

    1

    4

    KING Lilly

    ISC

    0.65

    1:04.79

    PM Women’s 100m Breaststroke Final

    30.34

    34.45

    2

    3

    JACOBY Lydia

    STSC

    0.63

    1:05.28

    PM Women’s 100m Breaststroke Final

    30.94

    34.34

    3

    5

    LAZOR Annie

    MVN

    0.66

    1:05.60

    PM Women’s 100m Breaststroke Final

    30.82

    34.78

    4

    6

    GALAT Bethany

    AGS

    0.53

    1:05.75

    PM Women’s 100m Breaststroke Final

    30.69

    35.06

    5

    0

    DOBLER Kaitlyn

    TDPS

    0.65

    1:06.29

    PM Women’s 100m Breaststroke Final

    30.83

    35.46

    6

    2

    SUMRALL Micah

    GAME

    0.71

    1:06.84

    PM Women’s 100m Breaststroke Final

    31.83

    35.01

    7

    7

    HANNIS Molly

    TNAQ

    0.70

    1:07.26

    PM Women’s 100m Breaststroke Final

    31.29

    35.97

    8

    1

    ESCOBEDO Emily

    COND

    0.68

    1:07.31

    PM Women’s 100m Breaststroke Final

    31.91

    35.40

    9

    8

    TUCKER Miranda

    UN-MI

    0.68

    1:07.44

    PM Women’s 100m Breaststroke Final

    31.73

    35.71



    Australian Trials

    Also underway are the Australian Trials. Similarly to the US Trials they can be read into R using SwimmeR versions >= 0.10.2. For the very curious, these are Hy-Tek results, not Omega. We at Swimming + Data Science have scrapped entire Hy-Tek live results pages before and the same general principles can be applied the collect all Australian Trials results. Here’s just the Men’s 100 Fly Final.

    file <-
      "http://liveresults.swimming.org.au/SAL/2021TRIALS/210612F015.htm"
    
    M100Bk <- file %>%
      read_results() %>%
      swim_parse(splits = TRUE)
    
    M100Bk %>%
      select(-DQ,
             -Exhibition,
             -Points,
             "Prelims" = "Prelims_Time",
             "Finals" = "Finals_Time") %>%
      flextable_style()
    < template id="4fd1aa2e-d6fd-4583-8e64-c366f9ff96f4">

    Place

    Name

    Age

    Team

    Prelims

    Finals

    Event

    Split_50

    Split_100

    1

    LARKIN, MITCH

    27

    STPET

    53.04

    53.40

    Male 100 LC Metre Backstroke

    25.86

    27.54

    2

    COOPER, ISAAC

    17

    RACKL

    53.79

    53.49

    Male 100 LC Metre Backstroke

    25.94

    27.55

    3

    HOLLARD, TRISTA

    24

    STHPT

    54.56

    54.00

    Male 100 LC Metre Backstroke

    26.73

    27.27

    4

    WOODWARD, BRADL

    22

    MING

    54.47

    54.13

    Male 100 LC Metre Backstroke

    26.19

    27.94

    5

    YANG, WILLIAM

    22

    LNSC

    54.75

    54.56

    Male 100 LC Metre Backstroke

    25.98

    28.58

    6

    MAHONEY, TRAVIS

    30

    MARI

    55.03

    55.02

    Male 100 LC Metre Backstroke

    26.78

    28.24

    7

    VAN KOOL, KAI

    19

    GUSC

    54.68

    55.13

    Male 100 LC Metre Backstroke

    26.38

    28.75

    8

    HARTWELL, TY

    20

    CHAND

    55.00

    55.23

    Male 100 LC Metre Backstroke

    26.66

    28.57

    9

    TYSOE, CAMERON

    24

    GIND

    55.05

    54.84

    Male 100 LC Metre Backstroke

    26.46

    28.38

    10

    MILLS, PETER

    24

    MBAY

    55.04

    55.30

    Male 100 LC Metre Backstroke

    26.74

    28.56

    11

    SWINBURN, STUAR

    19

    UNSW

    55.80

    55.65

    Male 100 LC Metre Backstroke

    27.00

    28.65

    12

    BAYLISS, JAMES

    17

    NCOLL

    56.06

    55.91

    Male 100 LC Metre Backstroke

    26.68

    29.23

    13

    BOOTH, SHAYE

    20

    MING

    56.33

    55.99

    Male 100 LC Metre Backstroke

    27.21

    28.78

    14

    DAFF, CONOR

    18

    MBAY

    56.25

    56.08

    Male 100 LC Metre Backstroke

    26.93

    29.15

    15

    FOOTE, NATHAN

    20

    STAND

    56.19

    56.33

    Male 100 LC Metre Backstroke

    27.59

    28.74

    16

    CORNWELL, JYE

    24

    YERPK

    56.03

    56.43

    Male 100 LC Metre Backstroke

    27.17

    29.26



    US Trials Wave I Reaction Time Demo

    Let’s see if there’s a difference between the reaction times of sprinters, mid distance swimmers and distance swimmers in the US Trials Wave I results. We’ll define anyone who swims 50 or 100m distances as a sprinter, anyone who swims the 800 or 1500m distances as a distance swimmer, and everyone else as mid-distance.

    For this analysis We’ll need the Lane, Name, Reaction_Time and Event columns. The other columns won’t be needed, so I’ll remove them.

    We can pull distances out the event names. Note however from the 100 Fly results above that the event names contain more information than we’re perhaps used to seeing. Let’s clean that up.

    Wave_I_Clean <- Wave_I %>%
      select(Lane, Name, Team, Reaction_Time, Event) %>% # select only columns of interest
      mutate(Event = str_remove(Event, ".*(?=(Men)|(Women))")) %>% # remove everything in event names before Men or Women
      mutate(Reaction_Time = as.numeric(Reaction_Time)) # change type of Reaction_Time column

    Now we can classify swimmers by type.

    Wave_I_Clean <- Wave_I_Clean %>%
      group_by(Name) %>% # determining type by athlete
      mutate(Type = case_when(
        # encode athlete types based on events swam
        any(str_detect(Event, "(1500m)|(800m)"), na.rm = TRUE) == TRUE ~ "Distance",
        any(str_detect(Event, "(100m)|(50m)"), na.rm = TRUE) == TRUE ~ "Sprint",
        TRUE ~ "Mid"
      )) %>%
      mutate(Type = factor(Type, levels = c("Sprint", "Mid", "Distance"))) # type as ordered factor for ggplot later

    Let’s look at the distribution of reaction times by swimmer type.

    Wave_I_Clean %>%
      ggplot(aes(x = Type, y = Reaction_Time, fill = Type)) +
      geom_violin() +
      theme_bw() +
      labs(y = "Reaction Time (s)",
           title = "Reaction Times by Swimmer Type")

    There is a noticeable shift towards slower reaction times for distance swimmers compared to sprint and mid-distance, but is it significant? We can use an ANOVA test to determine if the values are significantly different to some standard (called a p value).

    reaction_anova <- aov(Reaction_Time ~ Type, data = Wave_I_Clean) # calculate anova
    reaction_anova_summary <- summary(reaction_anova) # save summary anova object
    reaction_anova_summary # view anova results
    ##               Df Sum Sq Mean Sq F value Pr(>F)    
    ## Type           2  0.479 0.23930   74.65 <2e-16 ***
    ## Residuals   1270  4.071 0.00321                   
    ## ---
    ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

    The p value is very low, at 2.2336931^{-31}. We can conclude that their are significant differences between the groups to at least a significance value (p value) of 0.001. That means the likelihood of these level of difference between the three groups appearing as the result of random variations in populations that are actually identical is less than 0.1%. The ANOVA test doesn’t tell us which group(s) have the significant differences though. For that we can use a Tukey HSD test.

    reaction_Tukey <- TukeyHSD(reaction_anova) # calculate Tukey HSD
    reaction_Tukey # view results
    ##   Tukey multiple comparisons of means
    ##     95% family-wise confidence level
    ## 
    ## Fit: aov(formula = Reaction_Time ~ Type, data = Wave_I_Clean)
    ## 
    ## $Type
    ##                       diff        lwr        upr p adj
    ## Mid-Sprint      0.02606628 0.01762142 0.03451114     0
    ## Distance-Sprint 0.07474784 0.05854508 0.09095060     0
    ## Distance-Mid    0.04868156 0.03158292 0.06578019     0

    The adjusted p values are all approximately zero. we can see what they actually are by pulling them out of the reaction_Tukey model object.

    reaction_Tukey$Type[,"p adj"] # view actual adjusted p values
    ##      Mid-Sprint Distance-Sprint    Distance-Mid 
    ##    1.634137e-12    0.000000e+00    1.058689e-10

    All very low, so all the groups have differences significant at the p = 0.001 level. Sprinters really do have faster reaction times than mid-distance, who are in turn faster than distance swimmers.


    Reaction Times By Lane

    Just for giggles let’s also look by lane. When I was swimming there was always this rumor going around that swimmers in the outside lane nearest the starting device would have an advantage, because the light/sound from the device would reach them before it reached athletes further from the device. It never made much sense, since faster swimmers were deliberately seeded into inner lanes and they usually won. Nowadays each block is equipped with a LED light bar and a sounding device so everything should be equal (if it ever wasn’t).

    Wave_I_Clean %>%
      filter(Lane != "0") %>% 
      ggplot(aes(x = Lane, y = Reaction_Time, fill = Lane)) +
      geom_violin() +
      theme_bw() +
      labs(y = "Reaction Time (s)",
           title = "Reaction Times by Lane")


    That looks about even to me. Let’s see what the testing has to say.

    reaction_anova <- aov(Reaction_Time ~ Lane, data = Wave_I_Clean) # calculate anova
    reaction_anova_summary <- summary(reaction_anova) # save summary anova object
    reaction_anova_summary # view anova results
    ##               Df Sum Sq  Mean Sq F value Pr(>F)
    ## Lane           8  0.025 0.003144   0.878  0.534
    ## Residuals   1264  4.525 0.003580

    Here the p value is 0.5341483, which is larger than any p value we’d care to use. There is no significant difference in reaction time by lane.



    In Closing

    I hope you’re enjoying the various Olympic Trials meets, even all the more so now that SwimmeR makes it easy to import them into R. Join us next time here at Swimming + Data Science where we’ll take a look at something else swimming-centric.

    To leave a comment for the author, please follow the link and comment on their blog: Welcome to Swimming + Data Science on Swimming + Data Science.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.