Making the Most of Mobility
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I became aware of the Community Mobility Reports data courtesy of this tweet from Mike Schussler:
Next time someone tells you that people are misbehaving you can tell the the data says no they are listening doing as they are told. Data from our mobile phones shows we staying away from retail, restaurants, transit parks etc, We have spent far more time at home 1 pic.twitter.com/bwSQKMPa0l
— mike schussler (@mikeschussler) January 10, 2021
The data are available for download here. There’s a global CSV file (data for all regions) as well as individual CSV files (one for each region, packaged in a ZIP archive). There are also automated PDF reports for each region per day. For example, this is the report for South Africa on 28 March 2021.
Load the Data
I’m focusing on the data for South Africa (ZA) and using both years of available data.
COUNTRY <- "ZA" YEARS <- c(2020, 2021)
Use purrr::map_df()
to iterate over each year, load the corresponding CSV and concatenate into a single data frame. A few of the columns are empty, so apply janitor::remove_empty()
to drop those.
mobility <- map_df( YEARS, function(year) { filename <- glue("{year}_{COUNTRY}_Region_Mobility_Report.csv" ) filepath <- file.path(FOLDER_REGIONAL_CSV, filename) read_csv(filepath) } ) %>% # Remove any empty columns. remove_empty(which = "cols") %>% # Remove columns with just one value. remove_constant() %>% # Rename specific columns. rename( region = sub_region_1, region_iso = iso_3166_2_code ) %>% # Strip "_percent_change_from_baseline" from column names. rename_with( ~ str_replace(.x, "_percent_change_from_baseline$", "") )
How much data?
dim(mobility) [1] 4080 10
There are 4080 and 10 columns per record. The data span the period from 15 February 2020 to 28 March 2021.
What are the (revised) column names?
names(mobility) [1] "region" "region_iso" "place_id" [4] "date" "retail_and_recreation" "grocery_and_pharmacy" [7] "parks" "transit_stations" "workplaces" [10] "residential"
What are the unique place identifiers?
mobility %>% select(region, region_iso, place_id) %>% unique() # A tibble: 10 x 3 region region_iso place_id <chr> <chr> <chr> 1 <NA> <NA> ChIJURLu2YmmNBwRoOikHwxjXeg 2 Eastern Cape ZA-EC ChIJu5znKjRWYh4RkqxyqdKUajo 3 Free State ZA-FS ChIJGRTWM2HFjx4RRwqiTVWK9e0 4 Gauteng ZA-GT ChIJn3cRVJUSlR4R4jhUy8fnnm0 5 KwaZulu-Natal ZA-NL ChIJVQ7iWQ4Q8R4Rjdnka6d4YYI 6 Limpopo ZA-LP ChIJwTDNNhTJxh4RStzIZh49iWI 7 Mpumalanga ZA-MP ChIJPSAvTvpg6h4RhGvk9A3foGQ 8 North West ZA-NW ChIJ612A6EIKmB4R_5BkMf6qLUc 9 Northern Cape ZA-NC ChIJbUtwf_UhJBwRkEyPkNb4AAM 10 Western Cape ZA-WC ChIJ841peohdzB0Ri6I2IY95juk
So there’s data for each province as well as for the country as a whole. We’ll confine our attention to the country as a whole.
Visualise the Data
We’ve got daily observations of various mobility metrics for each of the provinces as well as the country as a whole. Making sense of this is going to require pictures!
Work & Home
Below are the two plots of the mobility percentage for workplaces and residential areas. Superimposed are solid vertical lines that indicate the onset of each lockdown level, starting with Level 5 (L5) on 27 March 2020. On that date there was a precipitous drop in the number of people moving to their workplaces and a simultaneous increase the people staying at home. Vertical dashed lines indicates public holidays, which also appear to have a significant effect on mobility.
There’s a clear weekly variation in these data, indicating that, despite the lockdown, people’s behaviour is different on weekends and during the week.
Shopping & Recreation
What about shopping and recreation habits?
The data indicate that following the initial lockdown there was a substantial reduction in visits to supermarkets and pharmacies, but that this has largely recovered.
Including restaurants, cafés, shopping centres, libraries, cinemas and other recreational venues paints a different picture. The impact on the entertainment industry due to restrictions on the sale of alcohol has no doubt played a role in this.
Interestingly it seems that public holidays do not have a major effect on people going shopping or hitting recreational venues.
Out & About
Public spaces, like beaches, parks and gardens, have also been impacted. There’s some interesting variation here which I don’t fully understand right now. For instance, why was there less use of public spaces following the transition to Level 1 in September 2020?
Public holidays cause major spikes in the use of public spaces, at least under lockdown levels 3, 2 and 1.
Finally, transport hubs, which includes train and bus stations as well as airports, were practically empty following the initial lockdown. However, they gradually became more busy during the course of 2020. Travel activity dropped again after Christmas 2020.
This is a very rich data set with lots of opportunities for interesting analyses. Time permitting I’ll be back to look at it again.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.