UK Energy company acquisitions – a great use case for Sankey plots
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Like many others in the ‘United’ Kingdom, I’ve found my energy supplier has gone bust, due to ongoing price hikes, and I’ve been moved over to another one, whom ordinarily I wouldn’t have signed up with.
Fantastic.
How many folk have been affected by this sort of thing since the energy market prices went haywire in 2021?
Here’s what I’m using:
library(readxl) library(dplyr) library(tidyr) library(networkD3) library(htmlwidgets)
Data duly pinched from the internet and bunged into excel for no reason other than to preserve the dates, rather than face the horror of trying to convert them all later (NB – lubridate and anytime make this easy, I’m just being lazy.)
I have since converted it to csv (with data.table’s ‘fread’, of course) and it can currently be found here if you want to try replicating this
Here’s how the data looks:
head(data,10) ## # A tibble: 10 x 4 ## Date `Failed_supplier` Customers`Acquired_b~ ## <dttm> <chr> <dbl> <chr> ## 1 2022-01-01 00:00:00 Together Energy Retail Ltd. 176000 British Gas ## 2 2021-12-01 00:00:00 Zog Energy 11700 EDF ## 3 2021-11-01 00:00:00 Entice Energy Supply Ltd 5400 Scottish Power ## 4 2021-11-01 00:00:00 Orbit Energy 65000 Scottish Power ## 5 2021-11-01 00:00:00 Bulb 1700000 *In special admini~ ## 6 2021-11-01 00:00:00 Neon Reef Ltd. 30000 British Gas ## 7 2021-11-01 00:00:00 Social Energy Supply Ltd. 5500 British Gas ## 8 2021-11-01 00:00:00 CNG Energy Ltd. 41000 Pozitive Energy ## 9 2021-11-01 00:00:00 Omni Energy Ltd. 6000 Utilita ## 10 2021-11-01 00:00:00 MA Energy Ltd. 300 SmartestEnergy Bus~
I don’t like the column headers, and I want to restrict my final plot to just plotting by year rather than individual date, so I create a ‘year_acquired’ column:
data2 <- data %>% rename(failed = `Failed_supplier`, acquired = `Acquired_by`) %>% mutate(year_acquired = lubridate::year(Date)) %>% relocate(year_acquired, .before = 'Date') %>% select(-Date)
I want a sankey plot, so I need to get my data into shape , and while I’m at it, I’ll filter it for 2021 and onwards:
links <- data2 %>% filter(year_acquired >= 2021) %>% mutate(row = row_number()) %>% gather('column', 'source', -row) %>% mutate(column = match(column, names(data2))) %>% group_by(row) %>% arrange(column) %>% mutate(target = lead(source)) %>% ungroup() %>% filter(!is.na(target))
Network plots are all about links and nodes. So here’s some more jiggery-pokery with links and nodes. It’s all dplyr, and as that’s so easy to understand, I won’t delve into this too much
links <- links %>% mutate(source = paste0(source, '_', column)) %>% mutate(target = paste0(target, '_', column + 1)) %>% select(source, target) nodes <- data.frame(name = unique(c(links$source, links$target))) links$source <- match(links$source, nodes$name) - 1 links$target <- match(links$target, nodes$name) - 1 links$value <- 1 nodes$name <- sub('_[0-9]+$', '', nodes$name)
Wohoo. Time for a Sankey plot. No idea how to get an interactive plot on here though, unfortunately, so a boring old static image will have to do.
Sorry about that.
But not as sorry as I will be, when I get my winter gas bill.
sankeyNetwork(Links = links, Nodes = nodes, Source = 'source', Target = 'target', Value = 'value', NodeID = 'name')
# assign plot to 'p', then save #saveNetwork(p, file= "sankeyEnergy.html", selfcontained = TRUE)
Looks like some big players are doing well in the acquisition stakes, as the smaller (cheaper, with good customer service) suppliers get squeezed out.
Probably a metaphor for something, who knows?
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.