UK Energy company acquisitions – a great use case for Sankey plots

[This article was first published on HighlandR, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Like many others in the ‘United’ Kingdom, I’ve found my energy supplier has gone bust, due to ongoing price hikes, and I’ve been moved over to another one, whom ordinarily I wouldn’t have signed up with.

Fantastic.

not really
Because I’m not always sure you get the sarcasm

How many folk have been affected by this sort of thing since the energy market prices went haywire in 2021?

Here’s what I’m using:

library(data.table)
library(dplyr)
library(tidyr)
library(networkD3)
library(htmlwidgets)

I found some data relating to the closures of energy companies in the UK and this has been saved here

Here’s how the data looks:

data <- data.table::fread('https://raw.githubusercontent.com/johnmackintosh/energy-closures/main/rawdata.csv')
head(data,10)

## # A tibble: 10 x 4
##    Date                `Failed_supplier`           Customers`Acquired_b~
##    <dttm>              <chr>                           <dbl> <chr>              
##  1 2022-01-01 00:00:00 Together Energy Retail Ltd.    176000 British Gas        
##  2 2021-12-01 00:00:00 Zog Energy                      11700 EDF                
##  3 2021-11-01 00:00:00 Entice Energy Supply Ltd         5400 Scottish Power     
##  4 2021-11-01 00:00:00 Orbit Energy                    65000 Scottish Power     
##  5 2021-11-01 00:00:00 Bulb                          1700000 *In special admini~
##  6 2021-11-01 00:00:00 Neon Reef Ltd.                  30000 British Gas        
##  7 2021-11-01 00:00:00 Social Energy Supply Ltd.        5500 British Gas        
##  8 2021-11-01 00:00:00 CNG Energy Ltd.                 41000 Pozitive Energy    
##  9 2021-11-01 00:00:00 Omni Energy Ltd.                 6000 Utilita            
## 10 2021-11-01 00:00:00 MA Energy Ltd.                    300 SmartestEnergy Bus~

I don’t like the column headers, and I want to restrict my final plot to just plotting by year rather than individual date, so I create a ‘year_acquired’ column:

data2 <- data %>% 
  rename(failed = `Failed_supplier`,
         acquired =  `Acquired_by`) %>% 
  mutate(year_acquired = lubridate::year(Date)) %>% 
  relocate(year_acquired, .before = 'Date') %>% 
    select(-Date)

I want a sankey plot, so I need to get my data into shape , and while I’m at it, I’ll filter it for 2021 and onwards:

links <-
  data2 %>%
  filter(year_acquired >= 2021) %>% 
  mutate(row = row_number()) %>%
  gather('column', 'source', -row) %>%
  mutate(column = match(column, names(data2))) %>%
  group_by(row) %>%
  arrange(column) %>%
  mutate(target = lead(source)) %>%
  ungroup() %>%
  filter(!is.na(target))

Network plots are all about links and nodes. So here’s some more jiggery-pokery with links and nodes. It’s all dplyr, and as that’s so easy to understand, I won’t delve into this too much

links <-
  links %>%
  mutate(source = paste0(source, '_', column)) %>%
  mutate(target = paste0(target, '_', column + 1)) %>%
  select(source, target)

nodes <- data.frame(name = unique(c(links$source, links$target)))

links$source <- match(links$source, nodes$name) - 1
links$target <- match(links$target, nodes$name) - 1
links$value <- 1

nodes$name <- sub('_[0-9]+$', '', nodes$name)

Time for the Sankey plot.

It’s interactive! No mean feat on Jekyll - I don’t want to tell you how much time this took to get working.

sankeyNetwork(Links = links,
              Nodes = nodes,
              Source = 'source',
              Target = 'target',
              Value = 'value',
              NodeID = 'name')

Looks like some big players are doing well in the acquisition stakes, as the smaller (cheaper, with good customer service) suppliers get squeezed out.

To leave a comment for the author, please follow the link and comment on their blog: HighlandR.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)