PowerQuery Puzzle solved with R

[This article was first published on Numbers around us - Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# 137–138

Puzzles:

PQ_137: content file
PQ_138: content file

PQ_137

Our goal today is to find out some data properties in table, but unfortunatelly they are weirdly nested in cells. It doesn’t look scary because we already know that many structured are nested like JSONs or XMLs. So today we have to unfold bowels of one table and join it with the second one to get proper data. Let’s do it.

Load libraries and data

library(tidyverse)
library(readxl)

T1 = read_excel("PQ_Challenge_137.xlsx", range = "A1:B5")
T2 = read_excel("PQ_Challenge_137.xlsx", range = "A9:B21")

test = read_excel("PQ_Challenge_137.xlsx", range = "E1:H9")

Data transformation

T1_1 = T1 %>%
  separate_rows(Company, sep = ";|,") %>%
  mutate(Company = str_remove_all(Company, "[:space:]")) %>%
  separate(Company, into = c("ID","Company"),sep = ":") %>%
  mutate(ID = as.numeric(ID)) 

result = T1_1 %>%
  left_join(T2, by = "Company") %>%
  arrange(Group, Company)

Validation

identical(result, test)
#> [1] TRUE

PQ_138

We have something like tape of letters with corresponding numbers. I don’t know what is the sense and purpose of it, but I know what we need to achieve. We are gonna take letters and cutting it by 10 make them columns. Whats weirder after transposition we need letters to be by letters and number by numbers. Some cyphers are really interesting.
Lets tranform this tables as our host wanted.

Load libraries and data

library(tidyverse)
library(readxl)

input = read_excel("PQ_Challenge_138.xlsx", range = "A1:F9")
test  = read_excel("PQ_Challenge_138.xlsx", range = "H1:K10")

Data transformation

result <- input %>%
  group_by(group_id = (row_number() - 1) %/% 2) %>%
  group_map(~ .x) %>%
  set_names(seq_along(.))

a1 = result %>%
  map(., ~ as_tibble(t(.))) %>%
  bind_rows() %>%
  drop_na() %>%
  add_row(V1 = NA_character_, V2 = NA_character_) 

a2_L = matrix(a1$V1, ncol=2, byrow = TRUE)
a2_D = matrix(a1$V2, ncol=2, byrow = TRUE)

a2 = bind_cols(a2_L, a2_D) %>%
  as_tibble() %>%
  rename(Group1 = ...1, Group2 = ...2, Value1 = ...3, Value2 = ...4) %>%
  mutate(Value1 = as.numeric(Value1),
         Value2 = as.numeric(Value2))

Validation

identical(a2, test)  
#> [1] TRUE

We’ve seen some nice puzzles considering time changes and it shows little bit out of the box thinking. Feel free to ask, like and share.
Lets be in touch for next article about the functions.


PowerQuery Puzzle solved with R was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.

To leave a comment for the author, please follow the link and comment on their blog: Numbers around us - Medium.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)