PowerQuery Puzzle solved with R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
# 137–138
Puzzles:
PQ_137: content file
PQ_138: content file
PQ_137
Our goal today is to find out some data properties in table, but unfortunatelly they are weirdly nested in cells. It doesn’t look scary because we already know that many structured are nested like JSONs or XMLs. So today we have to unfold bowels of one table and join it with the second one to get proper data. Let’s do it.
Load libraries and data
library(tidyverse) library(readxl) T1 = read_excel("PQ_Challenge_137.xlsx", range = "A1:B5") T2 = read_excel("PQ_Challenge_137.xlsx", range = "A9:B21") test = read_excel("PQ_Challenge_137.xlsx", range = "E1:H9")
Data transformation
T1_1 = T1 %>% separate_rows(Company, sep = ";|,") %>% mutate(Company = str_remove_all(Company, "[:space:]")) %>% separate(Company, into = c("ID","Company"),sep = ":") %>% mutate(ID = as.numeric(ID)) result = T1_1 %>% left_join(T2, by = "Company") %>% arrange(Group, Company)
Validation
identical(result, test) #> [1] TRUE
PQ_138
We have something like tape of letters with corresponding numbers. I don’t know what is the sense and purpose of it, but I know what we need to achieve. We are gonna take letters and cutting it by 10 make them columns. Whats weirder after transposition we need letters to be by letters and number by numbers. Some cyphers are really interesting.
Lets tranform this tables as our host wanted.
Load libraries and data
library(tidyverse) library(readxl) input = read_excel("PQ_Challenge_138.xlsx", range = "A1:F9") test = read_excel("PQ_Challenge_138.xlsx", range = "H1:K10")
Data transformation
result <- input %>% group_by(group_id = (row_number() - 1) %/% 2) %>% group_map(~ .x) %>% set_names(seq_along(.)) a1 = result %>% map(., ~ as_tibble(t(.))) %>% bind_rows() %>% drop_na() %>% add_row(V1 = NA_character_, V2 = NA_character_) a2_L = matrix(a1$V1, ncol=2, byrow = TRUE) a2_D = matrix(a1$V2, ncol=2, byrow = TRUE) a2 = bind_cols(a2_L, a2_D) %>% as_tibble() %>% rename(Group1 = ...1, Group2 = ...2, Value1 = ...3, Value2 = ...4) %>% mutate(Value1 = as.numeric(Value1), Value2 = as.numeric(Value2))
Validation
identical(a2, test) #> [1] TRUE
We’ve seen some nice puzzles considering time changes and it shows little bit out of the box thinking. Feel free to ask, like and share.
Lets be in touch for next article about the functions.
PowerQuery Puzzle solved with R was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.