R Solution for Excel Puzzles
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Puzzles no. 474–478
Puzzles
Author: ExcelBI
All files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.
Puzzle #478
Today we get some numbers and our task was to detect which one of them are wavy. What does it mean? Digits in those numbers are going up and down, one after another. But how to do it nice and with easy readable code. I choose to use very interesting technique. I did two things. First I checked differences between each pair of consecutive digits, and then because I wouldn’t like to write condition for each possible result, I used function sign(), which gives me one of only 3 values -1 for negatives, 0 for equals, and 1 for positive difference. So all what I have to check if every difference is either +2 or -2 from previous. Check it.
Loading libraries and data
library(tidyverse) library(readxl) input = read_excel("Excel/474 Wavy Numbers.xlsx", range = "A1:A10") test = read_excel("Excel/474 Wavy Numbers.xlsx", range = "B1:B6")
Transformation
is_wavy <- function(number) { digits <- str_split(as.character(number), "")[[1]] %>% as.numeric() differences <- diff(digits) signs <- sign(differences) if (length(signs) < 2) { return(FALSE) } all(abs(diff(signs)) == 2) } result = input %>% mutate(wavy = map_lgl(Numbers, is_wavy)) %>% filter(wavy) %>% select(`Answer Expected` = Numbers)
Validation
identical(result, test) # [1] TRUE
Puzzle #475
Today we can feel like official guest of some ceremony, because we need to stand in line and cut ribbon with veeeery long word in specific, given places. Is it easy, yes. Is it tricky, oh yes. Look to find out.
Loading libraries and data
library(tidyverse) library(readxl) input = read_excel("Excel/475 Split by Positions.xlsx", range = "A2:B12") test = read_excel("Excel/475 Split by Positions.xlsx", range = "C2:H12")
Transformation
split_string_by_pos <- function(string, positions_str) { positions <- str_split(positions_str, "\\s*,\\s*") %>% unlist() %>% as.numeric() starts <- c(1, positions) ends <- c(positions - 1, nchar(string)) map2(starts, ends, ~ substr(string, .x, .y)) } result = input %>% mutate(split = map2(Names, Position, split_string_by_pos)) %>% unnest_wider(split, names_sep = "_") %>% select(Text1 = split_1, Text2 = split_2, Text3 = split_3, Text4 = split_4, Text5 = split_5, Text6 = split_6)
Validation
identical(result, test) # [1] TRUE
Puzzle #476
Today’s is pretty decent data manipulation task. We have stores with sales, but also departments that we need to assign with sales proportionally. Nice, short and simple job. Look closer.
Loading libraries and data
library(tidyverse) library(readxl) input1 = read_excel("Excel/476 Assigning Sales.xlsx", range = "A2:B5") input2 = read_excel("Excel/476 Assigning Sales.xlsx", range = "D2:E11") test = read_excel("Excel/476 Assigning Sales.xlsx", range = "G2:I11")
Transformation
result = input1 %>% left_join(input2, by = "Store") %>% mutate(n = n(), .by = Store) %>% mutate(Sales = Sales / n) %>% select(Store, Branch, Sales)
Validation
identical(result, test) # [1] TRUE
Puzzle #477
So, some challenges are really easy, when you read it, but worse when you start to do something. We have list of people with some values, and we need them to stand in rows, but every consecutive row has to had one person more. O, it was tricky.
Loadingzlibraries and data
library(readxl) library(tidyverse) input = read_excel("Excel/477 Records Split and Alignment.xlsx", range = "A2:B15") test = read_excel("Excel/477 Records Split and Alignment.xlsx", range = "D2:M6") names(test) = gsub("\\.+\\d+", "", names(test))
Transformation
nr = nrow(input) seq = 1 i = 1 while(sum(seq) <= nr){ seq = c(seq, i) i = i + 1 } seq = seq[-1] slice_dataframe <- function(df, seq) { indices <- map2(c(0, cumsum(seq)[-length(seq)]), cumsum(seq), ~(.x + 1):.y) map(indices, ~df[.x, ]) } indexed_input = slice_dataframe(input, seq) pad_and_bind_dataframes <- function(dfs) { max_length <- max(map_int(dfs, nrow)) pad_df <- function(df, length) { if (nrow(df) < length) { additional_rows <- tibble(x = rep(NA, length - nrow(df))) df <- bind_rows(df, additional_rows) } df } padded_dfs <- map(dfs, pad_df, length = max_length) bound_df <- bind_cols(padded_dfs) %>% select(-starts_with("x")) bound_df <- bound_df %>% filter_all(any_vars(!is.na(.))) bound_df } result = pad_and_bind_dataframes(indexed_input) names(result) = gsub("\\.+\\d+", "", names(result))
Validation
all.equal(result, test) # [1] TRUE
Puzzle #478
And again we have two reports with different stores and we need to merge them. If you know dplyr, this task is a piece of cake. Look up for solution.
Loading data and libraries
library(tidyverse) library(readxl) path = "Excel/478 Merge Tables.xlsx" input1 = read_excel(path, range = "A2:C9") input2 = read_excel(path, range = "E2:H10") test = read_excel(path, range = "J2:M14")
Transformation
result = input1 %>% full_join(input2, by = c("Org", "Year")) %>% arrange(Org, Year) %>% mutate(Sales = map2_dbl(Sales.x, Sales.y, ~ sum(c(.x, .y), na.rm = TRUE))) %>% select(Org, Year, Prime, Sales)
Validation
identical(result, test) #> [1] TRUE
Feel free to comment, share and contact me with advices, questions and your ideas how to improve anything. Contact me on Linkedin if you wish as well.
PS. Couple weeks ago, I started uploading on Github not only R, but also in Python. Come and check it.
R Solution for Excel Puzzles was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.