Site icon R-bloggers

Has your knowledge stopped updating?

[This article was first published on R – What You're Doing Is Rather Desperate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Some years ago I read an article – I forget where – describing how our general knowledge often becomes frozen in time. Asked to name the tallest building in the world you confidently proclaim “the Sears Tower!”, because for most of your childhood that was the case – never mind that the record was surpassed long ago and it isn’t even called the Sears Tower anymore. From memory the example in the article was of a middle-aged speaker who constantly referred to a figure of 4 billion for the human population – again, because that’s what he learned in school and had never mentally updated.

Is this the case with programming too? Oh yes – as I learned today when performing the simplest of tasks: reading CSV files using R.

Here’s the scenario: given a directory containing CSV files with the same columns, read them into a single data frame with an additional column containing the file name.

We start with list_files() of course, something along the lines of.

csv_files <- list.files(path = "path/to/the/folder", pattern = ".csv", full.names = TRUE)

My frozen, outdated knowledge tells me that the next steps are: (1) use lapply() to read the CSV files into a list of data frames, (2) use the vector of file names as names for the list and (3) use dplyr::bind_rows() to create a single data frame and add the column of file names, here named “path”.

library(dplyr)
library(readr)

csv_data <- lapply(csv_files, read_csv)
names(csv_data) <- csv_files
csv_data <- bind_rows(csv_data, .id = "path")

I’ve used readr::read_csv() for years. Only today did I learn that not only can it read multiple files given a vector of file names, but it can also add a column for those file names. All in one line.

csv_data <- read_csv(csv_files, id = "path")

Why did I not know this? I guess because I had a solution that worked, and I’d never bothered to go back and see if something better had been invented since I learned my solution.

How can we unlearn our frozen, outdated knowledge and update our skills? Right now my answer is “once in a while take the time to read the help page when you use a function, even if it’s one you use all the time, in case it’s been updated with something new and useful.”

Any better ideas?

To leave a comment for the author, please follow the link and comment on their blog: R – What You're Doing Is Rather Desperate.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.