Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A couple weeks ago I stumbled across a feature in R that I had never heard of before. The functions save(), load(), and the R file type .rda.
The .rda files allow a user to save their R data structures such as vectors, matrices, and data frames. The file is automatically compressed, with user options for additional compression. Let’s take a look.
First, we will grab one of the built-in R datasets. We can view these by calling data(). Let’s use the “Orange” dataset.
# get the Orange data Orange Tree age circumference 1 1 118 30 2 1 484 58 3 1 664 87 4 1 1004 115 5 1 1231 120 6 1 1372 142 7 1 1582 145 8 2 118 33 9 2 484 69 10 2 664 111 11 2 1004 156 12 2 1231 172 13 2 1372 203 14 2 1582 203 15 3 118 30 16 3 484 51 17 3 664 75 18 3 1004 108 19 3 1231 115 20 3 1372 139 21 3 1582 140 22 4 118 32 23 4 484 62 24 4 664 112 25 4 1004 167 26 4 1231 179 27 4 1372 209 28 4 1582 214 29 5 118 30 30 5 484 49 31 5 664 81 32 5 1004 125 33 5 1231 142 34 5 1372 174 35 5 1582 177
Next, let’s save each column individually as vectors.
# save the Orange data as vectors count<-Orange$Tree age<-Orange$age circumference<-Orange$circumference
Now if we look at our variables in the RStudio environment, we can see count, age, and circumference saved there.
Next, let’s set our R working directory, so the .rda file will save in the correct location. First we’ll use getwd() to find our current working directory, then we’ll adjust it (if needed) using setwd(). I set my working directory to a folder on the D drive.
#get and set working directory getwd() [1] "D:/Users" setwd("D:/r-temp") > getwd() [1] "D:/r-temp"
Finally, let’s use the save() command to save our 3 vectors to an .rda file. The “file” name will be the name of the new .rda file.
#save to rda file save(count, age, circumference, file = "mydata.rda")
Next we will remove our R environment variables using the command rm().
#remove variables rm(age, circumference, count)
Now we can see that we no longer have saved variables in our R workspace.
Now, we can check that our .rda file (myrda.rda) does in fact store our data by using the load() command.
Note: If we had not properly set our working directory, then we would have needed to provide a full path to the rda file. For example, “C:/Users/Documents/R files/myrda” rather than just “myrda”.
#load the rda file load(file = "mydata.rda")
Great, now we can see that our variables are back in the R environment for use once more.
Saving and loading data in R might be very useful when you’re working with large datasets that you want to clear from your memory, but you also would like to save for later. It also might be useful for long, complex R workflows and scripts. You can control the compression of the file using the settings ‘compress’ and ‘compression_level’.
That’s all for now!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.