Site icon R-bloggers

R Read and Write CSV Files

[This article was first published on R feed, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The CSV (Comma Separated Value) file is a plain text file that uses a comma to separate values.

R has a built-in functionality that makes it easy to read and write a CSV file.


Sample CSV File

To demonstrate how we read CSV files in R, let's suppose we have a CSV file named airtravel.csv with following data:

Month,  1958,   1959,   1960
JAN,    340,    360,    417
FEB,    318,    342,    391
MAR,    362,    406,    419
APR,    348,    396,    461
MAY,    363,    420,    472
JUN,    435,    472,    535
JUL,    491,    548,    622
AUG,    505,    559,    606
SEP,    404,    463,    508
OCT,    359,    407,    461
NOV,    310,    362,    390
DEC,    337,    405,    432

The CSV file above is a sample data of monthly air travel, in thousands of passengers, for 1958-1960.

Now, let's try to read data from this CSV File using R's built-in functions.


Read a CSV File in R

In R, we use the read.csv() function to read a CSV file available in our current directory. For example,

# read airtravel.csv file from our current directory
read_data <- read.csv("airtravel.csv")

# display csv file
print(read_data)

Output

      Month,  1958,  1959,    1960
1    JAN       340    360      417
2    FEB       318    342      391
3    MAR       362    406      419
4    APR       348    396      461
5    MAY       363    420      472
6    JUN       435    472      535
7    JUL       491    548      622
8    AUG       505    559      606
9    SEP       404    463      508
10  OCT        359    407      461
11  NOV        310    362      390
12  DEC        337    405      432

In the above example, we have read the airtravel.csv file that is available in our current directory. Notice the code,

read_data <- read.csv("airtravel.csv")

Here, read.csv() reads the csv file airtravel.csv and creates a dataframe which is stored in the read_data variable.

Finally, the csv file is displayed using print().

Note: If the file is in some other location, we have to specify the path along with the file name as: read.csv("D:/folder1/airtravel.csv").


Number of Rows and Columns of CSV File in R

We use the ncol() and nrow() function to get the total number of rows and columns present in the CSV file in R. For example,

# read airtravel.csv file from our directory
read_data <- read.csv("airtravel.csv")

# print total number of columns
cat("Total Columns: ", ncol(read_data))

# print total number of rows
cat("Total Rows:", nrow(read_data))

Output

Total Columns: 4
Total Rows: 12 

In the above example, we have used the ncol() and nrow() function to find the total number of columns and rows in the airtravel.csv file.

Here,


Using min() and max() With CSV Files

In R, we can also find minimum and maximum data in a certain column of a CSV file using the min() and max() function. For example,

# read airtravel.csv file from our directory
read_data <- read.csv("airtravel.csv")

# return minimum value of 1960 column of airtravel.csv
 min_data <- min(read_data$1960)  # 390

# return maximum value of 1958 column of airtravel.csv
 min_data <- max(read_data$1958)  # 505

Output

[1] 390
[1] 505

Here, we have used the min() and max() function to find the minimum and maximum value of the 1960 and 1958 column of the airtravel.csv file respectively.


Subset of a CSV File in R

In R, we use the subset() function to return all the datas from a CSV file that satisfies the specified condition. For example,

# read airtravel.csv file from our directory
read_data <- read.csv("airtravel.csv")

# return subset of csv where number of air 
# traveler in 1958 should be greater than 400
sub_data <- subset(read_data, 1958 > 400) 

print(sub_data)

Output

      Month,  1958,  1959,  1960
6    JUN       435    472      535
7    JUL       491    548      622
8    AUG       505    559      606
9    SEP       404    463      508

In the above example, we have specified a certain condition inside the subset() function to extract data from a CSV file.

subset(read_data, 1958 > 400)

Here, subset() creates a subset of airtravel.csv with data column 1958 having data greater than 400 and stored it in the sub_data data frame.

Since column 1958 has data greater than 400 in 6th, 7th, 8th, and 9th row, only these rows are displayed.


Write Into CSV File in R

In R, we use the write.csv() function to write into a CSV file. We pass the data in the form of dataframe. For example,

# Create a data frame
dataframe1 <- data.frame (
  Name = c("Juan", "Alcaraz", "Simantha"),
  Age = c(22, 15, 19),
  Vote = c(TRUE, FALSE, TRUE))

# write dataframe1 into file1 csv file
write.csv(dataframe1, "file1.csv")

In the above example, we have used the write.csv() function to export a data frame named dataframe1 to a CSV file. Notice the arguments passed inside write.csv(),

write.csv(dataframe1, "file1.csv")

Here,

Finally, the file1.csv file would look like this in our directory:

If we pass "quote = FALSE" to write.csv() as:

write.csv(dataframe1, "file1.csv",
  quote = FALSE
)

Our file1.csv would look like this:

All the values which were wrapped by double quotes " " are removed.

To leave a comment for the author, please follow the link and comment on their blog: R feed.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.