Import CSV Files into R Step-by-Step Guide

[This article was first published on Methods – finnstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Visit for the most up-to-date information on Data Science, employment, and tutorials finnstats.

If you want to read the original article, go here Import CSV Files into R Step-by-Step Guide

Import CSV Files into R, the contents of a CSV file are stored in a tabular-like style with rows and columns. A delimiter string separates the values of the columns in each row.

K Nearest Neighbor Algorithm in Machine Learning » finnstats

The CSV files can be imported into the working environment and edited using built-in techniques as well as external package imports.

Assume we have a data.csv CSV file saved in the following location:

D:\RStudio\Binning\data.csv

This CSV file can be imported into R in one of three ways

  1. Use read.csv from R’s base package (Slowest method, but works fine for smaller datasets)

To load a.csv file into the current script and operate with it, use the read.csv() method in base R.

Regression analysis in R-Model Comparison » finnstats

The output is delivered as a data frame, with row numbers given to integers starting at 1.

data1 <- read.csv("D:\\RStudio\\Binning\\data.csv", header=TRUE, stringsAsFactors=FALSE)

2. Use the readr package’s read csv command (2-3x faster than read.csv)

The R package “readr” is used to quickly and efficiently read huge flat files into the working space.

library(readr)
data2 <- read_csv("D:\\RStudio\\Binning\\data.csv ")

3. Use the data.table package’s fread (2-3 times faster than read csv)

library(data.table)
data3 <- fread("D:\\RStudio\\Binning\\data.csv ")

This tutorial demonstrates how to import a CSV file into R using each of these approaches.

Approach 1: read.csv

If your CSV file is small enough, you may simply use Base R’s read.csv function to import it.

Decision Tree R Code » Classification & Regression » finnstats

To avoid R converting character or categorical variables into factors, set stringsAsFactors=FALSE when using this technique.

The following code demonstrates how to import this CSV file into R using read.csv:

Let’s import the CSV data file from the location

data1 <- read.csv("D:\\RStudio\\Binning\\data.csv", header=TRUE, stringsAsFactors=FALSE)
head(data1)
  Product      WHC_SLP      DHC_VOL      DHC_GLS
1       A NotPreferred NotPreferred    Preferred
2       A    Preferred    Preferred NotPreferred
3       A NotPreferred    Preferred    Preferred
4       A    Preferred NotPreferred NotPreferred
5       A NoPreference    Preferred NotPreferred
6       B NoPreference NotPreferred    Preferred

Let’s view the structure of data

str(data1)
'data.frame':      11 obs. of  4 variables:
 $ Product: chr  "A" "A" "A" "A" ...
 $ WHC_SLP: chr  "NotPreferred" "Preferred" "NotPreferred" "Preferred" ...
 $ DHC_VOL: chr  "NotPreferred" "Preferred" "Preferred" "NotPreferred" ...
 $ DHC_GLS: chr  "Preferred" "NotPreferred" "Preferred" "NotPreferred" ...

Approach 2: read_csv

You can use the read CSV function from the readr package if you’re working with larger files.

LSTM Network in R » Recurrent Neural network » finnstats

library(readr)

Now we can import the data set

data2 <- read_csv("D:\\RStudio\\Binning\\data.csv")
head(data2)
  Product WHC_SLP      DHC_VOL      DHC_GLS    
  <chr>   <chr>        <chr>        <chr>      
1 A       NotPreferred NotPreferred Preferred  
2 A       Preferred    Preferred    NotPreferred
3 A       NotPreferred Preferred    Preferred  
4 A       Preferred    NotPreferred NotPreferred
5 A       NoPreference Preferred    NotPreferred
6 B       NoPreference NotPreferred Preferred

Let’s view the structure of the data

str(data2)
spec_tbl_df [11 x 4] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ Product: chr [1:11] "A" "A" "A" "A" ...
 $ WHC_SLP: chr [1:11] "NotPreferred" "Preferred" "NotPreferred" "Preferred" ...
 $ DHC_VOL: chr [1:11] "NotPreferred" "Preferred" "Preferred" "NotPreferred" ...
 $ DHC_GLS: chr [1:11] "Preferred" "NotPreferred" "Preferred" "NotPreferred" ...
 - attr(*, "spec")=
  .. cols(
  ..   Product = col_character(),
  ..   WHC_SLP = col_character(),
  ..   DHC_VOL = col_character(),
  ..   DHC_GLS = col_character()
  .. )

Approach 3: fread

If your CSV is exceptionally huge, the fread function from the data is the fastest way to import it into the R.

Naive Bayes Classifier in Machine Learning » Prediction Model » finnstats

Load data.table package

library(data.table)
data3 <- fread("D:\\RStudio\\Binning\\data.csv")
head(data3)
  Product      WHC_SLP      DHC_VOL      DHC_GLS
1:       A NotPreferred NotPreferred    Preferred
2:       A    Preferred    Preferred NotPreferred
3:       A NotPreferred    Preferred    Preferred
4:       A    Preferred NotPreferred NotPreferred
5:       A NoPreference    Preferred NotPreferred
6:       B NoPreference NotPreferred    Preferred

Now let’s view the structure of the data3

str(data3)
Classes ‘data.table’ and 'data.frame':       11 obs. of  4 variables:
 $ Product: chr  "A" "A" "A" "A" ...
 $ WHC_SLP: chr  "NotPreferred" "Preferred" "NotPreferred" "Preferred" ...
 $ DHC_VOL: chr  "NotPreferred" "Preferred" "Preferred" "NotPreferred" ...
 $ DHC_GLS: chr  "Preferred" "NotPreferred" "Preferred" "NotPreferred" ...
 - attr(*, ".internal.selfref")=<externalptr>

To avoid the following common error, we used double backslashes (\\) in the file path in each example.

Error: '\U' used without hex digits in character string starting ""C:\U"

Deep Neural Network in R » Keras & Tensor Flow finnstats

Subscribe to our newsletter!

Don't forget to express your happiness by leaving a comment.
Import CSV Files into R Step-by-Step Guide.
If you are interested to learn more about data science, you can find more articles here finnstats.

The post Import CSV Files into R Step-by-Step Guide appeared first on finnstats.

To leave a comment for the author, please follow the link and comment on their blog: Methods – finnstats.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)