R Data Frame
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A data frame is a two-dimensional data structure which can store data in tabular format.
Data frames have rows and columns and each column can be a different vector. And different vectors can be of different data types.
Before we learn about Data Frames, make sure you know about R vector.
Create a Data Frame in R
In R, we use the data.frame()
function to create a Data Frame.
The syntax of the data.frame()
function is
dataframe1 <- data.frame( first_col = c(val1, val2, ...), second_col = c(val1, val2, ...), ... )
Here,
first_col
- a vector with valuesval1, val2, ...
of same data typesecond_col
- another vector with valuesval1, val2, ...
of same data type and so on
Let's see an example,
# Create a data frame dataframe1 <- data.frame ( Name = c("Juan", "Alcaraz", "Simantha"), Age = c(22, 15, 19), Vote = c(TRUE, FALSE, TRUE) ) print(dataframe1)
Output
Name Age Vote 1 Juan 22 TRUE 2 Alcaraz 15 FALSE 3 Simantha 19 TRUE
In the above example, we have used the data.frame()
function to create a data frame named dataframe1. Notice the arguments passed inside data.frame()
,
data.frame ( Name = c("Juan", "Alcaraz", "Simantha"), Age = c(22, 15, 19), Vote = c(TRUE, FALSE, TRUE) )
Here, Name
, Age
, and Vote
are column names for vectors of String
, Numeric
, and Boolean
type respectively.
And finally the datas represented in tabular format are printed.
Access Data Frame Columns
There are different ways to extract columns from a data frame. We can [ ]
, [[ ]]
, or $
to access specific column of a data frame in R. For example,
# Create a data frame dataframe1 <- data.frame ( Name = c("Juan", "Alcaraz", "Simantha"), Age = c(22, 15, 19), Vote = c(TRUE, FALSE, TRUE) ) # pass index number inside [ ] print(dataframe1[1]) # pass column name inside [[ ]] print(dataframe1[["Name"]]) # use $ operator and column name print(dataframe1$Name)
Output
Name 1 Juan 2 Alcaraz 3 Simantha [1] "Juan" "Alcaraz" "Simantha" [1] "Juan" "Alcaraz" "Simantha"
In the above example, we have created a data frame named dataframe1 with three columns Name, Age, Vote.
Here, we have used different operators to access Name column of dataframe1.
Accessing with [[ ]]
or $
is similar. However, it differs for [ ]
, [ ]
will return us a data frame but the other two will reduce it into a vector and return a vector.
Combine Data Frames
In R, we use the rbind()
and the cbind()
function to combine two data frames together.
rbind()
- combines two data frames verticallycbind()
- combines two data frames horizontally
Combine Vertically Using rbind()
If we want to combine two data frames vertically, the column name of two data frames must be equal. For example,
# create a data frame dataframe1 <- data.frame ( Name = c("Juan", "Alcaraz"), Age = c(22, 15) ) # create another data frame dataframe2 <- data.frame ( Name = c("Yiruma", "Bach"), Age = c(46, 89) ) # combine two data frames vertically updated <- rbind(dataframe1, dataframe2) print(updated)
Output
Name Age 1 Juan 22 2 Alcaraz 15 3 Yiruma 46 4 Bach 89
Here, we have used the rbind()
function to combine the two data frames: dataframe1 and dataframe2 vertically.
Combine Horizontally Using cbind()
The cbind()
function combines two or more data frames horizontally. For example,
# create a data frame dataframe1 <- data.frame ( Name = c("Juan", "Alcaraz"), Age = c(22, 15) ) # create another data frame dataframe2 <- data.frame ( Hobby = c("Tennis", "Piano") ) # combine two data frames horizontally updated <- cbind(dataframe1, dataframe2) print(updated)
Output
Name Age Hobby 1 Juan 22 Tennis 2 Alcaraz 15 Piano
Here, we have used cbind()
to combine two data frames horizontally.
Note: The number of items on each vector of two or more combining data frames must be equal otherwise we will get an error: arguments imply differing number of rows or columns
.
#length Length of a Data Frame in R
In R, we use the length()
function to find the number of columns in a data frame. For example,
# Create a data frame dataframe1 <- data.frame ( Name = c("Juan", "Alcaraz", "Simantha"), Age = c(22, 15, 19), Vote = c(TRUE, FALSE, TRUE) ) cat("Total Elements:", length(dataframe1))
Output
Total Elements: 3
Here, we have used length()
to find the total number of columns in dataframe1. Since there are 3 columns, the length()
function returns 3.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.