Create and convert tibbles
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Tibbles are the modern reimagination of data frames and share many commonalities with their ancestors. The most visible difference is how tibble contents are printed to the console. Tibbles are part of the tidyverse and used for their more consistent behaviour compared to data frames.
- Learn the difference between data frames and tibbles
- Create tibbles from vectors
- Convert data frames into tibbles
tibble(___ = ___, ___ = ___, ...) as_tibble(___)
Introduction to Tibbles
A modern reimagining of the data frame
Tibbles are in many ways similar to data frames. In fact, they are inherited from data frames which means that all functions and features available for data frames also work for tibbles. Therefore, when we speak of data frames we also mean tibbles.
In addition to everything a data frame has to offer, tibbles have a more consistent behaviour with better usability in many cases. Most importantly, when a tibble object is printed to the console it automatically shows only the first 10 rows and condenses additional columns. By contrast, a data frame fills up the entire console screen with values which can lead to confusion. Let’s take a look the the gapminder
dataset from the gapminder package:
gapminder # A tibble: 1,704 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows
We immediately see that the gapminder
dataset is a tibble consisting of 1,704 rows and 6 columns on the top line. In the second line we can see the column names and their corresponding data types directly below.
For example, the column country
has the type <fct>
(which is short for “factor”), year
is an integer <int>
and life expectancy lifeExp
is a <dbl>
—a decimal number.
Quiz: Tibbles versus Data Frames
Which answers about data frames and tibbles are correct?- The printed output to the console is the same for tibbles and data frames
- All functions defined for data frames also work on tibbles.
- Tibbles also show the data types in the console output.
- To use tibble objects the tibbles package needs to be loaded.
- The table dimensions are not shown in the console output for tibbles.
Creating Tibbles
tibble(___ = ___, ___ = ___, ...) as_tibble(___)
The creation of tibbles works exactly the same as for data frames. We can use the tibble()
function from the tibble package to create a new tabular object.
For example, a tibble containing data from four different people and three columns can be created like this:
library(tibble) tibble( id = c(1, 2, 3, 4), name = c("Louisa", "Jonathan", "Luigi", "Rachel"), female = c(TRUE, FALSE, FALSE, TRUE) ) # A tibble: 4 x 3 id name female <dbl> <chr> 1 1 Louisa TRUE 2 2 Jonathan FALSE 3 3 Luigi FALSE 4 4 Rachel TRUE
Converting data frames to Tibbles
If you prefer tibbles to data frames for their additional features they can also be converted from existing data frames with the as_tibble()
function.
For example, the Davis
data frame from the carData package can be converted to a tibble like so:
as_tibble(Davis) # A tibble: 200 x 5 sex weight height repwt repht <fct> <int> <int> <int> <int> 1 M 77 182 77 180 2 F 58 161 51 159 3 F 53 161 54 158 4 M 68 177 70 175 5 F 59 157 59 155 6 M 76 170 76 165 7 M 76 167 77 165 8 M 69 186 73 180 9 M 71 178 71 175 10 M 65 171 64 170 # … with 190 more rows
Exercise: Convert data frame to Tibble
speed dist 1 4 2 2 4 10 3 7 4 [ reached 'max' / getOption("max.print") -- omitted 47 rows ]
The data frame cars
reports the speed of cars and distances taken to stop. To have a nicer printed output in the console use the as_tibble()
function and create a tibble object out of it.
Create and convert tibbles is an excerpt from the course Introduction to R, which is available for free at quantargo.com
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.