Site icon R-bloggers

A Comprehensive Introduction to Handling Date & Time in R

[This article was first published on Rsquared Academy Blog - Explore Discover Learn, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this tutorial, we will learn to handle date & time in R. We will start off by learning how to get current date & time before moving on to understand how R handles date/time internally and the different classes such as Date & POSIXct/lt. We will spend some time exploring time zones, daylight savings and ISO 8001 standard for representing date/time. We will look at all the weird formats in which date/time come in real world and learn to parse them using conversion specifications. After this, we will also learn how to handle date/time columns while reading external data into R. We will learn to extract and update different date/time components such as year, month, day, hour, minute etc., create sequence of dates in different ways and explore intervals, durations and period. We will end the tutorial by learning how to round/rollback dates. Throughout the tutorial, we will also work through a case study to better understand the concepts we learn. Happy learning!

Table of Contents

Resources

Below are the links to all the resources related to this tutorial:

Introduction

Date

Let us begin by looking at the current date and time. Sys.Date() and today() will return the current date.

Sys.Date()
## [1] "2020-06-10"
lubridate::today()
## [1] "2020-06-10"

Time

Sys.time() and now() return the date, time and the timezone. In now(), we can specify the timezone using the tzone argument.

Sys.time()
## [1] "2020-06-10 22:47:50 IST"
lubridate::now()
## [1] "2020-06-10 22:47:50 IST"
lubridate::now(tzone = "UTC")
## [1] "2020-06-10 17:17:50 UTC"

AM or PM?

am() and pm() allow us to check whether date/time occur in the AM or PM? They return a logical value i.e. TRUE or FALSE

lubridate::am(now())
## [1] FALSE
lubridate::pm(now())
## [1] TRUE

Leap Year

We can also check if the current year is a leap year using leap_year().

Sys.Date()
## [1] "2020-06-10"
lubridate::leap_year(Sys.Date())
## [1] TRUE

Summary

Function Description
Sys.Date() Current Date
lubridate::today() Current Date
Sys.time() Current Time
lubridate::now() Current Time
lubridate::am() Whether time occurs in am?
lubridate::pm() Whether time occurs in pm?
lubridate::leap_year() Check if the year is a leap year?

Your Turn

  • get current date
  • get current time
  • check whether the time occurs in am or pm?
  • check whether the following years were leap years
    • 2018
    • 2016

Case Study

Throughout the tutorial, we will work on a case study related to transactions of an imaginary trading company. The data set includes information about invoice and payment dates.

Data

transact <- readr::read_csv('https://raw.githubusercontent.com/rsquaredacademy/datasets/master/transact.csv')
## # A tibble: 2,466 x 3
##    Invoice    Due        Payment   
##    <date>     <date>     <date>    
##  1 2013-01-02 2013-02-01 2013-01-15
##  2 2013-01-26 2013-02-25 2013-03-03
##  3 2013-07-03 2013-08-02 2013-07-08
##  4 2013-02-10 2013-03-12 2013-03-17
##  5 2012-10-25 2012-11-24 2012-11-28
##  6 2012-01-27 2012-02-26 2012-02-22
##  7 2013-08-13 2013-09-12 2013-09-09
##  8 2012-12-16 2013-01-15 2013-01-12
##  9 2012-05-14 2012-06-13 2012-07-01
## 10 2013-07-01 2013-07-31 2013-07-26
## # ... with 2,456 more rows

We will explore more about reading data sets with date/time columns after learning how to parse date/time. We have shared the code for reading the data sets used in the practice questions both in the Learning Management System as well as in our GitHub repo.

Data Dictionary

The data set has 3 columns. All the dates are in the format (yyyy-mm-dd).

Column Description
Invoice Invoice Date
Due Due Date
Payment Payment Date

In the case study, we will try to answer a few questions we have about the transact data.

  • extract date, month and year from Due
  • compute the number of days to settle invoice
  • compute days over due
  • check if due year is a leap year
  • check when due day in february is 29, whether it is a leap year
  • how many invoices were settled within due date
  • how many invoices are due in each quarter

Date & Time Classes

In this section, we will look at two things. First, how to create date/time data in R, and second, how to convert other data types to date/time. Let us begin by creating the release date of R 3.6.2.

release_date <- 2019-12-12
release_date
## [1] 1995

Okay! Why do we see 1995 when we call the date? What is happening here? Let us quickly check the data type of release_date.

class(release_date)
## [1] "numeric"

The data type is numeric i.e. R has subtracted 12 twice from 2019 to return 1995. Clearly, the above method is not the right way to store date/time. Let us see if we can get some hints from the built-in R functions we used in the previous section. If you observe the output, all of them returned date/time wrapped in quotes. Hmmm… let us wrap our date in quotes and see what happens.

release_date <- "2019-12-12"
release_date
## [1] "2019-12-12"

Alright, now R does not do any arithmetic and returns the date as we specified. Great! Is this the right format to store date/time then? No. Why? What is the problem if date/time is saved as character/string? The problem is the nature or type of operations done on date/time are different when compared to string/character, number or logical values.

  • how do we add/subtract dates?
  • how do we extract components such as year, month, day etc.

To answer the above questions, we will first check the data type of Sys.Date() and now().

class(Sys.Date())
## [1] "Date"
class(lubridate::now())
## [1] "POSIXct" "POSIXt"
class(release_date)
## [1] "character"

As you can see from the above output, there are 3 different classes for storing date/time in R

  • Date
  • POSIXct
  • POSIXlt

Let us explore each of the above classes one by one.

Date

Introduction

The Date class represents calendar dates. Let us go back to Sys.Date(). If you check the class of Sys.Date(), it is Date. Internally, this date is a number i.e. an integer. The unclass() function will show how dates are stored internally.

Sys.Date()
## [1] "2020-06-10"
unclass(Sys.Date())
## [1] 18423

What does this integer represent? Why has R stored the date as an integer? In R, dates are represented as the number of days since 1970-01-01. All the dates in R are internally stored in this way. Before we explore this concept further, let us learn to create Date objects in R. We will continue to use the release date of R 3.6.2, 2019-12-12.

Until now, we have stored the above date as character/string but now we will use as.Date() to save it as a Date object. as.Date() is the easiest and simplest way to create dates in R.

release_date <- as.Date("2019-12-12")
release_date
## [1] "2019-12-12"

The as_date() function from the lubridate package is similar to as.Date().

release_date <- lubridate::as_date("2019-12-12")
release_date
## [1] "2019-12-12"

If you look at the difference between release_date and 1970-01-01, it will be the same as unclass(release_date).

release_date - as.Date("1970-01-01")
## Time difference of 18242 days
unclass(release_date)
## [1] 18242

Let us come back to 1970-01-01 i.e. the origin for dates in R.

lubridate::origin
## [1] "1970-01-01 UTC"

From the previous examples, we know that dates are internally stored as number of days since 1970-01-01. How about dates older than the origin? How are they stored? Let us look at that briefly.

unclass(as.Date("1963-08-28"))
## [1] -2318

Dates older than the origin are stored as negative integers. For those who are not aware, Martin Luther King, Jr. delivered his famous I Have a Dream speech on 1963-08-28. Let us move on and learn how to convert numbers into dates.

Convert Numeric

The as.Date() function can be used to convert any of the following to a Date object

  • character/string
  • number
  • factor (categorical/qualitative)

We have explored how to convert strings to date. How about converting numbers to date? Sure, we can create date from numbers by specifying the origin and number of days since it.

as.Date(18242, origin = "1970-01-01")
## [1] "2019-12-12"

The origin can be changed to another date (while changing the number as well.)

as.Date(7285, origin = "2000-01-01")
## [1] "2019-12-12"

ISO 8601

If you have carefully observed, the format in which we have been specifying the dates as well as of those returned by functions such as Sys.Date() or Sys.time() is the same i.e. YYYY-MM-DD. It includes

  • the year including the century
  • the month
  • the date

The month and date separated by -. This default format used in R is the ISO 8601 standard for date/time. ISO 8601 is the internationally accepted way to represent dates and times and uses the 24 hour clock system. Let us create the release date using another function ISOdate().

ISOdate(year  = 2019,
        month = 12,
        day   = 12,
        hour  = 8,
        min   = 5, 
        sec   = 3,
        tz    = "UTC")
## [1] "2019-12-12 08:05:03 UTC"

We will look at all the different weird ways in which date/time are specified in the real world in the Date & Time Formats section. For the time being, let us continue exploring date/time classes in R. The next class we are going to look at is POSIXct/POSIXlt.

POSIX

You might be wondering what is this POSIX thing? POSIX stands for Portable Operating System Interface. It is a family of standards specified for maintaining compatibility between different operating systems. Before we learn to create POSIX objects, let us look at now() from lubridate.

class(lubridate::now())
## [1] "POSIXct" "POSIXt"

now() returns current date/time as a POSIXct object. Let us look at its internal representation using unclass()

unclass(lubridate::now())
## [1] 1591809475
## attr(,"tzone")
## [1] ""

The output you see is the number of seconds since January 1, 1970.

POSIXct

POSIXct represents the number of seconds since the beginning of 1970 (UTC) and ct stands for calendar time. To store date/time as POSIXct objects, use as.POSIXct(). Let us now store the release date of R 3.6.2 as POSIXct as shown below

release_date <- as.POSIXct("2019-12-12 08:05:03")
class(release_date)
## [1] "POSIXct" "POSIXt"
unclass(release_date) 
## [1] 1576118103
## attr(,"tzone")
## [1] ""

POSIXlt

POSIXlt represents the following information in a list

  • seconds
  • minutes
  • hour
  • day of the month
  • month
  • year
  • day of week
  • day of year
  • daylight saving time flag
  • time zone
  • offset in seconds from GMT

The lt in POSIXlt stands for local time. Use as.POSIXlt() to store date/time as POSIXlt objects. Let us store the release date as a POSIXlt object as shown below

release_date <- as.POSIXlt("2019-12-12 08:05:03")
release_date
## [1] "2019-12-12 08:05:03 IST"

As we said earlier, POSIXlt stores date/time components in a list and these can be extracted. Let us look at the date/time components returned by POSIXlt using unclass().

release_date <- as.POSIXlt("2019-12-12 08:05:03")
unclass(release_date)
## $sec
## [1] 3
## 
## $min
## [1] 5
## 
## $hour
## [1] 8
## 
## $mday
## [1] 12
## 
## $mon
## [1] 11
## 
## $year
## [1] 119
## 
## $wday
## [1] 4
## 
## $yday
## [1] 345
## 
## $isdst
## [1] 0
## 
## $zone
## [1] "IST"
## 
## $gmtoff
## [1] NA

Use unlist() if you want the components returned as a vector.

release_date <- as.POSIXlt("2019-12-12 08:05:03")
unlist(release_date)
##    sec    min   hour   mday    mon   year   wday   yday  isdst   zone gmtoff 
##    "3"    "5"    "8"   "12"   "11"  "119"    "4"  "345"    "0"  "IST"     NA

To extract specific components, use $.

release_date <- as.POSIXlt("2019-12-12 08:05:03")
release_date$hour
## [1] 8
release_date$mon
## [1] 11
release_date$zone
## [1] "IST"

Now, let us look at the components returned by POSIXlt. Some of them are intuitive

Component Description
sec Second
min Minute
hour Hour of the day
mon Month of the year (0-11
zone Timezone
wday Day of week
mday Day of month
year Years since 1900
yday Day of year
isdst Daylight saving flag
gmtoff Offset is seconds from GMT

Great! We will end this section with a few tips/suggestions on when to use Date or POSIXct/POSIXlt.

  • use Date when there is no time component
  • use POSIX when dealing with time and timezones
  • use POSIXlt when you want to access/extract the different components

Your Turn

R 1.0.0 was released on 2000-02-29 08:55:23 UTC. Save it as

  • Date using character
  • Date using origin and number
  • POSIXct
  • POSIXlt and extract
    • month day
    • day of year
    • month
    • zone
  • ISODate

Date Arithmetic

Time to do some arithmetic with the dates. Let us calculate the length of a course you have enrolled for (Become a Rock Star Data Scientist in 10 Days) by subtracting the course start date from the course end date.



course_start    <- as_date('2017-04-12')
course_end      <- as_date('2017-04-21')
course_duration <- course_end - course_start
course_duration
## Time difference of 9 days

Shift Date

Time to shift the course dates. We can shift a date by days, weeks or months. Let us shift the course start date by:

  • 2 days
  • 3 weeks
  • 1 year



course_start + days(2)
## [1] "2017-04-14"
course_start + weeks(3)
## [1] "2017-05-03"
course_start + years(1)
## [1] "2018-04-12"

Case Study

Compute days to settle invoice

Let us estimate the number of days to settle the invoice by subtracting the date of invoice from the date of payment.

transact %>%
  mutate(
    days_to_pay = Payment - Invoice
  )
## # A tibble: 2,466 x 4
##    Invoice    Due        Payment    days_to_pay
##    <date>     <date>     <date>     <drtn>     
##  1 2013-01-02 2013-02-01 2013-01-15 13 days    
##  2 2013-01-26 2013-02-25 2013-03-03 36 days    
##  3 2013-07-03 2013-08-02 2013-07-08  5 days    
##  4 2013-02-10 2013-03-12 2013-03-17 35 days    
##  5 2012-10-25 2012-11-24 2012-11-28 34 days    
##  6 2012-01-27 2012-02-26 2012-02-22 26 days    
##  7 2013-08-13 2013-09-12 2013-09-09 27 days    
##  8 2012-12-16 2013-01-15 2013-01-12 27 days    
##  9 2012-05-14 2012-06-13 2012-07-01 48 days    
## 10 2013-07-01 2013-07-31 2013-07-26 25 days    
## # ... with 2,456 more rows

Compute days over due

How many of the invoices were settled post the due date? We can find this by:

  • subtracting the due date from the payment date
  • counting the number of rows where delay > 0
transact %>%
  mutate(
    delay = Payment - Due
  ) %>%
  filter(delay > 0) %>% 
  count(delay)
## # A tibble: 36 x 2
##    delay       n
##  * <drtn>  <int>
##  1  1 days    61
##  2  2 days    65
##  3  3 days    51
##  4  4 days    62
##  5  5 days    69
##  6  6 days    56
##  7  7 days    55
##  8  8 days    49
##  9  9 days    38
## 10 10 days    33
## # ... with 26 more rows

Your Turn

  • compute the length of a vacation which begins on 2020-04-19 and ends on 2020-04-25
  • recompute the length of the vacation after shifting the vacation start and end date by 10 days and 2 weeks
  • compute the days to settle invoice and days overdue from the receivables.csv data set
  • compute the length of employment (only for those employees who have been terminated) from the hr-data.csv data set (use date of hire and termination)

Time Zones & Daylight Savings

Introduction

In the previous section, POSIXlt stored date/time components as a list. Among the different components it returned were

  • gmtoff
  • zone

gmtoff is offset in seconds from GMT i.e. difference in hours and minutes from UTC. Wait.. What do UTC and GMT stand for?

  • Coordinated Universal Time (UTC)
  • Greenwich Meridian Time (GMT)

Since we are talking about UTC, GMT etc., let us spend a little time on understanding the basics of time zones and daylight savings.

Time Zones

Timezones exist because different parts of the Earth receive sun light at different times. If there was a single timezone, noon or morning would mean different things in different parts of the world. The timezones are based on Earth’s rotation. The Earth moves ~15 degrees every 60 minutes i.e. 360 degrees in 24 hours. The planet is divided into 24 timezones, each 15 degrees of longitude width.

Now, you have heard of Greenwich Meridian Time (GMT) right? We just saw GMT off set in POSIXlt and you would have come across it in other time formats as well. For example, India timezone is given as GMT +5:30. Let us explore GMT in a little more detail. Greenwich is a suburb of London and the time at Greenwich is Greenwich Mean Time. As you move West from Greenwich, every 15 degree section is one hour earlier than GMT and every 15 degree section to the East is an hour later.

Alright! What is UTC then? Coordinated Universal Time (UTC) , on the other hand, is the time standard commonly used across the world. Even though they share the same current time, GMT is a timezone while UTC is a time standard.

So how do we check the timezone in R? When you run Sys.timezone(), you should be able to see the timezone you are in.

Sys.timezone()
## [1] "Asia/Calcutta"

If you do not see the timezone, use Sys.getenv() to get the value of the TZ environment variable.

Sys.getenv("TZ")
## [1] ""

If nothing is returned, it means we have to set the timezone. Use Sys.setenv() to set the timezone as shown below. The author resides in India and hence the timezone is set to Asia/Calcutta. You need to set the timezone in which you reside or work.

Sys.setenv(TZ = "Asia/Calcutta")

Another way to get the timezone is through tz() from the lubridate package.

lubridate::tz(Sys.time())
## [1] ""

If you want to view the time in a different timezone, use with_tz(). Let us look at the current time in UTC instead of Indian Standard Time.

lubridate::with_tz(Sys.time(), "UTC")
## [1] "2020-06-10 17:17:58 UTC"

Daylight Savings

Daylight savings also known as

  • daylight saving time
  • daylight savings time
  • daylight time
  • summer time

is the practice of advancing clocks during summer months so that darkness falls later each day according to the clock. In other words

  • advance clock by one hour in spring (spring forward)
  • retard clocks by one hour in autumn (fall back)

In R, the dst() function is an indicator for daylight savings. It returns TRUE if daylight saving is in force, FALSE if not and NA if unknown.

Sys.Date()
## [1] "2020-06-10"
dst(Sys.Date()) 
## [1] FALSE

Your Turn

  • check the timezone you live in
  • check if daylight savings in on
  • check the current time in UTC or a different time zone

Date & Time Formats

After the timezones and daylight savings detour, let us get back on path and explore another important aspect, date & time formats. Although it is a good practice to adher to ISO 8601 format, not all date/time data will comply with it. In real world, date/time data may come in all types of weird formats. Below is a sample

Format
December 12, 2019
12th Dec, 2019
Dec 12th, 19
12-Dec-19
2019 December
12.12.19

When the data is not in the default ISO 8601 format, we need to explicitly specify the format in R. We do this using conversion specifications. A conversion specification is introduced by %, usually followed by a single letter or O or E and then a single letter.

Conversion Specifications

Specification Description Example
%d Day of the month (decimal number) 12
%m Month (decimal number) 12
%b Month (abbreviated) Dec
%B Month (full name) December
%y Year (2 digit) 19
%Y Year (4 digit) 2019
%H Hour 8
%M Minute 5
%S Second 3

Time to work through a few examples. Let us say you are dealing with dates in the format 19/12/12. In this format, the year comes first followed by month and the date; each separated by a slash (/). The year consists of only 2 digits i.e. it does not include the century. Let us now map each component of the date to the conversion specification table shown at the beginning.

Date Specification
19 %y
12 %m
12 %d

Using the format argument, we will specify the conversion specification as a character vector i.e. enclosed in quotes.

as.Date("19/12/12", format = "%y/%m/%d")
## [1] "2019-12-12"

Another way in which the release data can be written is 2019-Dec-12. We still have the year followed by the month and the date but there are a few changes here:

  • the components are separated by a - instead of /
  • year has 4 digits i.e. includes the century
  • the month is specified using abbreviation instead of digits

Let us map the components to the format table:

Date Specification
2019 %Y
Dec %b
12 %d

Let us specify the format for the date using the above mapping.

as.Date("2019-Dec-12", format = "%Y-%b-%d")
## [1] "2019-12-12"

In both the above examples, we have not dealt with time components. Let us include the release time of R 3.6.2 in the next one i.e.  19/12/12 08:05:03.

Date Specification
19 %y
12 %m
12 %d
08 %H
05 %M
03 %S

Since we are dealing with time, we will use as.POSIXct() instead of as.Date().

as.POSIXct("19/12/12 08:05:03", tz = "UTC", format = "%y/%m/%d %H:%M:%S")
## [1] "2019-12-12 08:05:03 UTC"

In the below table, we look at some of the most widely used conversion specifications. You can learn more about these specifications by running ?strptime or help(strptime).

Specification Description
%a Abbreviated weekday
%A Full weekday
%C Century (00-99)
%D Same as %m/%d/%y
%e Day of month [1 – 31]
%F Same as %Y-%m-%d
%h Same as %b
%I Hours as decimal [01 – 12]
%j Day of year [001 – 366]
%R Same as %H:%M
%t Tab
%T Same as %H:%M:%S
%u Weekday 1 – 7
%U Week of year [00 – 53]
%V Week of year [01 – 53]
%w Weekday 0 – 6
%W Week of year [00 – 53]

We have included a lot of practice questions for you to explore the different date/time formats. The solutions are available in the Learning Management System as well as in our GitHub repo. Try them and let us know if you have any doubts.

Guess Format

guess_formats() from lubridate is a very useful function. It will guess the date/time format if you specify the order in which year, month, date, hour, minute and second appear.

release_date_formats <- c("December 12th 2019",
                        "Dec 12th 19",
                        "dec 12 2019")

guess_formats(release_date_formats, 
              orders = "mdy", 
              print_matches = TRUE)
##                           Omdy          mdy         
## [1,] "December 12th 2019" "%Om %dth %Y" "%B %dth %Y"
## [2,] "Dec 12th 19"        "%Om %dth %y" "%b %dth %y"
## [3,] "dec 12 2019"        "%Om %d %Y"   "%b %d %Y"
##          Omdy          Omdy          Omdy           mdy           mdy 
## "%Om %dth %Y" "%Om %dth %y"   "%Om %d %Y"  "%B %dth %Y"  "%b %dth %y" 
##           mdy 
##    "%b %d %Y"

Your Turn

Below, we have specified July 5th, 2019 in different ways. Create the date using as.Date() while specifying the correct format for each of them.

  • 05.07.19
  • 5-July 2019
  • July 5th, 2019
  • July 05, 2019
  • 2019-July- 05
  • 05/07/2019
  • 07/05/2019
  • 7/5/2019
  • 07/5/19
  • 2019-07-05

Parse Date & Time

While creating date-time objects, we specified different formats using the conversion specification but most often you will not create date/time and instead deal with data that comes your way from a database or API or colleague/collaborator. In such cases, we need to be able to parse date/time from the data provided to us. In this section, we will focus on parsing date/time from character data. Both base R and the lubridate package offer functions to parse date and time and we will explore a few of them in this section. We will initially use functions from base R and later on explore those from lubridate which will give us an opportunity to compare and contrast. It will also allow us to choose the functions based on the data we are dealing with.

strptime() will convert character data to POSIXlt. You will use this when converting from character data to date/time. On the other hand, if you want to convert date/time to character data, use any of the following:

  • strftime()
  • format()
  • as.character()

The above functions will convert POSIXct/POSIXlt to character. Let us start with a simple example. The data we have been supplied has date/time as character data and in the format YYYYMMDD i.e. nothing separates the year, month and date from each other. We will use strptime() to convert this to an object of class POSIXlt.

rel_date <- strptime("20191212", format = "%Y%m%d")
class(rel_date)
## [1] "POSIXlt" "POSIXt"

If you have a basic knowledge of conversion specifications, you can use strptime() to convert character data to POSIXlt. Let us quickly explore the functions to convert date/time to character data before moving on to the functions from lubridate.

rel_date_strf <- strftime(rel_date)
class(rel_date_strf)
## [1] "character"
rel_date_format <- format(rel_date)
class(rel_date_format)
## [1] "character"
rel_date_char <- as.character(rel_date)
class(rel_date_char)
## [1] "character"

As you can see, all the 3 functions converted date/time to character. Time to move on and explore the lubridate package. We will start with an example in which the release date is formatted in 3 different ways but they have one thing in common i.e. the order in which the components appear. In all the 3 formats, the year is followed by the month and then the date.

To parse the release date, we will use parse_date_time() from lubridate which parses the input into POSIXct objects.

release_date <- c("19-12-12", "20191212", "19-12 12")
parse_date_time(release_date, "ymd")
## [1] "2019-12-12 UTC" "2019-12-12 UTC" "2019-12-12 UTC"
parse_date_time(release_date, "y m d")
## [1] "2019-12-12 UTC" "2019-12-12 UTC" "2019-12-12 UTC"
parse_date_time(release_date, "%y%m%d")
## [1] "2019-12-12 UTC" "2019-12-12 UTC" "2019-12-12 UTC"

Try to use strptime() in the above example and see what happens. Now, let us look at another data set.

release_date <- c("19-07-05", "2019-07-05", "05-07-2019")

What happens in the below case? The same date appears in multiple formats. How do we parse them? parse_date_time() allows us to specify mutiple date-time formats. Let us first map the dates to their formats.

Date Specification
19-07-05 ymd
2019-07-05 ymd
05-07-2019 dmy

The above specifications can be supplied as a character vector.

parse_date_time(release_date, c("ymd", "ymd", "dmy"))
## [1] "2019-07-05 UTC" "2019-07-05 UTC" "2019-07-05 UTC"

Great! We have used both strptime() and parse_date_time() now. Can you tell what differentiates parse_date_time() when compared to strptime()? We summarize it in the points below:

  • no need to include % prefix or separator
  • specify several date/time formats

There are other helper functions that can be used to

  • parse dates with only year, month, day components
  • parse dates with year, month, day, hour, minute, seconds components
  • parse dates with only hour, minute, second components

and are explored in the below examples.

# year/month/date
ymd("2019-12-12")
## [1] "2019-12-12"
# year/month/date
ymd("19/12/12")
## [1] "2019-12-12"
# date/month/year
dmy(121219)
## [1] "2019-12-12"
# year/month/date/hour/minute/second
ymd_hms(191212080503)
## [1] "2019-12-12 08:05:03 UTC"
# hour/minute/second
hms("8, 5, 3")
## [1] "8H 5M 3S"
# hour/minute/second
hms("08:05:03")
## [1] "8H 5M 3S"
# minute/second
ms("5,3")
## [1] "5M 3S"
# hour/minute
hm("8, 5")
## [1] "8H 5M 0S"

Note, in a couple of cases where the components are not separated by /, - or space, we have not enclosed the values in quotes.

Your Turn

Below, we have specified July 5th, 2019 in different ways. Parse the dates using strptime() or parse_date_time() or any other helper function.

  • July-05-19
  • JUL-05-19
  • 05.07.19
  • 5-July 2019
  • July 5th, 2019
  • July 05, 2019
  • 2019-July- 05
  • 05/07/2019
  • 07/05/2019
  • 7/5/2019
  • 07/5/19
  • 2019-07-05

Date & Time Components

In the second section, we discussed the downside of saving date/time as character/string in R. One of the points we discussed was that we can’t extract components such as year, month, day etc. In this section, we will learn to extract date/time components such as

  • year
  • month
  • date
  • week
  • day
  • quarter
  • semester
  • hour
  • minute
  • second
  • timezone

The below table outlines the functions we will explore in the first part of this section.

Function Description
year() Get year
month() Get month (number)
month(label = TRUE) Get month (abbreviated name)
month(abbr = FALSE) Get month (full name)
months() Get month
week() Get week

Year

release_date <- ymd_hms("2019-12-12 08:05:03")
year(release_date) 
## [1] 2019

Month

month() will return the month as a number i.e. 12 for December.

month(release_date)
## [1] 12

Instead, if you want the name of the month , use the label argument and set it to TRUE. Now it returns Dec instead of 12.

month(release_date, label = TRUE)
## [1] Dec
## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

But this is the abbreviated name and not the full name. How do we get the full name of the month? Set the abbr argument to FALSE.

month(release_date, label = TRUE, abbr  = FALSE)
## [1] December
## 12 Levels: January < February < March < April < May < June < ... < December

Ah! now we can see the full name of the month. months() from base R will return the full name of the month by default. If you want the abbreviated name, use the abbreviate argument and set it to TRUE.

months(release_date)
## [1] "December"

Week

week() returns the number of complete 7 day periods between the date and 1st January plus one.

week(release_date)
## [1] 50

Day

Use day() to extract the date component. There are other variations such as

Function Description
day Get day
mday() Day of the month
wday() Day of the week
qday() Day of quarter
yday() Day of year
weekdays() Day of week
days_in_month() Days in the month
day(release_date)
## [1] 12
mday(release_date)                 
## [1] 12
qday(release_date)                 
## [1] 73
yday(release_date)                 
## [1] 346

wday can return

  • a number
  • abbreviation of the weekday
  • full name of the weekday
wday(release_date)  
## [1] 5
wday(release_date, label = TRUE)
## [1] Thu
## Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat
wday(release_date, label = TRUE, abbr  = FALSE)  
## [1] Thursday
## 7 Levels: Sunday < Monday < Tuesday < Wednesday < Thursday < ... < Saturday

weekdays() from base R also returns the day of the week (the name and not the number). If you want the abbreviated name, use the abbreviate argument.

weekdays(release_date)
## [1] "Thursday"
weekdays(release_date, abbreviate = TRUE)
## [1] "Thu"

Days in Month

If you want to know the number of days in the month, use days_in_month(). In our example, the month is December and it has 31 days.

days_in_month(release_date)
## Dec 
##  31

Hour, Minute & Seconds

Function Description
hour() Get hour
minute() Get minute
second() Get second
seconds() Number of seconds since 1970-01-01

So far we have been looking at date components. Now, let us look at time components.

hour(release_date)
## [1] 8
minute(release_date)
## [1] 5
second(release_date)
## [1] 3

seconds() returns the number of seconds since 1970-01-01.

seconds(release_date)
## [1] "1576137903S"

Quarter & Semester

quarter() will return the quarter from the date. December is in the 4th quarter and hence it returns 4.

quarter(release_date)
## [1] 4

If you want the year along with the quarter, set the with_year argument to TRUE.

quarter(release_date, with_year = TRUE)
## [1] 2019.4

In India, the fiscal starts in April and December falls in the 3rd quarter. How can we accommodate this change? The fiscal_start argument allows us to set the month in which the fiscal begins. We will set it to 4 for April. Now it returns 3 instead of 4.

quarter(release_date, fiscal_start = 4)    
## [1] 3

quarters() from base R also returns the quarter.

quarters(release_date)
## [1] "Q4"
Function Description
quarter() Get quarter
quarter(with_year = TRUE) Quarter with year
quarter(fiscal_start = 4) Fiscal starts in April
quarters() Get quarter
semester() Get semester

Case Study

Extract Date, Month & Year from Due Date

Let us now extract the date, month and year from the Due column.

transact %>%
  mutate(
    due_day   = day(Due),
    due_month = month(Due),
    due_year  = year(Due)
  )
## # A tibble: 2,466 x 6
##    Invoice    Due        Payment    due_day due_month due_year
##    <date>     <date>     <date>       <int>     <dbl>    <dbl>
##  1 2013-01-02 2013-02-01 2013-01-15       1         2     2013
##  2 2013-01-26 2013-02-25 2013-03-03      25         2     2013
##  3 2013-07-03 2013-08-02 2013-07-08       2         8     2013
##  4 2013-02-10 2013-03-12 2013-03-17      12         3     2013
##  5 2012-10-25 2012-11-24 2012-11-28      24        11     2012
##  6 2012-01-27 2012-02-26 2012-02-22      26         2     2012
##  7 2013-08-13 2013-09-12 2013-09-09      12         9     2013
##  8 2012-12-16 2013-01-15 2013-01-12      15         1     2013
##  9 2012-05-14 2012-06-13 2012-07-01      13         6     2012
## 10 2013-07-01 2013-07-31 2013-07-26      31         7     2013
## # ... with 2,456 more rows

Data Sanitization

Let us do some data sanitization. If the due day happens to be February 29, let us ensure that the due year is a leap year. Below are the steps to check if the due year is a leap year:

  • we will extract the following from the due date:
    • day
    • month
    • year
  • we will then create a new column is_leap which will have be set to TRUE if the year is a leap year else it will be set to FALSE
  • filter all the payments due on 29th Feb
  • select the following columns:
    • Due
    • is_leap
transact %>%
  mutate(
    due_day   = day(Due),
    due_month = month(Due),
    due_year  = year(Due),
    is_leap   = leap_year(due_year)
  ) %>%
  filter(due_month == 2 & due_day == 29) %>%
  select(Due, is_leap) 
## # A tibble: 4 x 2
##   Due        is_leap
##   <date>     <lgl>  
## 1 2012-02-29 TRUE   
## 2 2012-02-29 TRUE   
## 3 2012-02-29 TRUE   
## 4 2012-02-29 TRUE

Invoices Distribution by Quarter

Let us count the invoices due for each quarter.

transact %>%
  mutate(
    quarter_due = quarter(Due)
  ) %>%
  count(quarter_due)
## # A tibble: 4 x 2
##   quarter_due     n
## *       <int> <int>
## 1           1   521
## 2           2   661
## 3           3   618
## 4           4   666

Your Turn

Get the R release dates using r_versions() from the rversions package and tabulate the following

  • year
  • month with label
  • weekday with label
  • hour
  • and quarter

Create, Update & Verify

In the second section, we learnt to create date-time objects using as.Date(), as.POSIXct() etc. In this section, we will explore a few other functions that will allow us to do the same

  • make_date()
  • make_datetime()

Create

To create date without time components, use make_date() and specify the following:

  • year
  • month
  • date

We need to specify all the components in numbers i.e. we cannot use Dec or December for the month. It has to be 12.

make_date(year  = 2019,
          month = 12,
          day   = 12)
## [1] "2019-12-12"

When you need to include time components, use make_datetime().

make_datetime(year  = 2019,
              month = 12,
              day   = 12,
              hour  = 08,
              min   = 05,
              sec   = 03,
              tz    = "UTC")
## [1] "2019-12-12 08:05:03 UTC"

Update

Let us look at another scenario. You have a date-time object and want to change one of its components i.e. any of the following

  • year
  • month
  • date

Instead of creating another date-time object, you can change any of the components using update(). In the below example, we will start with the date of release of R version 3.6.1 and using update(), we will change it to 2019-12-12.

prev_release <- ymd("2019-07-05")
prev_release %>% 
  update(year  = 2019,
         month = 12,
         mday  = 12)
## [1] "2019-12-12"

Date Sequence

So far we have created a single date-time instance. How about creating a sequence of dates? We can do that using seq.Date(). We need to specify the from date as the bare minimum input. If the end date is not specified, it will create the sequence uptil the current date.

The interval of the sequence can be specified in any of the following units:

  • day
  • week
  • month
  • quarter
  • year

We can add the following to the interval units

  • integer
  • + / - (increment or decrement)

Using the integer, we can specify multiples of the units mentioned and using the sign, we can specify whether to increment or decrement.

The below table displays the main arguments used in seq.Date():

Function Description
from Starting date of the sequence
by End date of the sequence
to Date increment of the sequence
length.out Length of the sequence
along.with Use length of this value as length of sequence

In the first example, we will create a sequence of dates from 2010-01-01 to 2019-12-31. The unit of increment should be a year i.e. the difference between the dates in the sequence should be 1 year, specified using the by argument.

seq.Date(from = as.Date("2010-01-01"), to = as.Date("2019-12-31"), by = "year")
##  [1] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01"
##  [6] "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01"

In the next example, we change the unit of increment to a quarter i.e. the difference between the dates in the sequence should be a quarter or 3 months.

seq.Date(from = as.Date("2009-12-12"), to = as.Date("2019-12-12"), by = "quarter")
##  [1] "2009-12-12" "2010-03-12" "2010-06-12" "2010-09-12" "2010-12-12"
##  [6] "2011-03-12" "2011-06-12" "2011-09-12" "2011-12-12" "2012-03-12"
## [11] "2012-06-12" "2012-09-12" "2012-12-12" "2013-03-12" "2013-06-12"
## [16] "2013-09-12" "2013-12-12" "2014-03-12" "2014-06-12" "2014-09-12"
## [21] "2014-12-12" "2015-03-12" "2015-06-12" "2015-09-12" "2015-12-12"
## [26] "2016-03-12" "2016-06-12" "2016-09-12" "2016-12-12" "2017-03-12"
## [31] "2017-06-12" "2017-09-12" "2017-12-12" "2018-03-12" "2018-06-12"
## [36] "2018-09-12" "2018-12-12" "2019-03-12" "2019-06-12" "2019-09-12"
## [41] "2019-12-12"

We will now create a sequence of dates but instead of specifying the unit of increment, we specify the number of dates in the sequence i.e. the length of the sequence. We do this using the length.out argument which specifies the desired length of the sequence. We want the sequence to have 10 dates including the start and end date, and hence we supply the value 10 for the length.out argument.

seq.Date(from = as.Date("2010-01-01"), to = as.Date("2019-12-31"), length.out = 10)
##  [1] "2010-01-01" "2011-02-10" "2012-03-22" "2013-05-02" "2014-06-11"
##  [6] "2015-07-22" "2016-08-31" "2017-10-10" "2018-11-20" "2019-12-31"

In all of the previous examples, we have specified both the start and the end date. Let us look at a few examples where we create a sequence of dates where we only specify the start date. In the below example, we want to create a sequence of dates starting from 2010-01-01. The unit of increment should be 1 year i.e. the difference between the dates in the sequence should be 1 year and the length of the sequence should be 10 i.e. the number of dates including the start date should be 10.

seq.Date(from = as.Date("2010-01-01"), by = "year", length.out = 10)
##  [1] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01"
##  [6] "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01"

The unit of increment can include multiples and +/- sign i.e. it can be an unit of increment or decrement. In the next example, we can increment the dates in the sequence by 2 i.e. the difference between the dates should be 2 instead of 1. This is achieved by specifying the unit of increment (multiple) first followed by a space and then the unit. In our example, it is 2 year. As you can see, the sequence now goes all the way till 2028 and the gap between the dates is 2 years.

seq.Date(from = as.Date("2010-01-01"), by = "2 year", length.out = 10)
##  [1] "2010-01-01" "2012-01-01" "2014-01-01" "2016-01-01" "2018-01-01"
##  [6] "2020-01-01" "2022-01-01" "2024-01-01" "2026-01-01" "2028-01-01"

Let us say instead of increment we want to decrement the dates i.e. the sequence of dates will go backwards as shown in the next example. We achieve this by using the - sign along with the unit of decrement. The sequence of dates in next example starts from 2010 and goes back upto 1992 and the difference between the dates in 2 years.

seq.Date(from = as.Date("2010-01-01"), by = "-2 year", length.out = 10)
##  [1] "2010-01-01" "2008-01-01" "2006-01-01" "2004-01-01" "2002-01-01"
##  [6] "2000-01-01" "1998-01-01" "1996-01-01" "1994-01-01" "1992-01-01"

In the last example, we will explore the along.with argument. Here we have supplied a vector which is a sequence of numbers from 1 to 10. The length of this vector is 10 and the same length is used as the length of the sequence i.e.  the length of value supplied to along.with is also the length of the sequence.

seq.Date(from = as.Date("2010-01-01"), by = "-2 year", along.with = 1:10)
##  [1] "2010-01-01" "2008-01-01" "2006-01-01" "2004-01-01" "2002-01-01"
##  [6] "2000-01-01" "1998-01-01" "1996-01-01" "1994-01-01" "1992-01-01"

Verify Type

How do you check if the data is a date-time object? You can do that using any of the following from the lubridate package.

  • is.Date()
  • is.POSIXct()
  • is.POSIXlt()
is.Date(release_date)
## [1] FALSE
is.POSIXct(release_date)
## [1] TRUE
is.POSIXlt(release_date)
## [1] FALSE

Your Turn

  • R 2.0.0 was released on 2004-10-04 14:24:38. Create this date using both make_date() and make_datetime()

  • R 3.0.0 was released on 2013-04-03 07:12:36. Update the date created in the previous step to the above using update()

Intervals, Duration & Period

In this section, we will learn about

  • intervals
  • duration
  • and period

Interval

An interval is a timespan defined by two date-times. Let us represent the length of the course using interval.



course_start    <- as_date('2017-04-12')
course_end      <- as_date('2017-04-21')
interval(course_start, course_end)
## [1] 2017-04-12 UTC--2017-04-21 UTC

If you observe carefully, the interval is represented by the course start and end dates. We will learn how to use intervals in the case study.

Overlapping Intervals

Let us say you are planning a vacation and want to check if the vacation dates overlap with the course dates. You can do this by:

  • creating vacation and course intervals
  • use int_overlaps() to check if two intervals overlap. It returns TRUE if the intervals overlap else FALSE.

Let us use the vacation start and end dates to create vacation_interval and then check if it overlaps with course_interval.



vacation_start    <- as_date('2017-04-19')
vacation_end      <- as_date('2017-04-25')
course_interval   <- interval(course_start, course_end)
vacation_interval <- interval(vacation_start, vacation_end)
int_overlaps(course_interval, vacation_interval)
## [1] TRUE

How many invoices were settled within due date?

Let us use intervals to count the number of invoices that were settled within the due date. To do this, we will:

  • create an interval for the invoice and due date
  • create a new column due_next by incrementing the due date by 1 day
  • another interval for due_next and the payment date
  • if the intervals overlap, the payment was made within the due date
transact %>%
  mutate(
    inv_due_interval = interval(Invoice, Due),
    due_next         = Due + days(1),
    due_pay_interval = interval(due_next, Payment),
    overlaps         = int_overlaps(inv_due_interval, due_pay_interval)
  ) %>%
  select(Invoice, Due, Payment, overlaps)
## # A tibble: 2,466 x 4
##    Invoice    Due        Payment    overlaps
##    <date>     <date>     <date>     <lgl>   
##  1 2013-01-02 2013-02-01 2013-01-15 TRUE    
##  2 2013-01-26 2013-02-25 2013-03-03 FALSE   
##  3 2013-07-03 2013-08-02 2013-07-08 TRUE    
##  4 2013-02-10 2013-03-12 2013-03-17 FALSE   
##  5 2012-10-25 2012-11-24 2012-11-28 FALSE   
##  6 2012-01-27 2012-02-26 2012-02-22 TRUE    
##  7 2013-08-13 2013-09-12 2013-09-09 TRUE    
##  8 2012-12-16 2013-01-15 2013-01-12 TRUE    
##  9 2012-05-14 2012-06-13 2012-07-01 FALSE   
## 10 2013-07-01 2013-07-31 2013-07-26 TRUE    
## # ... with 2,456 more rows

Below we show another method to count the number of invoices paid within the due date. Instead of using days to change the due date, we use int_shift to shift it by 1 day.

transact %>%
  mutate(
    inv_due_interval = interval(Invoice, Due),
    due_pay_interval = interval(Due, Payment),  
    due_pay_next     = int_shift(due_pay_interval, by = days(1)),
    overlaps         = int_overlaps(inv_due_interval, due_pay_next)
  ) %>%
  select(Invoice, Due, Payment, overlaps)
## # A tibble: 2,466 x 4
##    Invoice    Due        Payment    overlaps
##    <date>     <date>     <date>     <lgl>   
##  1 2013-01-02 2013-02-01 2013-01-15 TRUE    
##  2 2013-01-26 2013-02-25 2013-03-03 FALSE   
##  3 2013-07-03 2013-08-02 2013-07-08 TRUE    
##  4 2013-02-10 2013-03-12 2013-03-17 FALSE   
##  5 2012-10-25 2012-11-24 2012-11-28 FALSE   
##  6 2012-01-27 2012-02-26 2012-02-22 TRUE    
##  7 2013-08-13 2013-09-12 2013-09-09 TRUE    
##  8 2012-12-16 2013-01-15 2013-01-12 TRUE    
##  9 2012-05-14 2012-06-13 2012-07-01 FALSE   
## 10 2013-07-01 2013-07-31 2013-07-26 TRUE    
## # ... with 2,456 more rows

You might be thinking why we incremented the due date by a day before creating the interval between the due day and the payment day. If we do not increment, both the intervals will share a common date i.e. the due date and they will always overlap as shown below:

transact %>%
  mutate(
    inv_due_interval = interval(Invoice, Due),
    due_pay_interval = interval(Due, Payment),
    overlaps         = int_overlaps(inv_due_interval, due_pay_interval)
  ) %>%
  select(Invoice, Due, Payment, overlaps)
## # A tibble: 2,466 x 4
##    Invoice    Due        Payment    overlaps
##    <date>     <date>     <date>     <lgl>   
##  1 2013-01-02 2013-02-01 2013-01-15 TRUE    
##  2 2013-01-26 2013-02-25 2013-03-03 TRUE    
##  3 2013-07-03 2013-08-02 2013-07-08 TRUE    
##  4 2013-02-10 2013-03-12 2013-03-17 TRUE    
##  5 2012-10-25 2012-11-24 2012-11-28 TRUE    
##  6 2012-01-27 2012-02-26 2012-02-22 TRUE    
##  7 2013-08-13 2013-09-12 2013-09-09 TRUE    
##  8 2012-12-16 2013-01-15 2013-01-12 TRUE    
##  9 2012-05-14 2012-06-13 2012-07-01 TRUE    
## 10 2013-07-01 2013-07-31 2013-07-26 TRUE    
## # ... with 2,456 more rows

Shift Interval

Intervals can be shifted too. In the below example, we shift the course interval by:

  • 1 day
  • 3 weeks
  • 1 year



course_interval <- interval(course_start, course_end)

# shift course_interval by 1 day 
int_shift(course_interval, by = days(1))
## [1] 2017-04-13 UTC--2017-04-22 UTC

# shift course_interval by 3 weeks
int_shift(course_interval, by = weeks(3))
## [1] 2017-05-03 UTC--2017-05-12 UTC

# shift course_interval by 1 year
int_shift(course_interval, by = years(1))
## [1] 2018-04-12 UTC--2018-04-21 UTC

Within

Let us assume that we have to attend a conference in April 2017. Does it clash with the course? We can answer this using %within% which will return TRUE if a date falls within an interval.



conference <- as_date('2017-04-15')
conference %within% course_interval
## [1] TRUE

How many invoices were settled within due date?

Let us use %within% to count the number of invoices that were settled within the due date. We will do this by:

  • creating an interval for the invoice and due date
  • check if the payment date falls within the above interval
transact %>%
  mutate(
    inv_due_interval = interval(Invoice, Due),
    overlaps         = Payment %within% inv_due_interval
  ) %>%
  select(Due, Payment, overlaps)
## # A tibble: 2,466 x 3
##    Due        Payment    overlaps
##    <date>     <date>     <lgl>   
##  1 2013-02-01 2013-01-15 TRUE    
##  2 2013-02-25 2013-03-03 FALSE   
##  3 2013-08-02 2013-07-08 TRUE    
##  4 2013-03-12 2013-03-17 FALSE   
##  5 2012-11-24 2012-11-28 FALSE   
##  6 2012-02-26 2012-02-22 TRUE    
##  7 2013-09-12 2013-09-09 TRUE    
##  8 2013-01-15 2013-01-12 TRUE    
##  9 2012-06-13 2012-07-01 FALSE   
## 10 2013-07-31 2013-07-26 TRUE    
## # ... with 2,456 more rows

Duration

Duration is timespan measured in seconds. To create a duration object, use duration(). The timespan can be anything from seconds to years but it will be represented as seconds. Let us begin by creating a duration object where the timespan is in seconds.

duration(50, "seconds")
## [1] "50s"

Another way to specify the above timespan is shown below:

duration(second = 50)
## [1] "50s"

As you can see, the output is same in both the cases. Let us increase the timespan to 60 seconds and see what happens.

duration(second = 60)
## [1] "60s (~1 minutes)"

Although the timespan is primarily measured in seconds, it also shows ~1 minutes in the brackets. As the length of the timespan increases i.e. the number becomes large, it is represented using larger units such as hours and days. In the below examples, as the number of seconds increases, you can observe larger units being used to represent the timespan.

# minutes
duration(minute = 50)
## [1] "3000s (~50 minutes)"
duration(minute = 60)
## [1] "3600s (~1 hours)"
# hours
duration(hour = 23)
## [1] "82800s (~23 hours)"
duration(hour = 24)
## [1] "86400s (~1 days)"

The following helper functions can be used to create duration objects as well.

# default
dseconds()
## [1] "1s"
dminutes()
## [1] "60s (~1 minutes)"
# seconds
duration(second = 59)
## [1] "59s"
dseconds(59)
## [1] "59s"
# minutes
duration(minute = 50)
## [1] "3000s (~50 minutes)"
dminutes(50)
## [1] "3000s (~50 minutes)"
# hours
duration(hour = 36)
## [1] "129600s (~1.5 days)"
dhours(36)
## [1] "129600s (~1.5 days)"
# weeks
duration(week = 56)
## [1] "33868800s (~1.07 years)"
dweeks(56)
## [1] "33868800s (~1.07 years)"

Let us use the above helper functions to get the course length in different units.


# course length in seconds 
course_interval / dseconds()
## [1] 777600

# course length in minutes
course_interval / dminutes()
## [1] 12960

# course length in hours
course_interval / dhours()
## [1] 216

# course length in weeks
course_interval / dweeks()
## [1] 1.285714

# course length in years
course_interval / dyears()
## [1] 0.02464066

Period

A period is a timespan defined in units such as years, months, and days. In the below examples, we use period() to represent timespan using different units.

# second
period(5, "second")
## [1] "5S"
period(second = 5)
## [1] "5S"
# minute & second
period(c(3, 5), c("minute", "second"))
## [1] "3M 5S"
period(minute = 3, second = 5)
## [1] "3M 5S"
# hour, minte & second
period(c(1, 3, 5), c("hour", "minute", "second"))
## [1] "1H 3M 5S"
period(hour = 1, minute = 3, second = 5)
## [1] "1H 3M 5S"
# day, hour, minute & second
period(c(3, 1, 3, 5), c("day", "hour", "minute", "second"))
## [1] "3d 1H 3M 5S"
period(day = 3, hour = 1, minute = 3, second = 5)
## [1] "3d 1H 3M 5S"



Let us get the course length in different units using as.period().

# course length in second
as.period(course_interval, unit = "seconds")
## [1] "777600S"

# course length in hours and minutes
as.period(course_interval, unit = "minutes")
## [1] "12960M 0S"

# course length in hours, minutes and seconds
as.period(course_interval, unit = "hours")
## [1] "216H 0M 0S"

time_length() computes the exact length of a timespan i.e. duration, interval or period. Let us use time_length() to compute the length of the course in different units.



# course length in seconds
time_length(course_interval, unit = "seconds")
## [1] 777600

# course length in minutes
time_length(course_interval, unit = "minutes")
## [1] 12960

# course length in hours
time_length(course_interval, unit = "hours")
## [1] 216

Round & Rollback

In this section, we will learn to round date/time to the nearest unit and roll back dates.

Rounding Dates

We will explore functions for rounding dates

  • to the nearest value using round_dates()
  • down using floor_date()
  • up using ceiling_date()

The unit for rounding can be any of the following:

  • second
  • minute
  • hour
  • day
  • week
  • month
  • bimonth
  • quarter
  • season
  • halfyear
  • and year

We will look at a few examples using round_date() and you will then practice using the other two functions.

# minute
round_date(release_date, unit = "minute")
## [1] "2019-12-12 08:05:00 UTC"
round_date(release_date, unit = "mins")
## [1] "2019-12-12 08:05:00 UTC"
round_date(release_date, unit = "5 mins")
## [1] "2019-12-12 08:05:00 UTC"
# hour
round_date(release_date, unit = "hour")
## [1] "2019-12-12 08:00:00 UTC"
# day
round_date(release_date, unit = "day")
## [1] "2019-12-12 UTC"

Rollback

Use rollback() if you want to change the date to the last day of the previous month or the first day of the month.

rollback(release_date)
## [1] "2019-11-30 08:05:03 UTC"

To change the date to the first day of the month, use the roll_to_first argument and set it to TRUE.

rollback(release_date, roll_to_first = TRUE)
## [1] "2019-12-01 08:05:03 UTC"

Your Turn

  • round up R release dates to hours
  • round down R release dates to minutes
  • rollback R release dates to the beginning of the month

Readings & References

*As the reader of this blog, you are our most important critic and commentator. We value your opinion and want to know what we are doing right, what we could do better, what areas you would like to see us publish in, and any other words of wisdom you are willing to pass our way.

We welcome your comments. You can email to let us know what you did or did not like about our blog as well as what we can do to make our post better.*

Email:

To leave a comment for the author, please follow the link and comment on their blog: Rsquared Academy Blog - Explore Discover Learn.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.