Site icon R-bloggers

Timezone conversion in R

[This article was first published on R – NPD team, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Timezone conversion is a common practice in real world and the simplest way is to utilize build-in functions in R rather than design your owns functions to take care of the hour offset, especially in a country like Australia where there are totally 7 different local time zones.

Data preparation

First let’s initiate a dataframe:

df <- data.frame(actualtime = c(
   '2015-04-15 13:10:00',
   '2015-04-15 14:22:00',
   '2015-04-15 10:14:00'),
timezone = c(
   'Australia/Sydney',
   'Australia/Perth',
   'Australia/Perth'))

Examine the data below, let’s assume the value in actualtime is always in Australia/Sydney timezone and the value in timezone column is the local timezone we are trying to convert to:

First attempt

Next let’s test the as.POSIXct/POSIXlt function for timezone converting, please keep in mind when you converting POSIXct data type to a different timezone, you need to use as.POSIXlt function at the conversion step:

# change character representation of timestamp to timestamp type
raw <- as.POSIXct(strptime(
   df$actualtime[2],
   format = "%Y-%m-%d %H:%M:%S", 
   tz ="Australia/Sydney"),
   tz = "Australia/Sydney")
# starts conversion
converted <- as.POSIXlt(raw, tz = df$timezone[2])

As we took the second row of the df, we expect there are two hours difference between the raw and converted value (Sydney and Perth time), and this can be verified:

> raw
[1] "2015-04-15 14:22:00 AEST"
> converted
[1] "2015-04-15 12:22:00 AWST"

Then we can build a new dataframe with the converted value column appended by using the above method:

ts_df <- do.call(rbind, lapply(1:nrow(df), function(i) {
tz <- df$timezone[i]
raw <- as.POSIXct(strptime(
   df$actualtime[i],
   format = "%Y-%m-%d %H:%M:%S",
   tz ="Australia/Sydney"),
   tz = "Australia/Sydney")
ts <- as.POSIXlt(raw, tz = tz)
data.frame(raw=raw,tz=tz,converted = ts)
}))

The above step put each converted result to a dataframe contains only one row, then rbind all of them to produce ts_df, however the result is not as we expected, see below:

Second and successful attempt

It looks like above method doesn’t work, let’s try another way.

ts_df <- do.call(rbind, lapply(1:nrow(df), function(i) {
tz <- df$timezone[i]
raw <- as.POSIXct(strptime(
   df$actualtime[i],
   format = "%Y-%m-%d %H:%M:%S",
   tz ="Australia/Sydney"),
   tz = "Australia/Sydney")
ts <- format(raw, tz = tz, usetz = TRUE)
data.frame(raw=raw,tz=tz,converted = as.POSIXct(ts))
}))

Here is the result, as can be seen below, this time we successfully converted the timestamp to local timezone:

Conclusion

The methodology used here is to format the timestamp to local timezone as character type and then convert to timestamp type.
As you may notices, the converted timestamp (as.POSIXct(ts)) is actually in local time zone which is Australia/Sydney in this case.
However it seems the timezone within the timestamp doesn’t really matters, all you want is to make sure the timestamp is been offset and can be used by other processed.

To leave a comment for the author, please follow the link and comment on their blog: R – NPD team.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.