Site icon R-bloggers

xts_0.13.0 on CRAN

[This article was first published on R on FOSS Trading, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

An updated version of xts is now on CRAN. This release adds several exciting changes: open-ended time-of-day subsetting, smarter conversions to xts from data.frames/data.tables/tibbles; to.period() handles custom endpoint values, print() truncates rows like data.table, and str() provides more informative output. There are also changes to make xts more consistent with zoo, some minor speed improvements, and the usual smattering of bug fixes.

For some reason, I decided it was a good idea to go through the oldest GitHub issues and determine whether they should be fixed or closed without being fixed. Some of the GitHub issues are open issues from when xts was still on R-Forge! The oldest issue fixed in this release was opened on 2013-09-15, and another one was opened on 2014-03-09! Better late than never I guess. 😉

New Features

The coolest new feature is the ability to use open-ended ranges for time-of-day subsetting. So you can subset by time of day from the start/end of the day without providing the start/end times (i.e. 00:00:00.000/23:59:59.999). Thanks to Chris Katsulis for the suggestion! (#243)

Here’s an example:

# an hourly sequence of times, and an xts object using them
times <- timeBasedSeq("2023-02-01/2023-02-05/H")
x <- xts(seq_along(times), times)

# function to show the first and last index values for each day
index_range_by_day <- function(x) {
    by_day <- split(x, "days")
    index_range <- function(y) {
        paste(start(y), end(y), sep = " / ")
    }
    lapply(by_day, index_range)
}

# between the start of the day and 5pm
index_range_by_day(x["/T1800"])
## $`2023-02-01`
## [1] "2023-02-01 / 2023-02-01 18:00:00"
## 
## $`2023-02-02`
## [1] "2023-02-02 / 2023-02-02 18:00:00"
## 
## $`2023-02-03`
## [1] "2023-02-03 / 2023-02-03 18:00:00"
## 
## $`2023-02-04`
## [1] "2023-02-04 / 2023-02-04 18:00:00"
## 
## $`2023-02-05`
## [1] "2023-02-05 / 2023-02-05 18:00:00"

# between 5am and the end of the day
index_range_by_day(x["T0500/"])
## $`2023-02-01`
## [1] "2023-02-01 05:00:00 / 2023-02-01 23:00:00"
## 
## $`2023-02-02`
## [1] "2023-02-02 05:00:00 / 2023-02-02 23:00:00"
## 
## $`2023-02-03`
## [1] "2023-02-03 05:00:00 / 2023-02-03 23:00:00"
## 
## $`2023-02-04`
## [1] "2023-02-04 05:00:00 / 2023-02-04 23:00:00"
## 
## $`2023-02-05`
## [1] "2023-02-05 05:00:00 / 2023-02-05 23:00:00"

You can now pass custom endpoints to to.period() using the ‘period’ argument. So you can aggregate on something other than the times that endpoints() supports. Thanks to Ethan B. Smith for the suggestion! (#302)

data(sample_matrix)
x <- as.xts(sample_matrix)

# aggregate to OHLC by week ending on Friday
week_fri <- to.period(x, endpoints(x, "weeks"))
head(week_fri)
##              x.Open   x.High    x.Low  x.Close
## 2007-01-07 50.03978 50.42188 49.95041 49.99185
## 2007-01-14 50.03555 50.62395 49.80454 50.60145
## 2007-01-21 50.61724 50.77336 50.02142 50.42090
## 2007-01-28 50.36008 50.43875 49.87468 49.88096
## 2007-02-04 49.85624 50.55509 49.76308 50.55509
## 2007-02-11 50.52389 50.91776 50.45977 50.91160

# aggregate to OHLC by week ending on Wednesday
wednesdays <- which(.indexwday(x) == 3)
week_wed <- to.period(x, wednesdays)
head(week_wed)
##              x.Open   x.High    x.Low  x.Close
## 2007-01-03 50.03978 50.42188 49.95041 50.39767
## 2007-01-10 50.42096 50.42096 49.80454 49.97246
## 2007-01-17 49.88529 50.77336 49.88529 50.48644
## 2007-01-24 50.48051 50.60712 50.02142 50.23145
## 2007-01-31 50.20738 50.28268 49.76308 50.22578
## 2007-02-07 50.22448 50.71661 50.19101 50.60611

Enhancements

The release also contains some quality of life changes to print() and str(). Now print() only shows the first and last ‘show.rows’ rows (default 10) if number of rows is > ‘max.rows’ (default 100), similar to data.table (#321).

data(sample_matrix)
x <- as.xts(sample_matrix)
x
##                Open     High      Low    Close
## 2007-01-02 50.03978 50.11778 49.95041 50.11778
## 2007-01-03 50.23050 50.42188 50.23050 50.39767
## 2007-01-04 50.42096 50.42096 50.26414 50.33236
## 2007-01-05 50.37347 50.37347 50.22103 50.33459
## 2007-01-06 50.24433 50.24433 50.11121 50.18112
## 2007-01-07 50.13211 50.21561 49.99185 49.99185
## 2007-01-08 50.03555 50.10363 49.96971 49.98806
## 2007-01-09 49.99489 49.99489 49.80454 49.91333
## 2007-01-10 49.91228 50.13053 49.91228 49.97246
## 2007-01-11 49.88529 50.23910 49.88529 50.23910
##        ...                                    
## 2007-06-21 47.71012 47.71012 47.61106 47.62921
## 2007-06-22 47.56849 47.59266 47.32549 47.32549
## 2007-06-23 47.22873 47.24771 47.09144 47.24771
## 2007-06-24 47.23996 47.30287 47.20932 47.22764
## 2007-06-25 47.20471 47.42772 47.13405 47.42772
## 2007-06-26 47.44300 47.61611 47.44300 47.61611
## 2007-06-27 47.62323 47.71673 47.60015 47.62769
## 2007-06-28 47.67604 47.70460 47.57241 47.60716
## 2007-06-29 47.63629 47.77563 47.61733 47.66471
## 2007-06-30 47.67468 47.94127 47.67468 47.76719

Now str() outputs more descriptive information for xts objects. It differentiates between xts objects that are empty (no data and zero-length index), zero-width (no data and has index values), or zero-length (no data–but has a column dimension and may have column names–and zero-length index). It also adds column names to the output. (#168, #378)

empty <- numeric()

# empty -- no data and zero-length index
str(.xts(NULL, empty))
## An empty xts object 
##   Data:    double [0, 0]
##   Index:   POSIXct,POSIXt [0] (TZ: "")

# zero length -- no rows of data and a zero-length index,
#     but has a column dimension and may have column names
str(.xts(empty, empty))
## A zero-length xts object 
##   Data:    double [0, 1]
##   Index:   POSIXct,POSIXt [0] (TZ: "")

zero_length_with_colnames <-
    .xts(matrix(empty, dimnames = list(NULL, "zero")), empty)
str(zero_length_with_colnames)
## A zero-length xts object 
##   Data:    double [0, 1]
##   Columns: zero
##   Index:   POSIXct,POSIXt [0] (TZ: "")

# zero width -- no data and has index values
str(xts(NULL, Sys.Date()))
## A zero-width xts object on 2023-02-21 / 2023-02-21 containing:
##   Data:    double [0, 0]
##   Index:   Date [1] (TZ: "UTC")

There’s a nice improvement to as.xts() for data.frame and similar objects (e.g. data.table, tibble). It will look for a time-based column in the data.frame if it cannot create an index from the row names. (#381)

d <- data.frame(as.Date("2023-02-21"), A = 21, B = 42)
as.xts(d)
##             A  B
## 2023-02-21 21 42

This release also includes a new xts method for na.fill() that significantly increases performance when ‘fill’ is a scalar. And it adds a startup warning that dplyr::lag() breaks method dispatch, which means calls to lag(my_xts) won’t work any more, and suggests a couple ways to work around that breakage.

Bug Fixes

Other

Postscript

I look forward to your questions and feedback! If you have a question, please ask on Stack Overflow and use the [r] and [xts] tags.  Or you can send an email to the R-SIG-Finance mailing list (you must subscribe to post). Open an issue on GitHub if you find a bug or want to request a feature. Please read the contributing guide first! It will help save time for both of us. 😉

If you love using my open-source work (e.g. quantmod, TTR, xts, IBrokers, microbenchmark, etc.), you can give back by sponsoring me on GitHub. I truly appreciate anything you’re willing and able to give!

To leave a comment for the author, please follow the link and comment on their blog: R on FOSS Trading.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version