Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Brian Peterson and I had a chance to visit the University of Washington a couple of weeks ago at the behest of Doug Martin, where we gave a seminar covering various R packages we’ve written. Here are the slides we used.
We also had quite a bit of time that we spent with Doug, Eric Zivot, Guy Yollin, David Carino, and others. We had some very good working sessions, mainly around this summer’s Google Summer of Code (GSoC) projects. I’ll talk about those projects as the summer progresses, but for now I wanted to post the slides we used for our seminar and talk a bit about them.
For the seminar, Doug asked us to provide a one-hour overview of some of the R packages we’ve been working on. That was something we haven’t done in quite a while, so it was a good opportunity for us to pull together a few of our favorite applications that we’ve shown separately before. We thought that to better tie them together it might help if we provided a bit more about the framework we’ve used to organize our thinking through time. So we dusted off some graphics that we used several years ago to frame the discussion.
After some reflection, I think that this framework is still useful.
The framework begins with a stylized view of an investment management business and the core business processes that cover innovation, production, compliance, and distribution. My focus, naturally, has been on functionality that supports the research and investment components of the business. That focus can be (and has been, in different contexts) decomposed into sub-processes covering the generation of ideas through implementation and monitoring.
The process view has its limitations, of course, because at the core of any investment business is a set of recurring decisions that need to be made. The decisions that get made are not easily contained within a sub-process — a view on risk can be as relevant to idea generation or portfolio construction as it is to risk remediation. So different tools for analyzing and supporting those decisions end up being used over and over again, in slightly different ways.
The whole purpose of developing these packages in R, then, is to provide tools to help people make high quality decisions efficiently and effectively, wherever they occur in the investment process. That requires more than tools, of course. Decision-making is, unto itself, a process. Users need decision-focused information to better develop evidence and confidence. They need to make decisions consistently, and they quickly receive and assess feedback about the quality of those decisions. Context does matter, of course, but it is usually better developed by the user than the tool developer (hence the success of Excel and end-user computing in finance).
With some key decisions outlined above, a number of capabilities seem useful. A number of years ago we then decomposed those capabilities into applications, then decomposed those applications further to provide a functional view. In updating these slides, it was a pleasant surprise to see that we and the broader R community have been able to make quite a bit of progress filling in functionality. A fair amount remains to be done, certainly.
The rest of the slides discuss three specific applications built using the R toolchain. The first is returns-based performance analysis, in this case examining a hedge fund against a set of peers. The second examines the construction of a portfolio of hedge fund indexes. And the third is a backtest, in this case a simple trend-following strategy.
I should point out that these applications bridge two different data contexts. In developing this framework, we decided to separate the returns-and-weights context from prices-and-transactions. That’s been a good decision for helping us to scope projects, although we will eventually provide the functionality for bridging the two contexts seamlessly. The two contexts already work well together with a package for time series data and another for meta-data definition.
So that’s a bit of description about how we came to develop much of the functionality available today. This framework continues to be useful for identifying needs and scoping new projects, as I hope you will see as the summer progresses.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.