Pairs Trading Issues
[This article was first published on Eran Raviv » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A few words for those of you who are not familiar with the “pairs trading” concept. First you should understand that the movement of every stock is dominated not by the companies performance but by the general market movement. This is the origin of many “factor models”, the factor that drives the every stock is the market factor, which is approximated by the S&P index in most cases. So, no matter how great a company I think amazon (AMZN) is, it will not stand any large market downturn without getting chopped itself. What a conservative player (not to say coward..) such as myself might do is to “net out” this factor from the equation. I can long AMZN and short another company or the index itself in the right amount so that I have exposure “only” to the intrinsic AMZN movement. Say I did just that, bought AMZN and sold the S&P index (SPY) , if the index goes up, I am losing since I am shorting it, but I hope AMZN will go up to overcompensate me on my loss from the index. AMZN should go up once since the market went up, and once since its a good company. The reverse, the index goes down, so I win on that one since I short the index, I hope AMZN will not decline as much to eat all my profits. AMZN should decline because of the market, but go up since it’s a good company. That way, I express my views about AMZN without taking on the factor/market exposure. The term “pairs trade” is since I am long and short a pair of stocks. That was a flat explanation about what is pairs trading.
It suits me just fine, I can volume up without the horrific P&L swings I used to endure when I was more stupid. I found many pairs that should co-move and went shopping with the revenues no doubt were soon to flow in. Imagine my surprise when things did not go my way, Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The upper plot is the estimation based on prices, it shows I should long 1.82 GLD for every 1 GDX.The bottom plot shows the same estimation based on returns, here I should hold twice GDX since every percent in GDX followed on average by only 0.433% in GLD.
What’s more, the aforementioned regression is infected with the underlying assumption that the right hand side variable is constant while the left hand side variable is random, it has an error term. In fact, \(stock_b \) is also random, so when we switch the variables in the regression, plugging GDX on the “Y” side we get different results:
This is disturbing, the amount I should trade is determined by the order in which I plug in the variables?? Does not sound like a money machine to me. Remember, I do not care that GLD is the one dragging GDX, (gold is dragging gold miners and not the reverse), all I am saying is that GLD is not a given constant, but a random variable in its own right.
To make matters more interesting, \( \widehat{\beta_1} \) is not constant over time, so I have no idea how many observation to use.Have a look:
This is of course the case for returns as well, and if you reverse the order of the LHS and RHS variables. You can copy paste the code and try it yourself, it’s pretty much a stand alone code.
Possible solutions are to think about your time horizon for investment, so for example if you plan to hold if for few months you can use the 365 days beta. I also tried to weight the observations such that the most recent get more weight and such other variations, did not reach any satisfactory condition to determine as to how much I should hold from each.
In theory, there is a strong relation between theory and practice, but in practice there is not. I showed here few the problems in pairs trading. Firstly, we do not know which measure to use for relation estimation, prices or returns. Secondly, we do not know which time frame to use and since the relation is not constant, it does matter.Lastly, the assumptions underlying the estimation procedure are false and invalidate whatever you hoped to feel comfortable with. As always, code and references are given below. Thanks for reading.
Quantitative Trading: How to Build Your Own Algorithmic Trading Business (Wiley Trading) (Hardcover)
by Ernie Chan
Price: $37.42
47 used & new available from $31.62
(18 customer reviews)
When Genius Failed: The Rise and Fall of Long-Term Capital Management (Paperback)
by Roger Lowenstein
Price: $10.88
182 used & new available from $2.92
(253 customer reviews)
Applied Quantitative Methods for Trading and Investment (The Wiley Finance Series) (Hardcover)
by
Price: $105.26
30 used & new available from $60.00
(4 customer reviews)
Pairs Trading: Quantitative Methods and Analysis (Wiley Finance) (Hardcover)
by Ganapathy Vidyamurthy
Price: $74.75
39 used & new available from $57.97
(16 customer reviews)
?Download download.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
library(quantmod)
library(PerformanceAnalytics)
tckr<-c("GLD", "GDX")
seq1 = c(30,90,180,365)
end<-format(Sys.Date() ,"%Y-%m-%d")
Tickers = array(dim =c(260,4,2) )
Tickersret = array(dim =c(260,4,2) )
for (j in 1:2){
for (i in seq1){
ind = match(i,seq1)
start[ind] <-format(Sys.Date() - (i) ,"%Y-%m-%d")
dat0 = (getSymbols(tckr[j], src="yahoo", from=start[ind], to=end, auto.assign = FALSE))
ret = (as.vector(dat0[2:NROW(dat0),4]) - as.vector(dat0[1:(NROW(dat0)-1),4]) )/ dat0[1:(NROW(dat0)-1),4]
Tickers[1:(NROW(dat0)),ind,j] = as.numeric( (dat0[,4]+dat0[,1]+(dat0[,2] + dat0[,3])/2)/3 ) # average price
Tickersret[1:(NROW(dat0)-1),ind,j] = as.numeric(ret)
}}
## Plot of prices:
par(mfrow = c(2,2))
for (i in 1:4){
plot(na.omit(Tickers[,i,1])/na.omit(Tickers[1,i,1]) , ty = "b", ylim = c(.65,1.35),
main = paste('Last', seq1[i], 'days'), ylab = "Return", xlab = "Time")
points(na.omit(Tickers[,i,2])/na.omit(Tickers[1,i,2]), ty = "b", col = 2)
legend('topright',legend = c(paste(tckr[1]), paste(tckr[2])), bty = "n", col = c(1:2), pch = 1)
}
## Plot of Beta return vs prices:
i = 4
par(mfrow = c(2,1))
plot(na.omit(Tickers[,i,2]) ~ na.omit(Tickers[,i,1]), ty = "p", main =
paste('Beta for the last', seq1[i], 'days',
"=", format(as.numeric(lm(na.omit(Tickers[,i,2]) ~ na.omit(Tickers[,i,1]))$coef[2]),digits = 3) )
, ylab = paste(tckr[2]), xlab = paste(tckr[1]) )
abline(lm(na.omit(Tickers[,i,2]) ~ na.omit(Tickers[,i,1]) ), col = 2, lwd = 3)
plot(na.omit(Tickersret[,i,2]) ~ na.omit(Tickersret[,i,1]), ty = "p", main =
paste('Beta for the last', seq1[i], 'days',
"=", format(as.numeric(lm(na.omit(Tickersret[,i,2]) ~ na.omit(Tickersret[,i,1]))$coef[2]),digits = 3) )
, ylab = paste(tckr[2]), xlab = paste(tckr[1]) )
abline(lm(na.omit(Tickersret[,i,2]) ~ na.omit(Tickersret[,i,1]) ), col = 2, lwd = 3)
## Plots of beta over time:
par(mfrow = c(2,2))
for (i in 1:4){
plot(na.omit(Tickers[,i,1]) ~ na.omit(Tickers[,i,2]), ty = "p", main =
paste('Beta for the last', seq1[i], 'days',
"=", format(as.numeric(lm(na.omit(Tickers[,i,1]) ~ na.omit(Tickers[,i,2]))$coef[2]),digits = 3) )
, ylab = paste(tckr[1]), xlab = paste(tckr[2]) )
abline(lm(na.omit(Tickers[,i,1]) ~ na.omit(Tickers[,i,2]) ), col = 2, lwd = 3)
}
|
To leave a comment for the author, please follow the link and comment on their blog: Eran Raviv » R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.



