A slice of S&P 500 skewness history

Pat

10 years ago

[This article was first published on Portfolio Probe » R language, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

How symmetric are the returns of the S&P 500? How does the skewness change over time?

Previously

We looked at the predictability of kurtosis and skewness in S&P constituents. We didn’t see any predictability of skewness among the constituents. Here we look at skewness from a different angle.

The data

Daily log returns of the S&P 500 from 1950 to 2011 October 17 were lying about. It is log returns (rather than simple returns) that we would expect to be symmetric.

Skewness through time

Figure 1 shows the rolling 250-day skewness throughout the time period. Figure 2 shows an informal method (explained in Appendix R) of assessing the variability of the skewness statistic.

Figure 1: Rolling 250-day skewness of the S&P 500. Figure 2: Rolling 250-day skewness of the S&P 500 (blue) with indication of its variability (gold). Note that zero is almost everywhere covered in gold. There doesn’t seem to be much reason to suppose that skewness varies from zero through time.

It is also interesting that the gold doesn’t extend much beyond the spikes in the realized series.

Skewness variability

Rather than looking at skewness over time, we can treat the data as a sample. We get a value of skewness for all the data. We can then bootstrap to see the variability of that statistic. Figure 3 shows the bootstrap distribution for skewness. It looks quite funny. What is going on? (Hint: the location of the actual statistic — the gold line — is a clue.)

Figure 3: Bootstrap distribution of skewness for the S&P 500 from 1950 to 2011. An explanation for the shape of the distribution in Figure 3 is that there is one datapoint that has significant influence on the statistic. The modes in the distribution correspond with how many times that datapoint is in the bootstrap sample. The right-most mode corresponds to zero occurrences, the second to right has one occurrence (including the actual data), and so on.

Figure 4: Bootstrap distribution of skewness for the S&P 500 from 1950 to 2011 except for 1987-10-19. Figure 4 shows the bootstrap when we leave out that one influential point: 1987 October 19. This plot suggests that there is probably some negative skewness, but not enough that we can say for sure. Plus we have left out one slightly important datapoint. It is important in terms of skewness because obviously the market can “accidentally” go down 20% in a day, but it seems highly unlikely that it would “accidentally” go up 20%.

Questions

It is parsimonious to hypothesize that skewness is always zero. That doesn’t mean it is right. Are there good ways to test that hypothesis?

Epilogue

What immortal hand or eye
Could frame thy fearful symmetry?

from “The Tyger” by William Blake

< embed width="450" type="application/x-shockwave-flash" src="https://www.youtube.com/v/8iNhSJzL3mk?version=3&hl=en_GB" allowFullScreen="true" allowscriptaccess="always" allowfullscreen="true" />

Appendix R

Here is an outline of how the computations were done.

skewness function

The skewness function can be found in kurtskew.R except that by now a bit of a change is in order. Using sd on a matrix has been deprecated in R version 2.14.x. So a substitution for sd(x) when x is a matrix is:

apply(x, 2, sd)

rollapply

The rollapply function from the zoo package was used to get the data for Figure 1:

spx.skew250 <- rollapply(spxret, 250, pp.skew, align='right')

A hiccup: The result of this command for some reason didn’t have names on it and pp.timeplot needs names that are dates in order to draw the time axis. spxret is just a named vector so the default method of rollapply will be the one that was used. But it was easy enough to add the names afterwards:

names(spx.skew250) <- tail(names(spxret), length(spx.skew250))

variability through time

Here’s how the variability in Figure 2 was computed. At each time point in the rollapply give a bootstrap sample rather than the actual data, and do that 20 times:

boot.skew250 <- vector('list', 20)

for(i in 1:20) boot.skew250[[i]] <- rollapply(spxret, 250, function(x){y <- sample(x, length(x), replace=TRUE);pp.skew(y)}, align='right')

boot.skew250mat <- do.call('cbind', boot.skew250) rownames(boot.skew250mat) <- names(spx.skew250)

pp.timeplot(boot.skew250mat, div="decade", lty=1, col="gold")

bootstrap

The sample bootstrapping was just:

bootskew1 <- rep(NA, 1000)

for(i in 1:1000) bootskew1[i] <- pp.skew(sample(spxret, 15548, replace=TRUE))

Subscribe to the Portfolio Probe blog by Email

To leave a comment for the author, please follow the link and comment on their blog: Portfolio Probe » R language.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.