Lomb-Scargle periodogram for unevenly sampled time series
[This article was first published on me nugget, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The above figure shows a Lomb-Scargle periodogram of a time series of sunspot activity (1749-1997) with 50% of monthly values missing. As expected (link1, link2), the periodogram displays a a highly significant maximum peak at a frequency of ~11 years.
The function comes from a nice set of functions that I found here: http://research.stowers-institute.org/efg/2005/LombScargle/R/index.htm. An accompanying paper focusing on its application to time series of gene expression can be found here.
Below is a comparison to an FFT of the full time series. For another great resource on spectral analysis, and time series-related R methods in general, see the following website: http://zoonek2.free.fr/UNIX/48_R/15.html.
#Required function # from: http://research.stowers-institute.org/efg/2005/LombScargle/R/index.htm source("LombScargle.R") source("R-Mnemonics.R") #Required data data(sunspot.month) # Historical sunspots, class(sunspot.month) = "ts" plot(sunspot.month) #Create vector version of data nspot <- as.vector(sunspot.month) time <- seq(1749+0.5/12, 1997+11.5/12, by=1/12) plot(nspot ~ time, t="l") # same as above #make gaps in data to create "observed" data series set.seed(1) obs <- sample(seq(nspot), 0.5*length(nspot)) # 50% gaps nspot.obs <- nspot[obs] time.obs <- time[obs] plot(nspot.obs ~ time.obs, pch=".", cex=2) # Create test frequencies corresponding to # periods from 1 to 50 years M <- 4*length(time.obs) # often 2 or 4 TestFrequencies <- (1/50) + (1 - 1/50) * (1:M / M) # Use Horne & Baliunas' estimate of independent frequencies Nindependent <- NHorneBaliunas(length(nspot.obs)) #Compute RES <- ComputeAndPlotLombScargle(time.obs, nspot.obs, TestFrequencies, Nindependent, "Sunspots") #Output consists of a single figure with four plots showing: #1. time series. The peak of the loess-smoothed curve (dotted line) is used for phase. #2. Histogram of time interval variability #3. Lomb-Scargle Periodogram with peak corresponding to a 11 year period #RES$PeakPeriod #4. Peak Significance Curve: p-value of Lomb-Scargle peak is quite significant: 1 - (1-exp(-RES$SpectralPowerDensity[which(1/RES$Frequency == RES$PeakPeriod)]))^Nindependent #Alternate plots p01 <- -log(1-((1-0.01)^(1/Nindependent))) #P=0.01 plot(RES$Frequency, RES$SpectralPowerDensity, t="n", xlab="Frequency [1/years]", ylab="Normalized Power Spectral Density") abline(h=seq(par()$yaxp[1], par()$yaxp[2],, par()$yaxp[3]+1), col=8, lty=3) abline(v=seq(par()$xaxp[1], par()$xaxp[2],, par()$xaxp[3]+1), col=8, lty=3) lines(RES$Frequency, RES$SpectralPowerDensity, lwd=1) mtext("Lomb-Scargle Periodogram", side=3, line=1) abline(h=p01, col=2, lty=2) text(0.9, p01, label="p = 0.01", pos=3, col=2) plot(1/RES$Frequency, RES$SpectralPowerDensity, type="n", log = "x", xlab="Frequency [years]", ylab="Normalized Power Spectral Density") abline(h=seq(par()$yaxp[1], par()$yaxp[2],, par()$yaxp[3]+1), col=8, lty=3) abline(v=c(1,2,5,10,20,50), col=8, lty=3) lines(1/RES$Frequency, RES$SpectralPowerDensity, lwd=1) mtext("Lomb-Scargle Periodogram", side=3, line=1) abline(h=p01, col=2, lty=2) text(2, p01, label="p = 0.01", pos=3, col=2) png("GappySunspots_LombScargle.png", width=5, height=7, units="in", res=400) #x11(width=5, height=7) op <- par(mfcol=c(2,1), mar=c(5,5,2,1), ps=10) plot(nspot.obs ~ time.obs, t="n", xlab="", ylab=expression(paste("No. sunspots [", "month"^-1, "]"))) grid() points(nspot.obs ~ time.obs, pch=".", cex=2) mtext("50% missing values", side=3, line=0.5) plot(1/RES$Frequency, RES$SpectralPowerDensity, type="n", log = "x", xlab="Frequency [years]", ylab="Normalized Power Spectral Density") abline(h=seq(par()$yaxp[1], par()$yaxp[2],, par()$yaxp[3]+1), col=8, lty=3) abline(v=c(1,2,5,10,20,50), col=8, lty=3) lines(1/RES$Frequency, RES$SpectralPowerDensity, lwd=1) mtext("Lomb-Scargle Periodogram", side=3, line=0.5) abline(h=p01, col=2, lty=2) text(2, p01, label="p = 0.01", pos=3, col=2) par(op) dev.off() ############################## ###Compare to other methods### ############################## plot(sunspots) spectrum(sunspot.month) abline(v=1:10, lty=3) z <- spectrum(sunspots, spans=2) #smoothed spectrum png("LombScargle_vs_DFT.png", width=5, height=4, res=400, units="in") XLIM <- c(2,20) op <- par(mar=c(5,5,2,5), ps=10) plot(1/RES$Frequency, RES$SpectralPowerDensity, type="n", log = "x", xlab="Frequency [years]", ylab="Normalized Power Spectral Density", xlim=XLIM) abline(h=seq(par()$yaxp[1], par()$yaxp[2],, par()$yaxp[3]+1), col=8, lty=3) abline(v=c(1,2,5,10,20,50), col=8, lty=3) lines(1/RES$Frequency, RES$SpectralPowerDensity, lwd=1) par(new=TRUE) plot(1/z$freq, z$spec, t="l", col=3, log = "x", xlim=XLIM, xlab="", ylab="", xaxt="n", yaxt="n") axis(4) mtext("Spectral Density", line=3, side=4, col=3) legend("topleft", legend=c("Lomb-Scargle", "Discrete Fourier\nTransform"), bg="white", col=c(1,3), lty=1) par(op) dev.off()
To leave a comment for the author, please follow the link and comment on their blog: me nugget.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.