Circular or spherical data, and density estimation
[This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I few years ago, while I was working on kernel based density estimation
on compact support distribution (like copulas) I went through a series
of papers on circular distributions. By that time, I thought it was
something for mathematicians working on weird spaces…. but during the
past weeks, I saw several potential applications of those estimators.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
- circular data density estimation
such that
. It can be seen as an invariance by rotation.von Mises proposed a parametric model in 1918 (see here or there), assuming that

is Bessel modified function of order 1,
(some concentration parameter) and mu a direction.From a series of observed angles
, the maximum likelihood estimator for kappa is solution of


, where those functions are modified Bessel functions. Well, that estimator is biased, but it is possible to improve it (see here or there). This can be done easily in R (actually Jeff Gill – here – used that package in several applications). But I am not a big fan of that technique….- density estimation for hours on simulated data
is the time (in hours) for the
th observation (the
th phone call received). Then set
set.seed(1) library(circular) X=rbeta(100,shape1=2,shape2=4)*24 Omega=2*pi*X/24 Omegat=2*pi*trunc(X)/24 H=circular(Omega,type="angle",units="radians",rotation="clock") Ht=circular(Omegat,type="angle",units="radians",rotation="clock") plot(Ht, stack=FALSE, shrink=1.3, cex=1.03, axes=FALSE,tol=0.8,zero=c(rad(90)),bins=24,ylim=c(0,1)) points(Ht, rotation = "clock", zero =c(rad(90)), col = "1", cex=1.03, stack=TRUE ) rose.diag(Ht-pi/2,bins=24,shrink=0.33,xlim=c(-2,2),ylim=c(-2,2), axes=FALSE,prop=1.5)
|
|
circ.dens = density(Ht+3*pi/2,bw=20) plot(Ht, stack=TRUE, shrink=.35, cex=0, sep=0.0, axes=FALSE,tol=.8,zero=c(0),bins=24, xlim=c(-2,2),ylim=c(-2,2), ticks=TRUE, tcl=.075) lines(circ.dens, col="darkgrey", lwd=3) text(0,0.8,"24", cex=2); text(0,-0.8,"12",cex=2); text(0.8,0,"6",cex=2); text(-0.8,0,"18",cex=2)The c
ode looks rather simple. But I am not very comfortable using codes that I do not completely
understand. So I did my own. The first step was to get a graph similar
to the one we have on the right, except that I prefer my own kernel
based estimator. The idea is that instead of estimating the density on
, we estimate it on the sample
. Then we multiply by 3 to get the density only on
. For the bandwidth, I took the same as the one that we would have taken on
The code is simply the following
U=seq(0,1,by=1/250)
O=U*2*pi
U12=seq(0,1,by=1/24)
O12=U12*2*pi
X=rbeta(100,shape1=2,shape2=4)*24
OM=2*pi*X/24
XL=c(X-24,X,X+24)
d=density(X)
d=density(XL,bw=d$bw,n=1500)
I=which((d$x>=6)&(d$x<=30))
Od=d$x[I]/24*2*pi-pi/2
Dd=d$y[I]/max(d$y)+1
plot(cos(O),-sin(O),xlim=c(-2,2),ylim=c(-2,2),
type="l",axes=FALSE,xlab="",ylab="")
for(i in pi/12*(0:12)){
abline(a=0,b=tan(i),lty=1,col="light yellow")}
segments(.9*cos(O12),.9*sin(O12),1.1*cos(O12),1.1*sin(O12))
lines(Dd*cos(Od),-Dd*sin(Od),col="red",lwd=1.5)
text(.7,0,"6"); text(-.7,0,"18")
text(0,-.7,"12"); text(0,.7,"24")
R=1/24/max(d$y)/3+1
lines(R*cos(O),R*sin(O),lty=2)
Note that it is possible to stress more (visually) on hours having few phone calls, or a lot (compared with an homogeneous Poisson process), e.g.
plot(cos(O),-sin(O),xlim=c(-2,2),ylim=c(-2,2),
type="l",axes=FALSE,xlab="",ylab="")
for(i in pi/12*(0:12)){
abline(a=0,b=tan(i),lty=1,col="light yellow")}
segments(2*cos(O12),2*sin(O12),1.1*cos(O12),1.1*sin(O12),
col="light grey")
segments(.9*cos(O12),.9*sin(O12),1.1*cos(O12),1.1*sin(O12))
text(.7,0,"6")
text(-.7,0,"18")
text(0,-.7,"12")
text(0,.7,"24")
R=1/24/max(d$y)/3+1
lines(R*cos(O),R*sin(O),lty=2)
AX=R*cos(Od);AY=-R*sin(Od)
BX=Dd*cos(Od);BY=-Dd*sin(Od)
COUL=rep("blue",length(AX))
COUL[R<Dd]="red"
CM=cm.colors(200)
a=trunc(100*Dd/R)
COUL=CM[a]
segments(AX,AY,BX,BY,col=COUL,lwd=2)
lines(Dd*cos(Od),-Dd*sin(Od),lwd=2)
We get here those two graphs,
|
|
|
|
- density of wind direction
A
standard application when studying angles is wind direction. For
instance, in Montréal, it is possible to find hourly observations,
starting in 1974 (we just need a R robot to pick up the information,
but I'll tell more about that in another post, someday). Here, we have
directly an angle. So we can use a code rather similar to the one used
above to estimate the distribution of wind direction in Montréal.
|
|
- density of 911 phone calls

|
|
|
|
|
|
- density of earth temperatures, or earthquakes
to densities on the unit circle
. But similarly, it is possible to go from
to the unit sphere
. A nice application being global climate studies,
library(ks)
X=cbind(EQ$Longitude,EQ$Latitude)
Hpi1 = Hpi(x = X)
DX=kde(x = X, H = Hpi1)
library(maps)
map("world")
plot(DX,add=TRUE,col="red")
points(X,cex=.2,col="blue")
Y=rbind(cbind(X[,1],X[,2]),cbind(X[,1]+360,X[,2]),
cbind(X[,1]-360,X[,2]),cbind(X[,1],X[,2]+180),
cbind(X[,1]+360,X[,2]+180),cbind(X[,1]-360,X[,2]+180),
cbind(X[,1],X[,2]-180),cbind(X[,1]+360,
X[,2]-180),cbind(X[,1]-360,X[,2]-180))
DY=kde(x = Y, H = Hpi1)
library(maps)
plot(DY,add=TRUE,col="purple")Without any correction, we get the red level curves. The pink one integrates correction.
To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.










