Convex Regression Model

[This article was first published on R-english – Freakonometrics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This morning during the lecture on nonlinear regression, I mentioned (very) briefly the case of convex regression. Since I forgot to mention the codes in R, I will publish them here. Assume that yi=m(xi)+εi where m:RdR is some convex function.

Then m is convex if and only if x1,x2Rd, t[0,1], m(tx1+[1t]x2)tm(x1)+[1t]m(x2)Hidreth (1954) proved that ifm=argminm convex{ni=1(yim(xi))2}then θ=(m(x1),,m(xn)) is unique.

Let y=θ+ε, then θ=argminθK{ni=1(yiθi))2}whereK={θRn:m convex ,m(xi)=θi}. I.e. θ is the projection of y onto the (closed) convex cone K. The projection theorem gives existence and unicity.

For convenience, in the application, we will consider the real-valued case, m:RR, i.e. yi=m(xi)+εi. Assume that observations are ordered x1x2xn. Here K={θRn:θ2θ1x2x1θ3θ2x3x2θnθn1xnxn1}

Hence, quadratic program with n2 linear constraints.

m is a piecewise linear function (interpolation of consecutive pairs (xi,θi)).

If m is differentiable, m is convex if m(x)+m(x)T[yx]m(y)

More generally, if m is convex, then there exists ξxRn such that m(x)+ξ Tx[yx]m(y)
ξx is a subgradient of m at x. And then m(x)={m(x)+ξ T[yx]m(y),yRn}

Hence, θ is solution of argmin{yθ2}subject to θi+ξ Ti[xjxi]θj, i,j and ξ1,,ξnRn. Now, to do it for real, use cobs package for constrained (b)splines regression,

library(cobs)

To get a convex regression, use

plot(cars)
x = cars$speed
y = cars$dist
rc = conreg(x,y,convex=TRUE)
lines(rc, col = 2)


Here we can get the values of the knots

rc

Call:  conreg(x = x, y = y, convex = TRUE) 
Convex regression: From 19 separated x-values, using 5 inner knots,
     7,    8,    9,   20,   23.
RSS =  1356; R^2 = 0.8766;
 needed (5,0) iterations

and actually, if we use them in a linear-spline regression, we get the same output here

reg = lm(dist~bs(speed,degree=1,knots=c(4,7,8,9,,20,23,25)),data=cars)
u = seq(4,25,by=.1)
v = predict(reg,newdata=data.frame(speed=u))
lines(u,v,col="green")

Let us add vertical lines for the knots

abline(v=c(4,7,8,9,20,23,25),col="grey",lty=2)

To leave a comment for the author, please follow the link and comment on their blog: R-english – Freakonometrics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)