Convex Regression Model
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This morning during the lecture on nonlinear regression, I mentioned (very) briefly the case of convex regression. Since I forgot to mention the codes in R, I will publish them here. Assume that yi=m(xi)+εi where m:Rd→R is some convex function.
Then m is convex if and only if ∀x1,x2∈Rd, ∀t∈[0,1], m(tx1+[1−t]x2)≤tm(x1)+[1−t]m(x2)Hidreth (1954) proved that ifm⋆=argminm convex{∑ni=1(yi−m(xi))2}then θ⋆=(m⋆(x1),⋯,m⋆(xn)) is unique.
Let y=θ+ε, then θ⋆=argminθ∈K{∑ni=1(yi−θi))2}whereK={θ∈Rn:∃m convex ,m(xi)=θi}. I.e. θ⋆ is the projection of y onto the (closed) convex cone K. The projection theorem gives existence and unicity.
For convenience, in the application, we will consider the real-valued case, m:R→R, i.e. yi=m(xi)+εi. Assume that observations are ordered x1≤x2≤⋯≤xn. Here K={θ∈Rn:θ2−θ1x2−x1≤θ3−θ2x3−x2≤⋯≤θn−θn−1xn−xn−1}
Hence, quadratic program with n−2 linear constraints.
m⋆ is a piecewise linear function (interpolation of consecutive pairs (xi,θ⋆i)).
If m is differentiable, m is convex if m(x)+∇m(x)T⋅[y−x]≤m(y)
More generally, if m is convex, then there exists ξx∈Rn such that m(x)+ξ Tx⋅[y−x]≤m(y)
ξx is a subgradient of m at x. And then ∂m(x)={m(x)+ξ T⋅[y−x]≤m(y),∀y∈Rn}
Hence, θ⋆ is solution of argmin{‖y−θ‖2}subject to θi+ξ Ti[xj−xi]≤θj, ∀i,j and ξ1,⋯,ξn∈Rn. Now, to do it for real, use cobs package for constrained (b)splines regression,
library(cobs)
To get a convex regression, use
plot(cars) x = cars$speed y = cars$dist rc = conreg(x,y,convex=TRUE) lines(rc, col = 2)
Here we can get the values of the knots
rc Call: conreg(x = x, y = y, convex = TRUE) Convex regression: From 19 separated x-values, using 5 inner knots, 7, 8, 9, 20, 23. RSS = 1356; R^2 = 0.8766; needed (5,0) iterations
and actually, if we use them in a linear-spline regression, we get the same output here
reg = lm(dist~bs(speed,degree=1,knots=c(4,7,8,9,,20,23,25)),data=cars) u = seq(4,25,by=.1) v = predict(reg,newdata=data.frame(speed=u)) lines(u,v,col="green")
Let us add vertical lines for the knots
abline(v=c(4,7,8,9,20,23,25),col="grey",lty=2)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.