Site icon R-bloggers

Linear splines with convenient parametrisations

[This article was first published on R – Brokering Closure, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I have just published a small package lspline on CRAN that implements linear splines using convenient parametrisations such that

Knot locations can be specified

The implementation follows Greene (2003, chapter 7.5.2).

The package sources are on GitHub here.

Examples

We will use the following artificial data with knots at x=5 and x=10:

set.seed(666)
n <- 200
d <- data.frame(
  x = scales::rescale(rchisq(n, 6), c(0, 20))
)
d$interval <- findInterval(d$x, c(5, 10), rightmost.closed = TRUE) + 1
d$slope <- c(2, -3, 0)[d$interval]
d$intercept <- c(0, 25, -5)[d$interval]
d$y <- with(d, intercept + slope * x + rnorm(n, 0, 1))

Plotting y against x:

library(ggplot2)
fig <- ggplot(d, aes(x=x, y=y)) + 
  geom_point(aes(shape=as.character(slope))) +
  scale_shape_discrete(name="Slope") +
  theme_bw()
fig

The slopes of the consecutive segments are 2, -3, and 0.

Setting knot locations manually

We can parametrize the spline with slopes of individual segments (default marginal=FALSE):

library(lspline)
m1 <- lm(y ~ lspline(x, c(5, 10)), data=d)
knitr::kable(broom::tidy(m1))
term estimate std.error statistic p.value
(Intercept) 0.1343204 0.2148116 0.6252941 0.5325054
lspline(x, c(5, 10))1 1.9435458 0.0597698 32.5171747 0.0000000
lspline(x, c(5, 10))2 -2.9666750 0.0503967 -58.8664832 0.0000000
lspline(x, c(5, 10))3 -0.0335289 0.0518601 -0.6465255 0.5186955

Or parametrize with coeficients measuring change in slope (with marginal=TRUE):

m2 <- lm(y ~ lspline(x, c(5,10), marginal=TRUE), data=d)
knitr::kable(broom::tidy(m2))
term estimate std.error statistic p.value
(Intercept) 0.1343204 0.2148116 0.6252941 0.5325054
lspline(x, c(5, 10), marginal = TRUE)1 1.9435458 0.0597698 32.5171747 0.0000000
lspline(x, c(5, 10), marginal = TRUE)2 -4.9102208 0.0975908 -50.3143597 0.0000000
lspline(x, c(5, 10), marginal = TRUE)3 2.9331462 0.0885445 33.1262479 0.0000000

The coefficients are

The two parametrisations (obviously) give identical predicted values:

all.equal( fitted(m1), fitted(m2) )
## [1] TRUE

graphically

fig +
  geom_smooth(method="lm", formula=formula(m1), se=FALSE) +
  geom_vline(xintercept = c(5, 10), linetype=2)

Knots at n equal-length intervals

Function elspline() sets the knots at points dividing the range of x into n equal length intervals.

m3 <- lm(y ~ elspline(x, 3), data=d)
knitr::kable(broom::tidy(m3))
term estimate std.error statistic p.value
(Intercept) 3.5484817 0.4603827 7.707678 0.00e+00
elspline(x, 3)1 0.4652507 0.1010200 4.605529 7.40e-06
elspline(x, 3)2 -2.4908385 0.1167867 -21.328105 0.00e+00
elspline(x, 3)3 0.9475630 0.2328691 4.069080 6.84e-05

Graphically

fig +
  geom_smooth(aes(group=1), method="lm", formula=formula(m3), se=FALSE, n=200)

Knots at quantiles of x

Function qlspline() sets the knots at points dividing the range of x into q equal-frequency intervals.

m4 <- lm(y ~ qlspline(x, 4), data=d)
knitr::kable(broom::tidy(m4))
term estimate std.error statistic p.value
(Intercept) 0.0782285 0.3948061 0.198144 0.8431388
qlspline(x, 4)1 2.0398804 0.1802724 11.315548 0.0000000
qlspline(x, 4)2 1.2675186 0.1471270 8.615132 0.0000000
qlspline(x, 4)3 -4.5846478 0.1476810 -31.044273 0.0000000
qlspline(x, 4)4 -0.4965858 0.0572115 -8.679818 0.0000000

Graphically

fig +
  geom_smooth(method="lm", formula=formula(m4), se=FALSE, n=200)

Installation

Stable version from CRAN or development version from GitHub with

devtools::install_github("mbojan/lspline", build_vignettes=TRUE)

Acknowledgements

Inspired by Stata command mkspline and function ares::lspline from Junger & Ponce de Leon (2011). As such, the implementation follows Greene (2003), chapter 7.5.2.

To leave a comment for the author, please follow the link and comment on their blog: R – Brokering Closure.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.