Manipulate(d) Regression!
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The R package ‘manipulate’ can be used to create interactive plots in RStudio. Though not as versatile as the ‘shiny’ package, ‘manipulate’ can be used to quickly add interactive elements to standard R plots. This can prove useful for demonstrating statistical concepts, especially to a non-statistician audience.
The R code at the end of this post uses the ‘manipulate’ package with a regression plot to illustrate the effect of outliers (and influential) points on the fitted linear regression model. The resulting manipulate(d) plot in RStudio includes a gear icon, which, when clicked, opens up a slider control. The slider can be used to move some data points. The plot changes interactively with the data.
Here are some static figures:
Initial state: It is possible to move two points in the scatter plot, one at the end and one at the center.
An outlier at center has a limited influence on the fitted regression model.
An outlier at the ends of support of x and y ‘moves’ the regression line towards it and is also an influential point!
Here is the complete R code for generating the interactive plot. This is to be run in RStudio.
library(manipulate) ## First define a custom function that fits a linear regression line ## to (x,y) points and overlays the regression line in a scatterplot. ## The plot is then 'manipulated' to change as y values change. linregIllustrate <- function(x, y, e, h.max, h.med){ max.x <- max(x) med.x <- median(x) max.xind <- which(x == max.x) med.xind <- which(x == med.x) y1 <- y ## Modified y y1[max.xind] <- y1[max.xind]+h.max ## at the end y1[med.xind] <- y1[med.xind]+h.med ## at the center plot(x, y1, xlim=c(min(x),max(x)+5), ylim=c(min(y1),max(y1)), pch=16, xlab="X", ylab="Y") text(x[max.xind], y1[max.xind],"I'm movable!", pos=3, offset = 0.3, cex=0.7, font=2, col="red") text(x[med.xind], y1[med.xind],"I'm movable too!", pos=3, offset = 0.3, cex=0.7, font=2, col="red") m <- lm(y ~ x) ## Regression with original set of points, the black line abline(m, lwd=2) m1 <- lm(y1 ~ x) ## Regression with modified y, the dashed red line abline(m1, col="red", lwd=2, lty=2) } ## Now generate some x and y data x <- rnorm(35,10,5) e <- rnorm(35,0,5) y <- 3*x+5+e ## Plot and manipulate the plot! manipulate(linregIllustrate(x, y, e, h.max, h.med), h.max=slider(-100, 100, initial=0, step=10, label="Move y at the end"), h.med=slider(-100, 100, initial=0, step=10, label="Move y at the center"))
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.