Interpolation and smoothing functions in base R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
by Andrie de Vries
Every once in a while I try to remember how to do interpolation using R. This is not something I do frequently in my workflow, so I do the usual sequence of finding the appropriate help page:
?interpolate
Help pages:
stats::approx Interpolation Functions
stats::NLSstClosestX Inverse Interpolation
stats::spline Interpolating Splines
So, the help tells me to use approx() to perform linear interpolation. This is an interesting function, because the help page also describes approxfun() that does the same thing as approx(), except that approxfun() returns a function that does the interpolation, whilst approx() returns the interpolated values directly.
(In other words, approxfun() acts a little bit like a predict() method for approx().)
Other functions in the interpolation family
The help page for approx() also points to stats::spline() to do spline interpolation and from there you can find smooth.spline() for smoothing splines.
Talking about smoothing, base R also contains the function smooth(), an implementation of running median smoothers (algorithm proposed by Tukey).
Finally I want to mention loess(), a function that estimates Local Polynomial Regression Fitting. (The function loess() underlies the stat_smooth() as one of the defaults in the package ggplot2.)
Trying the different interpolation and smoothing methods
I set up a little experiment to see how the different functions behave. To do this, I simulate some random data in the shape of a sine wave. Then I use each of these functions to interpolate or smooth the data.
Results
On my generated data, the interpolation functions approx() and spline() gives a quite ragged interpolation. The smoothed median function smooth() doesn't do much better – there simply is too much variance in the data.
The smooth.spline() function does a great job at finding a smoother using default values.
The last two plots illustrate loess(), the local regression estimator. Notice that loess() needs a tuning parameter (span). The lower the value of the smoothing parameter, the smaller the number of points that it functions on. Thus with a value of 0.1 you can see a much smoother interpolation than at a value of 0.5.
The code
Here is the code:
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.