Fitting a rational function in R using ordinary least-squares regression
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
by Srini Kumar, VP of Product Management and Data Science, LevaData; and Bob Horton, Senior Data Scientist, Microsoft
A rational function is defined as the ratio of two functions. The Padé Approximant uses a ratio of polynomials to approximate functions:
$$
R(x)= \frac{\sum_{j=0}^m a_j x^j}{1+\sum_{k=1}^n b_k x^k}=\frac{a_0+a_1x+a_2x^2+\cdots+a_mx^m}{1+b_1 x+b_2x^2+\cdots+b_nx^n}
$$
Here we show a way to fit this type of function to a set of data points using the lm
function in R.
We’ll use a data-simulating function based on a ratio of polynomials to generate a set of y
values for some random x
values, add some noise, then see how well we can fit a curve to these noisy points.
set.seed(123) N <- 1000 x <- rnorm(N) f <- function(x) 50*x^2/(1 + 4*x) # data-simulating function y <- f(x) + rnorm(N, sd=3) point_data <- data.frame(x, y) library("tidyverse") ggplot(point_data, aes(x=x, y=y)) + geom_point() + ylim(-100, 100) + ggtitle("simulated data points")
Note that in the formula for the Padé approximant the numerator polynomial has an zeroth-order coefficient (\(a_0\)), where the denominator just has a constant \( 1\). This makes it easy to rearrange the equation to express \(y\) as a linear combination of terms.
$$ y = \frac{a_0 + a_1 x + a_2 x^2}{1 + b_1 x + b_2 x^2} $$ $$ y (1 + b_1 x + b_2 x^2) = a_0 + a_1 x + a_2 x^2 $$ $$ y = a_0 + a_1 x + a_2 x^2 - b_1 x y - b_2 x^2 y $$
Now we have a form that lm
can work with. We just need to specify a set of inputs that are powers of x
(as in a traditional polynomial fit), and a set of inputs that are y
times powers of x
. This may seem like a strange thing to do, because we are making a model where we would need to know the value of y
in order to predict y
. But the trick here is that we will not try to use the fitted model to predict anything; we will just take the coefficients out and rearrange them in a function. The fit_pade
function below takes a dataframe with x
and y
values, fits an lm
model, and returns a function of x
that uses the coefficents from the model to predict y
:
fit_pade <- function(point_data){ fit <- lm(y ~ x + I(x^2) + I(y*x) + I(y*x^2), point_data) lm_coef <- as.list(coef(fit)) names(lm_coef) <- c("a0", paste0(rep(c('a','b'), each=2), 1:2)) with(lm_coef, function(x)(a0 + a1*x + a2*x^2)/(1 - b1*x - b2*x^2)) }
We'll use ggplot2
to visualize the results; since it drops points outside the plotted range, it does not try to connect the points on either side of the discontinuity, as long as the data gives expected y values past those limits.
plot_fitted_function <- function(x_data, fitted_fun, title){ x_data$y_hat <- fitted_fun(x_data$x) g <- ggplot(x_data, aes(x=x, y=y)) + geom_point() + ylim(-100, 100) + geom_line(aes(y=y_hat), col="red", size=1) + ggtitle(title) plot(g) }
When we draw the fitted values over the input data points, we see that in this case the function describes the points quite well:
pade_approx <- fit_pade(point_data) plot_fitted_function(point_data, pade_approx, title="fitted function")
Here are some more examples, showing that this approach can work with some interesting patterns of points:
function_list <- list( function(x) (100 - 50*x - 100*x^2)/(1 - 10*x - 10*x^2), function(x) (100 - 50*x - 100*x^2)/(1 - 10*x - 5*x^2), function(x) (100 - 50*x - 100*x^2)/(1 - 50*x - 5*x^2) ) for (f in function_list){ sim_data <- point_data %>% mutate(y=f(x) + rnorm(nrow(point_data), sd=3)) plot_fitted_function(sim_data, fit_pade(sim_data), title=as.character(deparse(f))[2]) }
Here we have used linear regression by ordinary least squares (with lm
) to fit distinctly nonlinear rational functions. For simplicity, these examples focus on equations of second order (or less) in both numerator and denominator, but the idea extends to higher orders. On a cautionary note, this approach seems to have numerical stability issues with some inputs. For example, if you take one of the data-simulating equations above and make selected coefficients much larger, you can create datasets that are fitted poorly by this method. And if the data-simulating function does not have the correct form (for example, if the zeroth order term in the denominator is not 1), the fitted curves can be completely wrong. For practical purposes it might be preferable to use a nonlinear least squares approach (e.g., the nls
function). Still, this approach works well in many examples, and it lets you fit some curves that cannot be represented by ordinary polynomials.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.