Linear regression with random error giving EXACT predefined parameter estimates
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
When simulating linear models based on some defined slope/intercept and added gaussian noise, the parameter estimates vary after least-squares fitting. Here is some code I developed that does a double transform of these models as to obtain a fitted model with EXACT defined parameter estimates a
(intercept) and b
(slope).
It does so by:
1) Fitting a linear model #1 to the x,y
data.
2) Correcting y
by : .
3) Refitting linear model #2: .
4) Correcting y
by : .
5) Refitting linear model #3: , which is the final model with parameter estimates a
and b
.
Below is the code:
exactLM <- function(
x = 1:100, ## predictor values
b = 0.01, ## slope
a = 3, ## intercept
error = NULL, ## homoscedastic error
n = 1, ## number of replicates
weights = NULL, ## possible weights, i.e. when heteroscedastic
plot = TRUE, ## plot data and regression
... ## other parameters to 'lm'
)
{
if (is.null(error)) stop("Please define some error!")
## create x and y-values
x <- rep(x, n)
y <- a + b * x
if (!is.null(error) & length(error) != length(x)) stop("'x' and 'error' must be of same length!")
if (!is.null(weights) & length(weights) != length(x)) stop("'x' and 'weights' must be of same length!")
## add error
y <- y + error
## create linear model #1
LM1 <- lm(y ~ x, weights = weights, ...)
COEF1 <- coef(LM1)
## correct slope and create linear model #2
y <- y * (b/COEF1[2])
LM2 <- lm(y ~ x, weights = weights, ...)
COEF2 <- coef(LM2)
## correct intercept and create linear model #3
y <- y + (a - COEF2[1])
LM3 <- lm(y ~ x, weights = weights, ...)
## plot data and regression
plot(x, y, pch = 16)
abline(LM3, lwd = 2, col = "darkred")
return(list(model = LM3, x = x, y = y))
}
Here are some applications using replicates and weighted fitting:
############ Examples #################
## n = 1
exactLM(x = 1:100, a = 0.5, b = 0.2, error = rnorm(100, 0, 2))
## n = 3
exactLM(x = 1:100, a = 0.5, b = 0.2, error = rnorm(300, 0, 2), n = 3)
## weighted by exact 1/var
x <- 1:100
error <- rnorm(100, 0, 0.1 * x)
weights <- 1/(0.1 * x)^2
exactLM(x = x, a = 0.5, b = 0.2, error = error, weights = weights)
## weighted by empirical 1/var
x <- rep(1:100, 3)
error <- rnorm(300, 0, 0.1 * x)
weights <- rep(1/(tapply(error, x, mean)^2), 3)
exactLM(x = x, a = 0.5, b = 0.2, error = error, weights = weights)
I am curious on comments concerning simplification and more importantly, application (other than cheating data…)!
Cheers,
Andrej
Filed under: Uncategorized
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.