Comparing additive and multiplicative regressions using AIC in R

[This article was first published on R – Modern Forecasting, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

One of the basic things the students are taught in statistics classes is that the comparison of models using information criteria can only be done when the models have the same response variable. This means, for example, that when you have log(yt) and calculate AIC, then this value is not comparable with AIC from a model with yt. The reason for this is because the scales of variables are different. But there is a way to make the criteria in these two cases comparable: both variables need to be transformed into the original scale, and we need to understand what are the distributions of these variables in that scale. In our example, if we assume that log(yt)N(0,σ2l) (where σ2l is the variance of the residuals of the model in logarithms), then the exponent of this variable will have log-normal distribution:
ytlogN(0,σ2l)


Just as a reminder, all the information criteria rely on the log-likelihood. For example, here’s the formula of AIC:
AIC=2k2,

where k is the number of all the estimated parameters and is the value of the log-likelihood.

If we use the likelihood of log-normal distribution instead of the likelihood of the normal in (???) for the variable yt, then the information criteria will become comparable. In order to understand what needs to be done for this transformation, we need to compare the formulae for the two distributions: normal and log-normal. Here’s normal for the variable logyt:
f(yt|θ,σ2l)=12πσ2le(logytlogμt)22σ2l


and here’s the log-normal for the variable yt=exp(log(yt)) (the multiplicative model in the original scale):
f(yt|θ,σ2l)=1yt12πσ2le(logytlogμt)22σ2l,

where θ is the vector of parameters of our model. The main difference between the two distributions is in 1yt. If we derive the log-likelihood based on (???), here is what we get:
(θ,σ2l|Y)=12(Tlog(2πσ2l)+Tt=1(logytlogμt)22σ2l)Tt=1logyt,

where Y is the vector of all the actual values in the sample. When we extract likelihood of the model in logarithms, we calculate only the first part of (???), before the “Tt=1logyt”, which corresponds to the normal distribution. So, in order to produce the likelihood of the model with the variable in the original scale, we need to subtract the sum of logarithms of the response variable from the extracted likelihood.

The function

AIC()
in R, applied to the model in logarithms, will extract the value based on that first part of (???). As a result, in order to fix this and get AIC in the same scale as the variable yt we need to take the remaining part into account, modifying equation (???):
AIC=2k2+2Tt=1logyt=AIC+2Tt=1logyt,

Let’s see an example in R. We will use

longley
data from
datasets
package. First we construct additive and multiplicative models:

modelAdditive <- lm(GNP~Employed,data=longley)
modelMultiplicative <- lm(log(GNP)~Employed,data=longley)

And extract the respective AICs:

AIC(modelAdditive)
> 142.7824
AIC(modelMultiplicative)
> -44.5661

As we see, the values are not comparable. Now let’s modify the second AIC:

AIC(modelMultiplicative) + 2*sum(log(longley$GNP))
> 145.118

So, now the values are comparable, and we can conclude that the first model (additive) is better than the second one in terms of AIC.

Similar technique can be used for the other transformed response variables (square root, Box-Cox transformation), but the respective distributions of the variables would need to be derived, which is not always a simple task.

To leave a comment for the author, please follow the link and comment on their blog: R – Modern Forecasting.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)