Weighted Effect Coding: Dummy coding when size matters
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
If your regression model contains a categorical predictor variable, you commonly test the significance of its categories against a preselected reference category. If all categories have (roughly) the same number of observations, you can also test all categories against the grand mean using effect (ANOVA) coding. In observational studies, however, the number of observations per category typically varies. We published a paper in the International Journal of Public Health, showing how all categories can be tested against the sample mean.
In a second paper in the same journal, the procedure is expanded to regression models that test interaction effects. Within this framework, the weighted effect coded interaction displays the extra effect on top of the main effect found in a model without the interaction effect. This offers a promising new route to estimate interaction effects in observational data, where different category sizes often prevail.
To apply the procedures introduced in these papers, called weighted effect coding, procedures are made available for R, SPSS, and Stata. For R, we created the ‘wec’ package which can be installed by typing:
install.packages(“wec”)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.