Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In this exercise, we will continue to solve problems from the last exercise about GLM here. Therefore, the exercise number will start at 9. Please make sure you read and follow the previous exercise before you continue practicing.
In the last exercise, we knew that there was over-dispersion over the model. So, we tried to use Quasi-Poisson regression, along with step-wise variable selection algorithms. Please note, here we assumed there is no influence from the background theory or knowledge behind the data. Obviously, there is no such thing in the real world, but we just use this step as an exercise in general.
Answers to these exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Load the data-set and required package before running the exercise.
Exercise 9
Load the package called “MASS” to execute the negative binomial model. Run the package; consider all the explanatory variables.
Exercise 10
Check the summary of the model.
Exercise 11
Set options in base R, considering missing values.
Exercise 12
The previous exercise gave insight that variables 1,3,4,6 or 1,4,6 produce the best model performance. Therefore, refit the model using those variables.
Exercise 13
Check the diagnostic plot and generate a conclusion based on if the model gives the best performance.
Related exercise sets:
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.