[This article was first published on Sandy Muspratt's R Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< !-- MathJax scripts -->
Specifying the model
This is the second part of a series of notes demonstrating use of the R package, R2MLwiN, an R command interface to the multilevel modelling software package, MLwiN (see the MLwiN site for getting access to MLwiN). The first set of notes showed how to get started with R2MLwiN. In these notes, I show how to fit predictors (continuous, categorical, and interactions) to the fixed-effects part of a multilevel regression model, and how to fit random slopes to the regression model. The examples use the ALNT.csv data (see Working with R2MLwiN Part 1 for a description of the data). Though the series is concerned with demonstrating Bayesian estimation using MCMC methods, the examples presented here do not depend on MCMC methods of estimations, and so to speed up the running of the examples, they use maximum likelihood estimation. It is easy enough to switch between maximum likelihood and MCMC procedures; set theestM
option to 1 for MCMC, and 0 for maximum likelihood.Some Preliminaries
Load theR2MLwiN
and the plyr
packages. The ddply()
function from the plyr
package will be used to add some derived variables to the data file.library(R2MLwiN) library(plyr)
The examples in these notes use the ALNT data file available from github. Assuming the data file is located in your working directory, the following command loads the data, and lists the variables. (See the end of this blog for an easy way to get GitHub data into an R session.)
alnt = read.csv("ALNT.csv", header = TRUE, sep = ",") names(alnt)
The alnt data file needs some modifications:
- a column of 1s for the 'intercept' variable;
- grand mean centring of the write3 variable;
- group mean centring of the write3 variable;
- group (school) means for the write3 variable; and,
- two variables, gender and location, should be changed from numeric variables to factors.
alnt$cons = 1 # Intercept alnt$write3C = alnt$write3 - mean(alnt$write3) # Grand mean centring alnt = ddply(alnt, .(school), mutate, write3GP = write3 - mean(write3)) # Group mean centring alnt = ddply(alnt, .(school), mutate, GrpMeans = mean(write3)) # Group means alnt$gender = factor(alnt$gender, labels = c("male", "female")) # Factor labels alnt$location = factor(alnt$location, labels = c("metro", "prov", "rural", "remote"))
Next, set up the information to be passed to MLwiN:
# 1. Path to MLwiN executable (mlwin.exe) mlwin = c("C:/Program Files/MLwiN v2.26/i386") # Substitute the path on your machine # 2. The variables containing IDs for all levels - highest level first levID = c("school", "student") # 3. Distribution to be modelled D = "Normal" # 4. Method of estimation (maximum likelihood) estoptions = list(EstM = 0) # Set EstM to 1 for MCMC estimation
Continuous Predictors
In all the models to follow, write7 is the dependent variable. In Part 1 of these notes, the only predictor in the model was the intercept, and so the model was an “empty model”. In this section, I take into account students earlier (year 3) writing achievements.The code demonstrates the setting up of models using different versions of the write3 variable as the predictor variable. The models are random-intercepts models. Versions of the write3 variable enter the models in turn as:
- uncentred write3;
- write3 centred on the grand mean;
- write3 centred on the group means + school means; and,
- write3 centred on the grand mean + school means.
The code for the formulas to be passed to MLwiN are:
# No centring f1 = "write7 ~ (0 | cons + write3) + (1 | cons) + (2 | cons)" # Grand mean centring f2 = "write7 ~ (0 | cons + write3C) + (1 | cons) + (2 | cons)" # Group mean centring + School means f3 = "write7 ~ (0 | cons + write3Gp + GrpMeans) + (1 | cons) + (2 | cons)" # Grand mean centring + School means f4 = "write7 ~ (0 | cons + write3C + GrpMeans) + (1 | cons) + (2 | cons)"
Run the models.
mod1 = runMLwiN(Formula = f1, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir()) mod2 = runMLwiN(Formula = f2, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir()) mod3 = runMLwiN(Formula = f3, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir()) mod4 = runMLwiN(Formula = f4, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir())
Summaries for the 3rd and 4th models only are shown below:
MODEL 3 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.75s Number of obs: 3097 Deviance statistic: 37810.9 --------------------------------------------------------------------------------------------------- The model formula: write7~(0|cons+write3Gp+GrpMeans)+(1|cons)+(2|cons) Level 2: school Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] cons 306.12204 61.41767 4.98 6.22e-07 *** 185.74563 426.49845 write3Gp 0.42190 0.01934 21.81 1.882e-105 *** 0.38398 0.45981 GrpMeans 0.84466 0.11832 7.14 9.426e-13 *** 0.61275 1.07657 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Conf. Interval] var_cons 1216.74207 269.28083 688.96134 1744.52279 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. [95% Conf. Interval] var_cons 11335.23047 291.24032 10764.40994 11906.05100 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MODEL 4 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.51s Number of obs: 3097 Deviance statistic: 37810.9 --------------------------------------------------------------------------------------------------- The model formula: write7~(0|cons+write3C+GrpMeans)+(1|cons)+(2|cons) Level 2: school Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] cons 525.47400 62.23589 8.44 3.086e-17 *** 403.49390 647.45410 write3C 0.42190 0.01934 21.81 1.883e-105 *** 0.38398 0.45981 GrpMeans 0.42277 0.11989 3.53 0.0004216 *** 0.18778 0.65776 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Conf. Interval] var_cons 1216.73645 269.27958 688.95817 1744.51473 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. [95% Conf. Interval] var_cons 11335.23242 291.24034 10764.41184 11906.05301 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
Consider the estimates of the coefficients for
write3Gp
, write3C
and GrpMeans
in the summaries of the two models. Model 3 contains coefficients for write3 centred on the group means, and the group means. The first coefficient is a within-schools slope (\( \,\beta_w \)); the second coefficient is a between-schools coefficient (\( \,\beta_b \)). Their interpretations are:- \( \beta_w \) is the expected difference in write7 scores for two students in the same school whose write3 scores differ by one point;
- \( \beta_b \) is the expected difference between the mean write7 scores for two schools that differ by one point for mean write3 scores.
Model 4 contains coefficients for write3 centred on the grand mean, and the group means. The first coefficient is again the within-groups slope (\( \,\beta_w \)) (the same within-groups slope that was estimated in Model 3 even though a different version of the predictor was used). The second coefficient is a contextual effect (\( \,\beta_c \)). Its interpretation is:
- \( \beta_c \) is the expected difference in write7 scores for two students with the same write3 scores but who attend schools that differ in mean write3 scores by one point.
The relationship among the three \( \beta \) values is: \( \beta_b = \beta_w + \beta_c \)
Thus, the two coefficients returned by either model can be used to calculate the third.
Using Model 4, \( \,\beta_b \) can be derived from \( \,\beta_w \) and \( \,\beta_c \):
- \( \beta_w = \) 0.4219
- \( \beta_c = \) 0.4228
- Thus, \( \,\beta_b = \) 0.8447
Or, using Model 3, \( \,\beta_c \) can be derived from \( \,\beta_w \) and \( \,\beta_b \):
- \( \beta_w = \) 0.4219
- \( \beta_b = \) 0.8447
- Thus, \( \,\beta_c = \) 0.4228
Categorical predictors: gender
and location
A binomial predictor (gender
) is fitted to the model by naming the variable in the fixed part of the model formula, but to get the category names included in the output, a reference category should be nominated within square brackets. A multinomial variable (location
) is similarly fitted by naming it and nominating a reference category within square brackets.f5 = "write7 ~ (0 | cons + write3C + gender[male] + location[remote]) + (1 | cons) + (2 | cons)" mod5 = runMLwiN(Formula = f5, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir())
MODEL 5 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.55s Number of obs: 3097 Deviance statistic: 37711.1 --------------------------------------------------------------------------------------------------- The model formula: write7~(0|cons+write3C+gender[male]+location[remote])+(1|cons)+(2|cons) Level 2: school Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] cons 703.17004 21.45731 32.77 1.542e-235 *** 661.11449 745.22560 write3C 0.38592 0.01927 20.02 3.526e-89 *** 0.34814 0.42370 female 41.25539 3.89543 10.59 3.29e-26 *** 33.62049 48.89028 metro 23.29506 22.51005 1.03 0.3007 -20.82384 67.41395 prov 24.03177 27.16254 0.88 0.3763 -29.20582 77.26936 rural 19.27530 23.43201 0.82 0.4107 -26.65060 65.20119 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Conf. Interval] var_cons 1508.27771 319.07181 882.90845 2133.64697 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. [95% Conf. Interval] var_cons 10928.06738 280.81009 10377.68973 11478.44504 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
Interaction between a continuous predictor and a binomial predictor: write3C:gender
Interactions are fitted to the model by naming the variables in the fixed part of the model formula, separating the two names with a colon.# No interaction f6 = "write7 ~ (0 | cons + write3C + gender) + (1 | cons) + (2 | cons)" # Interaction f7 = "write7 ~ (0 | cons + write3C + gender + write3C:gender) + (1 | cons) + (2 | cons)" mod6 = runMLwiN(Formula = f6, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir()) mod7 = runMLwiN(Formula = f7, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir())
MODEL 6 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.56s Number of obs: 3097 Deviance statistic: 37712.2 --------------------------------------------------------------------------------------------------- The model formula: write7~(0|cons+write3C+gender)+(1|cons)+(2|cons) Level 2: school Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] cons 723.94867 5.59581 129.37 0 *** 712.98108 734.91626 write3C 0.38583 0.01927 20.02 3.471e-89 *** 0.34806 0.42359 gender 41.25045 3.89577 10.59 3.369e-26 *** 33.61488 48.88602 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Conf. Interval] var_cons 1521.67712 321.32080 891.89993 2151.45431 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. [95% Conf. Interval] var_cons 10930.38574 280.87301 10379.88476 11480.88673 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MODEL 7 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.52s Number of obs: 3097 Deviance statistic: 37709.4 --------------------------------------------------------------------------------------------------- The model formula: write7~(0|cons+write3C+gender+write3C:gender)+(1|cons)+(2|cons) Level 2: school Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] cons 723.23883 5.60825 128.96 0 *** 712.24686 734.23080 write3C 0.35672 0.02594 13.75 5.077e-43 *** 0.30587 0.40757 gender 41.17575 3.89428 10.57 3.959e-26 *** 33.54311 48.80839 write3C:gender 0.06165 0.03681 1.67 0.09397 . -0.01050 0.13381 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Conf. Interval] var_cons 1519.48962 320.89186 890.55314 2148.42611 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. [95% Conf. Interval] var_cons 10920.60156 280.62144 10370.59365 11470.60947 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
Interactions with multinomal predictors: gender:location
and write3C:location
It seems that all the categories (except the reference category) of the multinomail variable implicated in the interaction have to be named.f8 = "write7 ~ (0 | cons + write3C + location[remote] + gender[male] + gender:metro + gender:prov + gender:rural) + (1 | cons) + (2 | cons)" f9 = "write7 ~ (0 | cons + write3C + location[remote] + write3C:metro + write3C:prov + write3C:rural) + (1 | cons) + (2 | cons)" mod8 = runMLwiN(Formula = f8, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir()) mod9 = runMLwiN(Formula = f9, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir())
MODEL 8 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.61s Number of obs: 3097 Deviance statistic: 37706.8 --------------------------------------------------------------------------------------------------- The model formula: write7~(0|cons+write3C+location[metro]+gender[male]+gender:prov+gender:rural+gender:remote(1|cons)+(2|cons) Level 2: school Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] cons 728.29382 7.43251 97.99 0 *** 713.72637 742.86127 write3C 0.38605 0.01927 20.03 2.969e-89 *** 0.34828 0.42382 prov -0.24233 19.42681 -0.01 0.99 -38.31818 37.83352 rural -11.41142 12.93321 -0.88 0.3776 -36.76004 13.93720 remote -41.91774 25.48594 -1.64 0.1 -91.86926 8.03377 female 37.61590 4.50712 8.35 7.068e-17 *** 28.78211 46.44969 gender:prov 1.77819 14.83192 0.12 0.9046 -27.29185 30.84823 gender:rural 15.37919 10.37243 1.48 0.1382 -4.95041 35.70878 gender:remote 36.63394 23.55428 1.56 0.1199 -9.53159 82.79948 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Conf. Interval] var_cons 1505.50220 318.52834 881.19812 2129.80628 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. [95% Conf. Interval] var_cons 10913.01953 280.42303 10363.40048 11462.63858 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MODEL 9 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.59s Number of obs: 3097 Deviance statistic: 37812.2 --------------------------------------------------------------------------------------------------- The model formula: write7~(0|cons+write3C+location[metro]+write3C:prov+write3C:rural+write3C:remote)+(1|cons)+(2|cons) Level 2: school Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] cons 747.09021 7.12631 104.84 0 *** 733.12289 761.05753 write3C 0.42681 0.02207 19.34 2.435e-83 *** 0.38355 0.47006 prov -3.61633 18.43805 -0.20 0.8445 -39.75424 32.52157 rural -4.88976 12.03440 -0.41 0.6845 -28.47675 18.69722 remote -22.29548 22.68583 -0.98 0.3257 -66.75889 22.16793 write3C:prov -0.12652 0.07598 -1.67 0.09589 . -0.27544 0.02240 write3C:rural 0.03808 0.05334 0.71 0.4754 -0.06648 0.14263 write3C:remote 0.27499 0.11943 2.30 0.0213 * 0.04092 0.50906 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Conf. Interval] var_cons 1520.18079 322.95416 887.20226 2153.15931 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. [95% Conf. Interval] var_cons 11295.73242 290.25483 10726.84341 11864.62144 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
Random intercepts and random slopes
So far, the only variation between the schools allowed by the models is in the schools' intercepts, but the slopes can also be allowed to vary. To fit random slopes to the regression model, namewrite3C
in the level 2 random part of the model formula.f10 = "write7 ~ (0 | cons + write3C) + (1 | cons) + (2 | cons + write3C)" mod10 = runMLwiN(Formula = f10, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir())
The model can be represented as follows:
\[ \begin{align}
write7_{ij} &=\beta_{0j} + \beta_{1j}write3C_{ij} + e_{0ij}\\\
\beta_{0j} &=\beta_{0} + u_{0j}\\\
\beta_{1j} &=\beta_{1} + u_{1j}\\\
\end{align}
\]\[ \begin{align}
\text{ where }
\begin{bmatrix}
u_{0j}\\ u_{1j}
\end{bmatrix}
& \sim N(0,\; \Omega_u) \qquad\text{:}\qquad\Omega_u =
\begin{bmatrix}
\sigma^2_{u0} & \phantom{1} \\
\sigma_{u01} & \sigma^2_{u1}
\end{bmatrix}\\\
\text{and } e_{ij} & \sim N(0,\; \sigma^2_e)
\end{align}
\]
The analysis returns estimates for the fixed parts of the model: the intercept (\( \,\beta_0 \)) and the slope (\( \,\beta_1 \)). As well, there are student level residuals (\( \,e_{ij} \)) and thus a student level (level 1) variance (\( \,\sigma^2_{ij} \)). At the school level, there are residuals for the intercepts (\( \,u_{0j} \)) and for the slopes (\( \,u_{1j} \)), and thus corresponding to the two sets of residuals, there are two school level (level 2) variances (\( \,\sigma^2_{u0} \) and \( \,\sigma^2_{u1} \)). But the school intercepts and slopes can be correlated, and thus there is also a level 2 covariance (\( \,\sigma_{u01} \)).
MODEL 10 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.97s Number of obs: 3097 Deviance statistic: 37809 --------------------------------------------------------------------------------------------------- The model formula: write7~(0|cons+write3C)+(1|cons)+(2|cons+write3C) Level 2: school Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] cons 744.10114 5.12260 145.26 0 *** 734.06103 754.14124 write3C 0.44131 0.02517 17.53 7.906e-69 *** 0.39198 0.49064 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Conf. Interval] var_cons 1.406e+03 304.63530 809.37110 2.004e+03 cov_cons_write3C 3.104e+00 1.11109 0.92639 5.282e+00 var_write3C 1.483e-02 0.00676 0.00157 2.809e-02 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. [95% Conf. Interval] var_cons 11227.50391 290.93843 10657.27506 11797.73276 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
The covariance (3.104) corresponds to a correlation of 0.680. It means that schools with larger mean write3 scores tend to have larger slopes.
To constrain the level 2 covariance to zero, the
clre
option is used. clre
is a matrix with three rows. Here, the matrix contains one column because there is only one coviariance to be contrained to zero. The first row of the matrix indicates the level of the covariance to be contrained to zero; here, level 2. The second and third rows indicate the covariance; here, the covariance between cons
and write3C
. clre
is passed to MLwiN as one of the list elements of estoptions
.clre = matrix(, nrow = 3, ncol = 1) clre[1, 1] = 2 clre[2, 1] = "cons" clre[3, 1] = "write3C" estoptions = list(EstM = 0, clre = clre) mod10b = runMLwiN(Formula = f10, levID = levID, D = D, indata = alnt, estoptions = estoptions, MLwiNPath = mlwin, workdir = tempdir())
MODEL 10b -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.56s Number of obs: 3097 Deviance statistic: 37817.2 --------------------------------------------------------------------------------------------------- The model formula: write7~(0|cons+write3C)+(1|cons)+(2|cons+write3C) Level 2: school Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] cons 744.25812 5.21915 142.60 0 *** 734.02877 754.48747 write3C 0.44040 0.02370 18.58 4.536e-77 *** 0.39395 0.48686 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Conf. Interval] var_cons 1.474e+03 315.63714 855.19228 2.092e+03 var_write3C 1.001e-02 0.00585 -0.00146 2.147e-02 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. [95% Conf. Interval] var_cons 11239.34082 291.29545 10668.41223 11810.26941 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
The only difference between this model and the previous one is in the variance-covariance structure. The variance-covariance structure for this model can be represented as:
\[ \begin{align}
\begin{bmatrix}
u_{0j}\\ u_{1j}
\end{bmatrix}
& \sim N(0,\; \Omega_u) \qquad\text{:}\qquad\Omega_u =
\begin{bmatrix}
\sigma^2_{u0} & \phantom{1} \\
0 & \sigma^2_{u1}
\end{bmatrix}\\\
\text{and } e_{ij} & \sim N(0,\; \sigma^2_e)
\end{align}
\]
Getting GitHub data into R
Christopher Gandrud's Source_GitHubData function makes it easy to get GitHub data into R. The URL for the ALNT data file is https://raw.github.com/SandyMuspratt/Blog-files/master/R2MLwiN%20I/ALNT.csv. Or, the shortened bitly URL is: http://bit.ly/10A5LBYThe following commands should get the ALNT data into your R session.
# Christopher Gandrud's source_GitHubData function source_GitHubData <-function(url, sep = ",", header = TRUE) { require(httr) request <- GET(url) stop_for_status(request) handle <- textConnection(content(request, as = 'text')) on.exit(close(handle)) read.table(handle, sep = sep, header = header) } url = "http://bit.ly/10A5LBY" alnt = source_GitHubData(url = url) names(alnt)
[1] "student" "school" "gender" "write3" "write7" "location" "dsi"
To leave a comment for the author, please follow the link and comment on their blog: Sandy Muspratt's R Blog.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.