Site icon R-bloggers

L is for Latent Variable Path Analysis

[This article was first published on Deeply Trivial, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
LVPAFor the letter F, I introduced the lavaan package with confirmatory factor analysis. You may have noticed, during my video on interpreting output that there are two functions for analysis: cfa and sem. When the model you specify is a confirmatory factor analysis, it doesn’t really matter which of these you use, because the results will be a CFA. But there are other models you can specify, which is where the sem function becomes useful.

One of those models is latent variable path analysis, or LVPA for short. This analysis technique combines path analysis, where you specify causal relationships between variables, and confirmatory factor analysis, where combinations of observed variables are used to measure a latent variable or factor. So LVPA allows you to specify which observed variables measure which factors, as well as causal relationships between those factors.

To demonstrate, I’ll conduct an analysis I demonstrated in my B is for Beta post; in that post, I used linear regression to demonstrate the predictive relationship between rumination and depression. I simply used total score from my rumination and depression measures, but I could have conducted an LVPA instead, allowing each of the items from these measures to load onto those factors. In fact, this analysis technique is very similar to regression, and is sometimes called “structural regression”.

Once again, we’ll load our Facebook dataset – as a reminder, you can access a simulated version of this dataset (along with a simple codebook). Then we’ll load the lavaan package.

Facebook<-read.delim(file="small_facebook_set.txt", header=TRUE)
library(lavaan)

## This is lavaan 0.5-23.1097

## lavaan is BETA software! Please report any bugs.

Next, we’ll create our models. For the factors, we’ll use the syntax we used previously, where we specify the name of the factor, =~, then the names of the variables loading onto that factor. To specify causal relationships between factors, the factor being caused goes first (endogenous latent variable), followed by the ~ symbol, then the causal factor or factors (exogenous latent variable(s)).

Rum_Dep<-'
Depression =~ Dep1 + Dep2 + Dep3 + Dep4 + Dep5 + Dep6 + Dep7 + Dep8 +
              Dep9 + Dep10 + Dep11 + Dep12 + Dep13 + Dep14 + Dep15 + Dep16
Rumination =~ Rum1 + Rum2 + Rum3 + Rum4 + Rum5 + Rum6 + Rum7 + Rum8 + Rum9 +
              Rum10 + Rum11 + Rum12 + Rum13 + Rum14 + Rum15 + Rum16 +
              Rum17 + Rum18 + Rum19 + Rum20 + Rum21 + Rum22
Depression ~ Rumination
'
RD_Fit<-sem(Rum_Dep, data=Facebook)
summary(RD_Fit, standardized=TRUE, fit.measures=TRUE)

## lavaan (0.5-23.1097) converged normally after  41 iterations
## 
##   Number of observations                           257
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic             1689.041
##   Degrees of freedom                               664
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             5182.076
##   Degrees of freedom                               703
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.771
##   Tucker-Lewis Index (TLI)                       0.758
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -11766.888
##   Loglikelihood unrestricted model (H1)     -10922.367
## 
##   Number of free parameters                         77
##   Akaike (AIC)                               23687.775
##   Bayesian (BIC)                             23961.054
##   Sample-size adjusted Bayesian (BIC)        23716.941
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.078
##   90 Percent Confidence Interval          0.073  0.082
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.076
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Depression =~                                                         
##     Dep1              1.000                               0.451    0.583
##     Dep2              0.806    0.123    6.565    0.000    0.364    0.469
##     Dep3              1.395    0.156    8.944    0.000    0.630    0.706
##     Dep4              0.784    0.149    5.243    0.000    0.354    0.361
##     Dep5              0.799    0.142    5.619    0.000    0.361    0.390
##     Dep6              1.614    0.160   10.084    0.000    0.729    0.856
##     Dep7              0.901    0.145    6.204    0.000    0.407    0.438
##     Dep8              1.207    0.140    8.601    0.000    0.545    0.667
##     Dep9              0.734    0.126    5.817    0.000    0.332    0.406
##     Dep10             1.401    0.157    8.902    0.000    0.633    0.701
##     Dep11             0.849    0.116    7.292    0.000    0.383    0.534
##     Dep12             1.106    0.142    7.818    0.000    0.499    0.585
##     Dep13             0.826    0.113    7.324    0.000    0.373    0.537
##     Dep14             1.420    0.142    9.976    0.000    0.641    0.840
##     Dep15             1.161    0.140    8.270    0.000    0.524    0.631
##     Dep16             0.980    0.136    7.194    0.000    0.442    0.525
##   Rumination =~                                                         
##     Rum1              1.000                               0.609    0.596
##     Rum2              0.834    0.120    6.967    0.000    0.508    0.492
##     Rum3              0.787    0.118    6.642    0.000    0.479    0.465
##     Rum4              0.899    0.120    7.492    0.000    0.548    0.538
##     Rum5              1.071    0.141    7.624    0.000    0.653    0.550
##     Rum6              1.100    0.133    8.283    0.000    0.671    0.612
##     Rum7              1.158    0.140    8.301    0.000    0.706    0.613
##     Rum8              1.133    0.136    8.341    0.000    0.691    0.617
##     Rum9              1.043    0.130    8.040    0.000    0.635    0.588
##     Rum10             1.145    0.134    8.526    0.000    0.698    0.635
##     Rum11             1.055    0.134    7.885    0.000    0.643    0.574
##     Rum12             0.564    0.115    4.891    0.000    0.343    0.329
##     Rum13             0.788    0.108    7.282    0.000    0.480    0.519
##     Rum14             1.137    0.128    8.872    0.000    0.693    0.671
##     Rum15             1.282    0.143    8.968    0.000    0.781    0.681
##     Rum16             1.363    0.142    9.581    0.000    0.830    0.748
##     Rum17             1.271    0.136    9.356    0.000    0.775    0.723
##     Rum18             1.256    0.137    9.170    0.000    0.765    0.702
##     Rum19             1.168    0.129    9.069    0.000    0.711    0.692
##     Rum20             1.338    0.142    9.404    0.000    0.815    0.728
##     Rum21             0.978    0.130    7.531    0.000    0.596    0.542
##     Rum22             1.248    0.136    9.153    0.000    0.760    0.701
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Depression ~                                                          
##     Rumination        0.434    0.068    6.396    0.000    0.586    0.586
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Dep1              0.395    0.036   10.859    0.000    0.395    0.660
##    .Dep2              0.469    0.042   11.076    0.000    0.469    0.780
##    .Dep3              0.399    0.038   10.413    0.000    0.399    0.501
##    .Dep4              0.836    0.075   11.198    0.000    0.836    0.870
##    .Dep5              0.724    0.065   11.170    0.000    0.724    0.848
##    .Dep6              0.193    0.022    8.777    0.000    0.193    0.267
##    .Dep7              0.696    0.063   11.117    0.000    0.696    0.808
##    .Dep8              0.370    0.035   10.593    0.000    0.370    0.555
##    .Dep9              0.556    0.050   11.154    0.000    0.556    0.835
##    .Dep10             0.413    0.040   10.438    0.000    0.413    0.508
##    .Dep11             0.368    0.034   10.967    0.000    0.368    0.715
##    .Dep12             0.480    0.044   10.856    0.000    0.480    0.658
##    .Dep13             0.343    0.031   10.961    0.000    0.343    0.711
##    .Dep14             0.171    0.019    9.098    0.000    0.171    0.294
##    .Dep15             0.414    0.039   10.723    0.000    0.414    0.601
##    .Dep16             0.514    0.047   10.984    0.000    0.514    0.724
##    .Rum1              0.675    0.062   10.917    0.000    0.675    0.645
##    .Rum2              0.807    0.073   11.092    0.000    0.807    0.758
##    .Rum3              0.833    0.075   11.126    0.000    0.833    0.784
##    .Rum4              0.737    0.067   11.026    0.000    0.737    0.710
##    .Rum5              0.983    0.089   11.006    0.000    0.983    0.698
##    .Rum6              0.752    0.069   10.881    0.000    0.752    0.626
##    .Rum7              0.826    0.076   10.877    0.000    0.826    0.624
##    .Rum8              0.775    0.071   10.867    0.000    0.775    0.619
##    .Rum9              0.763    0.070   10.933    0.000    0.763    0.654
##    .Rum10             0.718    0.066   10.820    0.000    0.718    0.596
##    .Rum11             0.842    0.077   10.962    0.000    0.842    0.671
##    .Rum12             0.971    0.086   11.243    0.000    0.971    0.892
##    .Rum13             0.624    0.056   11.054    0.000    0.624    0.730
##    .Rum14             0.587    0.055   10.713    0.000    0.587    0.550
##    .Rum15             0.706    0.066   10.677    0.000    0.706    0.536
##    .Rum16             0.542    0.052   10.366    0.000    0.542    0.440
##    .Rum17             0.548    0.052   10.502    0.000    0.548    0.478
##    .Rum18             0.601    0.057   10.594    0.000    0.601    0.507
##    .Rum19             0.552    0.052   10.637    0.000    0.552    0.522
##    .Rum20             0.589    0.056   10.475    0.000    0.589    0.470
##    .Rum21             0.855    0.078   11.020    0.000    0.855    0.707
##    .Rum22             0.600    0.057   10.601    0.000    0.600    0.509
##    .Depression        0.134    0.027    4.880    0.000    0.657    0.657
##     Rumination        0.371    0.072    5.123    0.000    1.000    1.000

The factor analysis results are interpreted in the same way as before. The only difference is that we also have a path coefficient between rumination and depression, which describes the numerical strength of the relationship between the two variables. These results confirm our regression results, that rumination has a strong predictive relationship with depression, which can see from our standardized path coefficient, 0.586. But the model also shows some signs of poor fit, based on our CFI and TLI, both less than 0.9, as well as a slightly elevated RMSEA (greater than our cutoff of 0.07).

How could we potentially improve this model? In the Beta post, we also conducted a separate regression where we broke Rumination down into its 3 subscales: Depression-Related Rumination, Brooding, and Reflecting. We could try conducting another LVPA where we use those 3 subscales, instead a single factor of Rumination.

So let’s create another model including 4 factors: Depression (using the CESD items), Depression-Related Rumination, Brooding, and Reflecting. We’ll then add causal paths between these 3 Rumination constructs and Depression.

Rum3_Dep<-'
Depression =~ Dep1 + Dep2 + Dep3 + Dep4 + Dep5 + Dep6 + Dep7 + Dep8 +
              Dep9 + Dep10 + Dep11 + Dep12 + Dep13 + Dep14 + Dep15 + Dep16
DRR =~ Rum1 + Rum2 + Rum3 + Rum4 + Rum6 + Rum8 + Rum9 + Rum14 + Rum17 + Rum18 + 
              Rum19 + Rum22
Reflecting =~ Rum7 + Rum11 + Rum12 + Rum20 + Rum21
Brooding =~ Rum5 + Rum10 + Rum13 + Rum15 + Rum16
Depression ~ DRR + Reflecting + Brooding
'
RD3<-sem(Rum3_Dep, data=Facebook)
summary(RD3, standardized=TRUE, fit.measures=TRUE)

## lavaan (0.5-23.1097) converged normally after  61 iterations
## 
##   Number of observations                           257
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic             1566.635
##   Degrees of freedom                               659
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             5182.076
##   Degrees of freedom                               703
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.797
##   Tucker-Lewis Index (TLI)                       0.784
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -11705.685
##   Loglikelihood unrestricted model (H1)     -10922.367
## 
##   Number of free parameters                         82
##   Akaike (AIC)                               23575.369
##   Bayesian (BIC)                             23866.394
##   Sample-size adjusted Bayesian (BIC)        23606.429
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.073
##   90 Percent Confidence Interval          0.069  0.078
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.073
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Depression =~                                                         
##     Dep1              1.000                               0.451    0.583
##     Dep2              0.807    0.123    6.569    0.000    0.364    0.469
##     Dep3              1.396    0.156    8.932    0.000    0.629    0.706
##     Dep4              0.783    0.150    5.234    0.000    0.353    0.360
##     Dep5              0.809    0.143    5.670    0.000    0.365    0.395
##     Dep6              1.616    0.160   10.075    0.000    0.729    0.856
##     Dep7              0.906    0.146    6.223    0.000    0.408    0.440
##     Dep8              1.206    0.141    8.584    0.000    0.544    0.666
##     Dep9              0.735    0.126    5.815    0.000    0.331    0.406
##     Dep10             1.405    0.158    8.903    0.000    0.633    0.702
##     Dep11             0.846    0.116    7.263    0.000    0.381    0.532
##     Dep12             1.108    0.142    7.814    0.000    0.500    0.585
##     Dep13             0.825    0.113    7.307    0.000    0.372    0.536
##     Dep14             1.421    0.143    9.964    0.000    0.641    0.840
##     Dep15             1.159    0.141    8.248    0.000    0.523    0.630
##     Dep16             0.986    0.137    7.221    0.000    0.445    0.528
##   DRR =~                                                                
##     Rum1              1.000                               0.617    0.603
##     Rum2              0.833    0.118    7.036    0.000    0.514    0.498
##     Rum3              0.812    0.118    6.898    0.000    0.501    0.486
##     Rum4              0.928    0.119    7.779    0.000    0.573    0.562
##     Rum6              1.117    0.132    8.486    0.000    0.689    0.628
##     Rum8              1.135    0.134    8.461    0.000    0.700    0.626
##     Rum9              1.052    0.128    8.201    0.000    0.649    0.601
##     Rum14             1.132    0.126    8.971    0.000    0.699    0.676
##     Rum17             1.238    0.133    9.319    0.000    0.764    0.713
##     Rum18             1.236    0.134    9.199    0.000    0.763    0.700
##     Rum19             1.174    0.127    9.234    0.000    0.724    0.704
##     Rum22             1.238    0.134    9.236    0.000    0.764    0.704
##   Reflecting =~                                                         
##     Rum7              1.000                               0.841    0.731
##     Rum11             0.907    0.089   10.149    0.000    0.763    0.681
##     Rum12             0.550    0.083    6.609    0.000    0.462    0.443
##     Rum20             1.073    0.090   11.856    0.000    0.903    0.806
##     Rum21             0.872    0.088    9.937    0.000    0.734    0.667
##   Brooding =~                                                           
##     Rum5              1.000                               0.676    0.570
##     Rum10             1.086    0.132    8.229    0.000    0.734    0.669
##     Rum13             0.705    0.103    6.835    0.000    0.477    0.516
##     Rum15             1.229    0.142    8.659    0.000    0.831    0.725
##     Rum16             1.332    0.144    9.243    0.000    0.901    0.812
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Depression ~                                                          
##     DRR               0.748    0.245    3.050    0.002    1.024    1.024
##     Reflecting       -0.067    0.068   -0.975    0.329   -0.124   -0.124
##     Brooding         -0.223    0.188   -1.186    0.235   -0.335   -0.335
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   DRR ~~                                                                
##     Reflecting        0.413    0.062    6.691    0.000    0.796    0.796
##     Brooding          0.386    0.061    6.288    0.000    0.924    0.924
##   Reflecting ~~                                                         
##     Brooding          0.420    0.068    6.214    0.000    0.738    0.738
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Dep1              0.396    0.036   10.862    0.000    0.396    0.660
##    .Dep2              0.469    0.042   11.076    0.000    0.469    0.780
##    .Dep3              0.399    0.038   10.419    0.000    0.399    0.502
##    .Dep4              0.836    0.075   11.199    0.000    0.836    0.870
##    .Dep5              0.721    0.065   11.166    0.000    0.721    0.844
##    .Dep6              0.193    0.022    8.779    0.000    0.193    0.267
##    .Dep7              0.695    0.063   11.115    0.000    0.695    0.807
##    .Dep8              0.371    0.035   10.600    0.000    0.371    0.556
##    .Dep9              0.556    0.050   11.154    0.000    0.556    0.835
##    .Dep10             0.412    0.039   10.436    0.000    0.412    0.507
##    .Dep11             0.369    0.034   10.973    0.000    0.369    0.717
##    .Dep12             0.480    0.044   10.857    0.000    0.480    0.658
##    .Dep13             0.344    0.031   10.965    0.000    0.344    0.713
##    .Dep14             0.171    0.019    9.108    0.000    0.171    0.295
##    .Dep15             0.416    0.039   10.730    0.000    0.416    0.604
##    .Dep16             0.512    0.047   10.980    0.000    0.512    0.721
##    .Rum1              0.666    0.062   10.824    0.000    0.666    0.636
##    .Rum2              0.802    0.073   11.042    0.000    0.802    0.752
##    .Rum3              0.812    0.073   11.060    0.000    0.812    0.764
##    .Rum4              0.709    0.065   10.922    0.000    0.709    0.684
##    .Rum6              0.728    0.068   10.751    0.000    0.728    0.605
##    .Rum8              0.762    0.071   10.758    0.000    0.762    0.608
##    .Rum9              0.745    0.069   10.829    0.000    0.745    0.639
##    .Rum14             0.578    0.055   10.577    0.000    0.578    0.542
##    .Rum17             0.565    0.054   10.404    0.000    0.565    0.492
##    .Rum18             0.605    0.058   10.470    0.000    0.605    0.510
##    .Rum19             0.534    0.051   10.451    0.000    0.534    0.505
##    .Rum22             0.594    0.057   10.451    0.000    0.594    0.504
##    .Rum7              0.616    0.067    9.208    0.000    0.616    0.465
##    .Rum11             0.673    0.069    9.743    0.000    0.673    0.536
##    .Rum12             0.876    0.080   10.894    0.000    0.876    0.804
##    .Rum20             0.439    0.056    7.885    0.000    0.439    0.350
##    .Rum21             0.672    0.068    9.866    0.000    0.672    0.555
##    .Rum5              0.952    0.089   10.652    0.000    0.952    0.675
##    .Rum10             0.665    0.065   10.167    0.000    0.665    0.552
##    .Rum13             0.627    0.058   10.822    0.000    0.627    0.734
##    .Rum15             0.624    0.064    9.720    0.000    0.624    0.475
##    .Rum16             0.420    0.050    8.399    0.000    0.420    0.341
##    .Depression        0.122    0.026    4.626    0.000    0.599    0.599
##     DRR               0.380    0.074    5.176    0.000    1.000    1.000
##     Reflecting        0.708    0.111    6.406    0.000    1.000    1.000
##     Brooding          0.457    0.097    4.734    0.000    1.000    1.000

Fit measures are about the same as before – once again, this could be because we’re measuring clinical constructs in a non-clinical sample – but let’s skip those and look at our path coefficients. As we found in the regression, Depression-Related Rumination has a significant relationship with Depression; Reflecting and Brooding do not. So we could simplify our model by dropping those path coefficients – if we wanted. Personally, I would leave them in as further evidence that the kind of rumination most strongly related to depression is rumination that fixates on one’s negative traits and feelings. Reflecting on feelings, or being morose in general, don’t seem to contribute – at least not above and beyond the first kind of rumination.

On Sunday, we’ll dig into fit measures – how they’re calculated and what they mean – so check back then! And tomorrow in A to Z, R Markdown Files, which I’ve been using this month to create most of my posts.

To leave a comment for the author, please follow the link and comment on their blog: Deeply Trivial.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.