Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In this post, I want to give a short overview of these new functions, which report different effect size measures. These are useful beyond significance tests (p-values), because they estimate the magnitude of effects, independent from sample size. sjstats provides following functions:
- eta_sq()
- omega_sq()
- cohens_f()
- anova_stats()
First, we need a sample model:
library(sjstats) # load sample data data(efc) # fit linear model fit <- aov( c12hour ~ as.factor(e42dep) + as.factor(c172code) + c160age, data = efc )
All functions accept objects of class aov
or anova
, so you can also use model fits from the car-package, which allows fitting Anova’s with different types of sum of squares. Other objects, like lm
, will be coerced to anova
internally.
The following functions return the effect size statistic as named numeric vector, using the model’s term names.
Eta Squared
The eta squared is the proportion of the total variability in the dependent variable that is accounted for by the variation in the independent variable. It is the ratio of the sum of squares for each group level to the total sum of squares. It can be interpreted as percentage of variance accounted for by a variable.
For variables with 1 degree of freedeom (in the numerator), the square root of eta squared is equal to the correlation coefficient r. For variables with more than 1 degree of freedom, eta squared equals R2. This makes eta squared easily interpretable. Furthermore, these effect sizes can easily be converted into effect size measures that can be, for instance, further processed in meta-analyses.
Eta squared can be computed simply with:
eta_sq(fit) #> as.factor(e42dep) as.factor(c172code) c160age #> 0.266114185 0.005399167 0.048441046
Partial Eta Squared
The partial eta squared value is the ratio of the sum of squares for each group level to the sum of squares for each group level plus the residual sum of squares. It is more difficult to interpret, because its value strongly depends on the variability of the residuals. Partial eta squared values should be reported with caution, and Levine and Hullett (2002) recommend reporting eta or omega squared rather than partial eta squared.
Use the partial
-argument to compute partial eta squared values:
eta_sq(fit, partial = TRUE) #> as.factor(e42dep) as.factor(c172code) c160age #> 0.281257128 0.007876882 0.066495448
Omega Squared
While eta squared estimates tend to be biased in certain situations, e.g. when the sample size is small or the independent variables have many group levels, omega squared estimates are corrected for this bias.
Omega squared can be simply computed with:
omega_sq(fit) #> as.factor(e42dep) as.factor(c172code) c160age #> 0.263453157 0.003765292 0.047586841
Cohen’s F
Finally, cohens_f()
computes Cohen’s F effect size for all independent variables in the model:
cohens_f(fit) #> as.factor(e42dep) as.factor(c172code) c160age #> 0.62555427 0.08910342 0.26689334
Complete Statistical Table Output
The anova_stats()
function takes a model input and computes a comprehensive summary, including the above effect size measures, returned as tidy data frame (as tibble, to be exact):
anova_stats(fit) #> # A tibble: 4 x 11 #> term df sumsq meansq statistic p.value etasq partial.etasq omegasq cohens.f power #> #> 1 as.factor(e42dep) 3 577756.33 192585.444 108.786 0.000 0.266 0.281 0.263 0.626 1.00 #> 2 as.factor(c172code) 2 11722.05 5861.024 3.311 0.037 0.005 0.008 0.004 0.089 0.63 #> 3 c160age 1 105169.60 105169.595 59.408 0.000 0.048 0.066 0.048 0.267 1.00 #> 4 Residuals 834 1476436.34 1770.307 NA NA NA NA NA NA NA
Like the other functions, the input may also be an object of class anova, so you can also use model fits from the car package, which allows fitting Anova’s with different types of sum of squares:
anova_stats(car::Anova(fit, type = 3)) #> # A tibble: 5 x 11 #> term sumsq meansq df statistic p.value etasq partial.etasq omegasq cohens.f power #> #> 1 (Intercept) 26851.070 26851.070 1 15.167 0.000 0.013 0.018 0.012 0.135 0.973 #> 2 as.factor(e42dep) 426461.571 142153.857 3 80.299 0.000 0.209 0.224 0.206 0.537 1.000 #> 3 as.factor(c172code) 7352.049 3676.025 2 2.076 0.126 0.004 0.005 0.002 0.071 0.429 #> 4 c160age 105169.595 105169.595 1 59.408 0.000 0.051 0.066 0.051 0.267 1.000 #> 5 Residuals 1476436.343 1770.307 834 NA NA NA NA NA NA NA
References
Levine TR, Hullet CR. Eta Squared, Partial Eta Squared, and Misreporting of Effect Size in Communication Research. Human Communication Research 28(4); 2002: 612-625
Tagged: anova, R, rstats
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.