Site icon R-bloggers

X is for By

[This article was first published on Deeply Trivial, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
X is for By Today’s post will be rather short, demonstrating a set of functions from the psych package, which allows you to conduct analysis by group. These commands add “By” to the end of existing functions. But first, a word of caution: With great power comes great responsibility. This function could very easily turn into a fishing expedition (also known as p-hacking). Conducting planned group comparisons is fine. Conducting all possible group comparisons and cherry-picking any differences is problematic. So use these group by functions with care.

Let’s pull up the Facebook dataset for this.

Facebook<-read.delim(file="full_facebook_set.txt", header=TRUE)

This is the full dataset, which includes all the variables I collected. I don’t want to run analyses on all variables, so I’ll pull out the ones most important for this blog post demonstration.

smallFB<-Facebook[,c(1:2,77:80,105:116,122,133:137,170,187)]

First, I’ll run descriptives on this smaller data frame by gender.

library(psych)

## Warning: package 'psych' was built under R version 3.4.4

describeBy(smallFB,smallFB$gender)

## 
##  Descriptive statistics by group 
## group: 0
##              vars  n      mean      sd   median   trimmed     mad      min
## RespondentId    1 73 164647.77 1711.78 164943.0 164587.37 2644.96 162373.0
## gender          2 73      0.00    0.00      0.0      0.00    0.00      0.0
## Rumination      3 73     37.66   14.27     37.0     37.41   13.34      8.0
## DepRelat        4 73     21.00    7.86     21.0     20.95    5.93      4.0
## Brood           5 73      8.49    3.76      9.0      8.42    2.97      1.0
## Reflect         6 73      8.16    4.44      8.0      8.24    4.45      0.0
## SavorPos        7 73     64.30   10.93     65.0     64.92    8.90     27.0
## SavorNeg        8 73     33.30   11.48     33.0     33.08   13.34     12.0
## SavorTot        9 73     31.00   20.15     34.0     31.15   19.27    -10.0
## AntPos         10 73     20.85    3.95     21.0     20.93    4.45     10.0
## AntNeg         11 73     11.30    4.23     11.0     11.22    4.45      4.0
## AntTot         12 73      9.55    6.90     10.0      9.31    7.41     -3.0
## MomPos         13 73     21.68    3.95     22.0     21.90    2.97      9.0
## MomNeg         14 73     11.45    4.63     11.0     11.41    5.93      4.0
## MomTot         15 73     10.23    7.63     11.0     10.36    8.90    -11.0
## RemPos         16 73     21.77    4.53     23.0     22.20    4.45      8.0
## RemNeg         17 73     10.55    4.39      9.0     10.27    4.45      4.0
## RemTot         18 73     11.22    8.05     14.0     11.68    7.41     -8.0
## LifeSat        19 73     24.63    6.80     25.0     24.93    7.41     10.0
## Extravert      20 73      4.32    1.58      4.5      4.33    1.48      1.5
## Agreeable      21 73      4.79    1.08      5.0      4.85    1.48      1.0
## Conscient      22 73      5.14    1.34      5.0      5.19    1.48      2.0
## EmotStab       23 73      5.10    1.22      5.0      5.15    1.48      1.0
## OpenExp        24 73      5.11    1.29      5.5      5.20    1.48      2.0
## Health         25 73     28.77   19.56     25.0     26.42   17.79      0.0
## Depression     26 73     10.26    7.27      9.0      9.56    5.93      0.0
##                 max  range  skew kurtosis     se
## RespondentId 168279 5906.0  0.21    -1.36 200.35
## gender            0    0.0   NaN      NaN   0.00
## Rumination       71   63.0  0.12    -0.53   1.67
## DepRelat         42   38.0  0.10    -0.04   0.92
## Brood            17   16.0  0.15    -0.38   0.44
## Reflect          19   19.0 -0.12    -0.69   0.52
## SavorPos         84   57.0 -0.69     0.76   1.28
## SavorNeg         57   45.0  0.14    -0.95   1.34
## SavorTot         72   82.0 -0.17    -0.75   2.36
## AntPos           28   18.0 -0.24    -0.46   0.46
## AntNeg           22   18.0  0.27    -0.55   0.49
## AntTot           24   27.0  0.11    -0.76   0.81
## MomPos           28   19.0 -0.69     0.55   0.46
## MomNeg           22   18.0  0.08    -0.98   0.54
## MomTot           24   35.0 -0.25    -0.55   0.89
## RemPos           28   20.0 -0.88     0.35   0.53
## RemNeg           22   18.0  0.56    -0.66   0.51
## RemTot           24   32.0 -0.53    -0.77   0.94
## LifeSat          35   25.0 -0.37    -0.84   0.80
## Extravert         7    5.5 -0.09    -0.93   0.19
## Agreeable         7    6.0 -0.60     1.04   0.13
## Conscient         7    5.0 -0.24    -0.98   0.16
## EmotStab          7    6.0 -0.60     0.28   0.14
## OpenExp           7    5.0 -0.49    -0.55   0.15
## Health           91   91.0  1.13     1.14   2.29
## Depression       36   36.0  1.02     0.95   0.85
## -------------------------------------------------------- 
## group: 1
##              vars   n      mean      sd    median   trimmed     mad
## RespondentId    1 184 164373.49 1515.34 164388.00 164253.72 1891.80
## gender          2 184      1.00    0.00      1.00      1.00    0.00
## Rumination      3 184     38.09   15.28     40.00     38.16   17.05
## DepRelat        4 184     21.67    8.78     21.00     21.66    8.90
## Brood           5 184      8.57    4.14      8.50      8.47    3.71
## Reflect         6 184      7.84    4.06      8.00      7.73    4.45
## SavorPos        7 184     67.22    9.63     68.00     67.71    8.90
## SavorNeg        8 184     29.75   11.62     27.50     28.72    9.64
## SavorTot        9 184     37.47   19.30     40.00     38.66   20.02
## AntPos         10 184     22.18    3.37     23.00     22.28    2.97
## AntNeg         11 184     10.10    4.44      9.00      9.78    4.45
## AntTot         12 184     12.08    6.85     14.00     12.36    5.93
## MomPos         13 184     22.28    3.88     23.00     22.59    2.97
## MomNeg         14 184     10.60    4.88      9.50     10.13    5.19
## MomTot         15 184     11.68    7.75     13.00     12.29    7.41
## RemPos         16 184     22.76    3.85     23.00     23.10    2.97
## RemNeg         17 184      9.05    3.79      8.00      8.68    2.97
## RemTot         18 184     13.71    6.97     15.00     14.34    5.93
## LifeSat        19 184     23.76    6.25     24.00     24.18    7.41
## Extravert      20 184      4.66    1.57      5.00      4.74    1.48
## Agreeable      21 184      5.22    1.06      5.50      5.26    1.48
## Conscient      22 184      5.32    1.24      5.50      5.42    1.48
## EmotStab       23 184      4.70    1.31      4.75      4.75    1.11
## OpenExp        24 184      5.47    1.08      5.50      5.56    0.74
## Health         25 184     32.54   16.17     30.00     31.43   16.31
## Depression     26 184     12.19    8.48      9.00     11.09    5.93
##                   min    max  range  skew kurtosis     se
## RespondentId 162350.0 167714 5364.0  0.46    -0.90 111.71
## gender            1.0      1    0.0   NaN      NaN   0.00
## Rumination        3.0     74   71.0 -0.05    -0.60   1.13
## DepRelat          0.0     42   42.0  0.00    -0.46   0.65
## Brood             0.0     19   19.0  0.19    -0.62   0.31
## Reflect           0.0     19   19.0  0.25    -0.48   0.30
## SavorPos         33.0     84   51.0 -0.59     0.36   0.71
## SavorNeg         12.0     64   52.0  0.79     0.25   0.86
## SavorTot        -18.0     72   90.0 -0.57    -0.10   1.42
## AntPos            9.0     28   19.0 -0.49     0.41   0.25
## AntNeg            4.0     22   18.0  0.63    -0.39   0.33
## AntTot           -8.0     24   32.0 -0.43    -0.48   0.50
## MomPos           10.0     28   18.0 -0.81     0.54   0.29
## MomNeg            4.0     24   20.0  0.81    -0.03   0.36
## MomTot          -13.0     24   37.0 -0.69    -0.03   0.57
## RemPos            9.0     28   19.0 -0.87     0.81   0.28
## RemNeg            4.0     21   17.0  0.83     0.33   0.28
## RemTot           -9.0     24   33.0 -0.82     0.50   0.51
## LifeSat           8.0     35   27.0 -0.53    -0.32   0.46
## Extravert         1.0      7    6.0 -0.36    -0.72   0.12
## Agreeable         2.5      7    4.5 -0.27    -0.63   0.08
## Conscient         1.0      7    6.0 -0.70     0.13   0.09
## EmotStab          1.5      7    5.5 -0.35    -0.73   0.10
## OpenExp           1.5      7    5.5 -0.91     0.62   0.08
## Health            2.0     85   83.0  0.60    -0.05   1.19
## Depression        0.0     39   39.0  1.14     0.66   0.62

In this dataset, I coded men as 0 and women as 1. The descriptive statistics table generated includes all scale and subscale scores, and gives me mean, standard deviation, median, a trimmed mean (dropping very low and very high values), median absolute deviation, minimum and maximum values, range, skewness, and kurtosis. I’d need to run t-tests to find out if differences were significant, but this still gives me some idea of how men and women might differ on these measures.

There are certain measures I included that we might hypothesize would show gender differences. For instance, some research suggests gender differences for rumination and depression. In addition to running descriptives by group, I might also want to display these differences in a violin plot. The psych package can quickly generate such a plot by group.

violinBy(smallFB,"Rumination","gender",grp.name=c("M","F"))
violinBy(smallFB,"Depression","gender",grp.name=c("M","F"))

ggplot2 will generate a violin plot by group, so this feature might not be as useful for final displays, but could help in quickly visualizing the data during analysis. And you may find that you prefer the appearance of this plots. To each his own.

Another function is error.bars.by, which plots means and confidence intervals by group for multiple variables. Again, this is a way to get some quick visuals, though differences in scale among measures should be taken into consideration when generating this plot. One set of variables for which this display might be useful is the 5 subscales of the Five-Factor Personality Inventory. This 10-item measure assesses where participants fall on the so-called Big Five personality traits: Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism (Emotional Stability). These subscales are all on the same metric.

error.bars.by(smallFB[,c(20:24)],group=smallFB$gender,xlab="Big Five Personality Traits",ylab="Score on Subscale")

Finally, we have the statsBy function, which gives descriptive statistics by group as well as between group statistics. This functions generates a lot of output, and you can read more about everything it gives you here.
FBstats<-statsBy(smallFB[,2:26],"gender",cors=TRUE,method="pearson",use="pairwise")
print(FBstats,short=FALSE)

## Statistics within and between groups  
## Call: statsBy(data = smallFB[, 2:26], group = "gender", cors = TRUE, 
##     method = "pearson", use = "pairwise")
## Intraclass Correlation 1 (Percentage of variance due to groups) 
##     gender Rumination   DepRelat      Brood    Reflect   SavorPos 
##       1.00      -0.01      -0.01      -0.01      -0.01       0.03 
##   SavorNeg   SavorTot     AntPos     AntNeg     AntTot     MomPos 
##       0.03       0.04       0.05       0.02       0.05       0.00 
##     MomNeg     MomTot     RemPos     RemNeg     RemTot    LifeSat 
##       0.00       0.01       0.02       0.05       0.04       0.00 
##  Extravert  Agreeable  Conscient   EmotStab    OpenExp     Health 
##       0.01       0.05       0.00       0.03       0.03       0.01 
## Depression 
##       0.01 
## Intraclass Correlation 2 (Reliability of group differences) 
##     gender Rumination   DepRelat      Brood    Reflect   SavorPos 
##       1.00     -22.34      -2.06     -50.93      -2.21       0.77 
##   SavorNeg   SavorTot     AntPos     AntNeg     AntTot     MomPos 
##       0.80       0.83       0.86       0.75       0.86       0.19 
##     MomNeg     MomTot     RemPos     RemNeg     RemTot    LifeSat 
##       0.39       0.46       0.68       0.87       0.84      -0.04 
##  Extravert  Agreeable  Conscient   EmotStab    OpenExp     Health 
##       0.60       0.88       0.05       0.80       0.81       0.60 
## Depression 
##       0.66 
## eta^2 between groups  
## Rumination.bg   DepRelat.bg      Brood.bg    Reflect.bg   SavorPos.bg 
##          0.00          0.00          0.00          0.00          0.02 
##   SavorNeg.bg   SavorTot.bg     AntPos.bg     AntNeg.bg     AntTot.bg 
##          0.02          0.02          0.03          0.02          0.03 
##     MomPos.bg     MomNeg.bg     MomTot.bg     RemPos.bg     RemNeg.bg 
##          0.00          0.01          0.01          0.01          0.03 
##     RemTot.bg    LifeSat.bg  Extravert.bg  Agreeable.bg  Conscient.bg 
##          0.02          0.00          0.01          0.03          0.00 
##   EmotStab.bg    OpenExp.bg     Health.bg Depression.bg 
##          0.02          0.02          0.01          0.01 
## Correlation between groups 
##               Rmnt. DpRl. Brd.b Rflc. SvrP. SvrN. SvrT. AntP. AntN. AntT.
## Rumination.bg  1                                                         
## DepRelat.bg    1     1                                                   
## Brood.bg       1     1     1                                             
## Reflect.bg    -1    -1    -1     1                                       
## SavorPos.bg    1     1     1    -1     1                                 
## SavorNeg.bg   -1    -1    -1     1    -1     1                           
## SavorTot.bg    1     1     1    -1     1    -1     1                     
## AntPos.bg      1     1     1    -1     1    -1     1     1               
## AntNeg.bg     -1    -1    -1     1    -1     1    -1    -1     1         
## AntTot.bg      1     1     1    -1     1    -1     1     1    -1     1   
## MomPos.bg      1     1     1    -1     1    -1     1     1    -1     1   
## MomNeg.bg     -1    -1    -1     1    -1     1    -1    -1     1    -1   
## MomTot.bg      1     1     1    -1     1    -1     1     1    -1     1   
## RemPos.bg      1     1     1    -1     1    -1     1     1    -1     1   
## RemNeg.bg     -1    -1    -1     1    -1     1    -1    -1     1    -1   
## RemTot.bg      1     1     1    -1     1    -1     1     1    -1     1   
## LifeSat.bg    -1    -1    -1     1    -1     1    -1    -1     1    -1   
## Extravert.bg   1     1     1    -1     1    -1     1     1    -1     1   
## Agreeable.bg   1     1     1    -1     1    -1     1     1    -1     1   
## Conscient.bg   1     1     1    -1     1    -1     1     1    -1     1   
## EmotStab.bg   -1    -1    -1     1    -1     1    -1    -1     1    -1   
## OpenExp.bg     1     1     1    -1     1    -1     1     1    -1     1   
## Health.bg      1     1     1    -1     1    -1     1     1    -1     1   
## Depression.bg  1     1     1    -1     1    -1     1     1    -1     1   
##               MmPs. MmNg. MmTt. RmPs. RmNg. RmTt. LfSt. Extr. Agrb. Cnsc.
## MomPos.bg      1                                                         
## MomNeg.bg     -1     1                                                   
## MomTot.bg      1    -1     1                                             
## RemPos.bg      1    -1     1     1                                       
## RemNeg.bg     -1     1    -1    -1     1                                 
## RemTot.bg      1    -1     1     1    -1     1                           
## LifeSat.bg    -1     1    -1    -1     1    -1     1                     
## Extravert.bg   1    -1     1     1    -1     1    -1     1               
## Agreeable.bg   1    -1     1     1    -1     1    -1     1     1         
## Conscient.bg   1    -1     1     1    -1     1    -1     1     1     1   
## EmotStab.bg   -1     1    -1    -1     1    -1     1    -1    -1    -1   
## OpenExp.bg     1    -1     1     1    -1     1    -1     1     1     1   
## Health.bg      1    -1     1     1    -1     1    -1     1     1     1   
## Depression.bg  1    -1     1     1    -1     1    -1     1     1     1   
##               EmtS. OpnE. Hlth. Dprs.
## EmotStab.bg    1                     
## OpenExp.bg    -1     1               
## Health.bg     -1     1     1         
## Depression.bg -1     1     1     1   
## Correlation within groups 
##               Rmnt. DpRl. Brd.w Rflc. SvrP. SvrN. SvrT. AntP. AntN. AntT.
## Rumination.wg  1.00                                                      
## DepRelat.wg    0.95  1.00                                                
## Brood.wg       0.88  0.78  1.00                                          
## Reflect.wg     0.80  0.63  0.59  1.00                                    
## SavorPos.wg   -0.20 -0.20 -0.18 -0.15  1.00                              
## SavorNeg.wg    0.43  0.43  0.36  0.30 -0.64  1.00                        
## SavorTot.wg   -0.36 -0.36 -0.31 -0.25  0.89 -0.92  1.00                  
## AntPos.wg     -0.06 -0.05 -0.08 -0.03  0.86 -0.49  0.73  1.00            
## AntNeg.wg      0.32  0.32  0.28  0.21 -0.54  0.89 -0.80 -0.50  1.00      
## AntTot.wg     -0.23 -0.23 -0.21 -0.15  0.78 -0.82  0.89  0.83 -0.89  1.00
## MomPos.wg     -0.26 -0.26 -0.22 -0.19  0.86 -0.60  0.80  0.60 -0.47  0.61
## MomNeg.wg      0.46  0.46  0.39  0.35 -0.51  0.88 -0.78 -0.33  0.66 -0.59
## MomTot.wg     -0.42 -0.42 -0.36 -0.32  0.75 -0.85  0.89  0.51 -0.65  0.68
## RemPos.wg     -0.20 -0.19 -0.17 -0.15  0.89 -0.56  0.79  0.66 -0.44  0.62
## RemNeg.wg      0.34  0.35  0.28  0.23 -0.65  0.87 -0.85 -0.49  0.69 -0.69
## RemTot.wg     -0.29 -0.30 -0.25 -0.21  0.85 -0.79  0.90  0.63 -0.62  0.72
## LifeSat.wg    -0.47 -0.47 -0.43 -0.31  0.54 -0.50  0.57  0.39 -0.33  0.41
## Extravert.wg  -0.20 -0.19 -0.11 -0.20  0.34 -0.35  0.38  0.21 -0.29  0.29
## Agreeable.wg  -0.18 -0.18 -0.20 -0.10  0.35 -0.45  0.45  0.28 -0.39  0.39
## Conscient.wg  -0.25 -0.30 -0.20 -0.10  0.24 -0.21  0.25  0.16 -0.14  0.17
## EmotStab.wg   -0.48 -0.44 -0.49 -0.34  0.34 -0.44  0.43  0.20 -0.33  0.32
## OpenExp.wg    -0.16 -0.14 -0.21 -0.10  0.37 -0.31  0.37  0.27 -0.27  0.31
## Health.wg      0.44  0.47  0.36  0.29 -0.30  0.34 -0.35 -0.21  0.26 -0.27
## Depression.wg  0.57  0.58  0.49  0.38 -0.44  0.55 -0.55 -0.27  0.39 -0.39
##               MmPs. MmNg. MmTt. RmPs. RmNg. RmTt. LfSt. Extr. Agrb. Cnsc.
## MomPos.wg      1.00                                                      
## MomNeg.wg     -0.56  1.00                                                
## MomTot.wg      0.86 -0.91  1.00                                          
## RemPos.wg      0.65 -0.42  0.59  1.00                                    
## RemNeg.wg     -0.55  0.63 -0.67 -0.65  1.00                              
## RemTot.wg      0.66 -0.58  0.69  0.91 -0.91  1.00                        
## LifeSat.wg     0.55 -0.55  0.62  0.48 -0.42  0.49  1.00                  
## Extravert.wg   0.39 -0.37  0.43  0.28 -0.25  0.29  0.27  1.00            
## Agreeable.wg   0.33 -0.43  0.43  0.31 -0.36  0.37  0.25  0.12  1.00      
## Conscient.wg   0.25 -0.16  0.22  0.23 -0.26  0.26  0.33  0.03  0.29  1.00
## EmotStab.wg    0.40 -0.50  0.51  0.27 -0.32  0.32  0.44  0.12  0.41  0.27
## OpenExp.wg     0.39 -0.26  0.36  0.30 -0.28  0.32  0.34  0.29  0.36  0.14
## Health.wg     -0.30  0.33 -0.36 -0.27  0.29 -0.31 -0.42 -0.10 -0.25 -0.24
## Depression.wg -0.45  0.56 -0.58 -0.41  0.49 -0.50 -0.65 -0.24 -0.29 -0.26
##               EmtS. OpnE. Hlth. Dprs.
## EmotStab.wg    1.00                  
## OpenExp.wg     0.24  1.00            
## Health.wg     -0.31 -0.18  1.00      
## Depression.wg -0.54 -0.28  0.56  1.00
## 
## Many results are not shown directly. To see specific objects select from the following list:
##  mean sd n F ICC1 ICC2 ci1 ci2 r within pooled sd.r raw rbg pbg rwg nw pwg etabg etawg nwg nG Call

The variance explained by gender is quite small for all of the variables. Instead, the relationships between the variables seem to be more meaningful.

A to Z is almost done! Just Y and Z, plus look for an A-to-Z-influenced Statistics Sunday post!

To leave a comment for the author, please follow the link and comment on their blog: Deeply Trivial.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.