Reveal the stories behind those Likert-type data

[This article was first published on R-Blog on Data modelling to develop ..., and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

This blog is about two new functions, Model_factors and garrett_ranking that have been added to the Dyn4cast package. The two functions provides means for gaining deeper insights into the meaning behind Likert-type variables collected from respondents. Garrett ranking provides the ranks of the observations of the variables based on the level of seriousness attached to it by the respondents. On the other hand, Model factors determines and retrieve the latent factors inherent in such data which now becomes continuous data. The factors or data frame retrieved from the variables can be used in other analysis like regression and machine learning.

The two functions are part of factor analysis, essentially, exploratory factor analysis (EFA), used to unravel the underlying structure of the observed variables. The analysis also helps to reduce the complex structure by determining a smaller number of latent factors that sufficiently represent the variation in the observed variables. With EFA, no prior knowledge or hypothesis about the number or nature of the factors is assumed. These are great tools to help tell the story behind your data. The data used for Model_factors is prepared using fa.parallel and fa functions in the psych package. The interesting thing about these functions are their simplicity, and we still maintain the one line code technique.

The basic usage of the codes are:

garrett_ranking(data, num_rank, ranking = NULL, m_rank = c(2:15))

Data The data for the Garrett Ranking.

num_rank number of ranks applied to the data. If the data is a five-point Likert-type data, then number of ranks is 5.

Ranking A vector of list representing the ranks applied to the data. If not available, positional ranks are applied.

m_rank scope of ranking (2-15).

Model_factors(data = dat, DATA = Data)

data R object⁠ obtained from EFA using the fa function in psych package

DATA data.frame of the raw data used to obtain data object.

Let us go!

Load library

library(Dyn4cast)

Garrett Ranking

ranking is supplied

garrett_data <- data.frame(garrett_data)
ranking <- c(
"Serious constraint", "Constraint",
"Not certain it is a constraint", "Not a constraint",
"Not a serious constraint"
)
garrett_ranking(garrett_data, 5, ranking)
$`Data mean table`
S/No Description Mean Remark Rank
1 1 S1 14.758621 Above 1
2 2 S2 8.172414 Above 2
3 7 S7 7.034483 Above 3
4 13 S13 7.034483 Above 4
5 3 S3 5.965517 Above 5
6 9 S9 4.517241 Above 6
7 15 S15 4.517241 Above 7
8 6 S6 3.965517 Above 8
9 12 S12 3.965517 Above 9
10 5 S5 3.413793 Above 10
11 11 S11 3.413793 Above 11
12 4 S4 3.310345 Above 12
13 10 S10 3.310345 Above 13
14 8 S8 1.862069 Below 14
15 14 S14 1.862069 Below 15
$`Garrett value`
# A tibble: 5 × 4
Number `Garrett point` `Garrett index` `Garrett value`
<dbl> <dbl> <dbl> <dbl>
1 1 3.33 15 85
2 2 10 25 75
3 3 16.7 31 69
4 4 23.3 36 64
5 5 30 40 60
$`Garrett ranked data`
S/No Description Serious constraint Constraint
1 2 S2 5 3
2 9 S9 7 6
3 15 S15 7 6
4 5 S5 10 2
5 11 S11 10 2
6 4 S4 4 4
7 10 S10 4 4
8 3 S3 1 2
9 1 S1 0 0
10 6 S6 0 4
11 12 S12 0 4
12 7 S7 0 2
13 13 S13 0 2
14 8 S8 0 0
15 14 S14 0 0
Not certain it is a constraint Not a constraint Not a serious constraint
1 2 2 1
2 0 5 1
3 0 5 1
4 8 5 0
5 8 5 0
6 6 7 3
7 6 7 3
8 5 5 1
9 2 1 0
10 6 5 6
11 6 5 6
12 0 2 2
13 0 2 2
14 5 2 17
15 5 2 17
Total Total Garrett Score Mean score Rank
1 13 976 75.07692 1
2 19 1425 75.00000 2
3 19 1425 75.00000 3
4 25 1872 74.88000 4
5 25 1872 74.88000 5
6 24 1682 70.08333 6
7 24 1682 70.08333 7
8 14 960 68.57143 8
9 3 202 67.33333 9
10 21 1394 66.38095 10
11 21 1394 66.38095 11
12 6 398 66.33333 12
13 6 398 66.33333 13
14 24 1493 62.20833 14
15 24 1493 62.20833 15

ranking not supplied

garrett_ranking(garrett_data, 5)
$`Data mean table`
S/No Description Mean Remark Rank
1 1 S1 14.758621 Above 1
2 2 S2 8.172414 Above 2
3 7 S7 7.034483 Above 3
4 13 S13 7.034483 Above 4
5 3 S3 5.965517 Above 5
6 9 S9 4.517241 Above 6
7 15 S15 4.517241 Above 7
8 6 S6 3.965517 Above 8
9 12 S12 3.965517 Above 9
10 5 S5 3.413793 Above 10
11 11 S11 3.413793 Above 11
12 4 S4 3.310345 Above 12
13 10 S10 3.310345 Above 13
14 8 S8 1.862069 Below 14
15 14 S14 1.862069 Below 15
$`Garrett value`
# A tibble: 5 × 4
Number `Garrett point` `Garrett index` `Garrett value`
<dbl> <dbl> <dbl> <dbl>
1 1 3.33 15 85
2 2 10 25 75
3 3 16.7 31 69
4 4 23.3 36 64
5 5 30 40 60
$`Garrett ranked data`
S/No Description 1st Rank 2nd Rank 3rd Rank 4th Rank 5th Rank Total
1 2 S2 5 3 2 2 1 13
2 9 S9 7 6 0 5 1 19
3 15 S15 7 6 0 5 1 19
4 5 S5 10 2 8 5 0 25
5 11 S11 10 2 8 5 0 25
6 4 S4 4 4 6 7 3 24
7 10 S10 4 4 6 7 3 24
8 3 S3 1 2 5 5 1 14
9 1 S1 0 0 2 1 0 3
10 6 S6 0 4 6 5 6 21
11 12 S12 0 4 6 5 6 21
12 7 S7 0 2 0 2 2 6
13 13 S13 0 2 0 2 2 6
14 8 S8 0 0 5 2 17 24
15 14 S14 0 0 5 2 17 24
Total Garrett Score Mean score Rank
1 976 75.07692 1
2 1425 75.00000 2
3 1425 75.00000 3
4 1872 74.88000 4
5 1872 74.88000 5
6 1682 70.08333 6
7 1682 70.08333 7
8 960 68.57143 8
9 202 67.33333 9
10 1394 66.38095 10
11 1394 66.38095 11
12 398 66.33333 12
13 398 66.33333 13
14 1493 62.20833 14
15 1493 62.20833 15

you can rank subset of the data

garrett_ranking(garrett_data, 8)
$`Data mean table`
S/No Description Mean Remark Rank
1 1 S1 14.758621 Above 1
2 2 S2 8.172414 Above 2
3 7 S7 7.034483 Above 3
4 13 S13 7.034483 Above 4
5 3 S3 5.965517 Above 5
6 9 S9 4.517241 Above 6
7 15 S15 4.517241 Above 7
8 6 S6 3.965517 Below 8
9 12 S12 3.965517 Below 9
10 5 S5 3.413793 Below 10
11 11 S11 3.413793 Below 11
12 4 S4 3.310345 Below 12
13 10 S10 3.310345 Below 13
14 8 S8 1.862069 Below 14
15 14 S14 1.862069 Below 15
$`Garrett value`
# A tibble: 8 × 4
Number `Garrett point` `Garrett index` `Garrett value`
<dbl> <dbl> <dbl> <dbl>
1 1 3.33 15 85
2 2 10 25 75
3 3 16.7 31 69
4 4 23.3 36 64
5 5 30 40 60
6 6 36.7 43 57
7 7 43.3 47 53
8 8 50 50 50
$`Garrett ranked data`
S/No Description 1st Rank 2nd Rank 3rd Rank 4th Rank 5th Rank 6th Rank
1 7 S7 4 2 2 0 2 0
2 13 S13 4 2 2 0 2 0
3 2 S2 2 0 2 5 3 2
4 9 S9 0 4 4 7 6 0
5 15 S15 0 4 4 7 6 0
6 3 S3 1 3 4 1 2 5
7 5 S5 0 1 0 10 2 8
8 11 S11 0 1 0 10 2 8
9 4 S4 0 1 3 4 4 6
10 10 S10 0 1 3 4 4 6
11 6 S6 0 1 1 0 4 6
12 12 S12 0 1 1 0 4 6
13 1 S1 0 0 0 0 0 2
14 8 S8 1 0 0 0 0 5
15 14 S14 1 0 0 0 0 5
7th Rank 8th Rank Total Total Garrett Score Mean score Rank
1 2 2 14 954 68.14286 1
2 2 2 14 954 68.14286 2
3 2 1 17 1078 63.41176 3
4 5 1 27 1699 62.92593 4
5 5 1 27 1699 62.92593 5
6 5 1 22 1370 62.27273 6
7 5 0 26 1556 59.84615 7
8 5 0 26 1556 59.84615 8
9 7 3 28 1641 58.60714 9
10 7 3 28 1641 58.60714 10
11 5 6 23 1291 56.13043 11
12 5 6 23 1291 56.13043 12
13 1 0 3 167 55.66667 13
14 2 17 25 1326 53.04000 14
15 2 17 25 1326 53.04000 15
garrett_ranking(garrett_data, 4)
$`Data mean table`
S/No Description Mean Remark Rank
1 1 S1 14.758621 Above 1
2 2 S2 8.172414 Above 2
3 7 S7 7.034483 Above 3
4 13 S13 7.034483 Above 4
5 3 S3 5.965517 Above 5
6 9 S9 4.517241 Above 6
7 15 S15 4.517241 Above 7
8 6 S6 3.965517 Above 8
9 12 S12 3.965517 Above 9
10 5 S5 3.413793 Above 10
11 11 S11 3.413793 Above 11
12 4 S4 3.310345 Above 12
13 10 S10 3.310345 Above 13
14 8 S8 1.862069 Below 14
15 14 S14 1.862069 Below 15
$`Garrett value`
# A tibble: 4 × 4
Number `Garrett point` `Garrett index` `Garrett value`
<dbl> <dbl> <dbl> <dbl>
1 1 3.33 15 85
2 2 10 25 75
3 3 16.7 31 69
4 4 23.3 36 64
$`Garrett ranked data`
S/No Description 1st Rank 2nd Rank 3rd Rank 4th Rank Total
1 9 S9 6 0 5 1 12
2 15 S15 6 0 5 1 12
3 2 S2 3 2 2 1 8
4 5 S5 2 8 5 0 15
5 11 S11 2 8 5 0 15
6 3 S3 2 5 5 1 13
7 4 S4 4 6 7 3 20
8 10 S10 4 6 7 3 20
9 1 S1 0 2 1 0 3
10 7 S7 2 0 2 2 6
11 13 S13 2 0 2 2 6
12 6 S6 4 6 5 6 21
13 12 S12 4 6 5 6 21
14 8 S8 0 5 2 17 24
15 14 S14 0 5 2 17 24
Total Garrett Score Mean score Rank
1 919 76.58333 1
2 919 76.58333 2
3 607 75.87500 3
4 1115 74.33333 4
5 1115 74.33333 5
6 954 73.38462 6
7 1465 73.25000 7
8 1465 73.25000 8
9 219 73.00000 9
10 436 72.66667 10
11 436 72.66667 11
12 1519 72.33333 12
13 1519 72.33333 13
14 1601 66.70833 14
15 1601 66.70833 15

Latent Variables Recovery

library(psych)
Data <- Quicksummary
GGn <- names(Data)
GG <- ncol(Data)
GGx <- c(paste0("x0", 1:9), paste("x", 10:ncol(Data), sep = ""))
names(Data) <- GGx
lll <- fa.parallel(Data, fm = "minres", fa = "fa")

Parallel analysis suggests that the number of factors = 5 and the number of components = NA 
dat <- fa(Data, nfactors = lll[["nfact"]], rotate = "varimax", fm = "minres")
DD <- Model_factors(data = dat, DATA = Data)

Loadings:
MR1 MR2 MR3 MR5 MR4
x11 0.513
x12 0.611
x13 0.559
x20 0.556
x24 0.617 0.527
x25 0.718
x26 0.595
x01 0.625
x02 0.783 0.541
x10 0.631
x28 -0.610
x04 0.740
x05 0.792
x06 0.720
x08 0.594 0.452
x17 0.667
x18 0.527
x19 0.592
x03 0.523
x07 0.417
x09 0.403
x14
x15 0.480
x16
x21 0.492
x22 0.481
x23 -0.440 0.499
x27 0.465
x29
MR1 MR2 MR3 MR5 MR4
SS loadings 3.854 2.895 2.786 2.441 2.203
Proportion Var 0.133 0.100 0.096 0.084 0.076
Cumulative Var 0.133 0.233 0.329 0.413 0.489
DD$Latent_1
 MR1 loading
1 x11 0.513
2 x12 0.611
3 x13 0.559
4 x20 0.556
6 x25 0.718
7 x26 0.595
8 x15 0.480
9 x21 0.492
DD$Latent_3
 MR3 loading
1 x04 0.740
2 x05 0.792
3 x06 0.720
DD$Latent_5
 MR5 loading
1 x17 0.667
2 x18 0.527
3 x19 0.592
4 x07 0.417
5 x09 0.403
DD$Latent_frame
# A tibble: 103 × 5
MR1 MR2 MR3 MR4 MR5
<dbl> <dbl> <dbl> <dbl> <dbl>
1 16.7 6.28 2.99 11.2 10.4
2 18.6 6.28 2.99 9.76 10.4
3 16.3 3.23 2.99 11.5 9.22
4 16.7 6.28 2.99 11.2 10.4
5 18.1 5.65 2.99 11.2 10.4
6 18.1 6.28 2.99 11.2 10.4
7 19.1 6.28 2.25 11.2 9.22
8 18.1 5.65 2.99 11.2 10.4
9 18.1 5.65 2.99 11.2 10.4
10 19.1 6.28 2.25 11.2 9.22
# ℹ 93 more rows

Welcome to the world of Data Science and easy Machine Learning!

To leave a comment for the author, please follow the link and comment on their blog: R-Blog on Data modelling to develop ....

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)