Reveal the stories behind those Likert-type data
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
This blog is about two new functions, Model_factors
and garrett_ranking
that have been added to the Dyn4cast
package. The two functions provides means for gaining deeper insights into the meaning behind Likert-type variables collected from respondents. Garrett ranking provides the ranks of the observations of the variables based on the level of seriousness attached to it by the respondents. On the other hand, Model factors determines and retrieve the latent factors inherent in such data which now becomes continuous data. The factors or data frame retrieved from the variables can be used in other analysis like regression and machine learning.
The two functions are part of factor analysis, essentially, exploratory factor analysis (EFA), used to unravel the underlying structure of the observed variables. The analysis also helps to reduce the complex structure by determining a smaller number of latent factors that sufficiently represent the variation in the observed variables. With EFA, no prior knowledge or hypothesis about the number or nature of the factors is assumed. These are great tools to help tell the story behind your data. The data used for Model_factors
is prepared using fa.parallel
and fa
functions in the psych
package. The interesting thing about these functions are their simplicity, and we still maintain the one line code technique.
The basic usage of the codes are:
garrett_ranking(data, num_rank, ranking = NULL, m_rank = c(2:15))
Data
The data for the Garrett Ranking.
num_rank
number of ranks applied to the data. If the data is a five-point Likert-type data, then number of ranks is 5.
Ranking
A vector of list representing the ranks applied to the data. If not available, positional ranks are applied.
m_rank
scope of ranking (2-15).
Model_factors(data = dat, DATA = Data)
data
R
object obtained from EFA using the fa
function in psych
package
DATA
data.frame
of the raw data used to obtain data object.
Let us go!
Load library
library(Dyn4cast)
Garrett Ranking
ranking is supplied
garrett_data <- data.frame(garrett_data) ranking <- c( "Serious constraint", "Constraint", "Not certain it is a constraint", "Not a constraint", "Not a serious constraint" ) garrett_ranking(garrett_data, 5, ranking) $`Data mean table` S/No Description Mean Remark Rank 1 1 S1 14.758621 Above 1 2 2 S2 8.172414 Above 2 3 7 S7 7.034483 Above 3 4 13 S13 7.034483 Above 4 5 3 S3 5.965517 Above 5 6 9 S9 4.517241 Above 6 7 15 S15 4.517241 Above 7 8 6 S6 3.965517 Above 8 9 12 S12 3.965517 Above 9 10 5 S5 3.413793 Above 10 11 11 S11 3.413793 Above 11 12 4 S4 3.310345 Above 12 13 10 S10 3.310345 Above 13 14 8 S8 1.862069 Below 14 15 14 S14 1.862069 Below 15 $`Garrett value` # A tibble: 5 × 4 Number `Garrett point` `Garrett index` `Garrett value` <dbl> <dbl> <dbl> <dbl> 1 1 3.33 15 85 2 2 10 25 75 3 3 16.7 31 69 4 4 23.3 36 64 5 5 30 40 60 $`Garrett ranked data` S/No Description Serious constraint Constraint 1 2 S2 5 3 2 9 S9 7 6 3 15 S15 7 6 4 5 S5 10 2 5 11 S11 10 2 6 4 S4 4 4 7 10 S10 4 4 8 3 S3 1 2 9 1 S1 0 0 10 6 S6 0 4 11 12 S12 0 4 12 7 S7 0 2 13 13 S13 0 2 14 8 S8 0 0 15 14 S14 0 0 Not certain it is a constraint Not a constraint Not a serious constraint 1 2 2 1 2 0 5 1 3 0 5 1 4 8 5 0 5 8 5 0 6 6 7 3 7 6 7 3 8 5 5 1 9 2 1 0 10 6 5 6 11 6 5 6 12 0 2 2 13 0 2 2 14 5 2 17 15 5 2 17 Total Total Garrett Score Mean score Rank 1 13 976 75.07692 1 2 19 1425 75.00000 2 3 19 1425 75.00000 3 4 25 1872 74.88000 4 5 25 1872 74.88000 5 6 24 1682 70.08333 6 7 24 1682 70.08333 7 8 14 960 68.57143 8 9 3 202 67.33333 9 10 21 1394 66.38095 10 11 21 1394 66.38095 11 12 6 398 66.33333 12 13 6 398 66.33333 13 14 24 1493 62.20833 14 15 24 1493 62.20833 15
ranking not supplied
garrett_ranking(garrett_data, 5) $`Data mean table` S/No Description Mean Remark Rank 1 1 S1 14.758621 Above 1 2 2 S2 8.172414 Above 2 3 7 S7 7.034483 Above 3 4 13 S13 7.034483 Above 4 5 3 S3 5.965517 Above 5 6 9 S9 4.517241 Above 6 7 15 S15 4.517241 Above 7 8 6 S6 3.965517 Above 8 9 12 S12 3.965517 Above 9 10 5 S5 3.413793 Above 10 11 11 S11 3.413793 Above 11 12 4 S4 3.310345 Above 12 13 10 S10 3.310345 Above 13 14 8 S8 1.862069 Below 14 15 14 S14 1.862069 Below 15 $`Garrett value` # A tibble: 5 × 4 Number `Garrett point` `Garrett index` `Garrett value` <dbl> <dbl> <dbl> <dbl> 1 1 3.33 15 85 2 2 10 25 75 3 3 16.7 31 69 4 4 23.3 36 64 5 5 30 40 60 $`Garrett ranked data` S/No Description 1st Rank 2nd Rank 3rd Rank 4th Rank 5th Rank Total 1 2 S2 5 3 2 2 1 13 2 9 S9 7 6 0 5 1 19 3 15 S15 7 6 0 5 1 19 4 5 S5 10 2 8 5 0 25 5 11 S11 10 2 8 5 0 25 6 4 S4 4 4 6 7 3 24 7 10 S10 4 4 6 7 3 24 8 3 S3 1 2 5 5 1 14 9 1 S1 0 0 2 1 0 3 10 6 S6 0 4 6 5 6 21 11 12 S12 0 4 6 5 6 21 12 7 S7 0 2 0 2 2 6 13 13 S13 0 2 0 2 2 6 14 8 S8 0 0 5 2 17 24 15 14 S14 0 0 5 2 17 24 Total Garrett Score Mean score Rank 1 976 75.07692 1 2 1425 75.00000 2 3 1425 75.00000 3 4 1872 74.88000 4 5 1872 74.88000 5 6 1682 70.08333 6 7 1682 70.08333 7 8 960 68.57143 8 9 202 67.33333 9 10 1394 66.38095 10 11 1394 66.38095 11 12 398 66.33333 12 13 398 66.33333 13 14 1493 62.20833 14 15 1493 62.20833 15
you can rank subset of the data
garrett_ranking(garrett_data, 8) $`Data mean table` S/No Description Mean Remark Rank 1 1 S1 14.758621 Above 1 2 2 S2 8.172414 Above 2 3 7 S7 7.034483 Above 3 4 13 S13 7.034483 Above 4 5 3 S3 5.965517 Above 5 6 9 S9 4.517241 Above 6 7 15 S15 4.517241 Above 7 8 6 S6 3.965517 Below 8 9 12 S12 3.965517 Below 9 10 5 S5 3.413793 Below 10 11 11 S11 3.413793 Below 11 12 4 S4 3.310345 Below 12 13 10 S10 3.310345 Below 13 14 8 S8 1.862069 Below 14 15 14 S14 1.862069 Below 15 $`Garrett value` # A tibble: 8 × 4 Number `Garrett point` `Garrett index` `Garrett value` <dbl> <dbl> <dbl> <dbl> 1 1 3.33 15 85 2 2 10 25 75 3 3 16.7 31 69 4 4 23.3 36 64 5 5 30 40 60 6 6 36.7 43 57 7 7 43.3 47 53 8 8 50 50 50 $`Garrett ranked data` S/No Description 1st Rank 2nd Rank 3rd Rank 4th Rank 5th Rank 6th Rank 1 7 S7 4 2 2 0 2 0 2 13 S13 4 2 2 0 2 0 3 2 S2 2 0 2 5 3 2 4 9 S9 0 4 4 7 6 0 5 15 S15 0 4 4 7 6 0 6 3 S3 1 3 4 1 2 5 7 5 S5 0 1 0 10 2 8 8 11 S11 0 1 0 10 2 8 9 4 S4 0 1 3 4 4 6 10 10 S10 0 1 3 4 4 6 11 6 S6 0 1 1 0 4 6 12 12 S12 0 1 1 0 4 6 13 1 S1 0 0 0 0 0 2 14 8 S8 1 0 0 0 0 5 15 14 S14 1 0 0 0 0 5 7th Rank 8th Rank Total Total Garrett Score Mean score Rank 1 2 2 14 954 68.14286 1 2 2 2 14 954 68.14286 2 3 2 1 17 1078 63.41176 3 4 5 1 27 1699 62.92593 4 5 5 1 27 1699 62.92593 5 6 5 1 22 1370 62.27273 6 7 5 0 26 1556 59.84615 7 8 5 0 26 1556 59.84615 8 9 7 3 28 1641 58.60714 9 10 7 3 28 1641 58.60714 10 11 5 6 23 1291 56.13043 11 12 5 6 23 1291 56.13043 12 13 1 0 3 167 55.66667 13 14 2 17 25 1326 53.04000 14 15 2 17 25 1326 53.04000 15 garrett_ranking(garrett_data, 4) $`Data mean table` S/No Description Mean Remark Rank 1 1 S1 14.758621 Above 1 2 2 S2 8.172414 Above 2 3 7 S7 7.034483 Above 3 4 13 S13 7.034483 Above 4 5 3 S3 5.965517 Above 5 6 9 S9 4.517241 Above 6 7 15 S15 4.517241 Above 7 8 6 S6 3.965517 Above 8 9 12 S12 3.965517 Above 9 10 5 S5 3.413793 Above 10 11 11 S11 3.413793 Above 11 12 4 S4 3.310345 Above 12 13 10 S10 3.310345 Above 13 14 8 S8 1.862069 Below 14 15 14 S14 1.862069 Below 15 $`Garrett value` # A tibble: 4 × 4 Number `Garrett point` `Garrett index` `Garrett value` <dbl> <dbl> <dbl> <dbl> 1 1 3.33 15 85 2 2 10 25 75 3 3 16.7 31 69 4 4 23.3 36 64 $`Garrett ranked data` S/No Description 1st Rank 2nd Rank 3rd Rank 4th Rank Total 1 9 S9 6 0 5 1 12 2 15 S15 6 0 5 1 12 3 2 S2 3 2 2 1 8 4 5 S5 2 8 5 0 15 5 11 S11 2 8 5 0 15 6 3 S3 2 5 5 1 13 7 4 S4 4 6 7 3 20 8 10 S10 4 6 7 3 20 9 1 S1 0 2 1 0 3 10 7 S7 2 0 2 2 6 11 13 S13 2 0 2 2 6 12 6 S6 4 6 5 6 21 13 12 S12 4 6 5 6 21 14 8 S8 0 5 2 17 24 15 14 S14 0 5 2 17 24 Total Garrett Score Mean score Rank 1 919 76.58333 1 2 919 76.58333 2 3 607 75.87500 3 4 1115 74.33333 4 5 1115 74.33333 5 6 954 73.38462 6 7 1465 73.25000 7 8 1465 73.25000 8 9 219 73.00000 9 10 436 72.66667 10 11 436 72.66667 11 12 1519 72.33333 12 13 1519 72.33333 13 14 1601 66.70833 14 15 1601 66.70833 15
Latent Variables Recovery
library(psych) Data <- Quicksummary GGn <- names(Data) GG <- ncol(Data) GGx <- c(paste0("x0", 1:9), paste("x", 10:ncol(Data), sep = "")) names(Data) <- GGx lll <- fa.parallel(Data, fm = "minres", fa = "fa")
Parallel analysis suggests that the number of factors = 5 and the number of components = NA dat <- fa(Data, nfactors = lll[["nfact"]], rotate = "varimax", fm = "minres") DD <- Model_factors(data = dat, DATA = Data) Loadings: MR1 MR2 MR3 MR5 MR4 x11 0.513 x12 0.611 x13 0.559 x20 0.556 x24 0.617 0.527 x25 0.718 x26 0.595 x01 0.625 x02 0.783 0.541 x10 0.631 x28 -0.610 x04 0.740 x05 0.792 x06 0.720 x08 0.594 0.452 x17 0.667 x18 0.527 x19 0.592 x03 0.523 x07 0.417 x09 0.403 x14 x15 0.480 x16 x21 0.492 x22 0.481 x23 -0.440 0.499 x27 0.465 x29 MR1 MR2 MR3 MR5 MR4 SS loadings 3.854 2.895 2.786 2.441 2.203 Proportion Var 0.133 0.100 0.096 0.084 0.076 Cumulative Var 0.133 0.233 0.329 0.413 0.489 DD$Latent_1 MR1 loading 1 x11 0.513 2 x12 0.611 3 x13 0.559 4 x20 0.556 6 x25 0.718 7 x26 0.595 8 x15 0.480 9 x21 0.492 DD$Latent_3 MR3 loading 1 x04 0.740 2 x05 0.792 3 x06 0.720 DD$Latent_5 MR5 loading 1 x17 0.667 2 x18 0.527 3 x19 0.592 4 x07 0.417 5 x09 0.403 DD$Latent_frame # A tibble: 103 × 5 MR1 MR2 MR3 MR4 MR5 <dbl> <dbl> <dbl> <dbl> <dbl> 1 16.7 6.28 2.99 11.2 10.4 2 18.6 6.28 2.99 9.76 10.4 3 16.3 3.23 2.99 11.5 9.22 4 16.7 6.28 2.99 11.2 10.4 5 18.1 5.65 2.99 11.2 10.4 6 18.1 6.28 2.99 11.2 10.4 7 19.1 6.28 2.25 11.2 9.22 8 18.1 5.65 2.99 11.2 10.4 9 18.1 5.65 2.99 11.2 10.4 10 19.1 6.28 2.25 11.2 9.22 # ℹ 93 more rows
Welcome to the world of Data Science and easy Machine Learning!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.