Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In a previous post, I analysed the feature importance for the per cent of engineers in Sweden who are women. I found that the size of the region is a feature that is significant for the per cent of engineers in Sweden who are women. In this post, I will analyse the feature importance of different occupational groups in Sweden. I will use an ensemble of linear models in my analysis.
Statistics Sweden use NUTS (Nomenclature des Unités Territoriales Statistiques), which is the EU’s hierarchical regional division, to specify the regions.
Please send suggestions for improvement of the analysis to ranalystatisticssweden@gmail.com.
First, define libraries and functions.
library (tidyverse) ## -- Attaching packages -------------------------------------------------- tidyverse 1.3.0 -- ## v ggplot2 3.3.0 v purrr 0.3.4 ## v tibble 3.0.0 v dplyr 0.8.5 ## v tidyr 1.0.2 v stringr 1.4.0 ## v readr 1.3.1 v forcats 0.5.0 ## -- Conflicts ----------------------------------------------------- tidyverse_conflicts() -- ## x dplyr::filter() masks stats::filter() ## x dplyr::lag() masks stats::lag() library (broom) library (car) ## Loading required package: carData ## ## Attaching package: 'car' ## The following object is masked from 'package:dplyr': ## ## recode ## The following object is masked from 'package:purrr': ## ## some library (caret) ## Loading required package: lattice ## ## Attaching package: 'caret' ## The following object is masked from 'package:purrr': ## ## lift library (recipes) ## ## Attaching package: 'recipes' ## The following object is masked from 'package:stringr': ## ## fixed ## The following object is masked from 'package:stats': ## ## step library (PerformanceAnalytics) ## Loading required package: xts ## Loading required package: zoo ## ## Attaching package: 'zoo' ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric ## ## Attaching package: 'xts' ## The following objects are masked from 'package:dplyr': ## ## first, last ## ## Attaching package: 'PerformanceAnalytics' ## The following object is masked from 'package:graphics': ## ## legend library (ggpubr) ## Loading required package: magrittr ## ## Attaching package: 'magrittr' ## The following object is masked from 'package:purrr': ## ## set_names ## The following object is masked from 'package:tidyr': ## ## extract library (ipred) library (iml) library (SuperLearner) ## Loading required package: nnls ## Super Learner ## Version: 2.0-26 ## Package created on 2019-10-27 library (scatterplot3d) readfile <- function (file1){read_csv (file1, col_types = cols(), locale = readr::locale (encoding = "latin1"), na = c("..", "NA")) %>% gather (starts_with("19"), starts_with("20"), key = "year", value = groupsize) %>% drop_na() %>% mutate (year_n = parse_number (year)) } perc_women <- function(x){ ifelse (length(x) == 2, x[2] / (x[1] + x[2]), NA) } nuts <- read.csv("nuts.csv") %>% mutate(NUTS2_sh = substr(NUTS2, 3, 4)) nuts %>% distinct (NUTS2_en) %>% knitr::kable( booktabs = TRUE, caption = 'Nomenclature des Unités Territoriales Statistiques (NUTS)')
NUTS2_en |
---|
SE11 Stockholm |
SE12 East-Central Sweden |
SE21 Småland and islands |
SE22 South Sweden |
SE23 West Sweden |
SE31 North-Central Sweden |
SE32 Central Norrland |
SE33 Upper Norrland |
SL.lm.caret <- function(..., method = "lm", tuneLength = 3, obsWeights = obsWeights, trControl = caret::trainControl(method = "cv", number = 10, verboseIter = FALSE)){ suppressWarnings(SL.caret(..., obsWeights = obsWeights, method = method, tuneLength = tuneLength, trControl = trControl)) } SL.lmStepAIC.caret <- function(..., method = "lmStepAIC", tuneLength = 3, obsWeights = obsWeights, trControl = caret::trainControl(method = "cv", number = 10, verboseIter = FALSE)){ suppressWarnings(SL.caret(..., obsWeights = obsWeights, method = method, tuneLength = tuneLength, trControl = trControl)) } SL.bayesglm.caret <- function(..., method = "bayesglm", tuneLength = 3, obsWeights = obsWeights, trControl = caret::trainControl(method = "cv", number = 10, verboseIter = FALSE)){ suppressWarnings(SL.caret(..., obsWeights = obsWeights, method = method, tuneLength = tuneLength, trControl = trControl)) } SL.rlm.caret <- function(..., method = "rlm", tuneLength = 3, obsWeights = obsWeights, trControl = caret::trainControl(method = "cv", number = 10, verboseIter = FALSE)){ suppressWarnings(SL.caret(..., obsWeights = obsWeights, method = method, tuneLength = tuneLength, trControl = trControl)) }
The data tables are downloaded from Statistics Sweden. They are saved as a comma-delimited file without heading, UF0506A1.csv, http://www.statistikdatabasen.scb.se/pxweb/en/ssd/.
The tables:
UF0506A1_1.csv: Population 16-74 years of age by region, highest level of education, age and sex. Year 1985 – 2018 NUTS 2 level 2008- 10 year intervals (16-74)
000000CG_1: Average basic salary, monthly salary and women´s salary as a percentage of men´s salary by region, sector, occupational group (SSYK 2012) and sex. Year 2014 – 2018 Monthly salary All sectors.
000000CD_1.csv: Average basic salary, monthly salary and women´s salary as a percentage of men´s salary by region, sector, occupational group (SSYK 2012) and sex. Year 2014 – 2018 Number of employees All sectors.
The data is aggregated, the size of each group is in the column groupsize.
I have also included some calculated predictors from the original data.
perc_women: The percentage of women within each group defined by edulevel, region and year
perc_women_region: The percentage of women within each group defined by year and region
regioneduyears: The average number of education years per capita within each group defined by year and region
eduquotient: The quotient between regioneduyears for men and women
salaryquotient: The quotient between salary for men and women within each group defined by year and region
perc_women_eng_region: The percentage of women who are engineers within each group defined by year and region
numedulevel <- read.csv("edulevel_1.csv") numedulevel[, 2] <- data.frame(c(8, 9, 10, 12, 13, 15, 22, NA)) tb <- readfile("000000CG_1.csv") tb <- readfile("000000CD_1.csv") %>% left_join(tb, by = c("region", "year", "sex", "sector","occuptional (SSYK 2012)")) tb <- readfile("UF0506A1_1.csv") %>% right_join(tb, by = c("region", "year", "sex")) %>% right_join(numedulevel, by = c("level of education" = "level.of.education")) %>% filter(!is.na(eduyears)) %>% mutate(edulevel = `level of education`) %>% group_by(edulevel, region, year, sex, `occuptional (SSYK 2012)`) %>% mutate(groupsize_all_ages = sum(groupsize)) %>% group_by(edulevel, region, year, `occuptional (SSYK 2012)`) %>% mutate (perc_women = perc_women (groupsize_all_ages[1:2])) %>% mutate (suming = sum(groupsize.x)) %>% mutate (salary = (groupsize.y[2] * groupsize.x[2] + groupsize.y[1] * groupsize.x[1])/(groupsize.x[2] + groupsize.x[1])) %>% group_by (sex, year, region, `occuptional (SSYK 2012)`) %>% mutate(regioneduyears_sex = sum(groupsize * eduyears) / sum(groupsize)) %>% mutate(regiongroupsize = sum(groupsize)) %>% mutate(suming_sex = groupsize.x) %>% group_by(region, year, `occuptional (SSYK 2012)`) %>% mutate (sum_pop = sum(groupsize)) %>% mutate (regioneduyears = sum(groupsize * eduyears) / sum(groupsize)) %>% mutate (perc_women_region = perc_women (regiongroupsize[1:2])) %>% mutate (eduquotient = regioneduyears_sex[2] / regioneduyears_sex[1]) %>% mutate (salary_sex = groupsize.y) %>% mutate (salaryquotient = salary_sex[2] / salary_sex[1]) %>% mutate (perc_women_eng_region = perc_women(suming_sex[1:2])) %>% left_join(nuts %>% distinct (NUTS2_en, NUTS2_sh), by = c("region" = "NUTS2_en")) %>% drop_na() summary(tb) ## region age level of education sex ## Length:29050 Length:29050 Length:29050 Length:29050 ## Class :character Class :character Class :character Class :character ## Mode :character Mode :character Mode :character Mode :character ## ## ## ## year groupsize year_n sector ## Length:29050 Min. : 405 Min. :2014 Length:29050 ## Class :character 1st Qu.: 25412 1st Qu.:2015 Class :character ## Mode :character Median : 61291 Median :2016 Mode :character ## Mean : 71345 Mean :2016 ## 3rd Qu.:113524 3rd Qu.:2017 ## Max. :271889 Max. :2018 ## occuptional (SSYK 2012) groupsize.x year_n.x groupsize.y ## Length:29050 Min. : 100 Min. :2014 Min. : 20200 ## Class :character 1st Qu.: 490 1st Qu.:2015 1st Qu.: 28900 ## Mode :character Median : 1300 Median :2016 Median : 33900 ## Mean : 3258 Mean :2016 Mean : 37066 ## 3rd Qu.: 3400 3rd Qu.:2017 3rd Qu.: 42100 ## Max. :45000 Max. :2018 Max. :133600 ## year_n.y eduyears edulevel groupsize_all_ages ## Min. :2014 Min. : 8.00 Length:29050 Min. : 405 ## 1st Qu.:2015 1st Qu.: 9.00 Class :character 1st Qu.: 25412 ## Median :2016 Median :12.00 Mode :character Median : 61291 ## Mean :2016 Mean :12.71 Mean : 71345 ## 3rd Qu.:2017 3rd Qu.:15.00 3rd Qu.:113524 ## Max. :2018 Max. :22.00 Max. :271889 ## perc_women suming salary regioneduyears_sex ## Min. :0.3575 Min. : 240 Min. : 20661 Min. :11.18 ## 1st Qu.:0.4343 1st Qu.: 1330 1st Qu.: 29046 1st Qu.:11.63 ## Median :0.4655 Median : 3100 Median : 34041 Median :11.78 ## Mean :0.4775 Mean : 6515 Mean : 37105 Mean :11.83 ## 3rd Qu.:0.5132 3rd Qu.: 7400 3rd Qu.: 42068 3rd Qu.:12.09 ## Max. :0.6423 Max. :60000 Max. :113976 Max. :12.55 ## regiongroupsize suming_sex sum_pop regioneduyears ## Min. :128262 Min. : 100 Min. : 262870 Min. :11.39 ## 1st Qu.:292864 1st Qu.: 490 1st Qu.: 596546 1st Qu.:11.56 ## Median :528643 Median : 1300 Median :1057419 Median :11.82 ## Mean :499413 Mean : 3258 Mean : 998826 Mean :11.83 ## 3rd Qu.:708813 3rd Qu.: 3400 3rd Qu.:1417931 3rd Qu.:11.93 ## Max. :827940 Max. :45000 Max. :1655215 Max. :12.41 ## perc_women_region eduquotient salary_sex salaryquotient ## Min. :0.4831 Min. :1.019 Min. : 20200 Min. :0.6423 ## 1st Qu.:0.4890 1st Qu.:1.027 1st Qu.: 28900 1st Qu.:0.9144 ## Median :0.4937 Median :1.032 Median : 33900 Median :0.9556 ## Mean :0.4931 Mean :1.033 Mean : 37066 Mean :0.9502 ## 3rd Qu.:0.4971 3rd Qu.:1.040 3rd Qu.: 42100 3rd Qu.:0.9941 ## Max. :0.5014 Max. :1.047 Max. :133600 Max. :1.3090 ## perc_women_eng_region NUTS2_sh ## Min. :0.01659 Length:29050 ## 1st Qu.:0.30876 Class :character ## Median :0.56000 Mode :character ## Mean :0.52565 ## 3rd Qu.:0.72414 ## Max. :0.94527 tbtemp <- ungroup(tb) %>% dplyr::select(salary, suming, year_n, sum_pop, regioneduyears, perc_women_region, salaryquotient, eduquotient, perc_women_eng_region, `occuptional (SSYK 2012)`) tb_unique <- unique(tbtemp)
I will use SuperLearner to train the ensemble consisting of four linear models without interactions. The four models are Linear Regression (lm), Linear Regression with Stepwise Selection (lmStepAIC), Bayesian Generalized Linear Model (bayesglm) and Robust Linear Model (rlm).
summary_table = vector() cor_table = vector() sp_table <- vector() rmse_table <- vector() for (i in unique(tb_unique$`occuptional (SSYK 2012)`)){ temp <- filter(tb_unique, `occuptional (SSYK 2012)` == i) if (dim(temp)[1] > 20){ temp_weights = temp$suming temp <- dplyr::select(temp, - c(`occuptional (SSYK 2012)`, suming)) blueprint <- recipe(perc_women_eng_region ~ ., data = temp) %>% step_integer(matches("Qual|Cond|QC|Qu")) %>% step_center(all_numeric(), -all_outcomes()) %>% step_scale(all_numeric(), -all_outcomes()) %>% step_dummy(all_nominal(), -all_outcomes(), one_hot = TRUE) prepare <- prep(blueprint, training = temp) temp <- bake(prepare, new_data = temp) invisible(capture.output(model <- SuperLearner( temp$perc_women_eng_region, data.frame(dplyr::select(temp, -c(perc_women_eng_region))), family = gaussian(), verbose = FALSE, obsWeights = temp_weights, SL.library = list("SL.lm.caret", "SL.lmStepAIC.caret", "SL.bayesglm.caret", "SL.rlm.caret")))) pred <- function(object, newdata){ predict(model, newdata=newdata, onlySL = TRUE)$pred } predictor <- Predictor$new(model, data = dplyr::select(temp, -perc_women_eng_region), y = temp$perc_women_eng_region, predict.fun = pred) imp <- FeatureImp$new(predictor, loss = "mae", n.repetitions = 30) summary_table <- rbind(summary_table, mutate(tibble(.rows = 7), importance = imp$results$importance, feature = imp$results$feature, importance.05 = imp$results$importance.05, ssyk = i)) cor_table <- rbind(cor_table, mutate(tibble(.rows = 7), feature = colnames(dplyr::select(temp, -c(perc_women_eng_region))), cor = cor(dplyr::select(temp, -c(perc_women_eng_region)), temp$perc_women_eng_region), ssyk = i)) sp_table <- rbind(sp_table, mutate(tibble(.rows = 4), coef = model$coef, model = names(model$coef), ssyk = i)) prs <- postResample(pred = predict(model)$pred, obs = temp$perc_women_eng_region) rmse_table <- rbind(rmse_table, mutate(tibble(.rows = 1), RMSE = prs[1], Rsquared = prs[2], MAE = prs[3], ssyk = i)) } } ## Registered S3 methods overwritten by 'lme4': ## method from ## cooks.distance.influence.merMod car ## influence.merMod car ## dfbeta.influence.merMod car ## dfbetas.influence.merMod car
The table below shows the feature values for the different occupation groups and if there is a single important feature (diff1) or if there are two important features (diff2) for the occupational group. The Rsquared value shows if the model for the occupational group does have a good fit.
summary_table %>% group_by(ssyk) %>% group_by(ssyk) %>% dplyr::mutate(diff1 = importance.05[1] / importance[2]) %>% dplyr::mutate(diff2 = importance.05[2] / importance[3]) %>% left_join(cor_table, by = c("ssyk", "feature")) %>% left_join(sp_table %>% spread(model, coef), by=c("ssyk")) %>% left_join(rmse_table, by=c("ssyk")) %>% dplyr::select(ssyk, feature, importance, importance.05, diff1, diff2, Rsquared) %>% knitr::kable( booktabs = TRUE, caption = 'Feature values for different occupation groups')
ssyk | feature | importance | importance.05 | diff1 | diff2 | Rsquared |
---|---|---|---|---|---|---|
123 Administration and planning managers | eduquotient | 3.8170179 | 3.2624044 | 0.9992710 | 0.9746650 | 0.5296568 |
123 Administration and planning managers | sum_pop | 3.2647844 | 2.6462447 | 0.9992710 | 0.9746650 | 0.5296568 |
123 Administration and planning managers | salary | 2.7150299 | 2.3345443 | 0.9992710 | 0.9746650 | 0.5296568 |
123 Administration and planning managers | regioneduyears | 2.5260824 | 2.1015060 | 0.9992710 | 0.9746650 | 0.5296568 |
123 Administration and planning managers | salaryquotient | 1.2063518 | 1.0434914 | 0.9992710 | 0.9746650 | 0.5296568 |
123 Administration and planning managers | perc_women_region | 1.1813452 | 1.0858792 | 0.9992710 | 0.9746650 | 0.5296568 |
123 Administration and planning managers | year_n | 1.0740710 | 1.0155535 | 0.9992710 | 0.9746650 | 0.5296568 |
141 Primary and secondary schools and adult education managers | regioneduyears | 2.7494050 | 2.2970027 | 1.3327022 | 0.9503406 | 0.6950328 |
141 Primary and secondary schools and adult education managers | year_n | 1.7235678 | 1.4984603 | 1.3327022 | 0.9503406 | 0.6950328 |
141 Primary and secondary schools and adult education managers | salary | 1.5767613 | 1.4698417 | 1.3327022 | 0.9503406 | 0.6950328 |
141 Primary and secondary schools and adult education managers | sum_pop | 1.0000000 | 1.0000000 | 1.3327022 | 0.9503406 | 0.6950328 |
141 Primary and secondary schools and adult education managers | perc_women_region | 1.0000000 | 1.0000000 | 1.3327022 | 0.9503406 | 0.6950328 |
141 Primary and secondary schools and adult education managers | salaryquotient | 1.0000000 | 1.0000000 | 1.3327022 | 0.9503406 | 0.6950328 |
141 Primary and secondary schools and adult education managers | eduquotient | 1.0000000 | 1.0000000 | 1.3327022 | 0.9503406 | 0.6950328 |
151 Health care managers | salaryquotient | 2.4538279 | 1.9290583 | 0.8094247 | 0.9296316 | 0.5799653 |
151 Health care managers | regioneduyears | 2.3832462 | 2.1362284 | 0.8094247 | 0.9296316 | 0.5799653 |
151 Health care managers | salary | 2.2979301 | 1.9260029 | 0.8094247 | 0.9296316 | 0.5799653 |
151 Health care managers | eduquotient | 2.2848362 | 1.7029592 | 0.8094247 | 0.9296316 | 0.5799653 |
151 Health care managers | year_n | 1.5379944 | 1.3579666 | 0.8094247 | 0.9296316 | 0.5799653 |
151 Health care managers | sum_pop | 1.3662380 | 1.1596424 | 0.8094247 | 0.9296316 | 0.5799653 |
151 Health care managers | perc_women_region | 1.0388730 | 1.0045578 | 0.8094247 | 0.9296316 | 0.5799653 |
153 Elderly care managers | year_n | 8.1559615 | 6.4775973 | 0.9123375 | 1.4669946 | 0.8177027 |
153 Elderly care managers | salary | 7.1000012 | 5.9284122 | 0.9123375 | 1.4669946 | 0.8177027 |
153 Elderly care managers | perc_women_region | 4.0411955 | 3.4263668 | 0.9123375 | 1.4669946 | 0.8177027 |
153 Elderly care managers | salaryquotient | 1.5390920 | 1.2329029 | 0.9123375 | 1.4669946 | 0.8177027 |
153 Elderly care managers | sum_pop | 1.0000000 | 1.0000000 | 0.9123375 | 1.4669946 | 0.8177027 |
153 Elderly care managers | regioneduyears | 1.0000000 | 1.0000000 | 0.9123375 | 1.4669946 | 0.8177027 |
153 Elderly care managers | eduquotient | 1.0000000 | 1.0000000 | 0.9123375 | 1.4669946 | 0.8177027 |
159 Other social services managers | sum_pop | 3.0725550 | 2.5099215 | 0.8325266 | 0.9953769 | 0.7325679 |
159 Other social services managers | salary | 3.0148243 | 2.6352584 | 0.8325266 | 0.9953769 | 0.7325679 |
159 Other social services managers | regioneduyears | 2.6474981 | 2.2659272 | 0.8325266 | 0.9953769 | 0.7325679 |
159 Other social services managers | eduquotient | 2.1941589 | 1.9031203 | 0.8325266 | 0.9953769 | 0.7325679 |
159 Other social services managers | year_n | 1.7175697 | 1.4021163 | 0.8325266 | 0.9953769 | 0.7325679 |
159 Other social services managers | salaryquotient | 1.5834901 | 1.3638281 | 0.8325266 | 0.9953769 | 0.7325679 |
159 Other social services managers | perc_women_region | 1.0000000 | 1.0000000 | 0.8325266 | 0.9953769 | 0.7325679 |
211 Physicists and chemists | eduquotient | 3.1793564 | 2.4217960 | 0.8168496 | 1.4762352 | 0.7777887 |
211 Physicists and chemists | perc_women_region | 2.9648004 | 2.4885379 | 0.8168496 | 1.4762352 | 0.7777887 |
211 Physicists and chemists | year_n | 1.6857326 | 1.4429244 | 0.8168496 | 1.4762352 | 0.7777887 |
211 Physicists and chemists | regioneduyears | 1.6402704 | 1.3256462 | 0.8168496 | 1.4762352 | 0.7777887 |
211 Physicists and chemists | sum_pop | 1.5988744 | 1.2158115 | 0.8168496 | 1.4762352 | 0.7777887 |
211 Physicists and chemists | salary | 1.5727207 | 1.3608653 | 0.8168496 | 1.4762352 | 0.7777887 |
211 Physicists and chemists | salaryquotient | 1.3311748 | 1.0615814 | 0.8168496 | 1.4762352 | 0.7777887 |
214 Engineering professionals | sum_pop | 3.2085990 | 2.5852735 | 1.0619019 | 1.1085082 | 0.8501315 |
214 Engineering professionals | regioneduyears | 2.4345690 | 2.0768022 | 1.0619019 | 1.1085082 | 0.8501315 |
214 Engineering professionals | eduquotient | 1.8735109 | 1.5486497 | 1.0619019 | 1.1085082 | 0.8501315 |
214 Engineering professionals | salary | 1.0000000 | 1.0000000 | 1.0619019 | 1.1085082 | 0.8501315 |
214 Engineering professionals | year_n | 1.0000000 | 1.0000000 | 1.0619019 | 1.1085082 | 0.8501315 |
214 Engineering professionals | perc_women_region | 1.0000000 | 1.0000000 | 1.0619019 | 1.1085082 | 0.8501315 |
214 Engineering professionals | salaryquotient | 1.0000000 | 1.0000000 | 1.0619019 | 1.1085082 | 0.8501315 |
218 Specialists within environmental and health protection | year_n | 1.1998968 | 1.0271753 | 0.9319265 | 1.0204712 | 0.2072889 |
218 Specialists within environmental and health protection | sum_pop | 1.1022064 | 1.0204712 | 0.9319265 | 1.0204712 | 0.2072889 |
218 Specialists within environmental and health protection | salary | 1.0000000 | 1.0000000 | 0.9319265 | 1.0204712 | 0.2072889 |
218 Specialists within environmental and health protection | regioneduyears | 1.0000000 | 1.0000000 | 0.9319265 | 1.0204712 | 0.2072889 |
218 Specialists within environmental and health protection | perc_women_region | 1.0000000 | 1.0000000 | 0.9319265 | 1.0204712 | 0.2072889 |
218 Specialists within environmental and health protection | salaryquotient | 1.0000000 | 1.0000000 | 0.9319265 | 1.0204712 | 0.2072889 |
218 Specialists within environmental and health protection | eduquotient | 1.0000000 | 1.0000000 | 0.9319265 | 1.0204712 | 0.2072889 |
221 Medical doctors | regioneduyears | 3.2538623 | 2.7126722 | 1.6832188 | 1.0066594 | 0.7935628 |
221 Medical doctors | eduquotient | 1.6115981 | 1.4986615 | 1.6832188 | 1.0066594 | 0.7935628 |
221 Medical doctors | perc_women_region | 1.4887473 | 1.3049164 | 1.6832188 | 1.0066594 | 0.7935628 |
221 Medical doctors | sum_pop | 1.0890646 | 1.0519492 | 1.6832188 | 1.0066594 | 0.7935628 |
221 Medical doctors | salaryquotient | 1.0136252 | 0.9725422 | 1.6832188 | 1.0066594 | 0.7935628 |
221 Medical doctors | salary | 1.0079123 | 0.9912346 | 1.6832188 | 1.0066594 | 0.7935628 |
221 Medical doctors | year_n | 0.9691336 | 0.9264889 | 1.6832188 | 1.0066594 | 0.7935628 |
222 Nursing professionals | perc_women_region | 1.3570281 | 1.2297860 | 1.1599907 | 0.9741302 | 0.2680452 |
222 Nursing professionals | salaryquotient | 1.0601688 | 0.9943949 | 1.1599907 | 0.9741302 | 0.2680452 |
222 Nursing professionals | eduquotient | 1.0208028 | 0.9745776 | 1.1599907 | 0.9741302 | 0.2680452 |
222 Nursing professionals | year_n | 1.0058709 | 0.9866984 | 1.1599907 | 0.9741302 | 0.2680452 |
222 Nursing professionals | sum_pop | 1.0015554 | 0.9908744 | 1.1599907 | 0.9741302 | 0.2680452 |
222 Nursing professionals | salary | 0.9999898 | 0.9996937 | 1.1599907 | 0.9741302 | 0.2680452 |
222 Nursing professionals | regioneduyears | 0.9991395 | 0.9967976 | 1.1599907 | 0.9741302 | 0.2680452 |
223 Nursing professionals (cont.) | perc_women_region | 1.7410583 | 1.4227133 | 0.8555381 | 0.9923183 | 0.6124752 |
223 Nursing professionals (cont.) | eduquotient | 1.6629455 | 1.4414067 | 0.8555381 | 0.9923183 | 0.6124752 |
223 Nursing professionals (cont.) | sum_pop | 1.4525648 | 1.3870921 | 0.8555381 | 0.9923183 | 0.6124752 |
223 Nursing professionals (cont.) | salaryquotient | 1.2526913 | 1.1340334 | 0.8555381 | 0.9923183 | 0.6124752 |
223 Nursing professionals (cont.) | year_n | 1.1251733 | 1.0377315 | 0.8555381 | 0.9923183 | 0.6124752 |
223 Nursing professionals (cont.) | regioneduyears | 1.0772452 | 0.9979456 | 0.8555381 | 0.9923183 | 0.6124752 |
223 Nursing professionals (cont.) | salary | 1.0064257 | 0.9699734 | 0.8555381 | 0.9923183 | 0.6124752 |
227 Naprapaths, physiotherapists, occupational therapists | year_n | 3.3148421 | 2.8516422 | 1.2860370 | 1.4787861 | 0.5424810 |
227 Naprapaths, physiotherapists, occupational therapists | salary | 2.2173875 | 1.9400838 | 1.2860370 | 1.4787861 | 0.5424810 |
227 Naprapaths, physiotherapists, occupational therapists | eduquotient | 1.3119435 | 1.1242123 | 1.2860370 | 1.4787861 | 0.5424810 |
227 Naprapaths, physiotherapists, occupational therapists | regioneduyears | 1.3023137 | 1.1403014 | 1.2860370 | 1.4787861 | 0.5424810 |
227 Naprapaths, physiotherapists, occupational therapists | salaryquotient | 1.0993687 | 0.9919804 | 1.2860370 | 1.4787861 | 0.5424810 |
227 Naprapaths, physiotherapists, occupational therapists | perc_women_region | 1.0570268 | 0.9703933 | 1.2860370 | 1.4787861 | 0.5424810 |
227 Naprapaths, physiotherapists, occupational therapists | sum_pop | 0.9923096 | 0.9587380 | 1.2860370 | 1.4787861 | 0.5424810 |
231 University and higher education teachers | perc_women_region | 6.9035680 | 6.1896016 | 1.0668866 | 0.8353526 | 0.9357939 |
231 University and higher education teachers | year_n | 5.8015553 | 4.8319783 | 1.0668866 | 0.8353526 | 0.9357939 |
231 University and higher education teachers | salary | 5.7843576 | 4.9118836 | 1.0668866 | 0.8353526 | 0.9357939 |
231 University and higher education teachers | sum_pop | 4.2107594 | 3.2672779 | 1.0668866 | 0.8353526 | 0.9357939 |
231 University and higher education teachers | eduquotient | 3.6947346 | 3.1285856 | 1.0668866 | 0.8353526 | 0.9357939 |
231 University and higher education teachers | regioneduyears | 2.8014376 | 2.4699874 | 1.0668866 | 0.8353526 | 0.9357939 |
231 University and higher education teachers | salaryquotient | 1.7814310 | 1.4686631 | 1.0668866 | 0.8353526 | 0.9357939 |
232 Vocational education teachers | perc_women_region | 5.8995689 | 4.5458286 | 0.8486146 | 0.9931816 | 0.9152722 |
232 Vocational education teachers | salary | 5.3567644 | 4.4992796 | 0.8486146 | 0.9931816 | 0.9152722 |
232 Vocational education teachers | regioneduyears | 4.5301682 | 4.0344389 | 0.8486146 | 0.9931816 | 0.9152722 |
232 Vocational education teachers | year_n | 2.5787996 | 2.2283684 | 0.8486146 | 0.9931816 | 0.9152722 |
232 Vocational education teachers | eduquotient | 1.9948566 | 1.7227708 | 0.8486146 | 0.9931816 | 0.9152722 |
232 Vocational education teachers | salaryquotient | 1.7484881 | 1.4137498 | 0.8486146 | 0.9931816 | 0.9152722 |
232 Vocational education teachers | sum_pop | 1.1795101 | 1.0464747 | 0.8486146 | 0.9931816 | 0.9152722 |
233 Secondary education teachers | year_n | 1.7519346 | 1.5963138 | 0.9861296 | 0.8798435 | 0.2711955 |
233 Secondary education teachers | salary | 1.6187667 | 1.3647524 | 0.9861296 | 0.8798435 | 0.2711955 |
233 Secondary education teachers | perc_women_region | 1.5511308 | 1.3237125 | 0.9861296 | 0.8798435 | 0.2711955 |
233 Secondary education teachers | eduquotient | 1.4901622 | 1.3515191 | 0.9861296 | 0.8798435 | 0.2711955 |
233 Secondary education teachers | regioneduyears | 1.1340296 | 1.0608823 | 0.9861296 | 0.8798435 | 0.2711955 |
233 Secondary education teachers | sum_pop | 1.1115054 | 1.0431384 | 0.9861296 | 0.8798435 | 0.2711955 |
233 Secondary education teachers | salaryquotient | 1.0011883 | 0.9739857 | 0.9861296 | 0.8798435 | 0.2711955 |
234 Primary- and pre-school teachers | regioneduyears | 2.6651473 | 2.3148568 | 1.0820339 | 0.9396348 | 0.7919968 |
234 Primary- and pre-school teachers | eduquotient | 2.1393570 | 1.8980735 | 1.0820339 | 0.9396348 | 0.7919968 |
234 Primary- and pre-school teachers | sum_pop | 2.0200119 | 1.7615651 | 1.0820339 | 0.9396348 | 0.7919968 |
234 Primary- and pre-school teachers | year_n | 1.9879886 | 1.7570799 | 1.0820339 | 0.9396348 | 0.7919968 |
234 Primary- and pre-school teachers | salaryquotient | 1.5711047 | 1.3697916 | 1.0820339 | 0.9396348 | 0.7919968 |
234 Primary- and pre-school teachers | salary | 1.5376109 | 1.3834061 | 1.0820339 | 0.9396348 | 0.7919968 |
234 Primary- and pre-school teachers | perc_women_region | 1.0541899 | 1.0163654 | 1.0820339 | 0.9396348 | 0.7919968 |
235 Teaching professionals not elsewhere classified | eduquotient | 3.4752946 | 3.1173913 | 1.1572182 | 1.0071504 | 0.7038429 |
235 Teaching professionals not elsewhere classified | perc_women_region | 2.6938664 | 2.2359223 | 1.1572182 | 1.0071504 | 0.7038429 |
235 Teaching professionals not elsewhere classified | year_n | 2.2200482 | 1.9098982 | 1.1572182 | 1.0071504 | 0.7038429 |
235 Teaching professionals not elsewhere classified | salaryquotient | 1.8916217 | 1.6391206 | 1.1572182 | 1.0071504 | 0.7038429 |
235 Teaching professionals not elsewhere classified | regioneduyears | 1.3369762 | 1.1453482 | 1.1572182 | 1.0071504 | 0.7038429 |
235 Teaching professionals not elsewhere classified | sum_pop | 1.0086567 | 0.9599072 | 1.1572182 | 1.0071504 | 0.7038429 |
235 Teaching professionals not elsewhere classified | salary | 1.0045400 | 0.9954951 | 1.1572182 | 1.0071504 | 0.7038429 |
241 Accountants, financial analysts and fund managers | perc_women_region | 2.7081423 | 2.2326348 | 0.8460985 | 1.0919083 | 0.7445476 |
241 Accountants, financial analysts and fund managers | eduquotient | 2.6387410 | 2.1737040 | 0.8460985 | 1.0919083 | 0.7445476 |
241 Accountants, financial analysts and fund managers | year_n | 1.9907387 | 1.5693998 | 0.8460985 | 1.0919083 | 0.7445476 |
241 Accountants, financial analysts and fund managers | salary | 1.4932763 | 1.3223917 | 0.8460985 | 1.0919083 | 0.7445476 |
241 Accountants, financial analysts and fund managers | regioneduyears | 1.3933309 | 1.2289757 | 0.8460985 | 1.0919083 | 0.7445476 |
241 Accountants, financial analysts and fund managers | salaryquotient | 1.0962361 | 1.0154278 | 0.8460985 | 1.0919083 | 0.7445476 |
241 Accountants, financial analysts and fund managers | sum_pop | 0.9995289 | 0.9973319 | 0.8460985 | 1.0919083 | 0.7445476 |
242 Organisation analysts, policy administrators and human resource specialists | salary | 4.1453246 | 3.4636361 | 1.4988404 | 1.0425932 | 0.6524219 |
242 Organisation analysts, policy administrators and human resource specialists | perc_women_region | 2.3108772 | 1.9737256 | 1.4988404 | 1.0425932 | 0.6524219 |
242 Organisation analysts, policy administrators and human resource specialists | year_n | 1.8930927 | 1.6469206 | 1.4988404 | 1.0425932 | 0.6524219 |
242 Organisation analysts, policy administrators and human resource specialists | regioneduyears | 1.8639424 | 1.6551601 | 1.4988404 | 1.0425932 | 0.6524219 |
242 Organisation analysts, policy administrators and human resource specialists | eduquotient | 1.3041251 | 1.2098787 | 1.4988404 | 1.0425932 | 0.6524219 |
242 Organisation analysts, policy administrators and human resource specialists | sum_pop | 1.0982455 | 1.0016594 | 1.4988404 | 1.0425932 | 0.6524219 |
242 Organisation analysts, policy administrators and human resource specialists | salaryquotient | 1.0559543 | 0.9813767 | 1.4988404 | 1.0425932 | 0.6524219 |
243 Marketing and public relations professionals | sum_pop | 4.9569245 | 3.9496983 | 1.1206542 | 0.9392641 | 0.6445752 |
243 Marketing and public relations professionals | regioneduyears | 3.5244578 | 2.8349264 | 1.1206542 | 0.9392641 | 0.6445752 |
243 Marketing and public relations professionals | salary | 3.0182422 | 2.2538137 | 1.1206542 | 0.9392641 | 0.6445752 |
243 Marketing and public relations professionals | eduquotient | 2.6524352 | 1.9742342 | 1.1206542 | 0.9392641 | 0.6445752 |
243 Marketing and public relations professionals | year_n | 1.7174528 | 1.4543276 | 1.1206542 | 0.9392641 | 0.6445752 |
243 Marketing and public relations professionals | salaryquotient | 1.4552259 | 1.2338961 | 1.1206542 | 0.9392641 | 0.6445752 |
243 Marketing and public relations professionals | perc_women_region | 1.3152657 | 1.1626737 | 1.1206542 | 0.9392641 | 0.6445752 |
251 ICT architects, systems analysts and test managers | perc_women_region | 2.7358787 | 2.5568847 | 0.9528479 | 1.1920886 | 0.4818438 |
251 ICT architects, systems analysts and test managers | salary | 2.6834131 | 2.1391217 | 0.9528479 | 1.1920886 | 0.4818438 |
251 ICT architects, systems analysts and test managers | year_n | 1.7944318 | 1.5618951 | 0.9528479 | 1.1920886 | 0.4818438 |
251 ICT architects, systems analysts and test managers | eduquotient | 1.0100090 | 1.0001357 | 0.9528479 | 1.1920886 | 0.4818438 |
251 ICT architects, systems analysts and test managers | sum_pop | 1.0047852 | 1.0012175 | 0.9528479 | 1.1920886 | 0.4818438 |
251 ICT architects, systems analysts and test managers | salaryquotient | 0.9996955 | 0.9963383 | 0.9528479 | 1.1920886 | 0.4818438 |
251 ICT architects, systems analysts and test managers | regioneduyears | 0.9930233 | 0.9888326 | 0.9528479 | 1.1920886 | 0.4818438 |
261 Legal professionals | salary | 4.7549424 | 3.8162165 | 0.8451578 | 1.1021776 | 0.7483456 |
261 Legal professionals | sum_pop | 4.5153896 | 3.1719595 | 0.8451578 | 1.1021776 | 0.7483456 |
261 Legal professionals | perc_women_region | 2.8779023 | 2.5745840 | 0.8451578 | 1.1021776 | 0.7483456 |
261 Legal professionals | year_n | 2.6379752 | 2.2504586 | 0.8451578 | 1.1021776 | 0.7483456 |
261 Legal professionals | regioneduyears | 2.5557708 | 2.1184918 | 0.8451578 | 1.1021776 | 0.7483456 |
261 Legal professionals | eduquotient | 2.0561141 | 1.7014956 | 0.8451578 | 1.1021776 | 0.7483456 |
261 Legal professionals | salaryquotient | 1.4548882 | 1.2819586 | 0.8451578 | 1.1021776 | 0.7483456 |
262 Museum curators and librarians and related professionals | sum_pop | 3.3548098 | 2.4968169 | 0.7595766 | 0.8090532 | 0.7594220 |
262 Museum curators and librarians and related professionals | eduquotient | 3.2871165 | 2.5885880 | 0.7595766 | 0.8090532 | 0.7594220 |
262 Museum curators and librarians and related professionals | perc_women_region | 3.1995277 | 2.8162677 | 0.7595766 | 0.8090532 | 0.7594220 |
262 Museum curators and librarians and related professionals | salary | 1.9883411 | 1.6716447 | 0.7595766 | 0.8090532 | 0.7594220 |
262 Museum curators and librarians and related professionals | regioneduyears | 1.4723407 | 1.3110794 | 0.7595766 | 0.8090532 | 0.7594220 |
262 Museum curators and librarians and related professionals | year_n | 1.1910891 | 1.1044098 | 0.7595766 | 0.8090532 | 0.7594220 |
262 Museum curators and librarians and related professionals | salaryquotient | 1.0596886 | 0.9983240 | 0.7595766 | 0.8090532 | 0.7594220 |
266 Social work and counselling professionals | year_n | 2.0699805 | 1.8890549 | 1.3195137 | 0.9204363 | 0.6423319 |
266 Social work and counselling professionals | regioneduyears | 1.4316296 | 1.1948843 | 1.3195137 | 0.9204363 | 0.6423319 |
266 Social work and counselling professionals | perc_women_region | 1.2981716 | 1.0993452 | 1.3195137 | 0.9204363 | 0.6423319 |
266 Social work and counselling professionals | sum_pop | 1.2974744 | 1.1788171 | 1.3195137 | 0.9204363 | 0.6423319 |
266 Social work and counselling professionals | salaryquotient | 1.0015134 | 0.9980390 | 1.3195137 | 0.9204363 | 0.6423319 |
266 Social work and counselling professionals | eduquotient | 1.0004463 | 0.9980253 | 1.3195137 | 0.9204363 | 0.6423319 |
266 Social work and counselling professionals | salary | 1.0002662 | 0.9921359 | 1.3195137 | 0.9204363 | 0.6423319 |
311 Physical and engineering science technicians | perc_women_region | 2.8239037 | 2.3308712 | 1.2124932 | 1.1060730 | 0.6129610 |
311 Physical and engineering science technicians | year_n | 1.9223788 | 1.6234583 | 1.2124932 | 1.1060730 | 0.6129610 |
311 Physical and engineering science technicians | salary | 1.4677678 | 1.2914193 | 1.2124932 | 1.1060730 | 0.6129610 |
311 Physical and engineering science technicians | salaryquotient | 1.0353719 | 0.9889612 | 1.2124932 | 1.1060730 | 0.6129610 |
311 Physical and engineering science technicians | eduquotient | 1.0236570 | 0.9963231 | 1.2124932 | 1.1060730 | 0.6129610 |
311 Physical and engineering science technicians | sum_pop | 1.0233421 | 0.9919135 | 1.2124932 | 1.1060730 | 0.6129610 |
311 Physical and engineering science technicians | regioneduyears | 1.0116614 | 0.9855591 | 1.2124932 | 1.1060730 | 0.6129610 |
331 Financial and accounting associate professionals | eduquotient | 3.7511968 | 3.2140254 | 1.3911100 | 0.9346430 | 0.6495278 |
331 Financial and accounting associate professionals | perc_women_region | 2.3104035 | 1.9925122 | 1.3911100 | 0.9346430 | 0.6495278 |
331 Financial and accounting associate professionals | salary | 2.1318431 | 1.8521291 | 1.3911100 | 0.9346430 | 0.6495278 |
331 Financial and accounting associate professionals | sum_pop | 1.4214195 | 1.2916594 | 1.3911100 | 0.9346430 | 0.6495278 |
331 Financial and accounting associate professionals | salaryquotient | 1.3864519 | 1.1803667 | 1.3911100 | 0.9346430 | 0.6495278 |
331 Financial and accounting associate professionals | regioneduyears | 1.0725493 | 1.0231476 | 1.3911100 | 0.9346430 | 0.6495278 |
331 Financial and accounting associate professionals | year_n | 1.0156559 | 0.9924611 | 1.3911100 | 0.9346430 | 0.6495278 |
332 Insurance advisers, sales and purchasing agents | perc_women_region | 3.8010119 | 3.0275822 | 1.5742205 | 1.3483567 | 0.7642041 |
332 Insurance advisers, sales and purchasing agents | sum_pop | 1.9232262 | 1.7322825 | 1.5742205 | 1.3483567 | 0.7642041 |
332 Insurance advisers, sales and purchasing agents | salaryquotient | 1.2847360 | 1.1413503 | 1.5742205 | 1.3483567 | 0.7642041 |
332 Insurance advisers, sales and purchasing agents | year_n | 1.2196519 | 1.0707646 | 1.5742205 | 1.3483567 | 0.7642041 |
332 Insurance advisers, sales and purchasing agents | salary | 1.0000000 | 1.0000000 | 1.5742205 | 1.3483567 | 0.7642041 |
332 Insurance advisers, sales and purchasing agents | regioneduyears | 1.0000000 | 1.0000000 | 1.5742205 | 1.3483567 | 0.7642041 |
332 Insurance advisers, sales and purchasing agents | eduquotient | 1.0000000 | 1.0000000 | 1.5742205 | 1.3483567 | 0.7642041 |
333 Business services agents | regioneduyears | 3.1945994 | 2.5604487 | 1.4159454 | 1.0419269 | 0.4301719 |
333 Business services agents | eduquotient | 1.8082963 | 1.4856024 | 1.4159454 | 1.0419269 | 0.4301719 |
333 Business services agents | year_n | 1.4258221 | 1.1847659 | 1.4159454 | 1.0419269 | 0.4301719 |
333 Business services agents | sum_pop | 1.1998397 | 1.0205232 | 1.4159454 | 1.0419269 | 0.4301719 |
333 Business services agents | salaryquotient | 1.0691040 | 1.0183061 | 1.4159454 | 1.0419269 | 0.4301719 |
333 Business services agents | perc_women_region | 1.0376611 | 0.9739336 | 1.4159454 | 1.0419269 | 0.4301719 |
333 Business services agents | salary | 1.0005005 | 0.9973068 | 1.4159454 | 1.0419269 | 0.4301719 |
335 Tax and related government associate professionals | eduquotient | 4.1013441 | 3.4857810 | 1.2177057 | 1.1768387 | 0.7469849 |
335 Tax and related government associate professionals | sum_pop | 2.8625809 | 2.6778006 | 1.2177057 | 1.1768387 | 0.7469849 |
335 Tax and related government associate professionals | perc_women_region | 2.2754185 | 1.9606631 | 1.2177057 | 1.1768387 | 0.7469849 |
335 Tax and related government associate professionals | salary | 1.0000000 | 1.0000000 | 1.2177057 | 1.1768387 | 0.7469849 |
335 Tax and related government associate professionals | year_n | 1.0000000 | 1.0000000 | 1.2177057 | 1.1768387 | 0.7469849 |
335 Tax and related government associate professionals | regioneduyears | 1.0000000 | 1.0000000 | 1.2177057 | 1.1768387 | 0.7469849 |
335 Tax and related government associate professionals | salaryquotient | 1.0000000 | 1.0000000 | 1.2177057 | 1.1768387 | 0.7469849 |
336 Police officers | eduquotient | 6.6149010 | 5.4101856 | 1.3781681 | 0.9403836 | 0.6720628 |
336 Police officers | sum_pop | 3.9256356 | 3.1250520 | 1.3781681 | 0.9403836 | 0.6720628 |
336 Police officers | salary | 3.3231673 | 2.8227248 | 1.3781681 | 0.9403836 | 0.6720628 |
336 Police officers | regioneduyears | 3.2344097 | 2.8043712 | 1.3781681 | 0.9403836 | 0.6720628 |
336 Police officers | perc_women_region | 2.0402484 | 1.7876511 | 1.3781681 | 0.9403836 | 0.6720628 |
336 Police officers | year_n | 1.8543180 | 1.6559515 | 1.3781681 | 0.9403836 | 0.6720628 |
336 Police officers | salaryquotient | 1.1808364 | 1.0825015 | 1.3781681 | 0.9403836 | 0.6720628 |
411 Office assistants and other secretaries | perc_women_region | 2.1872367 | 1.7804951 | 0.8268219 | 0.9425405 | 0.5768773 |
411 Office assistants and other secretaries | sum_pop | 2.1534202 | 1.8012277 | 0.8268219 | 0.9425405 | 0.5768773 |
411 Office assistants and other secretaries | salary | 1.9110349 | 1.5828173 | 0.8268219 | 0.9425405 | 0.5768773 |
411 Office assistants and other secretaries | year_n | 1.3145981 | 1.1258907 | 0.8268219 | 0.9425405 | 0.5768773 |
411 Office assistants and other secretaries | salaryquotient | 1.1366153 | 1.0368470 | 0.8268219 | 0.9425405 | 0.5768773 |
411 Office assistants and other secretaries | regioneduyears | 1.1011724 | 1.0330416 | 0.8268219 | 0.9425405 | 0.5768773 |
411 Office assistants and other secretaries | eduquotient | 1.0150045 | 0.9988372 | 0.8268219 | 0.9425405 | 0.5768773 |
422 Client information clerks | sum_pop | 2.2556210 | 2.0199751 | 1.1114007 | 1.2922028 | 0.5754468 |
422 Client information clerks | regioneduyears | 1.8175038 | 1.6419553 | 1.1114007 | 1.2922028 | 0.5754468 |
422 Client information clerks | salaryquotient | 1.2706638 | 1.1570272 | 1.1114007 | 1.2922028 | 0.5754468 |
422 Client information clerks | salary | 1.0000000 | 1.0000000 | 1.1114007 | 1.2922028 | 0.5754468 |
422 Client information clerks | year_n | 1.0000000 | 1.0000000 | 1.1114007 | 1.2922028 | 0.5754468 |
422 Client information clerks | perc_women_region | 1.0000000 | 1.0000000 | 1.1114007 | 1.2922028 | 0.5754468 |
422 Client information clerks | eduquotient | 1.0000000 | 1.0000000 | 1.1114007 | 1.2922028 | 0.5754468 |
532 Personal care workers in health services | regioneduyears | 6.6360229 | 5.5777322 | 1.6115974 | 0.8774912 | 0.8998367 |
532 Personal care workers in health services | eduquotient | 3.4609960 | 2.8222051 | 1.6115974 | 0.8774912 | 0.8998367 |
532 Personal care workers in health services | salary | 3.2162205 | 2.7855818 | 1.6115974 | 0.8774912 | 0.8998367 |
532 Personal care workers in health services | year_n | 2.1602674 | 1.8489474 | 1.6115974 | 0.8774912 | 0.8998367 |
532 Personal care workers in health services | sum_pop | 1.2635892 | 1.1422356 | 1.6115974 | 0.8774912 | 0.8998367 |
532 Personal care workers in health services | salaryquotient | 1.0352898 | 0.9386980 | 1.6115974 | 0.8774912 | 0.8998367 |
532 Personal care workers in health services | perc_women_region | 0.9978554 | 0.9931738 | 1.6115974 | 0.8774912 | 0.8998367 |
533 Health care assistants | regioneduyears | 5.8197354 | 4.7731043 | 1.3421333 | 1.1939419 | 0.9165128 |
533 Health care assistants | eduquotient | 3.5563562 | 3.0151663 | 1.3421333 | 1.1939419 | 0.9165128 |
533 Health care assistants | year_n | 2.5253877 | 2.3079860 | 1.3421333 | 1.1939419 | 0.9165128 |
533 Health care assistants | sum_pop | 1.6773638 | 1.3046246 | 1.3421333 | 1.1939419 | 0.9165128 |
533 Health care assistants | salary | 1.6134876 | 1.4965381 | 1.3421333 | 1.1939419 | 0.9165128 |
533 Health care assistants | perc_women_region | 1.3948836 | 1.1740007 | 1.3421333 | 1.1939419 | 0.9165128 |
533 Health care assistants | salaryquotient | 1.3923600 | 1.1347137 | 1.3421333 | 1.1939419 | 0.9165128 |
534 Attendants, personal assistants and related workers | salary | 3.5389050 | 3.1870397 | 0.9663243 | 0.9694028 | 0.6206695 |
534 Attendants, personal assistants and related workers | year_n | 3.2981057 | 2.6457771 | 0.9663243 | 0.9694028 | 0.6206695 |
534 Attendants, personal assistants and related workers | regioneduyears | 2.7292856 | 2.3165318 | 0.9663243 | 0.9694028 | 0.6206695 |
534 Attendants, personal assistants and related workers | sum_pop | 2.2153663 | 1.9380926 | 0.9663243 | 0.9694028 | 0.6206695 |
534 Attendants, personal assistants and related workers | eduquotient | 2.1263593 | 1.8101511 | 0.9663243 | 0.9694028 | 0.6206695 |
534 Attendants, personal assistants and related workers | perc_women_region | 1.5857830 | 1.4075435 | 0.9663243 | 0.9694028 | 0.6206695 |
534 Attendants, personal assistants and related workers | salaryquotient | 1.0341103 | 0.9911844 | 0.9663243 | 0.9694028 | 0.6206695 |
541 Other surveillance and security workers | salary | 4.7774908 | 4.1261786 | 1.0229660 | 0.8534003 | 0.6723747 |
541 Other surveillance and security workers | perc_women_region | 4.0335443 | 3.0521977 | 1.0229660 | 0.8534003 | 0.6723747 |
541 Other surveillance and security workers | year_n | 3.5765135 | 2.9675823 | 1.0229660 | 0.8534003 | 0.6723747 |
541 Other surveillance and security workers | eduquotient | 2.0845999 | 1.6709589 | 1.0229660 | 0.8534003 | 0.6723747 |
541 Other surveillance and security workers | sum_pop | 1.8146068 | 1.5619745 | 1.0229660 | 0.8534003 | 0.6723747 |
541 Other surveillance and security workers | regioneduyears | 1.0411189 | 0.9862411 | 1.0229660 | 0.8534003 | 0.6723747 |
541 Other surveillance and security workers | salaryquotient | 1.0341765 | 0.9505941 | 1.0229660 | 0.8534003 | 0.6723747 |
962 Newspaper distributors, janitors and other service workers | sum_pop | 2.7464002 | 2.2523244 | 1.1644473 | 0.8815180 | 0.7418281 |
962 Newspaper distributors, janitors and other service workers | perc_women_region | 1.9342434 | 1.6159634 | 1.1644473 | 0.8815180 | 0.7418281 |
962 Newspaper distributors, janitors and other service workers | regioneduyears | 1.8331599 | 1.5608802 | 1.1644473 | 0.8815180 | 0.7418281 |
962 Newspaper distributors, janitors and other service workers | salary | 1.6450591 | 1.3684304 | 1.1644473 | 0.8815180 | 0.7418281 |
962 Newspaper distributors, janitors and other service workers | eduquotient | 1.2975906 | 1.1453299 | 1.1644473 | 0.8815180 | 0.7418281 |
962 Newspaper distributors, janitors and other service workers | salaryquotient | 1.0911504 | 1.0023219 | 1.1644473 | 0.8815180 | 0.7418281 |
962 Newspaper distributors, janitors and other service workers | year_n | 1.0074452 | 0.9471405 | 1.1644473 | 0.8815180 | 0.7418281 |
134 Architectural and engineering managers | salary | 6.6628692 | 5.5853307 | 0.9346946 | 0.8201284 | 0.9151922 |
134 Architectural and engineering managers | eduquotient | 5.9755676 | 4.7127798 | 0.9346946 | 0.8201284 | 0.9151922 |
134 Architectural and engineering managers | regioneduyears | 5.7463923 | 4.7283129 | 0.9346946 | 0.8201284 | 0.9151922 |
134 Architectural and engineering managers | perc_women_region | 2.1729423 | 1.7202449 | 0.9346946 | 0.8201284 | 0.9151922 |
134 Architectural and engineering managers | salaryquotient | 1.7104284 | 1.4330406 | 0.9346946 | 0.8201284 | 0.9151922 |
134 Architectural and engineering managers | sum_pop | 1.5720877 | 1.3353975 | 0.9346946 | 0.8201284 | 0.9151922 |
134 Architectural and engineering managers | year_n | 1.3342337 | 1.0743796 | 0.9346946 | 0.8201284 | 0.9151922 |
321 Medical and pharmaceutical technicians | sum_pop | 2.7792282 | 2.4396571 | 1.3557086 | 0.9362958 | 0.4082033 |
321 Medical and pharmaceutical technicians | salaryquotient | 1.7995439 | 1.5614282 | 1.3557086 | 0.9362958 | 0.4082033 |
321 Medical and pharmaceutical technicians | regioneduyears | 1.6676655 | 1.4757725 | 1.3557086 | 0.9362958 | 0.4082033 |
321 Medical and pharmaceutical technicians | salary | 1.6228142 | 1.3713812 | 1.3557086 | 0.9362958 | 0.4082033 |
321 Medical and pharmaceutical technicians | perc_women_region | 1.2073206 | 1.1181762 | 1.3557086 | 0.9362958 | 0.4082033 |
321 Medical and pharmaceutical technicians | year_n | 1.1558014 | 1.0202584 | 1.3557086 | 0.9362958 | 0.4082033 |
321 Medical and pharmaceutical technicians | eduquotient | 1.0268559 | 0.9480420 | 1.3557086 | 0.9362958 | 0.4082033 |
351 ICT operations and user support technicians | perc_women_region | 2.3485720 | 1.9425995 | 0.9224562 | 1.3928600 | 0.2801627 |
351 ICT operations and user support technicians | sum_pop | 2.1058989 | 1.8259434 | 0.9224562 | 1.3928600 | 0.2801627 |
351 ICT operations and user support technicians | regioneduyears | 1.3109310 | 1.1486190 | 0.9224562 | 1.3928600 | 0.2801627 |
351 ICT operations and user support technicians | salaryquotient | 1.1923807 | 1.0975135 | 0.9224562 | 1.3928600 | 0.2801627 |
351 ICT operations and user support technicians | eduquotient | 1.1053954 | 0.9764972 | 0.9224562 | 1.3928600 | 0.2801627 |
351 ICT operations and user support technicians | year_n | 1.0054393 | 0.9660905 | 0.9224562 | 1.3928600 | 0.2801627 |
351 ICT operations and user support technicians | salary | 0.9995223 | 0.9843242 | 0.9224562 | 1.3928600 | 0.2801627 |
432 Stores and transport clerks | sum_pop | 5.3245183 | 4.2675746 | 1.9035879 | 0.9786471 | 0.7755035 |
432 Stores and transport clerks | regioneduyears | 2.2418584 | 1.8267779 | 1.9035879 | 0.9786471 | 0.7755035 |
432 Stores and transport clerks | perc_women_region | 1.8666360 | 1.5362588 | 1.9035879 | 0.9786471 | 0.7755035 |
432 Stores and transport clerks | salaryquotient | 1.6103759 | 1.3880391 | 1.9035879 | 0.9786471 | 0.7755035 |
432 Stores and transport clerks | eduquotient | 1.4745504 | 1.2612910 | 1.9035879 | 0.9786471 | 0.7755035 |
432 Stores and transport clerks | year_n | 1.1201795 | 0.9753529 | 1.9035879 | 0.9786471 | 0.7755035 |
432 Stores and transport clerks | salary | 1.1088593 | 0.9673804 | 1.9035879 | 0.9786471 | 0.7755035 |
531 Child care workers and teachers aides | perc_women_region | 2.2875316 | 1.9727938 | 0.8746626 | 1.0323371 | 0.4740983 |
531 Child care workers and teachers aides | year_n | 2.2554913 | 1.9691707 | 0.8746626 | 1.0323371 | 0.4740983 |
531 Child care workers and teachers aides | salary | 1.9074879 | 1.6568279 | 0.8746626 | 1.0323371 | 0.4740983 |
531 Child care workers and teachers aides | sum_pop | 1.7777613 | 1.5135123 | 0.8746626 | 1.0323371 | 0.4740983 |
531 Child care workers and teachers aides | regioneduyears | 1.6700858 | 1.5224323 | 0.8746626 | 1.0323371 | 0.4740983 |
531 Child care workers and teachers aides | eduquotient | 1.6139928 | 1.4242386 | 0.8746626 | 1.0323371 | 0.4740983 |
531 Child care workers and teachers aides | salaryquotient | 1.0492472 | 0.9970015 | 0.8746626 | 1.0323371 | 0.4740983 |
819 Process control technicians | eduquotient | 2.0817719 | 1.7942830 | 1.1691890 | 0.8777738 | 0.4932349 |
819 Process control technicians | year_n | 1.5346389 | 1.2918349 | 1.1691890 | 0.8777738 | 0.4932349 |
819 Process control technicians | salaryquotient | 1.4717173 | 1.3207021 | 1.1691890 | 0.8777738 | 0.4932349 |
819 Process control technicians | perc_women_region | 1.0716159 | 0.9923244 | 1.1691890 | 0.8777738 | 0.4932349 |
819 Process control technicians | regioneduyears | 1.0646529 | 1.0173561 | 1.1691890 | 0.8777738 | 0.4932349 |
819 Process control technicians | salary | 1.0537050 | 0.9954603 | 1.1691890 | 0.8777738 | 0.4932349 |
819 Process control technicians | sum_pop | 0.9959705 | 0.9677651 | 1.1691890 | 0.8777738 | 0.4932349 |
821 Assemblers | regioneduyears | 15.1386847 | 12.3171157 | 1.3260624 | 0.8326956 | 0.8026313 |
821 Assemblers | sum_pop | 9.2884888 | 6.7454161 | 1.3260624 | 0.8326956 | 0.8026313 |
821 Assemblers | perc_women_region | 8.1006984 | 6.2366514 | 1.3260624 | 0.8326956 | 0.8026313 |
821 Assemblers | year_n | 5.1637791 | 4.0079990 | 1.3260624 | 0.8326956 | 0.8026313 |
821 Assemblers | salaryquotient | 1.5702498 | 1.3980286 | 1.3260624 | 0.8326956 | 0.8026313 |
821 Assemblers | salary | 1.3401459 | 1.1613060 | 1.3260624 | 0.8326956 | 0.8026313 |
821 Assemblers | eduquotient | 1.2776958 | 1.0033735 | 1.3260624 | 0.8326956 | 0.8026313 |
The sum of the per cent that the model was used by the SuperLearner analysing the different occupational groups.
sp_table %>% ggplot (aes(coef, model)) + geom_col ()
The sum of the strongest feature for every occupational group.
summary_table %>% arrange(desc(importance)) %>% group_by(ssyk) %>% slice(1) %>% ggplot (aes(importance, feature)) + geom_col ()
Let’s see what we have found. First, check the occupation groups with a single feature that is significantly stronger than all other features. Linear models will not be suitable for all occupational groups implying that the model will not have a high R squared value.
A strong signal, the average number of education years in the region, Personal care workers in health services
temp <- filter(tb_unique, `occuptional (SSYK 2012)` == "532 Personal care workers in health services") model <- lm(perc_women_eng_region ~ regioneduyears, weights = suming, data = temp) temp %>% ggplot () + geom_jitter (mapping = aes(x = regioneduyears, y = perc_women_eng_region, colour = suming)) + geom_abline (slope = model$coefficients[2], intercept = model$coefficients[1]) + labs( x = "Education years", y = "Per cent of women in the occupation" )
summary(model)$adj.r.squared ## [1] 0.7732263 anova(model) ## Analysis of Variance Table ## ## Response: perc_women_eng_region ## Df Sum Sq Mean Sq F value Pr(>F) ## regioneduyears 1 315.573 315.573 133.98 5.039e-14 *** ## Residuals 38 89.506 2.355 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 postResample(pred = predict(model), obs = temp$perc_women_eng_region) ## RMSE Rsquared MAE ## 0.01225219 0.69069055 0.01023249
A strong signal, the average number of education years in the region, Medical doctors
temp <- filter(tb_unique, `occuptional (SSYK 2012)` == "221 Medical doctors") model <- lm(perc_women_eng_region ~ regioneduyears, weights = suming, data = temp) temp %>% ggplot () + geom_jitter (mapping = aes(x = regioneduyears, y = perc_women_eng_region, colour = suming)) + geom_abline(slope = model$coefficients[2], intercept = model$coefficients[1]) + labs( x = "Education years", y = "Per cent of women in the occupation" )
summary(model)$adj.r.squared ## [1] 0.8057127 anova(model) ## Analysis of Variance Table ## ## Response: perc_women_eng_region ## Df Sum Sq Mean Sq F value Pr(>F) ## regioneduyears 1 164.765 164.765 154.44 1.385e-14 *** ## Residuals 36 38.407 1.067 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 postResample(pred = predict(model), obs = temp$perc_women_eng_region) ## RMSE Rsquared MAE ## 0.01683530 0.72088034 0.01385548
A strong signal, the per cent women in the region, Insurance advisers, sales and purchasing agents
temp <- filter(tb_unique, `occuptional (SSYK 2012)` == "332 Insurance advisers, sales and purchasing agents") model <- lm(perc_women_eng_region ~ perc_women_region, weights = suming, data = temp) temp %>% ggplot () + geom_jitter (mapping = aes(x = perc_women_region, y = perc_women_eng_region, colour = suming)) + geom_abline(slope = model$coefficients[2], intercept = model$coefficients[1]) + labs( x = "Per cent of women in the region", y = "Per cent of women in the occupation" )
summary(model)$adj.r.squared ## [1] 0.6283407 anova(model) ## Analysis of Variance Table ## ## Response: perc_women_eng_region ## Df Sum Sq Mean Sq F value Pr(>F) ## perc_women_region 1 529.66 529.66 56.791 1.395e-08 *** ## Residuals 32 298.45 9.33 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 postResample(pred = predict(model), obs = temp$perc_women_eng_region) ## RMSE Rsquared MAE ## 0.02935038 0.49206133 0.02250770
Two strong signals, population size in the region and the average number of education years in the region, Engineering professionals
temp <- filter(tb_unique, `occuptional (SSYK 2012)` == "214 Engineering professionals") s3d <- scatterplot3d( temp$sum_pop, temp$regioneduyears, temp$perc_women_eng_region, type = "h", color = "blue", xlab = "Population in region", ylab = "Education years", zlab = "Per cent of women in the occupation") model <- lm(perc_women_eng_region ~ sum_pop + regioneduyears, weights = suming, data = temp) s3d$plane3d(model)
summary(model)$adj.r.squared ## [1] 0.8121964 anova(model) ## Analysis of Variance Table ## ## Response: perc_women_eng_region ## Df Sum Sq Mean Sq F value Pr(>F) ## sum_pop 1 255.902 255.902 144.321 5.673e-14 *** ## regioneduyears 1 31.373 31.373 17.693 0.0001712 *** ## Residuals 35 62.060 1.773 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 postResample(pred = predict(model), obs = temp$perc_women_eng_region) ## RMSE Rsquared MAE ## 0.012229213 0.835386966 0.009935413
Two strong signals, population size in the region and the per cent women in the region, Insurance advisers, sales and purchasing agents
temp <- filter(tb_unique, `occuptional (SSYK 2012)` == "332 Insurance advisers, sales and purchasing agents") s3d <- scatterplot3d( temp$sum_pop, temp$perc_women_region, temp$perc_women_eng_region, type = "h", color = "blue", xlab = "Population in region", ylab = "Per cent of women in the region", zlab = "Per cent of women in the occupation") model <- lm(perc_women_eng_region ~ sum_pop + perc_women_region, weights = suming, data = temp) s3d$plane3d(model)
summary(model)$adj.r.squared ## [1] 0.6525952 anova(model) ## Analysis of Variance Table ## ## Response: perc_women_eng_region ## Df Sum Sq Mean Sq F value Pr(>F) ## sum_pop 1 263.40 263.403 30.214 5.168e-06 *** ## perc_women_region 1 294.45 294.455 33.776 2.099e-06 *** ## Residuals 31 270.25 8.718 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 postResample(pred = predict(model), obs = temp$perc_women_eng_region) ## RMSE Rsquared MAE ## 0.02638844 0.57325855 0.02034915
Two strong signals, year and the per cent women in the region, Physical and engineering science technicians
temp <- filter(tb_unique, `occuptional (SSYK 2012)` == "311 Physical and engineering science technicians") s3d <- scatterplot3d( temp$year_n, temp$perc_women_region, temp$perc_women_eng_region, type = "h", color = "blue", xlab = "Year", ylab = "Per cent of women in the region", zlab = "Per cent of women in the occupation") model <- lm(perc_women_eng_region ~ year_n + perc_women_region, weights = suming, data = temp) s3d$plane3d(model)
summary(model)$adj.r.squared ## [1] 0.5373011 anova(model) ## Analysis of Variance Table ## ## Response: perc_women_eng_region ## Df Sum Sq Mean Sq F value Pr(>F) ## year_n 1 32.63 32.630 7.6503 0.009621 ** ## perc_women_region 1 134.39 134.393 31.5091 4.127e-06 *** ## Residuals 30 127.96 4.265 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 postResample(pred = predict(model), obs = temp$perc_women_eng_region) ## RMSE Rsquared MAE ## 0.01695193 0.59082239 0.01266243
Two strong signals, year and salary, Naprapaths, physiotherapists, occupational therapists
temp <- filter(tb_unique, `occuptional (SSYK 2012)` == "227 Naprapaths, physiotherapists, occupational therapists") s3d <- scatterplot3d( temp$year_n, temp$salary, temp$perc_women_eng_region, type = "h", color = "blue", xlab = "Year", ylab = "Salary", zlab = "Per cent of women in the occupation") model <- lm(perc_women_eng_region ~ year_n + salary, weights = suming, data = temp) s3d$plane3d(model)
summary(model)$adj.r.squared ## [1] 0.5269917 anova(model) ## Analysis of Variance Table ## ## Response: perc_women_eng_region ## Df Sum Sq Mean Sq F value Pr(>F) ## year_n 1 5.8240 5.8240 16.077 0.0005492 *** ## salary 1 4.9902 4.9902 13.776 0.0011481 ** ## Residuals 23 8.3317 0.3622 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 postResample(pred = predict(model), obs = temp$perc_women_eng_region) ## RMSE Rsquared MAE ## 0.01261698 0.46523146 0.01003402
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.