Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This is the first in a series of ongoing posts where I’ll take data on various topics and create simple visualizations of that data using the ggplot2 package in R. While my day job involves analyzing data, I rarely work on projects where I’m expected to produce “publication-worthy” graphics. Therefore, these posts are a way for me to continually practice using ggplot2 to produce static graphics, and also allows me to develop a better understanding of data visualization.
This first post looks at data on employee job satisfaction at top technology companies. For each business, we’re provided with a value ranging from zero to one for four variables, job satisfaction, work stress, job meaning, and job flexibility. Using the ggplot2 package in R, I constructed a variety of distribution plots.
ggplot(data=ndat, aes(x=Employer, y=value, fill=factor(variable))) + geom_bar(position="dodge", stat="identity") + coord_flip() + ylim(c(0,1.25)) + ylab("Job Satisfaction") + ggtitle("Plot 1: Employee Job Satisfaction at Top Tech Companies") ggplot(data=ndat, aes(x=Employer, y=value, fill=factor(variable))) + geom_bar(stat="identity") + coord_flip() + ylab("Job Satisfaction") + ggtitle("Plot 2: Employee Job Satisfaction at Top Tech Companies") ggplot(data=ndat, aes(x=Employer, y=value, fill=factor(Employer))) + geom_bar(stat="identity") + coord_flip() + ylim(c(0,1.5)) + facet_wrap( ~ variable, ncol=2) + theme(legend.position="none") + ggtitle("Plot 3: Employee Job Satisfaction at Top Tech Companies") + ylab(c("Job Satisfaction")) ggplot(data=ndat, aes(x=variable, y=value, fill=factor(variable))) + geom_bar(stat="identity") + coord_flip() + ylim(c(0,1.5)) + facet_wrap( ~ Employer, ncol=3) + theme(legend.position="none") + ggtitle("Plot 4: Employee Job Satisfaction at Top Tech Companies") + ylab(c("Job Satisfaction")) p1 = ggplot(data=nndat1, aes(x=variable, y=value, fill=factor(variable))) + geom_bar(stat="identity", colour="black") + ylab("Job Satisfaction") + ylim(c(0,1.5)) + ggtitle("Employee Job Satisfaction at LinkedIn") + theme(legend.position="none") + xlab(c("")) p2 = ggplot(data=nndat2, aes(x=variable, y=value, fill=factor(variable))) + geom_bar(stat="identity", colour="black") + ylab("Job Satisfaction") + ylim(c(0,1.5)) + ggtitle("Employee Job Satisfaction at HP") + theme(legend.position="none") + xlab(c("")) library(gridExtra) grid.arrange( p1, p2, ncol=1) Comments: 1. Not completely sure if horizontal bar plots are better than Cleveland Dot Plots for visualizing this data. 2. The default colors in ggplot2 are ugly. 3. With barplots, one issue is having plots with bars that are too wide, and that is the case with plot 5.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.