Visualizing Sampling Distributions in ggplot2: Adding area under the curve
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Thank you ggplot2tutor for solving one of my struggles. Apparently this is all it takes:
ggplot(NULL, aes(x = c(-3, 3))) + stat_function(fun = dnorm, geom = "line")
I can’t begin to count how often I have wanted to visualize a (normal) distribution in a plot. For instance to show how my sample differs from expectations, or to highlight the skewness of the scores on a particular variable. I wish I’d known earlier that I could just add one simple geom to my ggplot!
Want a different mean and standard deviation, just add a list to the args argument:
ggplot(NULL, aes(x = c(0, 20))) + stat_function(fun = dnorm, geom = "area", args = list( mean = 10, sd = 3 ))
Need a different distribution? Just pass a different distribution function to stat_function. For instance, an F-distribution, with the df function:
ggplot(NULL, aes(x = c(0, 5))) + stat_function(fun = df, geom = "area", args = list( df1 = 2, df2 = 10 ))
You can make it is complex as you want. The original ggplot2tutor blog provides this example:
ggplot(NULL, aes(x = c(-3, 5))) + stat_function( fun = dnorm, geom = "area", fill = "steelblue", alpha = .3 ) + stat_function( fun = dnorm, geom = "area", fill = "steelblue", xlim = c(qnorm(.95), 4) ) + stat_function( fun = dnorm, geom = "line", linetype = 2, fill = "steelblue", alpha = .5, args = list( mean = 2 ) ) + labs( title = "Type I Error", x = "z-score", y = "Density" ) + scale_x_continuous(limits = c(-3, 5))
Have a look at the original blog here: https://ggplot2tutor.com/sampling_distribution/sampling_distribution/
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.