[This article was first published on bRogramming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In class today we were discussing several types of survey sampling and we split into groups and did a little investigation. We were given a page of 100 rectangles with varying areas and took 3 samples of size 10. Our first was a convenience sample. We just picked a group of 10 rectangles adjacent to each other and counted their area. Next, we took a simple random sample (SRS), numbering the rectangles 1 through 100 and choosing 10 with a random number generator. Last, we took a stratified random sample by marking 50 rectangles as “Large” and 50 as “Small”, then randomly selecting 5 from each strata. Our estimates of the total area in all 100 rectangles and their 95% confidence intervals are given in the plot above, along with the true value. Our experiment turned out exactly how it was supposed to. Our convenience sample had the largest variability and our stratified sample the smallest. All 3 confidence intervals captured the true value, as you would expect to happen 95% of the time. I would share my R code with the figure, but it’s really sloppy and not nearly as nice as the succinct confirmation of statistical principles offered by just the image.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
To leave a comment for the author, please follow the link and comment on their blog: bRogramming.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.