Power Analysis and the Probability of Errors
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Power analysis is a very useful tool to estimate the statistical power from a study. It effectively allows a researcher to determine the needed sample size in order to obtained the required statistical power. Clients often ask (and rightfully so) what the sample size should be for a proposed project. Sample sizes end up being a delicate balance between the amount of acceptable error, detectable effect size, power, and financial cost. A lot of factors go into this decision. This example will discuss one approach.
In more specific terms power is the probability that a statistical test will reject the null hypothesis when the null hypothesis is truly false. What this means is that when power increases the probability of making a Type II error decreases. The probability of a Type II error is denoted by and power is calculated as .
In order to calculate the probability of a Type II error a researcher needs to know a few pieces of information , , , and (probability of a Type I error). Normally, if a researcher already knows the population mean () and variance () there is no need to take a sample to estimate them. However, we can set it up so we can look at a range of possible unknown population means and variances to see what the probability of a Type II error is for those values.
The following code shows a basic calculation and the density plot of a Type II error.
This graph shows what the power will be at a variety of sample sizes. In this example to obtain a power of 0.90 () a sample of size 23 (per group) is needed. So that will be a total of 46 observations. It’s then up to the researcher to determine the appropriate sample size based on needed power, desired effect size, level, and cost.
There is no real fixed standard for power. However, 0.8 and 0.9 are often used. This means that the probability of a Type II error is 0.2 and 0.1, respectively. But it really comes down to whether the researcher is willing to accept a Type I error or a Type II error. For example, it’s probably better to erroneously have a healthy patient return for a follow-up test than it is to tell a sick patient they’re healthy.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.