Estimating data parameters using R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Say we have some data and we are pretty confident that it comes from a random variable which follows a Normal distribution, now we would like to estimate the parameters of that distribution. Since the best estimator for the population mean is the sample mean and the best estimator for the variance is the corrected variance estimator, we could use those two estimators to compute a point estimate of the parameters we need. But, what if we would like to have a rough idea of what could be the range of those parameters within a certain level of confidence? Well, then we would have to find an interval that contains the parameters at a, say, 5% confidence level.
In order to do this, since the variance is unknown and needs to be estimated, we use the Student-t distribution and the following formula for the two sided interval:
In the same way, we could find unilateral boundaries for the value of the population mean, simply by adjusting the formula above:
Furthermore one could visualise the outcome by plotting the selected regions on a standard t-Student with n-1 degrees of freedom:
1) Confidence interval2) Upper bound3)Lower bound
Here below is he implementation in R of the process described above:
By clicking here you can download the code of the plotting function I used to generate the plots above.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.