Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
- The variables are continuous and independent
- The variables are normally distributed
- The variances in each group are equal
When these assumptions are satisfied the results of the t test are valid. Otherwise they are invalid and you need to use a non-parametric test. When data is not normally distributed you can apply transformations to make it normally distributed.
For this exercise it is important to have a good understanding of data normality and hypothesis testing.
For this set of exercises we will use a motor trend car road tests data set. This data is already available in R as mtcars. The data consists of fuel consumption and vehicle characteristics related to design and the level of performance. Our interest in this exercise is to test if there are any significant differences in miles per gallon achieved between manual and automatic transmission vehicles.
Answers to the exercises are available here. If you have an alternative answer please post in the comments.
Exercise 1
Inspect the structure of the data
Exercise 2
Label the am (0,1) variable into automatic and manual categories
Check data labeling was successful
Exercise 3
Attach mtcars data so that its variables are easily accessible
Exercise 4
Generate descriptive statistics for each group
Exercise 5
Generate box plot for each group
Exercise 6
Test for normality in each group
Exercise 7
Perform a Levene test for equality of variances in the two groups
Exercise 8
Apply a log transformation to stabilize data variance
Exercise 9
Perform a t test on the transformed variable
Exercise 10
Interpret the results
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.