Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A chi-square test is used to analyze nominal (sometimes known as categorical) data. It is pronounced kai and is frequently written as a χ2 test. It’s used to compare the observed frequencies in each sample’s response categories. The null hypothesis of a chi-square test is that the nominal variables have no relationship, that they are independent. That means,
- H0: There is no relationship between the nominal variables or variables are independent.
- H1: H0 is not true.
Creating or Importing data
In this step, we have to import our data into R or we can generate a data set for example.
Let’s create some nominal data:
set.seed(150) data <- data.frame(sampleA = sample(c("Positive","Positive","Negative"), 300, replace = TRUE), sampleB = sample(c("Positive","Positive","Negative"), 300, replace = TRUE)) Perform the chi-square test using the chisq.test function: test <- chisq.test(x = data$sampleA, y = data$sampleB) Analyse the result: > test
Pearson’s Chi-squared test with Yates’ continuity correction,
data: data$sampleA and data$sampleB X-squared = 1.7444, df = 1, p-value = 0.1866 p-value
Interpretation of Chi-square test
To interpret the chi-square test we use p-value. If the p-value is less or equal to 0.05 then we may reject the null hypothesis that means the categorical variables are independent. The p-value is 0.1866, which is above the 5% significance level, therefore the null hypothesis cannot be rejected.
Chi-Square (χ2) statistic
A large χ2 statistic means that the null hypothesis can be rejected. To determine how large it needs to be, the critical value can be found using the degrees of freedom and the significance level.
In our example, we have 1 degree of freedom. Using a table of probabilities for the χ2 distribution (example here), we can see that the critical χ2 value is 3.841. Therefore, the null hypothesis can be rejected where χ2 >= 3.841, but in this case, it is below 3.841 and the null hypothesis, therefore, cannot be rejected.
Learn Data Science and Machine Learning
Data Analysis Using R/R Studio
- Import data into R
- Principal component analysis (PCA) code
- Canonical correlation analysis (CCA) code
- Independent component analysis (ICA) code
- Cluster Analysis using R
- One-way ANOVA using R
- Two-way ANOVA using R
- Paired sample t-test using R
- Random Forest in R
The post Chi-Square test using R appeared first on Statistical Aid: A School of Statistics.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.