Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
We will work with a health related database the famous “Pima Indians Diabetes Database”. It was generously donated by Vincent Sigillito from Johns Hopkins University. Please find further information regarding the dataset here.
This is the first part of the series, it is going to be about data display.
Before proceeding, it might be helpful to look over the help pages for the table
, pie
, geom_bar
, coord_polar
, barplot
, stripchart
, geom_jitter
, density
, geom_density
, hist
, geom_histogram
, boxplot
, geom_boxplot
, qqnorm
, qqline
, geom_point
, plot
, qqline
, geom_point
.
install.packages('ggplot2')
library(ggplot)
Please run the code below in order to load the data set and make it into a proper data frame format:
url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
data <- read.table(url, fileEncoding="UTF-8", sep=",")
names <- c('preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class')
colnames(data) <- names
Answers to the exercises are available here.
If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.
Exercise 1
Create a frequency table of the class
variable.
Exercise 2
class.fac <- factor(data[['class']],levels=c(0,1), labels= c("Negative","Positive"))
Create a pie chart of the class.fac
variable.
Exercise 3
Create a bar plot for the age
variable.
Exercise 4
Create a strip chart for the mass
against class.fac
.
Exercise 5
Create a density plot for the preg
variable.
Exercise 6
Create a histogram for the preg
variable.
Exercise 7
Create a boxplot for the age
against class.fac
.
Exercise 8
Create a normal QQ plot and a line which passes through the first and third quartiles.
Exercise 9
Create a scatter plot for the variables age
against the mass
variable .
Exercise 10
Create scatter plots for every variable of the data set against every variable of the data set on a single window.
hint: it is quite simple, don’t overthink about it.
Related exercise sets:
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.