Site icon R-bloggers

Investigation the relationship between two variables using a scatter plot

[This article was first published on Software for Exploratory Data Analysis and Statistical Modelling, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The relationship between two variables can be visually represented using a scatter plot and will provide some insight into the correlation between the variables and possible models to describe the relationship. There are different ways to produce scatter plots in R making use of either the base graphics system, the lattice graphics library, ggplot2 or other packages.

The R system has various data sets available for analysis, for example the Puromycin data which describes an experiment to study the relationship between reaction velocity and substrate concentration in an enzymatic reaction involving untreated cells or cells that were treated with Puromycin.

The variable rate can be plotted against the variable conc to investigate the relationship. Using the lattice package we can use the xyplot function to create a graph with the following code:

xyplot(rate ~ conc, data = Puromycin,
  xlab = "Substrate concentration (ppm)",
  ylab = "Reaction velocity (counts/min/min)",
  main = "Reaction velocity of an enzymatic reaction")

In this graph we do not distinguish between the untreated and treated cells and this code produces this graph:

Reaction Rate plotted versus Concentration for Puromycin data


We can make use of different plotting symbols to distinguish between the treated and untreated cells by the groups argument. We adjust the above code as follows:
xyplot(rate ~ conc, data = Puromycin,
  xlab = "Substrate concentration (ppm)",
  ylab = "Reaction velocity (counts/min/min)",
  main = "Reaction velocity of an enzymatic reaction",
  groups = state)

Here the graph is now:

Plot of the Reaction Rate against Concentration by Treatment


An alternative would make use of the panelling facilities in lattice graphics to plot the data for the treated and untreated cells separately.

To leave a comment for the author, please follow the link and comment on their blog: Software for Exploratory Data Analysis and Statistical Modelling.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.