[This article was first published on W. Andrew Barr's Paleoecology Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I consider ggplot2 to be nothing short of a revolution in R graphics. I simply haven’t found anything like this package for quickly and elegantly producing usable graphics. I covered the quick and dirty basics of ggplot2 in a previous post. Now I am going to go into how you can tweak and customize ggplot graphics to look the way you want.
#create some sample data
variable1<-rnorm(200,sd=3) #200 random points from normal dist
variable2<-variable1 * .4 + rnorm(200,sd=.5) #correlated variable
category<-sample(c(“A”,”B”,”C”),200,replace=T) #categorical variable
myDF<-data.frame(variable1,variable2,category) #put in data frame
rm(list=c(“variable1″,”variable2″,”category”)) #delete unneeded variables
So now we have a single object (a dataframe) which contains two correlated continuous variables and one categorical variable. Using the qplot() function from ggplot we can make a quick scatterplot of the continuous variables that is color coded by the categorical variable.
That creates this nice looking default plot.
Changing Themes
Now….lets say that you don’t want a solid gray background, and you want a bigger default size. These are really common tweaks, so ggplot2 makes it easy to change the default “theme” in order to get white backgrounds and custom text sizes. You can set this globally for an entire R session with the following command.
Then we can reuse the single line of code we already used to make a scatter plot, and the “black and white” theme will be applied, along with a larger base size (the default is 12, so we have set it to be twice as large as the default). Note that the different sizes of axes, tick marks, legend titles, etc are automatically calculated relative to the base size that we set to 24.
It is easy to change theme elements for a single plot, rather than changing them globally as we did before with theme_set(). If I decide I want to go back to the default gray theme for any given plot, I can simply add a theme element to the plot object produced by the qplot() function.
Note: I save the result of the qplot() function into a variable called myPlot. This doesn’t result in anything getting printed to the screen, it just saves the ggplot object in a variable, that I can print whenever I want and manipulate to my heart’s content. This is a useful way to avoid typing commands again. The plot gets printed to the screen in the second line of code.
This produces a boxplot with a gray background, because we have added the theme_gray(12) theme element to the plot. Note that this command also resulted in a smaller base size. Also note that the next plot we do will have a white background and the larger base size, because the theme_set() command we did earlier is still in effect globally.
Changing individual theme elements
Although all I did was add the function theme_gray(12) to the previous plot plot, ggplot2 did several things. It changed the background to gray AND the gridlines to white (rather than black gridlines and white background), AND changed the base size. It did these several independent things behind the scenes, but you can of course manually change all the theme elements yourself. Lets say you like super-heinous-ugly plots….you might want to do the following.
Annotating a Plot
We can use the annotate() function as a quick way to add text to our plot. I position the text label by specifying the coordinates where I want the center of the text, in this case centered at 0 on the x-axis and at the maximum y value. Specifying x and y coordinates works on a boxplot also….just think of the categories as sequential integers.
#load ggplot2
library(ggplot2)#create some sample data
variable1<-rnorm(200,sd=3) #200 random points from normal dist
variable2<-variable1 * .4 + rnorm(200,sd=.5) #correlated variable
category<-sample(c(“A”,”B”,”C”),200,replace=T) #categorical variable
myDF<-data.frame(variable1,variable2,category) #put in data frame
rm(list=c(“variable1″,”variable2″,”category”)) #delete unneeded variables
So now we have a single object (a dataframe) which contains two correlated continuous variables and one categorical variable. Using the qplot() function from ggplot we can make a quick scatterplot of the continuous variables that is color coded by the categorical variable.
qplot(x=variable1,y=variable2,data=myDF,color=category,geom=”point”)
That creates this nice looking default plot.
Changing Themes
Now….lets say that you don’t want a solid gray background, and you want a bigger default size. These are really common tweaks, so ggplot2 makes it easy to change the default “theme” in order to get white backgrounds and custom text sizes. You can set this globally for an entire R session with the following command.
theme_set(theme_bw(24))
Then we can reuse the single line of code we already used to make a scatter plot, and the “black and white” theme will be applied, along with a larger base size (the default is 12, so we have set it to be twice as large as the default). Note that the different sizes of axes, tick marks, legend titles, etc are automatically calculated relative to the base size that we set to 24.
qplot(x=variable1,y=variable2,data=myDF,color=category,geom=”point”)
It is easy to change theme elements for a single plot, rather than changing them globally as we did before with theme_set(). If I decide I want to go back to the default gray theme for any given plot, I can simply add a theme element to the plot object produced by the qplot() function.
Note: I save the result of the qplot() function into a variable called myPlot. This doesn’t result in anything getting printed to the screen, it just saves the ggplot object in a variable, that I can print whenever I want and manipulate to my heart’s content. This is a useful way to avoid typing commands again. The plot gets printed to the screen in the second line of code.
myPlot<-qplot(x=category,y=variable1,data=myDF,geom=”boxplot”,fill=category)
myPlot + theme_gray(12) Changing individual theme elements
Although all I did was add the function theme_gray(12) to the previous plot plot, ggplot2 did several things. It changed the background to gray AND the gridlines to white (rather than black gridlines and white background), AND changed the base size. It did these several independent things behind the scenes, but you can of course manually change all the theme elements yourself. Lets say you like super-heinous-ugly plots….you might want to do the following.
myPlot + opts(plot.background=theme_rect(fill=”yellow”),
panel.background=theme_rect(fill=’purple’))
panel.background=theme_rect(fill=’purple’))
This is where the rubber hits the road for tweaking theme elements. We can change theme elements by using the opts() function to pass new values for various theme elements. There are about 30 theme elements you might want to change (listed here). In our case we changed two: the plot.background element and the panel.background element. Note the use of theme_rect(fill=”yellow”) to specify that we want a yellow plot background. Any time you tweak a theme element with the opts() command, you do so with one of five functions.
- theme_blank() – used to prevent an element from being drawn
- theme_text() – used to tweak text and labels
- theme_line() & theme_segment() – for drawing lines and segments for things like axes
- theme_rect() – for things like the overall plot background and the plot panel background
Annotating a Plot
We can use the annotate() function as a quick way to add text to our plot. I position the text label by specifying the coordinates where I want the center of the text, in this case centered at 0 on the x-axis and at the maximum y value. Specifying x and y coordinates works on a boxplot also….just think of the categories as sequential integers.
myScatterplot<-qplot(x=variable1,y=variable2,data=myDF,color=category,geom=”point”)
myScatterplot + annotate(“text”, x = 0, y = max(myDF$variable2), label = “Relationship between x and y”)
To leave a comment for the author, please follow the link and comment on their blog: W. Andrew Barr's Paleoecology Blog.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.