Site icon R-bloggers

Creating shaded areas in R

[This article was first published on fernandohrosa.com.br - en » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Sometimes we want to shade areas under a density on a graphic, for instance to illustrate a p-value or a region under the normal curve.

Using the polygon() function from base R we can make graphics like the following one:

Standard normal density. Shaded area shows P(1 < X < 2)

How to do it

First we have to plot the graphic of the function whose areas we are going to shade. We can usually do that using the function curve(). When we have a standard normal density we can use:

View Code RSPLUS
1
curve(dnorm(x,0,1))

The command above will create the graphic over an arbitrary range on the x values though. Since we want to shade a specific area under the curve, it is advisable to specify which interval on the x-axis we want the function plotted on. We can do that using the xlim parameter:

View Code RSPLUS
1
curve(dnorm(x,0,1),xlim=c(-3,3),main='Normal Density')

Now that we have our curve plotted, we are going to use the polygon(x,y) function. Its most important parameters are x and y, which define what kind of polygon R is going to draw. If x and y are vectors with i=1,…n elements, polygon(x,y) draws a polygon with vertices (xi,yi), i=1,…n. The trick now consists of expressing our region of interest as a polygon. The idea might seem stupid at first, as the area of interest is usually over a continuous curve, but if we take a large enough number of vertices the approximation will be very reasonable.

Let’s say we want to shade the region represented by P(-3 < X < -2). The first vertex we want for our polygon is (-3,0). We have then:

View Code RSPLUS
1
2
cord.x <- c(-3)
cord.y <- c(0)

The second vertex will be (-3,f(-3)), where f(-3) is the normal density evaluated at -3. We can readily obtain this value on R using the dnorm() function:

View Code RSPLUS
1
2
cord.x <- c(cord.x,-3) 
cord.y <- c(cord.y,dnorm(-3))

As a third and fourth vertices we can consider (-2,f(-2)) and (-2,0). We have now:

View Code RSPLUS
1
2
cord.x <- c(cord.x,-2,-2)
cord.y <- c(cord.y,dnorm(-2),0)

We then issue:

View Code RSPLUS
1
polygon(cord.x,cord.y,col='skyblue')

Obtaining (if you followed our instructions correctly ):

First attempt at creating a shaded area in R

Not that bad, but we can make it better than that. We just have to create a sequence of vertices between the two extrema of the density (f(-3) and f(-2) here), so that the larger number of vertices gives us a better approximation. In order to accomplish that we are going to use the seq() function, to create a sequence between -3 and 2, with steps of 0.01:

View Code RSPLUS
1
2
3
4
 cord.x <- c(-3,seq(-3,-2,0.01),-2) 
 cord.y <- c(0,dnorm(seq(-3,-2,0.01)),0) 
 curve(dnorm(x,0,1),xlim=c(-3,3),main='Standard Normal') 
 polygon(cord.x,cord.y,col='skyblue')

Which gives us the result:

Final graphic.

Commentaries

Note that we used the standard normal density. If we wanted to use another probability distribution its parameters would have to be properly specified. Try playing with areas under the exponential distribution, for example. To learn more about the uses and misuses of the polygon function, take a look at its documentation (?polygon).

To leave a comment for the author, please follow the link and comment on their blog: fernandohrosa.com.br - en » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.