Site icon R-bloggers

Individuals Control Charts

[This article was first published on R on jmarriott.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This post is a follow-up two my two recent posts on generating control charts in R, and animating them. One thing that I’ve been wondering about is how to calculate a range chart without using a package like ggQC or qcc. I knew that I could do it using a loop, but I was looking for a dplyr method when I stumbled on the dplyr::lag() function in this article on stackoverflow. It should work like a charm!

First, let’s load the tidyverse and generate some sample data.

library(tidyverse)
set.seed(1558635665)
control_chart_tibble <- tibble(x = rnorm(n = 30, mean = 25, sd = .005))
control_chart_tibble
## # A tibble: 30 x 1
##        x
##    <dbl>
##  1  25.0
##  2  25.0
##  3  25.0
##  4  25.0
##  5  25.0
##  6  25.0
##  7  25.0
##  8  25.0
##  9  25.0
## 10  25.0
## # ... with 20 more rows

Now we can use the lag() function to calculate the range between each row.

control_chart_tibble <- control_chart_tibble %>%
  mutate(mr = abs(x - lag(x, 1)))
control_chart_tibble
## # A tibble: 30 x 2
##        x        mr
##    <dbl>     <dbl>
##  1  25.0 NA       
##  2  25.0  0.0118  
##  3  25.0  0.00396 
##  4  25.0  0.000981
##  5  25.0  0.00289 
##  6  25.0  0.00344 
##  7  25.0  0.00420 
##  8  25.0  0.00482 
##  9  25.0  0.00306 
## 10  25.0  0.000677
## # ... with 20 more rows

The first row under the r column is NA because the first range is between the first and second rows and it gets stored in the second row.

Now the only thing left to do is calculate the control limits for the individuals and range charts.

To estimate the variance we’ll use the average of the r column (i.e. r-bar) and use the d2 constant as if we had two data-points for each observation. This will be more accurate than simply calculating the standard deviation using R’s built-in sd() function. Keep in mind when you do this that individuals charts like these won’t work very well if your data deviates even a little bit from a normal distribution. x-bar charts where you take the average of n data-points are less sensitive to the assumption of normality because of the central limit theorem.

Ok, enough boring stuff. Let’s do the math!

control_chart_tibble <- control_chart_tibble %>%
  mutate(x_cl = mean(x), # x-bar centerline
         mr_cl  = mean(mr, na.rm = TRUE), # moving range centerline
         x_ucl  = x_cl + 3 * mr_cl/1.1284, # x-bar UCL = x-bar + 3 * r-bar / d2
         x_lcl  = x_cl - 3 * mr_cl/1.1284, # x-bar LCL = x-bar - 3 * r-bar / d2
         mr_ucl = 3.2665 * mr_cl, # moving range UCL = D2 * r-bar
         mr_lcl = 0, # moving range LCL = 0
         n      = row_number()) %>%
  gather(key = "type", value = "value", -n) %>%
  mutate(chart = case_when( # Add a column to facet by
    str_detect(type, "x") ~ "Individuals",
    str_detect(type, "mr") ~ "Moving Range"))

control_chart_tibble %>%
  ggplot(aes(y = value, x = n, group = type)) + 
  # the "group" aes allows us to display different data without differentiating it on the chart
  geom_line() +
  facet_grid(chart~., scales = "free") +
  ggtitle("Example X-Moving Range Chart") +
  theme_bw() +
  theme(axis.title.x = element_blank(), axis.title.y = element_blank())

That’s a wrap! Let me know what you think in the comments.

To leave a comment for the author, please follow the link and comment on their blog: R on jmarriott.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.