Auto Encoders for Anomaly Detection in Predictive Maintenance
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Autoencoders is an unsupervised version of neural network that is used for data encoding. This technique is mainly used to learn the representation of data that can be used for dimensionality reduction by training network to ignore noise. Autoencoders play an important role in unsupervised learning and deep architectures mainly for transfer learning (Pierre. B, 2012). When autoencoders are decoded, they are simple linear circuits that transforms inputs to outputs with least distortion. Autoencoders were first introduced in 1980’s to address the issue of back propagation without training and rather use input as a teacher (Rumelhart et al., 1986). Since then, autoencoders have taken a phase change to the form on Restricted Boltzman Machine. Today, autoencoders are used in various applications such as predicting sentiment distributions in Natural Language Processing (NLP) (Socher et al., 2011a) (Socher et al., 2011b), feature extraction (Masci et al., 2011), anomaly detection (Sakurada et al., 2014), facial recognition (Gao et al., 2015), clustering (Dilokthanakul et al., 2016), image classification (Geng et al., 2015) and many other application.
Image: Simple auto encoder representation
In today’s tutorial, I will go over on how to use auto encoders for anomaly detection in predictive maintenance.
Load Libraries
You will need only two libraries for this analysis.
options(warn=-1)
# load libraries
library(dplyr)
library(h2o)
Load data
Here we are using data from a bench press and can be downloaded from my github repo. This is an experimental data I generated in a lab for my PhD dissertation. There are total of four different states in this machine and they are split into four different csv files. We need to load the data first. In the data time represents the time between samples, ax is the acceleration on x axis, ay is the acceleration on y axis, az is the acceleration on z axis and at is the G’s. The data was collected at sample rate of 100hz.
Four different states of the machine were collected
1. Nothing attached to drill press
2. Wooden base attached to drill press
3. Imbalance created by adding weight to one end of wooden base
4. Imbalance created by adding weight to two ends of wooden base.
setwd("/home/")
#read csv files
file1 = read.csv("dry run.csv", sep=",", header =T)
file2 = read.csv("base.csv", sep=",", header =T)
file3 = read.csv("imbalance 1.csv", sep=",", header =T)
file4 = read.csv("imbalance 2.csv", sep=",", header =T)
head(file1)
We can look at the summary of each file using summary function in R. Below, we can observe that 66 seconds long data is available. We also have min, max and mean for each of the variables.
# summary of each file
summary(file2)
Data Aggregation and feature extraction
Here, the data is aggregated by 1 minute and features are extracted. Features are extracted to reduce the size of the data and only storing the representation of the data.
file1$group = as.factor(round(file1$time))
file2$group = as.factor(round(file2$time))
file3$group = as.factor(round(file3$time))
file4$group = as.factor(round(file4$time))
#(file1,20)
#list of all files
files = list(file1, file2, file3, file4)
#loop through all files and combine
features = NULL
for (i in 1:4){
res = files[[i]] %>%
group_by(group) %>%
summarize(ax_mean = mean(ax),
ax_sd = sd(ax),
ax_min = min(ax),
ax_max = max(ax),
ax_median = median(ax),
ay_mean = mean(ay),
ay_sd = sd(ay),
ay_min = min(ay),
ay_may = max(ay),
ay_median = median(ay),
az_mean = mean(az),
az_sd = sd(az),
az_min = min(az),
az_maz = max(az),
az_median = median(az),
aT_mean = mean(aT),
aT_sd = sd(aT),
aT_min = min(aT),
aT_maT = max(aT),
aT_median = median(aT)
)
features = rbind(features, res)
}
#view all features
head(features)
Create Train and Test Set
To build an anomaly detection model, a train and test set is required. Here, the normal condition of the data is used for training and remaining is used for testing.
# create train and test set
train = features[1:67,2:ncol(features)]
test = features[68:nrow(features),2:ncol(features)]
Auto Encoders
Auto Encoders using H2O package
Use the h2o.init()method to initialize H2O. This method accepts the following options. Note: that in most cases, simply using h2o.init() is all that a user is required to do.
# initialize h2o cluser
h2o.init()
The R object to be converted to an H2O object should be named so that it can be used in subsequent analysis. Also note that the R object is converted to a parsed H2O data object, and will be treated as a data frame by H2O in subsequent analysis.
# convert train and test to h2o object
train_h2o = as.h2o(train)
test_h2o = as.h2o(test)
The h2o.deeplearning function fits H2O’s Deep Learning models from within R. While H2O Deep Learning has many parameters, it was designed to be just as easy to use as the other supervised training methods in H2O. Early stopping, automatic data standardization and handling of categorical variables and missing values and adaptive learning rates (per weight) reduce the amount of parameters the user has to specify. Often, it’s just the number and sizes of hidden layers, the number of epochs and the activation function and maybe some regularization techniques.
# build auto encoder model with 3 layers
model_unsup = h2o.deeplearning(x = 2:ncol(features)
, training_frame = train_h2o
, model_id = "Test01"
, autoencoder = TRUE
, reproducible = TRUE
, ignore_const_cols = FALSE
, seed = 42
, hidden = c(50,10,50,100,100)
, epochs = 100
, activation ="Tanh")
# view the model
model_unsup
Detect anomalies in an H2O data set using an H2O deep learning model with auto-encoding trained previously.
# now we need to calculate MSE or anomaly score
anmlt = h2o.anomaly(model_unsup
, train_h2o
, per_feature = FALSE) %>% as.data.frame()
# create a label for healthy data
anmlt$y = 0
# view top data
head(anmlt)
Calculate the threshold value for trainanomaly scores. Various methods can be used such as calculating the quantiles, max, median, min etc. It all depends on the use case. Here we will use quantile with probability of 99.9%.
# calculate thresholds from train data
threshold = quantile(anmlt$Reconstruction.MSE, probs = 0.999)
Now, we have anomaly score for train and its thresholds, we can predict the new anomaly scores for test data and plot it to see how it differs from train data.
# calculate anomaly scores for test data
test_anmlt = h2o.anomaly(model_unsup
, test_h2o
, per_feature = FALSE) %>% as.data.frame()
# create a label for healthy data
test_anmlt$y = 1
# combine the train and test anomaly scores for visulaizatio
results = data.frame(rbind(anmlt,test_anmlt), threshold)
head(results)
The results are plotted below. The x axis is the observations and y axis is the anomaly score. The green points are the trained data and red are test data. We can note that all the data that was trained except one lied below the anomaly limit. Its also interesting to note the increasing trend pattern for the anomaly scores for other state of the machine.
# Adjust plot sizes
options(repr.plot.width = 15, repr.plot.height = 6)
plot(results$Reconstruction.MSE, type = 'n', xlab='observations', ylab='Reconstruction.MSE', main = "Anomaly Detection Results")
points(results$Reconstruction.MSE, pch=19, col=ifelse(results$Reconstruction.MSE < threshold, "green", "red"))
abline(h=threshold, col='red', lwd=2)
Conclusion
Auto encoder is a very powerful tool and very fun to play with. They have been used in image analysis, image reconstruction and image colorization. In this tutorial you have seen how to perform anomaly detection on a simple signal data and few lines of code. The possibilities of using this are many. Let me know what you think about auto encoders in the comments below.
Follow my work
Github, Researchgate, and LinkedIn
Session info
Below is the session info for the the packages and their versions used in this analysis.
sessionInfo()
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.