Functional Regression Control Chart: a New Framework for Profile Monitoring
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
New statistical process control (SPC) methods have to be developed in order to handle more and more complex data, which are available because of the advent of new data acquisition technologies. In particular, in many practical situations the quality characteristic of a process can be modelled as a function defined on a compact domain, data of such kind are the foundation of a rapidly expanding area of statistics referred to as functional data analysis (FDA). SPC methods which allow monitoring and controlling such processes are known as profile monitoring techniques. As in the classical SPC (i.e., where data are scalars) profile monitoring control charts have the task of continuously monitoring the quality characteristic and of triggering a signal when assignable sources of variations (i.e., special causes) act on it. When this happens, the process is said to be out-of-control (OC). On the contrary, the process is said to be in-control (IC) when only normal sources of variation (i.e., common causes) apply.
Often, measures of other functional covariates related to the quality
characteristic are available. To this end, we propose a new control
chart that continuously monitors the quality characteristic using
information coming from the other functional covariates. The idea is to
adjust the quality characteristic value in order to improve the accuracy
and the effectiveness of the chart in identifying assignable sources of
variations acting on the process. This chart is referred to as
functional regression control chart (FRCC) due to the similarity to
the regression control chart, which arises in the multivariate (non
functional) context. The proposed methodology is implemented in the R
package funcharts
available at
https://github.com/unina-sfere/funcharts.
The Functional Regression Control Chart Framework
The FRCC can be regarded as a general framework for profile monitoring
that can be divided into three main steps. Firstly, (i) define a
functional regression model to be fitted $$\label{eq_generalmodel}
\tilde{Y}=g\left(\mathbf{\tilde{X}}\right)+\varepsilon,$$ where
$\tilde{Y}$
is the functional response variable, which represents the
functional quality characteristic, and $\varepsilon$
is a functional
error term, both defined on the compact domain $\mathcal{T}$
, $g$
is a
generic function of a vector $\mathbf{\tilde{X}}$
of random functional
covariates $\tilde{X}_1,\dots,\tilde{X}_p$
, defined on the compact
domain $\mathcal{S}$
. Secondly, (ii) define the estimation method of the
chosen model, and, thirdly (iii) define the monitoring strategy of the
functional residual defined as $$\label{eq_generalresiduals}
\tilde{e}=\tilde{Y}-\widehat{\tilde{Y}} ,$$ where $\widehat{\tilde{Y}}$
is the fitted value of $\tilde{Y}$
.
In particular, to obtain a specific implementation of the FRCC, we
assume that the covariates $\mathbf{X}$
linearly influence the response
$Y$
through the multivariate functional linear regression model, that
is $$\label{eq_lm}
Y\left(t\right)=\int_{\mathcal{S}}\left(\mathbf{\beta}\left(s,t\right)\right)^{T}\mathbf{X}\left(s\right)ds+\varepsilon\left(t\right)\quad t \in \mathcal{T},$$
where $Y$
and $\mathbf{X}$
are the standardized versions of $\tilde{Y}$
and
$\tilde{\mathbf{X}}$
, and
$\mathbf{\beta}=\left(\beta_1,\dots,\beta_p\right)^{T}$
is the coefficient
vector. An estimator $\hat{\mathbf{\beta}}$
of the coefficient vector
$\mathbf{\beta}$
is obtained using $n$
i.i.d. observations of the response
and predictor variables, and considering the *multivariate functional principal component* or
Karhunen–Loève decomposition of $Y$
and
$\mathbf{X}$
. To monitor the residual $\tilde e$
, we consider the Hotelling’s $T^{2}$
and the squared prediction error ($SPE$
) control charts based on the
scores of the functional principal component decomposition. The control
limits are calculated using percentiles of the empirical distributions
of the two statistics, estimated considering observations acquired under
in-control conditions and an overall Type I error. This phase, along
with the estimation of $\mathbf{\beta}$
, will be
referred to as Phase I. For a new observation, the residual and, thus,
the $T^{2}$
and $SPE$
statistics are calculated and an alarm signal is
issued if at least one statistic violets the control limits (Phase II).
Real-case Study: Fuel Consumption Monitoring in the Shipping Industry
To demonstrate the potential and the applicability of the proposed
control chart in practical situations, a real-case study in the shipping
industry is presented. It addresses the issue of monitoring ship fuel
consumption and, thus, $\text{CO}_{\text{2}}$
emissions, which, in view of the dramatic
climate change, is of great interest in the maritime field in the very
last years. In particular, real data are collected from a Ro-Pax ship
owned by the Italian shipping company Grimaldi Group linking two ports
in the Mediterranean sea from December 2014 to October 2017.
The following figure shows the
315 profiles observed for the covariates and response.
In particular, the cumulative fuel consumption ($CFC$
) per each voyage
is considered as the response variable, whereas, the sailing time
($T$
), measured in hours ($h$
), the speed over ground ($SOG$
),
measured in knots ($kn$
), and the longitudinal and transverse wind
components ($W_{lo}$
and $W_{tr}$
), measured in knots ($kn$
), are
assumed as the predictors.
During February 2016 energy efficiency operations were performed that
produced a shift in the response mean. In light of this, observations
before energy efficiency operations are used in Phase I, whereas the
remaining observations are used to perform Phase II. To evaluate the
FRCC performance, two competitor profile monitoring schemes are
considered. They consist of monitoring scores coming from a principal
decomposition of the response by means of Hotelling’s $T^{2}$
and the
$SPE$
control charts (hereafter denoted as RESP control chart), and of
monitoring the area under the response curve (hereafter denoted as INBA
control chart). The performance of the three charts is evaluated by
means of the average run length ($\text{ARL}$
).
In the following figure, each observation is plotted onto the FRCC control chart and the two competitor ones.
By comparing the three charts, the responsiveness of the FRCC is
evidently higher than that of the the INBA and the RESP control charts
which signal a much lower number of OCs. In particular, for the FRCC the
change in the response mean is almost exclusively captured by the
$T^{2}$
control chart, which means that dissimilarities between the
Phase I and Phase II samples occur mostly in the space spanned by the
retained principal components. Moreover, by looking at the following table,
the estimated $\text{ARL}$
($\widehat{\text{ARL}}$
) achieved by FRCC is at least a
fourth of those achieved by the RESP and INBA control charts. This
further confirms that the FRCC outperforms the competitor control
charts.
FRCC | RESP | INBA | |
---|---|---|---|
$\widehat{\text{ARL}}$ |
2.07 | 9.46 | 11.28 |
This article is based on
Centofanti, Fabio, Antonio Lepore, Alessandra Menafoglio, Biagio
Palumbo, and Simone Vantini. “Functional Regression Control Chart.”
Technometrics (2020): 1-14, DOI:
https://doi.org/10.1080/00401706.2020.1753581.
Authors’ biography
Fabio Centofanti is a PhD student at the Department of Industrial Engineering of the University of Naples Federico II, Italy, [email protected]. His research interests include functional data analysis and statistical process monitoring.
Antonio Lepore is an Assistant Professor at the Department of Industrial Engineering of the University of Naples Federico II, Italy, [email protected]. His main research interests include the industrial application of statistical techniques to the monitoring of complex measurement profiles from multi-sensor acquisition systems, with particular attention to renewable energy and harmful emissions.
Alessandra Menafoglio is an Assistant Professor at MOX, Department of Mathematics, Politecnico di Milano, [email protected]. Her research interests focus on the development of innovative statistical models and methods for the analysis and statistical process control of complex observations (e.g., curves, images, functional signals), possibly characterized by spatial dependence.
Biagio Palumbo is an Associate Professor in “Statistics for experimental and technological research” at the Department of Industrial Engineering of the University of Naples Federico II, Italy, [email protected]. His major research interests include reliability, design and analysis of experiments, statistical methods for process monitoring and optimization and data science for technology.
Simone Vantini is Associate Professor of Statistics at the Politecnico di Milano, Italy, [email protected]. He has been publishing widely in Functional and Object-Oriented Data Analysis. His current research interests include: permutation testing, nonparametric forecasting, process control, non-Euclidean data, and in general statistical methods and applications motivated by business or industrial problems.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.