An R Package `EnergyOnlineCPM' for Nonparametric Multivariate Statistical Process Control Chart Based on Energy Test (by Ya Fei Xu, yafei.xu@hotmail.de)




Keywords: statistical process control, SPC, statistical process monitoring, SPM, Phase II, online, nonparametric, sequential detection, high dimensional process, high dimensional time series, multiple change points, change point model, energy statistic, energy test, distributional divergence, R package, change point in R 

Attention: This package requires R version >= 3.3.2 and INDEPENDENT observations.

1. Description: The R package `EnergyOnlineCPM' is the first package which centers on the Phase II nonparametric change point model to online detect the multiple change points in context of high dimensional time series based on the maximum energy test statistic using permutation samples.

2. Installation
Two methods for installation can be used.

First method is to install the package from CRAN [link]:
    install.packages("EnergyOnlineCPM")
    library(EnergyOnlineCPM)

Second way is from Github [link]:
    install.packages("devtools")
    library(devtools)
    install_github("YafeiXu/EnergyOnlineCPM")
    library(EnergyOnlineCPM)


4. Example: An example of simulation study for the proposed SPC control chart to detect a data set of five dimensional time series with two change points in mean shift, from standard Gaussian to shifted Gaussian, can be found in the following.

library(MASS)
# simulate 300 length time series
simNr=300

# simulate 300 length 5 dimensonal standard Gaussian series
Sigma2 <- matrix(c(1,0,0,0,0, 0,1,0,0,0, 0,0,1,0,0, 0,0,0,1,0, 0,0,0,0,1),5,5)
Mean2=rep(1,5)
sim2=(mvrnorm(n = simNr, Mean2, Sigma2))

# simulate 300 length 5 dimensonal standard Gaussian series
Sigma3 <- matrix(c(1,0,0,0,0, 0,1,0,0,0, 0,0,1,0,0, 0,0,0,1,0, 0,0,0,0,1),5,5)
Mean3=rep(0,5)
sim3=(mvrnorm(n = simNr, Mean3, Sigma3))

# construct a data set of length equal to 90.
# first 20 points are from standard Gaussian.
# second 30 points from a Gaussian with a mean shift with 555.
# last 40 points are from standard Gaussian.
data1=sim6=rbind(sim2[1:20,],(sim3+555)[1:30,],sim2[1:40,])

# set warm-up number as 20, permutation 200 times, significant level 0.005
wNr=20
permNr=200
alpha=1/200
maxEnergyCPMv(data1,wNr,permNr,alpha)  


5. Value: here please note the function maxEnergyCPMv gives two special number: the 2600000 indicates that the function has reached the end point of the sequence, and the 1999 means that the function has met numierical problem in the computation. 2600000 and 1999 will sometimes appear in the return value by the function maxEnergyCPMv. 

6. Author: Please Contact: Yafei.Xu@hotmail.de [homepage]