## Change Point Analysis

### Introduction

 This is an open collaborative project, please contact Taha Kass-Hout, MD, MS for contributor access to site... Purpose:  CPA aims at detecting any change in the mean of a process in historical data   Example questions to be answered by performing CPA Did a change occur? Did more than one change occur? When did the changes occur? How confident are we that they are real changes?   Assumptions CPA assumes that the process (time series) must be DISTRIBUTED IDENTICALLY, and the observations must be INDEPENDENT (at least there is no strong autocorrelation) Pros No specific distribution is assumed It can handle all types of time ordered data including, data from non-normal distributions, ill-behaved data such as particle counts and complaint data and data with outliers If CPA is applied on the ranks, it will provide results that are robust to outliers CPA can detect subtle changes which may not be detected by control charts. Thus. CPA and control charts can be used in a complementary fashion [please see below] CPA characterizes better the changes detected by providing associated confidence levels and confidence intervals (CI’s) for the times of the changes Cons It is not a monitoring tool but a tool to analyze historical time-ordered data. It is not efficient at detecting isolated abnormal points like C2/W2 or control charts such as CUSUM If there is too much autocorrelation in the data, some changes could be confused with autoregressive effects The bootstrapping approach used in CPA will not produce identical results each time when it is performed How to Calculate CPA Determine the Series Mean Accumulate Running Sum of differences between Mean and individual values Plot CUSUM series The point farthest from 0 denotes a Change-Point Break into two sections at CP: Analyze each subseries for additional significant CPs Bootstrapping provides us with a measure of the CP’s significance   Aberration detection algorithms detect isolated or grouped abnormalities to detect  major changes quickly. The methods find abnormalities by  updating the data collection time-by-time, and control the change-wise errors to detect abnormalities.     CPA, on the other hand, uses a recursive algorithm to detect multiple change points (orange vertical lines) by splitting a given time series into two sub-series repeatedly and by applying the CPA algorithm on each sub-series to find a change point based on cumulative sums of the sub-series. A change point indicates the series means shifts from its previous mean to another. The green piece-wise constant lines represent mean shifts.     Aberration detection algorithms are generally better at detecting isolated or grouped abnormalities, while CPA algorithm is better at detecting subtle changes which may not be detected by aberration methods. Two methods can be used in a complementary fashion to get better understanding.