Change Point Detection in R
What is the differnce between change point and outlier?
To answer this question, we should really understand what is a change point for a time series.
Changepoints are also known as:
- breakpoints
- segmentation
- structural breaks
- regime switching
- detecting disorder
and can be found in a wide range of literature including
- quality control
- economics
- medicine
- environment
- linguistics
For data
There are many different types of change.
Thus a changepoint model for a change in mean has the following formulation:
What is the goal?
- Has a change occurred?
- If yes, where is the change?
- What is the difference between the pre and post change data?
- Maybe this is the type of change
- Maybe it is the parameter values before and after the change
- What is the probability that a change has occured?
- How certain are we of the changepoint location?
- How many changes have occurred (+ all the above for each change)?
- Why has there been a change?
Online vs Offline
- Online
- Processes data as it arrives or in batches
- Goal is quickest detection of a change
- Often used in processing control, intrusion detection
- Offline
- Processes all the data in one go
- Goal is accurate detection of a change
- Often used in genome analysis, audiology
change point detection function using ecp
package
library(zeta.forecast)
zetaEDA::enable_zeta_ggplot_theme()
plot_ts_change_point(eg_diamond_ts)
## # A tibble: 2 × 3
## time value cpt
## <date> <dbl> <fct>
## 1 2013-12-01 652769 yes
## 2 2019-09-01 305745 yes