Change Point Detection in R

What is the differnce between change point and outlier?

To answer this question, we should really understand what is a change point for a time series.

Changepoints are also known as:

  • breakpoints
  • segmentation
  • structural breaks
  • regime switching
  • detecting disorder

and can be found in a wide range of literature including

  • quality control
  • economics
  • medicine
  • environment
  • linguistics
  • \(\cdots\)

For data \(y_1, \ldots, y_n\), if a changepoint exists at \(\tau\), then \(y_1,\ldots,y_{\tau}\) differ from \(y_{\tau+1},\ldots,y_n\) in some way.

There are many different types of change.

Thus a changepoint model for a change in mean has the following formulation:

\[ y_t = \left\{ \begin{array}{lcl} \mu_1 & \mbox{if} & 1\leq t \leq \tau_1 \\ \mu_2 & \mbox{if} & \tau_1 < t \leq \tau_2 \\ \vdots & & \vdots \\ \mu_{m+1} & \mbox{if} & \tau_m < t \leq \tau_{m+1}=n \end{array} \right. \]

What is the goal?

  • Has a change occurred?
  • If yes, where is the change?
  • What is the difference between the pre and post change data?
    • Maybe this is the type of change
    • Maybe it is the parameter values before and after the change
  • What is the probability that a change has occured?
  • How certain are we of the changepoint location?
  • How many changes have occurred (+ all the above for each change)?
  • Why has there been a change?

Online vs Offline

  • Online
    • Processes data as it arrives or in batches
    • Goal is quickest detection of a change
    • Often used in processing control, intrusion detection
  • Offline
    • Processes all the data in one go
    • Goal is accurate detection of a change
    • Often used in genome analysis, audiology

change point detection function using ecp package

library(zeta.forecast)
zetaEDA::enable_zeta_ggplot_theme()
plot_ts_change_point(eg_diamond_ts)

## # A tibble: 2 × 3
##   time        value cpt  
##   <date>      <dbl> <fct>
## 1 2013-12-01 652769 yes  
## 2 2019-09-01 305745 yes
Chen Xing
Chen Xing
Founder & Data Scientist

Enjoy Life & Enjoy Work!

Related