Change Point Detection in R

What is the differnce between change point and outlier?

To answer this question, we should really understand what is a change point for a time series.

Changepoints are also known as:

  • breakpoints
  • segmentation
  • structural breaks
  • regime switching
  • detecting disorder

and can be found in a wide range of literature including

  • quality control
  • economics
  • medicine
  • environment
  • linguistics

For data y1,,yn, if a changepoint exists at τ, then y1,,yτ differ from yτ+1,,yn in some way.

There are many different types of change.

Thus a changepoint model for a change in mean has the following formulation:

yt={μ1if1tτ1μ2ifτ1<tτ2μm+1ifτm<tτm+1=n

What is the goal?

  • Has a change occurred?
  • If yes, where is the change?
  • What is the difference between the pre and post change data?
    • Maybe this is the type of change
    • Maybe it is the parameter values before and after the change
  • What is the probability that a change has occured?
  • How certain are we of the changepoint location?
  • How many changes have occurred (+ all the above for each change)?
  • Why has there been a change?

Online vs Offline

  • Online
    • Processes data as it arrives or in batches
    • Goal is quickest detection of a change
    • Often used in processing control, intrusion detection
  • Offline
    • Processes all the data in one go
    • Goal is accurate detection of a change
    • Often used in genome analysis, audiology

change point detection function using ecp package

library(zeta.forecast)
zetaEDA::enable_zeta_ggplot_theme()
plot_ts_change_point(eg_diamond_ts)

## # A tibble: 2 × 3
##   time        value cpt  
##   <date>      <dbl> <fct>
## 1 2013-12-01 652769 yes  
## 2 2019-09-01 305745 yes
Chen Xing
Chen Xing
Founder & Data Scientist

Enjoy Life & Enjoy Work!

Related