scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A survey of methods for time series change point detection

01 May 2017-Knowledge and Information Systems (Springer London)-Vol. 51, Iss: 2, pp 339-367
TL;DR: This survey article enumerates, categorizes, and compares many of the methods that have been proposed to detect change points in time series, and presents some grand challenges for the community to consider.
Abstract: Change points are abrupt variations in time series data. Such abrupt changes may represent transitions that occur between states. Detection of change points is useful in modelling and prediction of time series and is found in application areas such as medical condition monitoring, climate change detection, speech and image analysis, and human activity analysis. This survey article enumerates, categorizes, and compares many of the methods that have been proposed to detect change points in time series. The methods examined include both supervised and unsupervised algorithms that have been introduced and evaluated. We introduce several criteria to compare the algorithms. Finally, we present some grand challenges for the community to consider.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors present a selective survey of algorithms for the offline detection of multiple change points in multivariate time series, and a general yet structuring methodological strategy is adopted to organize this vast body of work.

506 citations

Journal ArticleDOI
01 Jan 2020
TL;DR: The results allow for moderate optimism related to the gradual lifting of social distance measures in the general population, and call for specific attention to the protection of focal micro-societies enriching high-risk elderly subjects, including nursing homes and chronic care facilities.
Abstract: Following the introduction of unprecedented "stay-at-home" national policies, the COVID-19 pandemic recently started declining in Europe. Our research aims were to characterize the changepoint in the flow of the COVID-19 epidemic in each European country and to evaluate the association of the level of social distancing with the observed decline in the national epidemics. Interrupted time series analyses were conducted in 28 European countries. Social distance index was calculated based on Google Community Mobility Reports. Changepoints were estimated by threshold regression, national findings were analyzed by Poisson regression, and the effect of social distancing in mixed effects Poisson regression model. Our findings identified the most probable changepoints in 28 European countries. Before changepoint, incidence of new COVID-19 cases grew by 24% per day on average. From the changepoint, this growth rate was reduced to 0.9%, 0.3% increase, and to 0.7% and 1.7% decrease by increasing social distancing quartiles. The beneficial effect of higher social distance quartiles (i.e., turning the increase into decline) was statistically significant for the fourth quartile. Notably, many countries in lower quartiles also achieved a flat epidemic curve. In these countries, other plausible COVID-19 containment measures could contribute to controlling the first wave of the disease. The association of social distance quartiles with viral spread could also be hindered by local bottlenecks in infection control. Our results allow for moderate optimism related to the gradual lifting of social distance measures in the general population, and call for specific attention to the protection of focal micro-societies enriching high-risk elderly subjects, including nursing homes and chronic care facilities.

120 citations

Posted Content
TL;DR: This study shows that binary segmentation and Bayesian online change point detection are among the best performing methods.
Abstract: Change point detection is an important part of time series analysis, as the presence of a change point indicates an abrupt and significant change in the data generating process. While many algorithms for change point detection exist, little attention has been paid to evaluating their performance on real-world time series. Algorithms are typically evaluated on simulated data and a small number of commonly-used series with unreliable ground truth. Clearly this does not provide sufficient insight into the comparative performance of these algorithms. Therefore, instead of developing yet another change point detection method, we consider it vastly more important to properly evaluate existing algorithms on real-world data. To achieve this, we present the first data set specifically designed for the evaluation of change point detection algorithms, consisting of 37 time series from various domains. Each time series was annotated by five expert human annotators to provide ground truth on the presence and location of change points. We analyze the consistency of the human annotators, and describe evaluation metrics that can be used to measure algorithm performance in the presence of multiple ground truth annotations. Subsequently, we present a benchmark study where 14 existing algorithms are evaluated on each of the time series in the data set. This study shows that binary segmentation (Scott and Knott, 1974) and Bayesian online change point detection (Adams and MacKay, 2007) are among the best performing methods. Our aim is that this data set will serve as a proving ground in the development of novel change point detection algorithms.

80 citations


Cites background or methods from "A survey of methods for time series..."

  • ..., Aminikhanghahi and Cook (2017) or Truong et al. (2020). Note that we focus here on the unsupervised change point detection problem, without access to external data that can be used to tune hyperparameters (Hocking et al., 2013a; Truong et al., 2017). Let yt ∈ Y denote the observations for time steps t = 1, 2, . . ., where the domain Y is d-dimensional and typically assumed to be a subset of Rd. A segment of the series from t = a, a + 1, . . . , b will be written as ya:b. The ordered set of change points is denoted by T = {τ0, τ1, . . . , τn}, with τ0 = 1 for notational convenience. In offline change point detection we further use T to denote the length of the series and define τn+1 = T + 1. Note that we assume that a change point marks the first observation of a new segment. Early work on CPD originates from the quality control literature. Page (1954) introduced the CUSUM method that detects where the corrected cumulative sum of observations exceeds a threshold value....

    [...]

  • ..., Aminikhanghahi and Cook (2017) or Truong et al. (2020). Note that we focus here on the unsupervised change point detection problem, without access to external data that can be used to tune hyperparameters (Hocking et al....

    [...]

  • ..., Aminikhanghahi and Cook (2017) or Truong et al. (2020). Note that we focus here on the unsupervised change point detection problem, without access to external data that can be used to tune hyperparameters (Hocking et al., 2013a; Truong et al., 2017). Let yt ∈ Y denote the observations for time steps t = 1, 2, . . ., where the domain Y is d-dimensional and typically assumed to be a subset of Rd. A segment of the series from t = a, a + 1, . . . , b will be written as ya:b. The ordered set of change points is denoted by T = {τ0, τ1, . . . , τn}, with τ0 = 1 for notational convenience. In offline change point detection we further use T to denote the length of the series and define τn+1 = T + 1. Note that we assume that a change point marks the first observation of a new segment. Early work on CPD originates from the quality control literature. Page (1954) introduced the CUSUM method that detects where the corrected cumulative sum of observations exceeds a threshold value. Theoretical analysis of this method was subsequently provided by Lorden (1971). Hinkley (1970) generalized this approach to testing for differences in the maximum likelihood, i....

    [...]

  • ...An alternative view of evaluating CPD algorithms considers change point detection as a classification problem between the “change point” and “non-change point” classes (Killick et al., 2012; Aminikhanghahi and Cook, 2017)....

    [...]

  • ...For a more expansive review of change point detection algorithms, see e.g., Aminikhanghahi and Cook (2017) or Truong et al. (2020)....

    [...]

Journal ArticleDOI
TL;DR: In this article, a change point detection framework using likelihood ratio, regression structure and a Bayesian change point detector was proposed to quantify the time lag effect reflected in transportation systems when authorities take action in response to the COVID-19 pandemic.
Abstract: The unprecedented challenges caused by the COVID-19 pandemic demand timely action. However, due to the complex nature of policy making, a lag may exist between the time a problem is recognized and the time a policy has its impact on a system. To understand this lag and to expedite decision making, this study proposes a change point detection framework using likelihood ratio, regression structure and a Bayesian change point detection method. The objective is to quantify the time lag effect reflected in transportation systems when authorities take action in response to the COVID-19 pandemic. Using travel patterns as an indicator of policy effectiveness, the length of policy lag and magnitude of policy impacts on the road system, mass transit, and micromobility are investigated through the case studies of New York City (NYC), and Seattle—two U.S. cities significantly affected by COVID-19. The quantitative findings show that the National declaration of emergency had no policy lag while stay-at-home and reopening policies had a lead effect on mobility. The magnitude of impact largely depended on the land use and sociodemographic characteristics of the area, as well as the type of transportation system.

74 citations

Proceedings ArticleDOI
01 Nov 2017
TL;DR: This work presents an algorithm which is domain agnostic, has only one easily determined parameter, and can handle data streaming at a high rate, and is the first to show that semantic segmentation may be possible at superhuman performance levels.
Abstract: Unsupervised semantic segmentation in the time series domain is a much-studied problem due to its potential to detect unexpected regularities and regimes in poorly understood data. However, the current techniques have several shortcomings, which have limited the adoption of time series semantic segmentation beyond academic settings for three primary reasons. First, most methods require setting/learning many parameters and thus may have problems generalizing to novel situations. Second, most methods implicitly assume that all the data is segmentable, and have difficulty when that assumption is unwarranted. Finally, most research efforts have been confined to the batch case, but online segmentation is clearly more useful and actionable. To address these issues, we present an algorithm which is domain agnostic, has only one easily determined parameter, and can handle data streaming at a high rate. In this context, we test our algorithm on the largest and most diverse collection of time series datasets ever considered, and demonstrate our algorithm's superiority over current solutions. Furthermore, we are the first to show that semantic segmentation may be possible at superhuman performance levels.

74 citations


Cites background from "A survey of methods for time series..."

  • ...A good representation of the literature on this problem is surveyed in detail in a recent paper in [1]....

    [...]

  • ...For clarity, this latter task is sometimes called “semantic segmentation” [1][31]; where there is no danger of confusion, and we will refer to it as segmentation....

    [...]

  • ...In essence, many researchers overlay the results of the segmentation on the original data, and we invite the reader to confirm that it matches human intuition [1][5][17][19]....

    [...]

  • ...While there are many techniques for segmentation [1][15] [17][19][25], they all have one or more limitations that have prevented their utilization in real world settings....

    [...]

  • ...• Most research efforts in this domain test on limited datasets [1][15]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A unified framework for the design and the performance analysis of the algorithms for solving change detection problems and links with the analytical redundancy approach to fault detection in linear systems are established.
Abstract: This book is downloadable from http://www.irisa.fr/sisthem/kniga/. Many monitoring problems can be stated as the problem of detecting a change in the parameters of a static or dynamic stochastic system. The main goal of this book is to describe a unified framework for the design and the performance analysis of the algorithms for solving these change detection problems. Also the book contains the key mathematical background necessary for this purpose. Finally links with the analytical redundancy approach to fault detection in linear systems are established. We call abrupt change any change in the parameters of the system that occurs either instantaneously or at least very fast with respect to the sampling period of the measurements. Abrupt changes by no means refer to changes with large magnitude; on the contrary, in most applications the main problem is to detect small changes. Moreover, in some applications, the early warning of small - and not necessarily fast - changes is of crucial interest in order to avoid the economic or even catastrophic consequences that can result from an accumulation of such small changes. For example, small faults arising in the sensors of a navigation system can result, through the underlying integration, in serious errors in the estimated position of the plane. Another example is the early warning of small deviations from the normal operating conditions of an industrial process. The early detection of slight changes in the state of the process allows to plan in a more adequate manner the periods during which the process should be inspected and possibly repaired, and thus to reduce the exploitation costs.

3,830 citations


"A survey of methods for time series..." refers background in this paper

  • ...The most familiar change point algorithm is cumulative sum [7,10,18,33], which accumulates deviations relative to a specified target of incomingmeasurements and indicates that a change point exists when the cumulative sum exceeds a specified threshold....

    [...]

Book
01 Jan 2000
TL;DR: Characteristics of Time Series * Time Series Regression and ARIMA Models * Dynamic Linear Models and Kalman Filtering * Spectral Analysis and Its Applications.
Abstract: Characteristics of Time Series * Time Series Regression and ARIMA Models * Dynamic Linear Models and Kalman Filtering * Spectral Analysis and Its Applications.

1,812 citations


"A survey of methods for time series..." refers background in this paper

  • ...Definition 2 A stationary time series is a finite variance process whose statistical properties are all constant over time [57]....

    [...]

01 Jan 2005
TL;DR: A systematic survey of the common processing steps and core decision rules in modern change detection algorithms, including significance and hypothesis testing, predictive models, the shading model, and background modeling is presented.

1,750 citations


"A survey of methods for time series..." refers background in this paper

  • ...Here, the observation at each time point is the digital encoding of an image [47]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors present a systematic survey of the common processing steps and core decision rules in modern change detection algorithms, including significance and hypothesis testing, predictive models, the shading model, and background modeling.
Abstract: Detecting regions of change in multiple images of the same scene taken at different times is of widespread interest due to a large number of applications in diverse disciplines, including remote sensing, surveillance, medical diagnosis and treatment, civil infrastructure, and underwater sensing. This paper presents a systematic survey of the common processing steps and core decision rules in modern change detection algorithms, including significance and hypothesis testing, predictive models, the shading model, and background modeling. We also discuss important preprocessing methods, approaches to enforcing the consistency of the change mask, and principles for evaluating and comparing the performance of change detection algorithms. It is hoped that our classification of algorithms into a relatively small number of categories will provide useful guidance to the algorithm designer.

1,693 citations

Journal ArticleDOI
TL;DR: A simple and fast computational method, the visibility algorithm, that converts a time series into a graph, which inherits several properties of the series in its structure, enhancing the fact that power law degree distributions are related to fractality.
Abstract: In this work we present a simple and fast computational method, the visibility algorithm, that converts a time series into a graph. The constructed graph inherits several properties of the series in its structure. Thereby, periodic series convert into regular graphs, and random series do so into random graphs. Moreover, fractal series convert into scale-free networks, enhancing the fact that power law degree distributions are related to fractality, something highly discussed recently. Some remarkable examples and analytical tools are outlined to test the method's reliability. Many different measures, recently developed in the complex network theory, could by means of this new approach characterize time series from a new point of view.

1,320 citations


"A survey of methods for time series..." refers methods in this paper

  • ...This graph can be defined based on a minimum spanning tree [26], minimum distance pairing [52], nearest neighbor graph [26,52], or visibility graph [40,66]....

    [...]