Optimal Detection of Changepoints With a Linear Computational Cost

doi:10.1080/01621459.2012.737745

Home
/
Papers
/
Optimal Detection of Changepoints With a Linear Computational Cost

Journal Article•DOI•

Optimal Detection of Changepoints With a Linear Computational Cost

Rebecca Killick¹, Paul Fearnhead¹, Idris A. Eckley¹•Institutions (1)

Lancaster University¹

17 Oct 2012-Journal of the American Statistical Association (Taylor & Francis Group)-Vol. 107, Iss: 500, pp 1590-1598

TL;DR: This work considers the problem of detecting multiple changepoints in large data sets and introduces a new method for finding the minimum of such cost functions and hence the optimal number and location of changepoints that has a computational cost which is linear in the number of observations.

read less

Abstract: In this article, we consider the problem of detecting multiple changepoints in large datasets. Our focus is on applications where the number of changepoints will increase as we collect more data: for example, in genetics as we analyze larger regions of the genome, or in finance as we observe time series over longer periods. We consider the common approach of detecting changepoints through minimizing a cost function over possible numbers and locations of changepoints. This includes several established procedures for detecting changing points, such as penalized likelihood and minimum description length. We introduce a new method for finding the minimum of such cost functions and hence the optimal number and location of changepoints that has a computational cost, which, under mild conditions, is linear in the number of observations. This compares favorably with existing methods for the same problem whose computational cost can be quadratic or even cubic. In simulation studies, we show that our new method can...

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

changepoint: An R Package for Changepoint Analysis

[...]

Rebecca Killick, Idris A. Eckley

25 Jun 2014-Journal of Statistical Software

TL;DR: The changepoint package has been developed to provide users with a choice of multiple changepoint search methods to use in conjunction with a given changepoint method and in particular provides an implementation of the recently proposed PELT algorithm.

...read moreread less

Abstract: One of the key challenges in changepoint analysis is the ability to detect multiple changes within a given time series or sequence. The changepoint package has been developed to provide users with a choice of multiple changepoint search methods to use in conjunction with a given changepoint method and in particular provides an implementation of the recently proposed PELT algorithm. This article describes the search methods which are implemented in the package as well as some of the available test statistics whilst highlighting their application with simulated and practical examples. Particular emphasis is placed on the PELT algorithm and how results differ from the binary segmentation approach.

...read moreread less

1,068 citations

Cites background or methods from "Optimal Detection of Changepoints W..."

...The PELT algorithm proposed by Killick et al. (2012) is similar to that of the Segment Neighbourhood algorithm since it provides an exact segmentation....
[...]
...…have been proposed to overcome this challenge, most notably the binary segmentation algorithm (Scott and Knott, 1974; Sen and Srivastava, 1975); the Segment Neighbourhood algorithm (Auger and Lawrence, 1989; Bai and Perron, 1998) and more recently the PELT algorithm (Killick et al., 2012)....
[...]
...The computational complexity of the algorithm is O(n log n) but this speed can come at the expense of accuracy of the resulting changepoints (see Killick et al. (2012) for details)....
[...]
...Over the years several multiple changepoint search algorithms have been proposed to overcome this challenge, most notably the binary segmentation algorithm (Scott and Knott, 1974; Sen and Srivastava, 1975); the Segment Neighbourhood algorithm (Auger and Lawrence, 1989; Bai and Perron, 1998) and more recently the PELT algorithm (Killick et al., 2012)....
[...]
..., detect a changepoint, then we estimate its position as τ̂1 the value of τ1 that maximises ML(τ1). The appropriate value for this parameter c is still an open research question with several authors devising p-values and other information criterion under different types of changes. We refer the interested reader to Guyon and Yao (1999); Chen and Gupta (2000); Lavielle (2005); Birge and Massart (2007) for interesting discussions and suggestions for c....
[...]

Journal Article•DOI•

Selective review of offline change point detection methods

[...]

Charles Truong¹, Laurent Oudre, Nicolas Vayatis¹•Institutions (1)

École Normale Supérieure¹

01 Feb 2020-Signal Processing

TL;DR: In this article, the authors present a selective survey of algorithms for the offline detection of multiple change points in multivariate time series, and a general yet structuring methodological strategy is adopted to organize this vast body of work.

...read moreread less

506 citations

Journal Article•DOI•

Wild binary segmentation for multiple change-point detection

[...]

Piotr Fryzlewicz¹•Institutions (1)

London School of Economics and Political Science¹

01 Dec 2014-Annals of Statistics

TL;DR: Wild binary segmentation (WBS) as discussed by the authors is a new technique for consistent estimation of the number and locations of multiple change-points in data, which does not require the choice of a window or span parameter and does not lead to a significant increase in computational complexity.

...read moreread less

Abstract: We propose a new technique, called wild binary segmentation (WBS), for consistent estimation of the number and locations of multiple change-points in data. We assume that the number of change-points can increase to infinity with the sample size. Due to a certain random localisation mechanism, WBS works even for very short spacings between the change-points and/or very small jump magnitudes, unlike standard binary segmentation. On the other hand, despite its use of localisation, WBS does not require the choice of a window or span parameter, and does not lead to a significant increase in computational complexity. WBS is also easy to code. We propose two stopping criteria for WBS: one based on thresholding and the other based on what we term the ‘strengthened Schwarz information criterion’. We provide default recommended values of the parameters of the procedure and show that it offers very good practical performance in comparison with the state of the art. The WBS methodology is implemented in the R package wbs, available on CRAN. In addition, we provide a new proof of consistency of binary segmentation with improved rates of convergence, as well as a corresponding result for WBS.

...read moreread less

493 citations

Journal Article•DOI•

Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data.

[...]

Hannah A. Pliner¹, Jonathan S. Packer¹, José L. McFaline-Figueroa¹, Darren A. Cusanovich¹, Riza M. Daza¹, Delasa Aghamirzaie¹, Sanjay Srivatsan¹, Xiaojie Qiu¹, Dana Jackson¹, Anna Minkina¹, Andrew Adey², Frank J. Steemers³, Jay Shendure¹, Jay Shendure⁴, Cole Trapnell¹ - Show less +11 more•Institutions (4)

University of Washington¹, Oregon Health & Science University², Illumina³, Howard Hughes Medical Institute⁴

06 Sep 2018-Molecular Cell

TL;DR: Cicero is introduced, an algorithm that identifies co-accessible pairs of DNA elements using single-cell chromatin accessibility data and so connects regulatory elements to their putative target genes and is applied to investigate how dynamically accessible elements orchestrate gene regulation in differentiating myoblasts.

...read moreread less

488 citations

Journal Article•DOI•

A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data

[...]

David S. Matteson¹, Nicholas A. James•Institutions (1)

Cornell University¹

19 Mar 2014-Journal of the American Statistical Association

TL;DR: The divisive method is shown to provide consistent estimates of both the number and the location of change points under standard regularity assumptions, and methods from cluster analysis are applied to assess performance and to allow simple comparisons of location estimates, even when the estimated number differs.

...read moreread less

Abstract: Change point analysis has applications in a wide variety of fields. The general problem concerns the inference of a change in distribution for a set of time-ordered observations. Sequential detection is an online version in which new data are continually arriving and are analyzed adaptively. We are concerned with the related, but distinct, offline version, in which retrospective analysis of an entire sequence is performed. For a set of multivariate observations of arbitrary dimension, we consider nonparametric estimation of both the number of change points and the positions at which they occur. We do not make any assumptions regarding the nature of the change in distribution or any distribution assumptions beyond the existence of the αth absolute moment, for some α ∈ (0, 2). Estimation is based on hierarchical clustering and we propose both divisive and agglomerative algorithms. The divisive method is shown to provide consistent estimates of both the number and the location of change points under standard...

...read moreread less

454 citations

Cites methods from "Optimal Detection of Changepoints W..."

...…Cappé 2011), which is based on a generalization of a Wilcoxon/Mann–Whitney (marginal) rank-based approach, the parametric Pruned Exact Linear Time (PELT) procedure (Killick, Fearnhead, and Eckley 2012), and the nonparametric Kernel Change Point (KCP) procedure (Arlot, Celisse, and Harchaoui 2012)....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A new look at the statistical model identification

[...]

Hirotugu Akaike

01 Dec 1974-IEEE Transactions on Automatic Control

TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.

...read moreread less

Abstract: The history of the development of statistical hypothesis testing in time series analysis is reviewed briefly and it is pointed out that the hypothesis testing procedure is not adequately defined as the procedure for statistical model identification. The classical maximum likelihood estimation procedure is reviewed and a new estimate minimum information theoretical criterion (AIC) estimate (MAICE) which is designed for the purpose of statistical identification is introduced. When there are several competing models the MAICE is defined by the model and the maximum likelihood estimates of the parameters which give the minimum of AIC defined by AIC = (-2)log-(maximum likelihood) + 2(number of independently adjusted parameters within the model). MAICE provides a versatile procedure for statistical model identification which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure. The practical utility of MAICE in time series analysis is demonstrated with some numerical examples.

...read moreread less

47,133 citations

"Optimal Detection of Changepoints W..." refers background in this paper

...Examples of such penalties include Akaike’s information criterion (AIC; Akaike 1974) (β = 2p) and Schwarz information criterion (SIC, also known as BIC; Schwarz 1978) (β = p log n, where p is the number of additional parameters introduced by adding a changepoint)....
[...]

Journal Article•DOI•

Estimating the Dimension of a Model

[...]

Gideon Schwarz

01 Mar 1978-Annals of Statistics

TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.

...read moreread less

Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

...read moreread less

38,681 citations

Estimating the dimension of a model

[...]

Gideon Schwarz

01 Jan 2005

...read moreread less

36,760 citations

"Optimal Detection of Changepoints W..." refers background in this paper

...Examples of such penalties include Akaike’s information criterion (AIC; Akaike 1974) (β = 2p) and Schwarz information criterion (SIC, also known as BIC; Schwarz 1978) (β = p log n, where p is the number of additional parameters introduced by adding a changepoint)....
[...]

Journal Article•DOI•

A Cluster Analysis Method for Grouping Means in the Analysis of Variance

[...]

A. J. Scott, M. Knott

01 Sep 1974-Biometrics

TL;DR: In this paper, the authors used the techniques of cluster analysis to split the treatments into reasonably homogeneous groups and developed a likelihood ratio test for judging the significance of differences among the resulting groups.

...read moreread less

Abstract: It is sometimes useful in an analysis of variance to split the treatments into reasonably homogeneous groups. Multiple comparison procedures are often used for this purpose, but a more direct method is to use the techniques of cluster analysis. This approach is illustrated for several sets of data, and a likelihood ratio test is developed for judging the significance of differences among the resulting groups.

...read moreread less

2,491 citations

"Optimal Detection of Changepoints W..." refers background or methods in this paper

...Early applications include Scott and Knott (1974) and Sen and Srivastava (1975). In essence, the method extends any single changepoint method to multiple changepoints by iteratively repeating the method on different subsets of the sequence....
[...]
...The remainder of this section describes two commonly used methods for multiple changepoint detection: BS (Scott and Knott 1974) and SN (Auger and Lawrence 1989)....
[...]
...At the time of writing, binary segmentation (BS) proposed by Scott and Knott (1974) is arguably the most widely used changepoint search method....
[...]
...Early applications include Scott and Knott (1974) and Sen and Srivastava (1975)....
[...]
...Early applications include Scott and Knott (1974) and Sen and Srivastava (1975)....
[...]