A One-Class Support Vector Machine Calibration Method for Time Series Change Point Detection

Home
/
Papers
/
A One-Class Support Vector Machine Calibration Method for Time Series Change Point Detection

Posted Content•

A One-Class Support Vector Machine Calibration Method for Time Series Change Point Detection

Baihong Jin¹, Yuxin Chen¹, Dan Li¹, Kameshwar Poolla², Alberto Sangiovanni-Vincentelli³ - Show less +1 more•Institutions (3)

University of California, Berkeley¹, California Institute of Technology², National University of Singapore³

18 Feb 2019-arXiv: Learning-

TL;DR: In this paper, a heuristic search method was proposed to find a good set of input data and hyperparameters that yield a well-performing model for detecting change points in time series with fewer training data.

read less

Abstract: It is important to identify the change point of a system's health status, which usually signifies an incipient fault under development. The One-Class Support Vector Machine (OC-SVM) is a popular machine learning model for anomaly detection and hence could be used for identifying change points; however, it is sometimes difficult to obtain a good OC-SVM model that can be used on sensor measurement time series to identify the change points in system health status. In this paper, we propose a novel approach for calibrating OC-SVM models. The approach uses a heuristic search method to find a good set of input data and hyperparameters that yield a well-performing model. Our results on the C-MAPSS dataset demonstrate that OC-SVM can also achieve satisfactory accuracy in detecting change point in time series with fewer training data, compared to state-of-the-art deep learning approaches. In our case study, the OC-SVM calibrated by the proposed model is shown to be useful especially in scenarios with limited amount of training data.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

An Encoder-Decoder Based Approach for Anomaly Detection with Application in Additive Manufacturing

[...]

Yingshui Tan¹, Baihong Jin¹, Alexander Nettekoven², Yuxin Chen³, Yisong Yue⁴, Ufuk Topcu², Alberto Sangiovanni-Vincentelli¹ - Show less +3 more•Institutions (4)

University of California, Berkeley¹, University of Texas at Austin², University of Chicago³, California Institute of Technology⁴

26 Jul 2019

TL;DR: It is shown that the encoder-decoder model is able to identify the injected anomalies in a modern AM manufacturing process in an unsupervised fashion and gives hints about the temperature non-uniformity of the testbed during manufacturing, which was not previously known prior to the experiment.

...read moreread less

Abstract: We present a novel unsupervised deep learning approach that utilizes an encoder-decoder architecture for detecting anomalies in sequential sensor data collected during industrial manufacturing. Our approach is designed to not only detect whether there exists an anomaly at a given time step, but also to predict what will happen next in the (sequential) process. We demonstrate our approach on a dataset collected from a real-world Additive Manufacturing (AM) testbed. The dataset contains infrared (IR) images collected under both normal conditions and synthetic anomalies. We show that our encoder-decoder model is able to identify the injected anomalies in a modern AM manufacturing process in an unsupervised fashion. In addition, our approach also gives hints about the temperature non-uniformity of the testbed during manufacturing, which was not previously known prior to the experiment.

...read moreread less

30 citations

DOI•

CNN and LSTM based Encoder-Decoder for Anomaly Detection in Multivariate Time Series

[...]

Ang Zhang¹, Xiaoyong Zhao¹, Lei Wang¹•Institutions (1)

Beijing Information Science & Technology University¹

15 Oct 2021

TL;DR: Wang et al. as discussed by the authors proposed a data anomaly detection algorithm based on convolutional neural network and encoder-decoder architecture CNN-LSTMED (Convolutional Neural Networks Long Short-Term Encoder-Decoder).

...read moreread less

Abstract: The purpose of anomaly detection is to detect data that deviates from the expected, and is widely used in intrusion detection, data preprocessing and so on.For data anomaly detection, we propose a data anomaly detection algorithm based on convolutional neural network and Encoder-Decoder architecture CNN-LSTMED (Convolutional Neural Networks Long Short-Term Encoder-Decoder).First,we use the convolutional neural network to encode the time series data to obtain the encoded sequence,and use the features extracted from the sequence as the input of the nonlinear model long short-term memory network LSTM (Long Short-Term Memory) to decode and output the decoded sequence. Finally, the reconstruction error is calculated and the threshold is set to determine the abnormal point. Through experimental comparison with GRUED (Gated Recurrent Neural Encoder-Decoder), LSTMED (Long Short-Term Memory Encoder-Decoder) ,and other algorithms on the KDD99 data set and credit card fraud data set,it turns out that our algorithm has strong robustness and accuracy .

...read moreread less

7 citations

Posted Content•

Exploiting Uncertainties from Ensemble Learners to Improve Decision-Making in Healthcare AI.

[...]

Yingshui Tan, Baihong Jin, Xiangyu Yue, Yuxin Chen, Alberto Sangiovanni-Vincentelli - Show less +1 more

12 Jul 2020-arXiv: Learning

TL;DR: This question is answered via a rigorous analysis of two commonly used uncertainty metrics in ensemble learning, namely ensemble mean and ensemble variance: ensemble mean is preferable with respect to ensemble variance as an uncertainty metric for decision making.

...read moreread less

Abstract: Ensemble learning is widely applied in Machine Learning (ML) to improve model performance and to mitigate decision risks. In this approach, predictions from a diverse set of learners are combined to obtain a joint decision. Recently, various methods have been explored in literature for estimating decision uncertainties using ensemble learning; however, determining which metrics are a better fit for certain decision-making applications remains a challenging task. In this paper, we study the following key research question in the selection of uncertainty metrics: when does an uncertainty metric outperforms another? We answer this question via a rigorous analysis of two commonly used uncertainty metrics in ensemble learning, namely ensemble mean and ensemble variance. We show that, under mild assumptions on the ensemble learners, ensemble mean is preferable with respect to ensemble variance as an uncertainty metric for decision making. We empirically validate our assumptions and theoretical results via an extensive case study: the diagnosis of referable diabetic retinopathy.

...read moreread less

5 citations

Posted Content•

Augmenting Monte Carlo Dropout Classification Models with Unsupervised Learning Tasks for Detecting and Diagnosing Out-of-Distribution Faults.

[...]

Baihong Jin, Yingshui Tan, Yuxin Chen, Alberto Sangiovanni-Vincentelli

10 Sep 2019-arXiv: Learning

TL;DR: The proposed novel approach of augmenting the classification model with an additional unsupervised learning task leads to improved fault detection and diagnosis performance, especially on out-of-distribution examples including both incipient and unknown faults.

...read moreread less

Abstract: The Monte Carlo dropout method has proved to be a scalable and easy-to-use approach for estimating the uncertainty of deep neural network predictions. This approach was recently applied to Fault Detection and Di-agnosis (FDD) applications to improve the classification performance on incipient faults. In this paper, we propose a novel approach of augmenting the classification model with an additional unsupervised learning task. We justify our choice of algorithm design via an information-theoretical analysis. Our experimental results on three datasets from diverse application domains show that the proposed method leads to improved fault detection and diagnosis performance, especially on out-of-distribution examples including both incipient and unknown faults.

...read moreread less

5 citations

Posted Content•

Are Ensemble Classifiers Powerful Enough for the Detection and Diagnosis of Intermediate-Severity Faults?

[...]

Baihong Jin, Yingshui Tan, Yuxin Chen, Kameshwar Poolla, Alberto Sangiovanni-Vincentelli - Show less +1 more

07 Jul 2020-arXiv: Learning

TL;DR: This work identifies common pitfalls in ensemble models through extensive experiments with several popular ensemble models on two real-world datasets, and discusses how to design more effective ensemble models for detecting and diagnosing Intermediate-Severity faults.

...read moreread less

Abstract: Intermediate-Severity (IS) faults present milder symptoms compared to severe faults, and are more difficult to detect and diagnose due to their close resemblance to normal operating conditions. The lack of IS fault examples in the training data can pose severe risks to Fault Detection and Diagnosis (FDD) methods that are built upon Machine Learning (ML) techniques, because these faults can be easily mistaken as normal operating conditions. Ensemble models are widely applied in ML and are considered promising methods for detecting out-of-distribution (OOD) data. We identify common pitfalls in these models through extensive experiments with several popular ensemble models on two real-world datasets. Then, we discuss how to design more effective ensemble models for detecting and diagnosing IS faults.

...read moreread less

3 citations

References

PDF

Open Access

More filters

Journal Article•DOI•

Optimization by Simulated Annealing

[...]

Scott Kirkpatrick¹, C. D. Gelatt¹, Mario P. Vecchi²•Institutions (2)

IBM¹, Venezuelan Institute for Scientific Research²

13 May 1983-Science

TL;DR: There is a deep and useful connection between statistical mechanics and multivariate or combinatorial optimization (finding the minimum of a given function depending on many parameters), and a detailed analogy with annealing in solids provides a framework for optimization of very large and complex systems.

...read moreread less

Abstract: There is a deep and useful connection between statistical mechanics (the behavior of systems with many degrees of freedom in thermal equilibrium at a finite temperature) and multivariate or combinatorial optimization (finding the minimum of a given function depending on many parameters). A detailed analogy with annealing in solids provides a framework for optimization of the properties of very large and complex systems. This connection to statistical mechanics exposes new information and provides an unfamiliar perspective on traditional optimization problems and methods.

...read moreread less

41,772 citations

Journal Article•DOI•

Differential Evolution – A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces

[...]

Rainer Storn¹, Kenneth Price•Institutions (1)

Siemens¹

01 Dec 1997-Journal of Global Optimization

TL;DR: In this article, a new heuristic approach for minimizing possibly nonlinear and non-differentiable continuous space functions is presented, which requires few control variables, is robust, easy to use, and lends itself very well to parallel computation.

...read moreread less

Abstract: A new heuristic approach for minimizing possibly nonlinear and non-differentiable continuous space functions is presented. By means of an extensive testbed it is demonstrated that the new method converges faster and with more certainty than many other acclaimed global optimization methods. The new method requires few control variables, is robust, easy to use, and lends itself very well to parallel computation.

...read moreread less

24,053 citations

Proceedings Article•

Practical Bayesian Optimization of Machine Learning Algorithms

[...]

Jasper Snoek¹, Hugo Larochelle², Ryan P. Adams³•Institutions (3)

University of Toronto¹, Université de Sherbrooke², Harvard University³

03 Dec 2012

TL;DR: This work describes new algorithms that take into account the variable cost of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation and shows that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms.

...read moreread less

Abstract: The use of machine learning algorithms frequently involves careful tuning of learning parameters and model hyperparameters. Unfortunately, this tuning is often a "black art" requiring expert experience, rules of thumb, or sometimes brute-force search. There is therefore great appeal for automatic approaches that can optimize the performance of any given learning algorithm to the problem at hand. In this work, we consider this problem through the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). We show that certain choices for the nature of the GP, such as the type of kernel and the treatment of its hyperparameters, can play a crucial role in obtaining a good optimizer that can achieve expertlevel performance. We describe new algorithms that take into account the variable cost (duration) of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms including latent Dirichlet allocation, structured SVMs and convolutional neural networks.

...read moreread less

5,654 citations

Journal Article•DOI•

Estimating the Support of a High-Dimensional Distribution

[...]

Bernhard Schölkopf¹, John Platt¹, John Shawe-Taylor², Alexander J. Smola³, Robert C. Williamson³ - Show less +1 more•Institutions (3)

Microsoft¹, Royal Holloway, University of London², Australian National University³

01 Jul 2001-Neural Computation

TL;DR: In this paper, the authors propose a method to estimate a function f that is positive on S and negative on the complement of S. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space.

...read moreread less

Abstract: Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

...read moreread less

4,397 citations

Book•

Differential Evolution: A Practical Approach to Global Optimization

[...]

Kenneth Price, Rainer Storn, Jouni Lampinen

25 Nov 2014

TL;DR: The differential evolution (DE) algorithm is a practical approach to global numerical optimization which is easy to understand, simple to implement, reliable, and fast as discussed by the authors, which is a valuable resource for professionals needing a proven optimizer and for students wanting an evolutionary perspective on global numerical optimisation.

...read moreread less

Abstract: Problems demanding globally optimal solutions are ubiquitous, yet many are intractable when they involve constrained functions having many local optima and interacting, mixed-type variables.The differential evolution (DE) algorithm is a practical approach to global numerical optimization which is easy to understand, simple to implement, reliable, and fast. Packed with illustrations, computer code, new insights, and practical advice, this volume explores DE in both principle and practice. It is a valuable resource for professionals needing a proven optimizer and for students wanting an evolutionary perspective on global numerical optimization.

...read moreread less

4,273 citations