scispace - formally typeset
Search or ask a question
Author

Juan José Rodríguez Diez

Bio: Juan José Rodríguez Diez is an academic researcher from University of Burgos. The author has contributed to research in topics: Boosting (machine learning) & Fault detection and isolation. The author has an hindex of 4, co-authored 6 publications receiving 45 citations.

Papers
More filters
01 Jan 2003
TL;DR: A system for supervised time series classification, capable of learning from series of different length and able of providing a classification when only part of the series are presented to the classifier, and can be used to identify partial time series.
Abstract: This work presents a system for supervised time series classification, capable of learning from series of different length and able of providing a classification when only part of the series are presented to the classifier. The induced classifiers consist of a linear combination of literals, obtained by boosting base classifiers that contain only one literal. Nevertheless, these literals are specifically designed for the task at hand and they test properties of fragments of the time series on temporal intervals. The method had already been developed for fixed length time series. This work exploits the symbolic nature of the classifier to add it two new features. First, the system has been slightly modified in order that it is now able to learn directly from variable length time series. Second, the classifier can be used to identify partial time series. This “early classification” is essential in some task, like on line supervision or diagnosis, where it is necessary to give an alarm signal as soon as possible. Several experiments on different data test are presented, which illustrate that the proposed method is highly competitive with previous approaches in terms of classification accuracy.

24 citations

Book ChapterDOI
02 Jul 2001
TL;DR: This work proposes a novel method for constructing RBF networks, based on boosting, where the task assigned to the base learner is to select a RBF, while the boosting algorithm combines linearly the different RBFs.
Abstract: This work proposes a novel method for constructing RBF networks, based on boosting. The task assigned to the base learner is to select a RBF, while the boosting algorithm combines linearly the different RBFs. For each iteration of boosting a new neuron is incorporated into the network. The method for selecting each RBF is based on randomly selecting several examples as the centers, considering the distances to these center as attributes of the examples and selecting the best split on one of these attributes. This selection of the best split is done in the same way than in the construction of decision trees. The RBF is computed from the center (attribute) and threshold selected. This work is not about using RBFNs as base learners for boosting, but about constructing RBFNs by boosting.

6 citations

Journal ArticleDOI
TL;DR: In this article, an integrated approach to diagnosis of complex dynamic systems, combining model based diagnosis with machine learning techniques, is proposed, and a simple framework to make them cooperate, hence improving the diagnosis capabilities of each individual method.

5 citations

Book ChapterDOI
01 Jan 2003
TL;DR: This work presents a learning system for the classification of multivariate time series that is useful in domains such as biomedical signals, continuous systems diagnosis, or data mining in temporal databases.
Abstract: This work presents a learning system for the classification of multivariate time series This classification is useful in domains such as biomedical signals [9], continuous systems diagnosis [2] or data mining in temporal databases [3]

4 citations

Proceedings ArticleDOI
02 Jul 2007
TL;DR: This work study the combination of a consistency-based diagnosis system together with a Case-based Reasoning system that will perform fault detection and localization and the CBR system provides accurate indication of the most probable fault mode, at early stages of the localization process.
Abstract: Consistency-based diagnosis automatically provides fault detection and localization capabilities, using just models for correct behavior. However, it may exhibit a lack of discrimination power. Knowledge about fault modes can be added to tackle the problem. Unfortunately, it brings additional complexity issues, since it will be necessary to discriminate among a maximum of KN mode assignments, for N components and K possible fault modes per component. Usually, some kind of heuristic information is included in the diagnosis process to focus the model-based diagnostician. In this work we study the combination of a consistency-based diagnosis system together with a Case-based Reasoning system. The consistency-based diagnosis will perform fault detection and localization. The CBR system provides accurate indication of the most probable fault mode, at early stages of the localization process.

4 citations


Cited by
More filters
Proceedings Article
01 Dec 2011
TL;DR: This paper advocates local shapelets as features, which are segments of time series remaining in the same space of the input data and thus are highly interpretable and can achieve effective early classification.
Abstract: Early classification on time series data has been found highly useful in a few important applications, such as medical and health informatics, industry production management, safety and security management. While some classifiers have been proposed to achieve good earliness in classification, the interpretability of early classification remains largely an open problem. Without interpretable features, application domain experts such as medical doctors may be reluctant to adopt early classification. In this paper, we tackle the problem of extracting interpretable features on time series for early classification. Specifically, we advocate local shapelets as features, which are segments of time series remaining in the same space of the input data and thus are highly interpretable. We extract local shapelets distinctly manifesting a target class locally and early so that they are effective for early classification. Our experimental results on seven benchmark real data sets clearly show that the local shapelets extracted by our methods are highly interpretable and can achieve effective early classification.

181 citations

Proceedings Article
11 Jul 2009
TL;DR: ECTS (Early Classification on Time Series), an effective 1-nearest neighbor classification method that makes early predictions and at the same time retains the accuracy comparable to that of a 1NN classifier using the full-length time series.
Abstract: In this paper, we formulate the problem of early classification of time series data, which is important in some time-sensitive applications such as health-informatics. We introduce a novel concept of MPL (Minimum Prediction Length) and develop ECTS (Early Classification on Time Series), an effective 1-nearest neighbor classification method. ECTS makes early predictions and at the same time retains the accuracy comparable to that of a 1NN classifier using the full-length time series. Our empirical study using benchmark time series data sets shows that ECTS works well on the real data sets where 1NN classification is effective.

102 citations

Proceedings ArticleDOI
01 Oct 2008
TL;DR: This paper identifies the novel problem of mining sequence classifiers for early prediction, and proposes two interesting methods that achieve accuracy comparable to that of the stateof-the-art methods, but typically need to use only very short prefixes of the sequences.
Abstract: Supervised learning on sequence data, also known as sequence classification, has been well recognized as an important data mining task with many significant applications. Since temporal order is important in sequence data, in many critical applications of sequence classification such as medical diagnosis and disaster prediction, early prediction is a highly desirable feature of sequence classifiers. In early prediction, a sequence classifier should use a prefix of a sequence as short as possible to make a reasonably accurate prediction. To the best of our knowledge, early prediction on sequence data has not been studied systematically. In this paper, we identify the novel problem of mining sequence classifiers for early prediction. We analyze the problem and the challenges. As the first attempt to tackle the problem, we propose two interesting methods. The sequential classification rule (SCR) method mines a set of sequential classification rules as a classifier. A so-called early-prediction utility is defined and used to select features and rules. The generalized sequential decision tree (GSDT) method adopts a divide-and-conquer strategy to generate a classification model. We conduct an extensive empirical evaluation on several real data sets. Interestingly, our two methods achieve accuracy comparable to that of the stateof-the-art methods, but typically need to use only very short prefixes of the sequences. The results clearly indicate that early prediction is highly feasible and effective.

77 citations

DissertationDOI
17 Jan 2003
TL;DR: A new framework for analyzing sequential or temporal data such as time series is proposed, with special emphasis on the interpretability of the results, which makes it possible to deal with irregular or chaotic series.
Abstract: A new framework for analyzing sequential or temporal data such as time series is proposed. It differs from other approaches by the special emphasis on the interpretability of the results, since interpretability is of vital importance for knowledge discovery, that is, the development of new knowledge (in the head of a human) from a list of discovered patterns. While traditional approaches try to model and predict all time series observations, the focus in this work is on modelling local dependencies in multivariate time series. This makes it possible to deal with irregular or chaotic series. The proposed discovery process consists of (1) time series abstraction to get a representation close to the human perception of time series, (2) the enumeration and ranking of qualitative relationships in the data, (3) the specialization with quantitative constraints and the generalization of patterns to overcome limitations that are implicitly induced by the search bias.

50 citations