Search or ask a question

Showing papers by "Amaury Lendasse published in 2012"

PDF

Open Access

Book Chapter•DOI•

Evolutive Approaches for Variable Selection Using a Non-parametric Noise Estimator

[...]

Alberto Guillén¹, Dušan Sovilj², Mark van Heeswijk², Luis Javier Herrera¹, Amaury Lendasse², Héctor Pomares¹, Ignacio Rojas¹ - Show less +3 more•Institutions (2)

University of Granada¹, Aalto University²

01 Jan 2012

TL;DR: This chapter presents several methodologies to perform variable selection in a local or a globalmanner using a non-parametric noise estimator to determine the quality of a subset of variables.

...read moreread less

Abstract: The design of a model to approximate a function relies significantly on the data used in the training stage. The problem of selecting an adequate set of variables should be treated carefully due to its importance. If the number of variables is high, the number of samples needed to design the model becomes too large and the interpretability of the model is lost. This chapter presents several methodologies to perform variable selection in a local or a globalmanner using a non-parametric noise estimator to determine the quality of a subset of variables. Several methods that apply parallel paradigms in different architecures are compared from the optimization and efficiency point of view since the problem is computationally expensive.

...read moreread less

6 citations

Journal Article•DOI•

Adaptive kernel smoothing regression for spatio-temporal environmental datasets

[...]

Federico Montesino Pouzols¹, Amaury Lendasse²•Institutions (2)

University of Helsinki¹, Aalto University²

01 Aug 2012-Neurocomputing

TL;DR: A simple and fast combination of incremental vector quantization with kernel smoothing regression using adaptive bandwidth is shown to be effective for online modeling of environmental datasets.

...read moreread less

3 citations

Proceedings Article•

Relevance learning for time series inspection

[...]

Andrej Gisbrecht¹, Dušan Sovilj², Barbara Hammer¹, Amaury Lendasse²•Institutions (2)

Bielefeld University¹, Aalto University²

01 Jan 2012

TL;DR: An extension of relevance learning to time series regression with GTM is proposed, which automatically adapts according to the relevant time lags resulting in a sparser representation, improved accuracy, and smoother visualization of the data.

...read moreread less

Abstract: By means of local neighborhood regression and time windows, the generative topographic mapping (GTM) allows to predict and visually inspect time series data. GTM itself, however, is fully unsupervised. In this contribution, we propose an extension of relevance learning to time series regression with GTM. This way, the metric automatically adapts according to the relevant time lags resulting in a sparser representation, improved accuracy, and smoother visualization of the data.

...read moreread less

2 citations

Proceedings Article•DOI•

Fast variable selection for memetracker phrases time series prediction

[...]

Yoan Miche¹, Tatiana Chistiakova¹, Anton Akusok¹, Amaury Lendasse¹, Rui Nian², Alberto Guillén³ - Show less +2 more•Institutions (3)

Aalto University¹, Ocean University of China², University of Granada³

06 Jun 2012

TL;DR: This paper proposes a methodology using a fast variable selection as a modified version of the Forward-Backward algorithm adapted to the specificities of the data used: very small number of samples and high number of variables.

...read moreread less

Abstract: This paper proposes a methodology using a fast variable selection as a modified version of the Forward-Backward algorithm. This methodology is adapted to the specificities of the data used: very small number of samples and high number of variables. Such data is generated using underlying dependencies and seasonality assumptions, from Meme phrases volume data. By the use of a resampling technique along with the proposed variable selection scheme, significant results are obtained, and the test Normalized Mean Square Error performances are improved. The results indicate that with the assumptions made on the data structure, variable selection is desirable. Also, the obtained information on the selected variables seem to cluster the time series in two very different classes: a set of approximately 600 series, which yield good NMSE, and seem to require very similar sets of variables for the prediction; and another set of 300--400 series, for which only the previous series value is of interest for the prediction. This first analysis clearly illustrates the future need to perform a more thorough analysis of the selected variables for each of the batch of series.Also, taking a close look at the possible dependences between the series inside a batch should give information as to why and how they are similar and have found themselves to be grouped under the same batch.

...read moreread less