Feature Discovery in Non-Metric Pairwise Data

Open AccessJournal Article

Feature Discovery in Non-Metric Pairwise Data

Julian Laub, +2 more

- 01 Dec 2004 -

Journal of Machine Learning Research

- Vol. 5, pp 801-818

Chats0

TLDR

It is shown by a simple, exploratory analysis that the negative eigenvalues can code for relevant structure in the data, thus leading to the discovery of new features, which were lost by conventional data analysis techniques.

Abstract:

Pairwise proximity data, given as similarity or dissimilarity matrix, can violate metricity. This occurs either due to noise, fallible estimates, or due to intrinsic non-metric features such as they arise from human judgments. So far the problem of non-metric pairwise data has been tackled by essentially omitting the negative eigenvalues or shifting the spectrum of the associated (pseudo-)covariance matrix for a subsequent embedding. However, little attention has been paid to the negative part of the spectrum itself. In particular no answer was given to whether the directions associated to the negative eigenvalues would at all code variance other than noise related. We show by a simple, exploratory analysis that the negative eigenvalues can code for relevant structure in the data, thus leading to the discovery of new features, which were lost by conventional data analysis techniques. The information hidden in the negative eigenvalue part of the spectrum is illustrated and discussed for three data sets, namely USPS handwritten digits, text-mining and data from cognitive psychology.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Similarity-based Classification: Concepts and Algorithms

Yihua Chen, +4 more

- 01 Dec 2009 -

Journal of Machine Learning Research

TL;DR: The generalizability of using similarities as features is analyzed, design goals and methods for weighting nearest-neighbors for similarity-based learning are proposed, and different methods for consistently converting similarities into kernels are compared.

...read moreread less

Journal ArticleDOI

What matters in differences between life trajectories: a comparative review of sequence dissimilarity measures

Matthias Studer, +1 more

- 01 Feb 2016 -

Journal of The Royal Statistical Society...

TL;DR: The study shows that there is no universally optimal distance index, and that the choice of a measure depends on which aspect the authors want to focus on, and introduces novel ways of measuring dissimilarities that overcome some flaws in existing measures.

...read moreread less

Journal ArticleDOI

Visualizing non-metric similarities in multiple maps

Laurens van der Maaten, +1 more

- 01 Apr 2012 -

Machine Learning

TL;DR: The extension t-SNE is presented, which aims to address the problems of traditional multidimensional scaling techniques when these techniques are used to visualize non-metric similarities, by constructing a collection of maps that reveal complementary structure in the similarity data.

...read moreread less

Journal ArticleDOI

DOA Estimation of Quasi-Stationary Signals With Less Sensors Than Sources and Unknown Spatial Noise Covariance: A Khatri–Rao Subspace Approach

Wing-Kin Ma, +2 more

- 01 Apr 2010 -

IEEE Transactions on Signal Processing

TL;DR: This paper considers the problem of direction-of-arrival (DOA) estimation of quasi-stationary signals and develops a Khatri-Rao (KR) subspace approach that provides a simple yet effective way of eliminating the unknown spatial noise covariance from the signal SOSs.

...read moreread less

Journal ArticleDOI

A novel method for reliable and fast extraction of neuronal EEG/MEG oscillations on the basis of spatio-spectral decomposition.

Vadim V. Nikulin, +2 more

- 15 Apr 2011 -

NeuroImage

TL;DR: A novel method based on a linear decomposition of recordings that maximizes the signal power at a peak frequency while simultaneously minimizing it at the neighboring, surrounding frequency bins and allows extraction of components with a characteristic "peaky" spectral profile, typical for oscillatory processes is presented.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Stephen F. Altschul, +6 more

- 01 Sep 1997 -

Nucleic Acids Research

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.

...read moreread less

Book

Pattern Classification

Peter E. Hart, +2 more

Journal ArticleDOI

Nonlinear dimensionality reduction by locally linear embedding.

Sam T. Roweis, +1 more

- 22 Dec 2000 -

Science

TL;DR: Locally linear embedding (LLE) is introduced, an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs that learns the global structure of nonlinear manifolds.

...read moreread less

Journal ArticleDOI

A global geometric framework for nonlinear dimensionality reduction.

Joshua B. Tenenbaum, +2 more

- 22 Dec 2000 -

Science

TL;DR: An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.

...read moreread less

Journal ArticleDOI

Nonlinear component analysis as a kernel eigenvalue problem

Bernhard Schölkopf, +2 more

- 01 Jul 1998 -

Neural Computation

TL;DR: A new method for performing a nonlinear form of principal component analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.

...read moreread less