scispace - formally typeset
Open AccessJournal Article

Feature Discovery in Non-Metric Pairwise Data

Reads0
Chats0
TLDR
It is shown by a simple, exploratory analysis that the negative eigenvalues can code for relevant structure in the data, thus leading to the discovery of new features, which were lost by conventional data analysis techniques.
Abstract
Pairwise proximity data, given as similarity or dissimilarity matrix, can violate metricity. This occurs either due to noise, fallible estimates, or due to intrinsic non-metric features such as they arise from human judgments. So far the problem of non-metric pairwise data has been tackled by essentially omitting the negative eigenvalues or shifting the spectrum of the associated (pseudo-)covariance matrix for a subsequent embedding. However, little attention has been paid to the negative part of the spectrum itself. In particular no answer was given to whether the directions associated to the negative eigenvalues would at all code variance other than noise related. We show by a simple, exploratory analysis that the negative eigenvalues can code for relevant structure in the data, thus leading to the discovery of new features, which were lost by conventional data analysis techniques. The information hidden in the negative eigenvalue part of the spectrum is illustrated and discussed for three data sets, namely USPS handwritten digits, text-mining and data from cognitive psychology.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Similarity-based Classification: Concepts and Algorithms

TL;DR: The generalizability of using similarities as features is analyzed, design goals and methods for weighting nearest-neighbors for similarity-based learning are proposed, and different methods for consistently converting similarities into kernels are compared.
Journal ArticleDOI

What matters in differences between life trajectories: a comparative review of sequence dissimilarity measures

TL;DR: The study shows that there is no universally optimal distance index, and that the choice of a measure depends on which aspect the authors want to focus on, and introduces novel ways of measuring dissimilarities that overcome some flaws in existing measures.
Journal ArticleDOI

Visualizing non-metric similarities in multiple maps

TL;DR: The extension t-SNE is presented, which aims to address the problems of traditional multidimensional scaling techniques when these techniques are used to visualize non-metric similarities, by constructing a collection of maps that reveal complementary structure in the similarity data.
Journal ArticleDOI

DOA Estimation of Quasi-Stationary Signals With Less Sensors Than Sources and Unknown Spatial Noise Covariance: A Khatri–Rao Subspace Approach

TL;DR: This paper considers the problem of direction-of-arrival (DOA) estimation of quasi-stationary signals and develops a Khatri-Rao (KR) subspace approach that provides a simple yet effective way of eliminating the unknown spatial noise covariance from the signal SOSs.
Journal ArticleDOI

A novel method for reliable and fast extraction of neuronal EEG/MEG oscillations on the basis of spatio-spectral decomposition.

TL;DR: A novel method based on a linear decomposition of recordings that maximizes the signal power at a peak frequency while simultaneously minimizing it at the neighboring, surrounding frequency bins and allows extraction of components with a characteristic "peaky" spectral profile, typical for oscillatory processes is presented.
References
More filters
Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI

Nonlinear dimensionality reduction by locally linear embedding.

TL;DR: Locally linear embedding (LLE) is introduced, an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs that learns the global structure of nonlinear manifolds.
Journal ArticleDOI

A global geometric framework for nonlinear dimensionality reduction.

TL;DR: An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.
Journal ArticleDOI

Nonlinear component analysis as a kernel eigenvalue problem

TL;DR: A new method for performing a nonlinear form of principal component analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.
Related Papers (5)