scispace - formally typeset
Search or ask a question

Showing papers by "Nello Cristianini published in 2001"


Book ChapterDOI
03 Jan 2001
TL;DR: The notion of kernel-alignment, a measure of similarity between two kernel functions or between a kernel and a target function, is introduced, giving experimental results showing that adapting the kernel to improve alignment on the labelled data significantly increases the alignment on a test set, giving improved classification accuracy.
Abstract: We introduce the notion of kernel-alignment, a measure of similarity between two kernel functions or between a kernel and a target function. This quantity captures the degree of agreement between a kernel and a given learning task, and has very natural interpretations in machine learning, leading also to simple algorithms for model selection and learning. We analyse its theoretical properties, proving that it is sharply concentrated around its expected value, and we discuss its relation with other standard measures of performance. Finally we describe some of the algorithms that can be obtained within this framework, giving experimental results showing that adapting the kernel to improve alignment on the labelled data significantly increases the alignment on the test set, giving improved classification accuracy. Hence, the approach provides a principled method of performing transduction.

1,083 citations


Journal ArticleDOI
28 Jun 2001
TL;DR: In this paper, the LSI approach can be implemented in a kernel-defined feature space, and experimental results demonstrate that the approach can significantly improve performance, and that it does not impair it.
Abstract: Kernel methods like support vector machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vector-space representation of two documents, in analogy with classical information retrieval (IR) approaches. Latent semantic indexing (LSI) has been successfully used for IR purposes as a technique for capturing semantic relations between terms and inserting them into the similarity measure between two documents. One of its main drawbacks, in IR, is its computational cost. In this paper we describe how the LSI approach can be implemented in a kernel-defined feature space. We provide experimental results demonstrating that the approach can significantly improve performance, and that it does not impair it.

303 citations



Proceedings Article
03 Jan 2001
TL;DR: This paper introduces new algorithms for unsupervised learning based on the use of a kernel matrix, and shows how the optimal solution can be approximated by slightly relaxing the corresponding optimization problem, and how this corresponds to using eigenvector information.
Abstract: In this paper we introduce new algorithms for unsupervised learning based on the use of a kernel matrix. All the information required by such algorithms is contained in the eigenvectors of the matrix or of closely related matrices. We use two different but related cost functions, the Alignment and the 'cut cost'. The first one is discussed in a companion paper [3], the second one is based on graph theoretic concepts. Both functions measure the level of clustering of a labeled dataset, or the correlation between data clusters and labels. We state the problem of unsupervised learning as assigning labels so as to optimize these cost functions. We show how the optimal solution can be approximated by slightly relaxing the corresponding optimization problem, and how this corresponds to using eigenvector information. The resulting simple algorithms are tested on real world data with positive results.

99 citations


Proceedings Article
03 Jan 2001
TL;DR: The residuals when data is projected into a subspace is shown to be reliably estimated on a random sample of points as can the sum of the tail of eigenvalues.
Abstract: We consider the problem of measuring the eigenvalues of a randomly drawn sample of points. We show that these values can be reliably estimated as can the sum of the tail of eigenvalues. Furthermore, the residuals when data is projected into a subspace is shown to be reliably estimated on a random sample. Experiments are presented that confirm the theoretical results.

36 citations