Learning the Kernel Matrix with Semidefinite Programming
Citations
11,357 citations
Cites background or methods from "Learning the Kernel Matrix with Sem..."
...Lanckriet et al. [2004] show that if K is a convex combination of Gram matrices Ki so that K = ∑ i νiKi with νi ≥ 0 for all i then the optimization of the alignment score w....
[...]
...The subset of datapoints (SD) method for GPC was proposed in Lawrence et al. [2003], using an EP-style approximation of the posterior, and the differential entropy score (see section 8....
[...]
7,767 citations
Cites background from "Learning the Kernel Matrix with Sem..."
...This question also motivated much research (Lanckriet, Cristianini, Bartlett, El Gahoui, & Jordan, 2002; Wang & Chan, 2002; Cristianini, Shawe-Taylor, Elisseeff, & Kandola, 200 ), and deep architectures can be viewed as a promising development in this direction....
[...]
...…to construct a supervised classifier, and in that case the unsupervised learning component can clearly be seen as a regularizer or a prior (Ng & Jordan, 2002; Lasserre et al., 2006; Liang & Jordan, 2008; Erhan et al., 2009) that forces the resulting parameters to make sense not only to model…...
[...]
...(discriminant models) (Ng & Jordan, 2002; Liang & Jordan, 2008)....
[...]
4,433 citations
3,773 citations
3,330 citations
References
[...]
33,341 citations
13,736 citations
"Learning the Kernel Matrix with Sem..." refers background or methods in this paper
...Kernel-based learning algorithms (see, for example, Cristianini and Shawe-Taylor, 2000; Schölkopf and Smola, 2002) work by embedding the data into a Hilbert space, and searching for linear relations in such a space....
[...]
...(See, for example, Cristianini and Shawe-Taylor, 2000; Schölkopf and Smola, 2002)....
[...]
...With a fixed kernel, all of these criteria give upper bounds on misclassification probability (see, for example, Chapter 4 of Cristianini and Shawe-Taylor, 2000)....
[...]
12,059 citations
"Learning the Kernel Matrix with Sem..." refers background in this paper
...The flrst kernelK1 is derived as a linear kernel from the \bag-of-words" representation of the difierent documents, capturing information about the frequency of terms in the difierent documents ( Salton and McGill, 1983 )....
[...]
10,262 citations
"Learning the Kernel Matrix with Sem..." refers methods in this paper
...Moreover, an additional kernel matrix is constructed by applying the Smith-Waterman (SW) pairwise sequence comparison algorithm ( Smith and Waterman, 1981 ) to the yeast protein sequences and applying the empirical kernel map (Tsuda, 1999)....
[...]
7,655 citations
"Learning the Kernel Matrix with Sem..." refers background or methods in this paper
...A general-purpose program such as SeDuMi ( Sturm, 1999 ) handles those problems e‐ciently....
[...]
...Note also that the optimal weights µi, i = 1, . . . , m, can be recovered from the primal-dual solution found by standard software such as SeDuMi (Sturm, 1999)....
[...]
...General-purpose programs such as SeDuMi (Sturm, 1999) use interior-point methods to solve SDP problems (Nesterov and Nemirovsky, 1994); they are polynomial time, but have a worst-case complexity O(n4....
[...]
...A general-purpose program such as SeDuMi (Sturm, 1999) handles those problems efficiently....
[...]
...…weights {µi}3i=1 are optimized according to a hard margin, a 1-norm soft margin and a 2-norm soft margin criterion, respectively; the semi-definite programs (27), (32) and (38) are solved using the general-purpose optimization software SeDuMi (Sturm, 1999), leading to optimal weights {µ∗i }3i=1....
[...]