An introduction to variable and feature selection
Citations
6,967 citations
4,706 citations
Cites background from "An introduction to variable and fea..."
...As many pattern recognition techniques were originally not designed to cope with large amounts of irrelevant features, combining them with FS techniques has become a necessity in many applications [43, 78, 79]....
[...]
...As many pattern recognition techniques were originally not designed to cope with large amounts of irrelevant features, combining them with FS techniques has become a necessity in many applications (Guyon and Elisseeff, 2003; Liu and Motoda, 1998; Liu and Yu, 2005)....
[...]
3,857 citations
3,672 citations
Cites background from "An introduction to variable and fea..."
...For example, Guyon et al. (2002) demonstrated recursive feature elimination with support vector machine classification models for a well-known colon cancer microarray data set....
[...]
3,517 citations
Cites background or methods from "An introduction to variable and fea..."
...Another method used in literature is to use the weights of a classifier [1,2,50] to rank the feature for their removal....
[...]
...Embedded methods [1,9,10] include variable selection as part of the training process without splitting the data into training and testing sets....
[...]
...One of the simplest criteria is the Pearson correlation coefficient [1,12] defined as:...
[...]
...The focus of feature selection is to select a subset of variables from the input which can efficiently describe the input data while reducing effects from noise or irrelevant variables and still provide good prediction results [1]....
[...]
...relevant variables is addressed in [1] with good examples....
[...]
References
40,785 citations
26,531 citations
"An introduction to variable and fea..." refers background or methods in this paper
...The proposal ofRakotomamonjy (2003) is to train non-linear SVMs (Boser et al., 1992, Vapnik, 1998) with a regular training procedure and select features with backward elimination like in RFE (Guyon et al., 2002)....
[...]
...Many other types of penalization of the training error have been proposed in the literatur (see, e.g., Vapnik, 1998, Hastie et al., 2001)....
[...]
...Many authors resort to using the leave-one-out cross-validation procedure, even though it is known to be a high variance estimator of generalization error (Vapnik, 1982) and to give overly optimistic results, particularly when data are not properly independently and identically sampled from the ”true” distribution....
[...]
...This is the case, for instance, for the linear least square model using J = ∑k=1(w · xk + b− yk)(2) and for the linear SVM or optimum margin classifier, which minimizesJ = (1/2)||w||2, under constraints (Vapnik, 1982)....
[...]
...This correspondence is formally established in the paper of Weston et al. (2003) for the particular case of classification with linear predictors f (x) = w ·x+b, in the SVM framework (Boser et al., 1992, Vapnik, 1998)....
[...]
21,694 citations
14,825 citations