scispace - formally typeset
Journal ArticleDOI

Optimal feature subset selection using hybrid binary Jaya optimization algorithm for text classification

Reads0
Chats0
TLDR
A new hybrid feature selection method based on normalized difference measure and binary Jaya optimization algorithm (NDM-BJO) to obtain the appropriate subset of optimal features from the text corpus and used the error rate as a minimizing objective function to measure the fitness of a solution.
Abstract
Feature selection is an important task in the high-dimensional problem of text classification. Nowadays most of the feature selection methods use the significance of optimization algorithm to select an optimal subset of feature from the high-dimensional feature space. Optimal feature subset reduces the computation cost and increases the text classifier accuracy. In this paper, we have proposed a new hybrid feature selection method based on normalized difference measure and binary Jaya optimization algorithm (NDM-BJO) to obtain the appropriate subset of optimal features from the text corpus. We have used the error rate as a minimizing objective function to measure the fitness of a solution. The nominated optimal feature subsets are evaluated using Naive Bayes and Support Vector Machine classifier with various popular benchmark text corpus datasets. The observed results have confirmed that the proposed work NDM-BJO shows auspicious improvements compared with existing work.

read more

Citations
More filters
Journal ArticleDOI

An Intensive and Comprehensive Overview of JAYA Algorithm, its Versions and Applications.

TL;DR: The JAYA algorithm combines the survival of the fittest principle from evolutionary algorithms as well as the global optimal solution attraction of Swarm Intelligence methods as discussed by the authors. And the proposed versions of the proposed algorithms have been surveyed such as modified, binary, hybridized, parallel, chaotic, multi-objective and others.
Journal ArticleDOI

A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities.

TL;DR: In this paper, a systematic literature review of over 200 articles is presented to enlighten analysts, practitioners and researchers in the field of data analytics seeking clarity in understanding and implementing effective FS optimization methods for improved text classification tasks.
Journal ArticleDOI

CS-BPSO: Hybrid feature selection based on chi-square and binary PSO algorithm for Arabic email authorship analysis

TL;DR: An efficient hybrid feature selection algorithm based on binary particle swarm optimization and chi-square BPSO (CS-BPSO) was developed to enhance the performance of Arabic email authorship analysis.
Journal ArticleDOI

A feature selection model for software defect prediction using binary Rao optimization algorithm

K. Thirumoorthy, +1 more
TL;DR: In this article , a hybrid feature selection (filter-wrapper) approach based on the multi-criteria decision making (MCDM) method and the Rao optimization method was proposed for selecting the more informative features to improve the software defect prediction rate.
Journal ArticleDOI

Wrapper and Hybrid Feature Selection Methods Using Metaheuristic Algorithms for English Text Classification: A Systematic Review

TL;DR: A comprehensive overview was systematically studied by exploring available studies of different metaheuristic algorithms used for FS to improve TC by answering four research questions (RQs) and a list of thirty-seven related articles was extracted and investigated to generate new knowledge in the domain of study.
References
More filters
Journal ArticleDOI

An introduction to variable and feature selection

TL;DR: The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.
Journal ArticleDOI

Wrappers for feature subset selection

TL;DR: The wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain and compares the wrapper approach to induction without feature subset selection and to Relief, a filter approach tofeature subset selection.
Journal ArticleDOI

A comparison of methods for multiclass support vector machines

TL;DR: Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.
Journal Article

An extensive empirical study of feature selection metrics for text classification

TL;DR: An empirical comparison of twelve feature selection methods evaluated on a benchmark of 229 text classification problem instances, revealing that a new feature selection metric, called 'Bi-Normal Separation' (BNS), outperformed the others by a substantial margin in most situations and was the top single choice for all goals except precision.
Journal ArticleDOI

Support vector machines for spam categorization

TL;DR: The use of support vector machines in classifying e-mail as spam or nonspam is studied by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, which found SVM's performed best when using binary features.
Related Papers (5)