Top 2 papers published by Saptarsi Goswami from Bangabasi College in 2014

Journal Article•DOI•

A Novel Feature Selection Technique for Text Classification Using Naïve Bayes

[...]

Subhajit Dey Sarkar, Saptarsi Goswami, Aman Agarwal, Javed Aktar

29 Oct 2014-International Scholarly Research Notices

TL;DR: A two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where the proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.

...read moreread less

Abstract: With the proliferation of unstructured data, text classification or text categorization has found many applications in topic classification, sentiment analysis, authorship identification, spam detection, and so on. There are many classification algorithms available. Naive Bayes remains one of the oldest and most popular classifiers. On one hand, implementation of naive Bayes is simple and, on the other hand, this also requires fewer amounts of training data. From the literature review, it is found that naive Bayes performs poorly compared to other classifiers in text classification. As a result, this makes the naive Bayes classifier unusable in spite of the simplicity and intuitiveness of the model. In this paper, we propose a two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where we use the univariate feature selection method to reduce the search space and then apply clustering to select relatively independent feature sets. We demonstrate the effectiveness of our method by a thorough evaluation and comparison over 13 datasets. The performance improvement thus achieved makes naive Bayes comparable or superior to other classifiers. The proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.

...read moreread less

57 citations

Journal Article•DOI•

Feature Selection: A Practitioner View

[...]

Saptarsi Goswami, Amlan Chakrabarti

08 Oct 2014-International Journal of Information Technology and Computer Science

TL;DR: A near comprehensive list of problems that have been solved using feature selection across technical and commercial domain is produced and can serve as a valuable tool to practitioners across industry and academia.

...read moreread less

Abstract: Feature selection is one of the most important preprocessing steps in data mining and knowledge Engineering. In this short review paper, apart from a brief taxonomy of current feature selection methods, we review feature selection methods that are being used in practice. Subsequently we produce a near comprehensive list of problems that have been solved using feature selection across technical and commercial domain. This can serve as a valuable tool to practitioners across industry and academia. We also present empirical results of filter based methods on various datasets. The empirical study covers task of classification, regression, text classification and clustering respectively. We also compare filter based ranking methods using rank correlation.

...read moreread less

44 citations

Showing papers by "Saptarsi Goswami published in 2014"