Support vector machines for text categorization

doi:10.1109/HICSS.2003.1174243

Proceedings ArticleDOI

Support vector machines for text categorization

- Vol. 5, pp 103

TLDR

This paper compares artificial neural network and support vector machine algorithms for use as text classifiers of news items and identifies a reduction in feature set that provides improved results.

Abstract:

Text categorization is the process of sorting text documents into one or more predefined categories or classes of similar documents. Differences in the results of such categorization arise from the feature set chosen to base the association of a given document with a given category. Advocates of text categorization recognize that the sorting of text documents into categories of like documents reduces the overhead required for fast retrieval of such documents and provides smaller domains in which the users may explore similar documents. In this paper we are interested in examining whether automatic classification of news texts can be improved by prefiltering the vocabulary to reduce the feature set used in the computations. First we compare artificial neural network and support vector machine algorithms for use as text classifiers of news items. Secondly, we identify a reduction in feature set that provides improved results.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Application of SVM and ANN for intrusion detection

Wun-Hwa Chen, +2 more

- 01 Oct 2005 -

Computers & Operations Research

TL;DR: Two data mining methodologies-Artificial Neural Networks and Support Vector Machine and two encoding methods-simple frequency-based scheme and tfi?idf scheme are used to detect potential system intrusions in this study.

...read moreread less

Book ChapterDOI

The intention behind web queries

Ricardo Baeza-Yates, +2 more

TL;DR: This work presents a framework for the identification of user’s interest in an automatic way, based on the analysis of query logs, and establishes that the combination of supervised and unsupervised learning is a good alternative to find user‘s goals.

...read moreread less

Proceedings Article

On Attacking Statistical Spam Filters.

Greg Wittel, +1 more

TL;DR: This work examines the general attack methods spammers use, along with challenges faced by developers and spammers, and demonstrates an attack that, while easy to implement, attempts to more strongly work against the statistical nature behind filters.

...read moreread less

Journal ArticleDOI

Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases

Juanying Xie, +1 more

- 01 May 2011 -

Expert Systems With Applications

TL;DR: A diagnosis model based on support vector machines (SVM) with a novel hybrid feature selection method to diagnose erythemato-squamous diseases and the proposed improved F-score and Sequential Forward Search (IFSFS) is developed, which is very promising compared to the previously reported results.

...read moreread less

Journal ArticleDOI

Development of early-warning protocol for predicting chlorophyll-a concentration using machine learning models in freshwater and estuarine reservoirs, Korea

Yongeun Park, +4 more

- 01 Jan 2015 -

Science of The Total Environment

TL;DR: An effective early-warning prediction method for Chl-a concentration and the eutrophication management scheme for reservoirs is suggested and a 7-day interval was determined as an efficient early warning interval in the two reservoirs.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

LIBSVM: A library for support vector machines

Chih-Chung Chang, +1 more

- 06 May 2011 -

ACM Transactions on Intelligent Systems ...

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Book

The Nature of Statistical Learning Theory

Vladimir Vapnik

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

...read moreread less

Journal ArticleDOI

A Tutorial on Support Vector Machines for Pattern Recognition

Christopher John Burges

- 01 Jun 1998 -

Data Mining and Knowledge Discovery

TL;DR: There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.

...read moreread less

Journal ArticleDOI

Practical Methods of Optimization.

Christoph Witzgall, +1 more

- 01 Oct 1989 -

Mathematics of Computation

Book ChapterDOI

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

Thorsten Joachims

TL;DR: This paper explores the use of Support Vector Machines for learning text classifiers from examples and analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task.

...read moreread less

Support vector machines for text categorization

Citations

Application of SVM and ANN for intrusion detection

The intention behind web queries

On Attacking Statistical Spam Filters.

Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases

Development of early-warning protocol for predicting chlorophyll-a concentration using machine learning models in freshwater and estuarine reservoirs, Korea

References

LIBSVM: A library for support vector machines

The Nature of Statistical Learning Theory

A Tutorial on Support Vector Machines for Pattern Recognition

Practical Methods of Optimization.

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

Related Papers (5)

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

The Nature of Statistical Learning Theory

Support-Vector Networks

Machine learning in automated text categorization

A Tutorial on Support Vector Machines for Pattern Recognition