scispace - formally typeset
Proceedings ArticleDOI

Support vector machines for text categorization

TLDR
This paper compares artificial neural network and support vector machine algorithms for use as text classifiers of news items and identifies a reduction in feature set that provides improved results.
Abstract
Text categorization is the process of sorting text documents into one or more predefined categories or classes of similar documents. Differences in the results of such categorization arise from the feature set chosen to base the association of a given document with a given category. Advocates of text categorization recognize that the sorting of text documents into categories of like documents reduces the overhead required for fast retrieval of such documents and provides smaller domains in which the users may explore similar documents. In this paper we are interested in examining whether automatic classification of news texts can be improved by prefiltering the vocabulary to reduce the feature set used in the computations. First we compare artificial neural network and support vector machine algorithms for use as text classifiers of news items. Secondly, we identify a reduction in feature set that provides improved results.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Application of SVM and ANN for intrusion detection

TL;DR: Two data mining methodologies-Artificial Neural Networks and Support Vector Machine and two encoding methods-simple frequency-based scheme and tfi?idf scheme are used to detect potential system intrusions in this study.
Book ChapterDOI

The intention behind web queries

TL;DR: This work presents a framework for the identification of user’s interest in an automatic way, based on the analysis of query logs, and establishes that the combination of supervised and unsupervised learning is a good alternative to find user‘s goals.
Proceedings Article

On Attacking Statistical Spam Filters.

TL;DR: This work examines the general attack methods spammers use, along with challenges faced by developers and spammers, and demonstrates an attack that, while easy to implement, attempts to more strongly work against the statistical nature behind filters.
Journal ArticleDOI

Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases

TL;DR: A diagnosis model based on support vector machines (SVM) with a novel hybrid feature selection method to diagnose erythemato-squamous diseases and the proposed improved F-score and Sequential Forward Search (IFSFS) is developed, which is very promising compared to the previously reported results.
Journal ArticleDOI

Development of early-warning protocol for predicting chlorophyll-a concentration using machine learning models in freshwater and estuarine reservoirs, Korea

TL;DR: An effective early-warning prediction method for Chl-a concentration and the eutrophication management scheme for reservoirs is suggested and a 7-day interval was determined as an efficient early warning interval in the two reservoirs.
References
More filters
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Journal ArticleDOI

A Tutorial on Support Vector Machines for Pattern Recognition

TL;DR: There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.
Book ChapterDOI

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

TL;DR: This paper explores the use of Support Vector Machines for learning text classifiers from examples and analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task.