scispace - formally typeset
Proceedings ArticleDOI

Text classification based on semi-supervised learning

Reads0
Chats0
TLDR
Experiments show that classification quality is enhanced after improvement features model and the accuracy of the classification results between before and after improve features model through semi-supervised machine learning method and classification algorithm based on SVM model.
Abstract
In this paper, we present our solution and experimental results of the application of semi-supervised machine learning techniques and the improvement of SVM algorithm to build text classification applications. Firstly, we create a features model which is based on labeled data, and then we will be improved it by the unlabeled data. The technique that is to be added a label into new data is based on binary classification. Our experiment is implemented on three data layers which are extracted from papers in three topics sports, entertainment and education on VNEXPRESS.NET. We experimented and compared the accuracy of the classification results between before and after improve features model through semi-supervised machine learning method and classification algorithm based on SVM model. Experiments show that classification quality is enhanced after improvement features model.

read more

Citations
More filters

An Augmented PAC Model for Semi-Supervised Learning

TL;DR: This chapter contains sections titled: Introduction, A Formal Framework, Sample Complexity Results, Algorithmic Results, Related Models and Discussion.
Proceedings ArticleDOI

Experimental Exploration of Support Vector Machine for Cancer Cell Classification

TL;DR: Breast Cancer Wisconsin (Diagnostic) Data Sets are used in order to classify using four types of SVM kernel methods such as linear, polynomial, sigmoid and radial, revealing that radial kernel method is best-suited data sets.
Journal Article

Text Classification Based on SVM and Text Summarization

TL;DR: A solution which is combined two algorithms: searching maximal frequent wordsets and clustering algorithms, extracting the main idea of the text before classifying, and is more stable and faster than that of the supervised learning or semi-supervised learning based on the support vector.
Proceedings ArticleDOI

A Definitive Survey of How to Use Unsupervised Text Classifiers

Sachin Sharma, +1 more
TL;DR: This paper used the Punjabi collection for basic news items from various news outlets to increase the precision of their classifications, such as Unusual Forest, Svms, and Rapid deployment Regressors, to explore the exactness of classifiers.
Proceedings ArticleDOI

The Use of Supervised Text Classification Techniques: A Comprehensive Study

TL;DR: Text classification accuracy was investigated using Naive Bayes and other machine learning classification techniques, as well as Random Forest, Support Vector Machines, and Logistic Regression approaches.
References
More filters
Proceedings ArticleDOI

Combining labeled and unlabeled data with co-training

TL;DR: A PAC-style analysis is provided for a problem setting motivated by the task of learning to classify web pages, in which the description of each example can be partitioned into two distinct views, to allow inexpensive unlabeled data to augment, a much smaller set of labeled examples.
BookDOI

Semi-Supervised Learning

TL;DR: Semi-supervised learning (SSL) as discussed by the authors is the middle ground between supervised learning (in which all training examples are labeled) and unsupervised training (where no label data are given).
Proceedings Article

A comparison of event models for naive bayes text classification

TL;DR: It is found that the multi-variate Bernoulli performs well with small vocabulary sizes, but that the multinomial performs usually performs even better at larger vocabulary sizes--providing on average a 27% reduction in error over the multi -variateBernoulli model at any vocabulary size.
Journal ArticleDOI

Text Classification from Labeled and Unlabeled Documents using EM

TL;DR: This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents, and presents two extensions to the algorithm that improve classification accuracy under these conditions.
Proceedings Article

Transductive Inference for Text Classification using Support Vector Machines

TL;DR: An analysis of why Transductive Support Vector Machines are well suited for text classi cation is presented, and an algorithm for training TSVMs, handling 10,000 examples and more is proposed.
Related Papers (5)