Text classification based on semi-supervised learning

doi:10.1109/SOCPAR.2013.7054133

Proceedings ArticleDOI

Text classification based on semi-supervised learning

Vo Duy Thanh, +3 more

- pp 232-236

Chats0

TLDR

Experiments show that classification quality is enhanced after improvement features model and the accuracy of the classification results between before and after improve features model through semi-supervised machine learning method and classification algorithm based on SVM model.

Abstract:

In this paper, we present our solution and experimental results of the application of semi-supervised machine learning techniques and the improvement of SVM algorithm to build text classification applications. Firstly, we create a features model which is based on labeled data, and then we will be improved it by the unlabeled data. The technique that is to be added a label into new data is based on binary classification. Our experiment is implemented on three data layers which are extracted from papers in three topics sports, entertainment and education on VNEXPRESS.NET. We experimented and compared the accuracy of the classification results between before and after improve features model through semi-supervised machine learning method and classification algorithm based on SVM model. Experiments show that classification quality is enhanced after improvement features model.

Citations

PDF

Open Access

More filters

An Augmented PAC Model for Semi-Supervised Learning

Olivier Chapelle, +2 more

TL;DR: This chapter contains sections titled: Introduction, A Formal Framework, Sample Complexity Results, Algorithmic Results, Related Models and Discussion.

...read moreread less

Proceedings ArticleDOI

Experimental Exploration of Support Vector Machine for Cancer Cell Classification

Kevin Joy Dsouza, +1 more

TL;DR: Breast Cancer Wisconsin (Diagnostic) Data Sets are used in order to classify using four types of SVM kernel methods such as linear, polynomial, sigmoid and radial, revealing that radial kernel method is best-suited data sets.

...read moreread less

Journal Article

Text Classification Based on SVM and Text Summarization

Vo Duy Thanh, +3 more

- 02 Nov 2015 -

International journal of engineering res...

TL;DR: A solution which is combined two algorithms: searching maximal frequent wordsets and clustering algorithms, extracting the main idea of the text before classifying, and is more stable and faster than that of the supervised learning or semi-supervised learning based on the support vector.

...read moreread less

Proceedings ArticleDOI

A Definitive Survey of How to Use Unsupervised Text Classifiers

Sachin Sharma, +1 more

TL;DR: This paper used the Punjabi collection for basic news items from various news outlets to increase the precision of their classifications, such as Unusual Forest, Svms, and Rapid deployment Regressors, to explore the exactness of classifiers.

...read moreread less

Proceedings ArticleDOI

The Use of Supervised Text Classification Techniques: A Comprehensive Study

Vijay Kumar Soni, +1 more

TL;DR: Text classification accuracy was investigated using Naive Bayes and other machine learning classification techniques, as well as Random Forest, Support Vector Machines, and Logistic Regression approaches.

...read moreread less

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Combining labeled and unlabeled data with co-training

Avrim Blum, +1 more

TL;DR: A PAC-style analysis is provided for a problem setting motivated by the task of learning to classify web pages, in which the description of each example can be partitioned into two distinct views, to allow inexpensive unlabeled data to augment, a much smaller set of labeled examples.

...read moreread less

BookDOI

Semi-Supervised Learning

Olivier Chapelle, +2 more

TL;DR: Semi-supervised learning (SSL) as discussed by the authors is the middle ground between supervised learning (in which all training examples are labeled) and unsupervised training (where no label data are given).

...read moreread less

Proceedings Article

A comparison of event models for naive bayes text classification

Andrew McCallum, +1 more

TL;DR: It is found that the multi-variate Bernoulli performs well with small vocabulary sizes, but that the multinomial performs usually performs even better at larger vocabulary sizes--providing on average a 27% reduction in error over the multi -variateBernoulli model at any vocabulary size.

...read moreread less

Journal ArticleDOI

Text Classification from Labeled and Unlabeled Documents using EM

Kamal Nigam, +3 more

- 01 May 2000 -

Machine Learning

TL;DR: This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents, and presents two extensions to the algorithm that improve classification accuracy under these conditions.

...read moreread less

Proceedings Article

Transductive Inference for Text Classification using Support Vector Machines

Thorsten Joachims

TL;DR: An analysis of why Transductive Support Vector Machines are well suited for text classi cation is presented, and an algorithm for training TSVMs, handling 10,000 examples and more is proposed.

...read moreread less

Text classification based on semi-supervised learning

Citations

An Augmented PAC Model for Semi-Supervised Learning

Experimental Exploration of Support Vector Machine for Cancer Cell Classification

Text Classification Based on SVM and Text Summarization

A Definitive Survey of How to Use Unsupervised Text Classifiers

The Use of Supervised Text Classification Techniques: A Comprehensive Study

References

Combining labeled and unlabeled data with co-training

Semi-Supervised Learning

A comparison of event models for naive bayes text classification

Text Classification from Labeled and Unlabeled Documents using EM

Transductive Inference for Text Classification using Support Vector Machines

Related Papers (5)

Chinese News Text Classification Based on Machine Learning Algorithm

Trained SVMs based rules extraction method for text classification

The instructional design of Chinese text classification based on SVM

Efficient text classification by weighted proximal SVM

Hierarchical Text Classification Incremental Learning