Proceedings ArticleDOI
Text classification based on semi-supervised learning
Vo Duy Thanh,Pham Minh Tuan,Vo Trung Hung,Doan Van Ban +3 more
- pp 232-236
Reads0
Chats0
TLDR
Experiments show that classification quality is enhanced after improvement features model and the accuracy of the classification results between before and after improve features model through semi-supervised machine learning method and classification algorithm based on SVM model.Abstract:
In this paper, we present our solution and experimental results of the application of semi-supervised machine learning techniques and the improvement of SVM algorithm to build text classification applications. Firstly, we create a features model which is based on labeled data, and then we will be improved it by the unlabeled data. The technique that is to be added a label into new data is based on binary classification. Our experiment is implemented on three data layers which are extracted from papers in three topics sports, entertainment and education on VNEXPRESS.NET. We experimented and compared the accuracy of the classification results between before and after improve features model through semi-supervised machine learning method and classification algorithm based on SVM model. Experiments show that classification quality is enhanced after improvement features model.read more
Citations
More filters
An Augmented PAC Model for Semi-Supervised Learning
TL;DR: This chapter contains sections titled: Introduction, A Formal Framework, Sample Complexity Results, Algorithmic Results, Related Models and Discussion.
Proceedings ArticleDOI
Experimental Exploration of Support Vector Machine for Cancer Cell Classification
Kevin Joy Dsouza,Zahid Ansari +1 more
TL;DR: Breast Cancer Wisconsin (Diagnostic) Data Sets are used in order to classify using four types of SVM kernel methods such as linear, polynomial, sigmoid and radial, revealing that radial kernel method is best-suited data sets.
Journal Article
Text Classification Based on SVM and Text Summarization
TL;DR: A solution which is combined two algorithms: searching maximal frequent wordsets and clustering algorithms, extracting the main idea of the text before classifying, and is more stable and faster than that of the supervised learning or semi-supervised learning based on the support vector.
Proceedings ArticleDOI
A Definitive Survey of How to Use Unsupervised Text Classifiers
Sachin Sharma,J Adlin +1 more
TL;DR: This paper used the Punjabi collection for basic news items from various news outlets to increase the precision of their classifications, such as Unusual Forest, Svms, and Rapid deployment Regressors, to explore the exactness of classifiers.
Proceedings ArticleDOI
The Use of Supervised Text Classification Techniques: A Comprehensive Study
TL;DR: Text classification accuracy was investigated using Naive Bayes and other machine learning classification techniques, as well as Random Forest, Support Vector Machines, and Logistic Regression approaches.
References
More filters
Proceedings ArticleDOI
Combining labeled and unlabeled data with co-training
Avrim Blum,Tom M. Mitchell +1 more
TL;DR: A PAC-style analysis is provided for a problem setting motivated by the task of learning to classify web pages, in which the description of each example can be partitioned into two distinct views, to allow inexpensive unlabeled data to augment, a much smaller set of labeled examples.
BookDOI
Semi-Supervised Learning
TL;DR: Semi-supervised learning (SSL) as discussed by the authors is the middle ground between supervised learning (in which all training examples are labeled) and unsupervised training (where no label data are given).
Proceedings Article
A comparison of event models for naive bayes text classification
Andrew McCallum,Kamal Nigam +1 more
TL;DR: It is found that the multi-variate Bernoulli performs well with small vocabulary sizes, but that the multinomial performs usually performs even better at larger vocabulary sizes--providing on average a 27% reduction in error over the multi -variateBernoulli model at any vocabulary size.
Journal ArticleDOI
Text Classification from Labeled and Unlabeled Documents using EM
TL;DR: This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents, and presents two extensions to the algorithm that improve classification accuracy under these conditions.
Proceedings Article
Transductive Inference for Text Classification using Support Vector Machines
TL;DR: An analysis of why Transductive Support Vector Machines are well suited for text classi cation is presented, and an algorithm for training TSVMs, handling 10,000 examples and more is proposed.