Home
/
Authors
/
Ronglu Li

Author

Ronglu Li

Bio: Ronglu Li is an academic researcher from Fudan University. The author has contributed to research in topics: Bayes' theorem & Categorization. The author has an hindex of 2, co-authored 2 publications receiving 41 citations.

Papers

PDF

Open Access

More filters

Journal Article•

Using Maximum Entropy Model for chinese text categorization

[...]

Ronglu Li, Xiaopeng Tao, Lei Tang, Yunfa Hu

01 Jan 2004-Lecture Notes in Computer Science

TL;DR: In this article, the authors used the maximum entropy model for text categorization and compared it to Bayes, KNN, and SVM, and showed that its performance is higher than Bayes and comparable with SVM.

...read moreread less

Abstract: Maximum Entropy Model is a probability estimation technique widely used for a variety of natural language tasks. It offers a clean and accommodable frame to combine diverse pieces of contextual information to estimate the probability of a certain linguistics phenomena. This approach for many tasks of NLP perform near state-of-the-art level, or outperform other competing probability methods when trained and tested under similar conditions. In this paper, we use maximum entropy model for text categorization. We compare and analyze its categorization performance using different approaches for text feature generation, different number of features and smoothing technique. Moreover, in experiments we compare it to Bayes, KNN and SVM, and show that its performance is higher than Bayes and comparable with KNN and SVM. We think it is a promising technique for text categorization.

...read moreread less

35 citations

Book Chapter•DOI•

Using Maximum Entropy Model for chinese text categorization

[...]

Ronglu Li¹, Xiaopeng Tao¹, Lei Tang¹, Yunfa Hu¹•Institutions (1)

Fudan University¹

14 Apr 2004

TL;DR: This work uses maximum entropy model for text categorization to compare and analyze its categorization performance using different approaches for text feature generation, different number of features and smoothing technique, and thinks it is a promising technique forText categorization.

...read moreread less

6 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Latent semantic analysis for text categorization using neural network

[...]

Bo Yu¹, Zongben Xu¹, Cheng-hua Li²•Institutions (2)

Xi'an Jiaotong University¹, Chonbuk National University²

01 Dec 2008-Knowledge Based Systems

TL;DR: Experimental results show that the models using MBPNN outperform than the basic BPNN and the application of LSA for this system can lead to dramatic dimensionality reduction while achieving good classification results.

...read moreread less

Abstract: New text categorization models using back-propagation neural network (BPNN) and modified back-propagation neural network (MBPNN) are proposed. An efficient feature selection method is used to reduce the dimensionality as well as improve the performance. The basic BPNN learning algorithm has the drawback of slow training speed, so we modify the basic BPNN learning algorithm to accelerate the training speed. The categorization accuracy also has been improved consequently. Traditional word-matching based text categorization system uses vector space model (VSM) to represent the document. However, it needs a high dimensional space to represent the document, and does not take into account the semantic relationship between terms, which can also lead to poor classification accuracy. Latent semantic analysis (LSA) can overcome the problems caused by using statistically derived conceptual indices instead of individual words. It constructs a conceptual vector space in which each term or document is represented as a vector in the space. It not only greatly reduces the dimensionality but also discovers the important associative relationship between terms. We test our categorization models on 20-newsgroup data set, experimental results show that the models using MBPNN outperform than the basic BPNN. And the application of LSA for our system can lead to dramatic dimensionality reduction while achieving good classification results.

...read moreread less

115 citations

Journal Article•DOI•

Advances in Machine Learning Based Text Categorization

[...]

Su Jinshu, Zhang Bofeng, Xu Xin

01 Jan 2006-Journal of Software

TL;DR: It is pointed out that problems such as nonlinearity, skewed data distribution, labeling bottleneck, hierarchical categorization, scalability of algorithms and categorization of Web pages are the key problems to the study of text categorization.

...read moreread less

Abstract: In recent years, there have been extensive studies and rapid progresses in automatic text categorization, which is one of the hotspots and key techniques in the information retrieval and data mining field. Highlighting the state-of-art challenging issues and research trends for content information processing of Internet and other complex applications, this paper presents a survey on the up-to-date development in text categorization based on machine learning, including model, algorithm and evaluation. It is pointed out that problems such as nonlinearity, skewed data distribution, labeling bottleneck, hierarchical categorization, scalability of algorithms and categorization of Web pages are the key problems to the study of text categorization. Possible solutions to these problems are also discussed respectively. Finally, some future directions of research are given.

...read moreread less

112 citations

Proceedings Article•DOI•

LSTM-CNN Hybrid Model for Text Classification

[...]

Jiarui Zhang¹, Yingxiang Li¹, Juan Tian¹, Tongyan Li¹•Institutions (1)

Chengdu University of Information Technology¹

01 Oct 2018

TL;DR: A hybrid model of LSTM and CNN is proposed that can effectively improve the accuracy of text classification and the performance of the hybrid model is compared with that of other models in the experiment.

...read moreread less

Abstract: Text classification is a classic task in the field of natural language processing. however, the existing methods of text classification tasks still need to be improved because of the complex abstraction of text semantic information and the strong relecvance of context. In this paper, we combine the advantages of two traditional neural network model, Long Short-Term Memory(LSTM) and Convolutional Neural Network(CNN). LSTM can effectively preserve the characteristics of historical information in long text sequences, and extract local features of text by using the structure of CNN. We proposes a hybrid model of LSTM and CNN, construct CNN model on the top of LSTM, the text feature vector output from LSTM is further extracted by CNN structure. The performance of the hybrid model is compared with that of other models in the experiment. The experimental results show that the hybrid model can effectively improve the accuracy of text classification.

...read moreread less

67 citations

Journal Article•DOI•

Text categorization based on combination of modified back propagation neural network and latent semantic analysis

[...]

Wei Wang¹, Bo Yu²•Institutions (2)

Sichuan University¹, Xi'an Jiaotong University²

09 Oct 2009-Neural Computing and Applications

TL;DR: The MBPNN is proposed to accelerate the training speed of BPNN and improve the categorization accuracy, and the application of LSA for the system can lead to dramatic dimensionality reduction while achieving good classification results.

...read moreread less

Abstract: This paper proposed a new text categorization model based on the combination of modified back propagation neural network (MBPNN) and latent semantic analysis (LSA). The traditional back propagation neural network (BPNN) has slow training speed and is easy to trap into a local minimum, and it will lead to a poor performance and efficiency. In this paper, we propose the MBPNN to accelerate the training speed of BPNN and improve the categorization accuracy. LSA can overcome the problems caused by using statistically derived conceptual indices instead of individual words. It constructs a conceptual vector space in which each term or document is represented as a vector in the space. It not only greatly reduces the dimension but also discovers the important associative relationship between terms. We test our categorization model on 20-newsgroup corpus and reuter-21578 corpus, experimental results show that the MBPNN is much faster than the traditional BPNN. It also enhances the performance of the traditional BPNN. And the application of LSA for our system can lead to dramatic dimensionality reduction while achieving good classification results.

...read moreread less

36 citations

Proceedings Article•DOI•

Research on Short Text Classification Algorithm Based on Statistics and Rules

[...]

Zhou Fa-guo, Zhang Fan, Yang Bingru¹, Yu Xingang²•Institutions (2)

University of Science and Technology Beijing¹, Liaocheng University²

29 Jul 2010

TL;DR: On the foundation of several common used classic text classification algorithms, mainly according to the major feature extraction methods, the short text classification based on statistics and rules is proposed and has better performance than other algorithms.

...read moreread less

Abstract: In this paper, we introduced the overview of short text research and the short text classification firstly. On the foundation of several common used classic text classification algorithms, mainly according to the major feature extraction methods, the short text classification based on statistics and rules is proposed. Experiments show that this algorithm has better performance than other algorithms. In order to improve the recall rate of short text classification, two-steps classification method is put forward.

...read moreread less

33 citations

1
2
3
4
…
5
6
7
8

Collapse