Home
/
Authors
/
Li Ronglu

Author

Li Ronglu

Bio: Li Ronglu is an academic researcher from Fudan University. The author has contributed to research in topics: Classification rule & n-gram. The author has an hindex of 2, co-authored 3 publications receiving 54 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Using Maximum Entropy Model for Chinese Text Categorization

[...]

Li Ronglu, Wang Jianhui, Chen Xiaoyun, Tao Xiaopeng, Hu Yunfa - Show less +1 more

15 Jan 2005-Journal of Computer Research and Development

47 citations

Journal Article•DOI•

Text Categorization Based on Classification Rules Tree by Frequent Patterns

[...]

Chen Xiao-Yun, Chen Yi, Wang Lei, Li Ronglu, Hu Yunfa - Show less +1 more

01 Jan 2006-Journal of Software

TL;DR: This study illuminates that the word frequency is helpful for improving the accuracy of the association categorization and the classification rule tree can improve the efficiency of the Association categorization.

...read moreread less

Abstract: Association categorization approach based on frequent patterns has been recently presented, which builds the classification rules according to frequent patterns in various categories and classifies the new text employing these rules. But there are two shortages when the method is applied to classify text data: one is that the method ignores the information about word's frequency in a text; another is that the rule pruning to improve the classification efficiency will lead to obvious descending of accuracy when mass rules are generated. Therefore, a text categorization algorithm based on frequent patterns with term frequency is presented. This study illuminates that the word frequency is helpful for improving the accuracy of the association categorization and the classification rule tree can improve the efficiency of the association classification. The result of experiments shows the performance of association classification is better than three typical text classification methods Bayes, kNN (k nearest neighbor) and SVM (support vector machines), so it is a promising text classification method.

...read moreread less

7 citations

Book Chapter•DOI•

Mining frequent trees based on topology projection

[...]

Ma Haibing¹, Wang Chen¹, Li Ronglu¹, Liu Yong¹, Hu Yunfa¹ - Show less +1 more•Institutions (1)

Fudan University¹

29 Mar 2005

TL;DR: TG, an efficient pattern growth algorithm for mining frequent embedded suttees in a forest of rooted, labeled, and ordered trees, is presented and it is found that TG outperforms TreeMiner, one of the fastest methods proposed before, by a factor of 4 to 15.

...read moreread less

Abstract: Methods for mining frequent trees are widely used in domains like bioinformatics, web-mining, chemical compound structure mining, and so on. In this paper, we present TG, an efficient pattern growth algorithm for mining frequent embedded suttees in a forest of rooted, labeled, and ordered trees. It uses rightmost path expansion scheme to construct complete pattern growth space, and creates a projected database for every grow point of the pattern ready to grow. Then, the problem is transformed from mining frequent trees to finding frequent nodes in the projected database. We conduct detailed experiments to test its performance and scalability and find that TG outperforms TreeMiner, one of the fastest methods proposed before, by a factor of 4 to 15.

...read moreread less

1 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Latent semantic analysis for text categorization using neural network

[...]

Bo Yu¹, Zongben Xu¹, Cheng-hua Li²•Institutions (2)

Xi'an Jiaotong University¹, Chonbuk National University²

01 Dec 2008-Knowledge Based Systems

TL;DR: Experimental results show that the models using MBPNN outperform than the basic BPNN and the application of LSA for this system can lead to dramatic dimensionality reduction while achieving good classification results.

...read moreread less

Abstract: New text categorization models using back-propagation neural network (BPNN) and modified back-propagation neural network (MBPNN) are proposed. An efficient feature selection method is used to reduce the dimensionality as well as improve the performance. The basic BPNN learning algorithm has the drawback of slow training speed, so we modify the basic BPNN learning algorithm to accelerate the training speed. The categorization accuracy also has been improved consequently. Traditional word-matching based text categorization system uses vector space model (VSM) to represent the document. However, it needs a high dimensional space to represent the document, and does not take into account the semantic relationship between terms, which can also lead to poor classification accuracy. Latent semantic analysis (LSA) can overcome the problems caused by using statistically derived conceptual indices instead of individual words. It constructs a conceptual vector space in which each term or document is represented as a vector in the space. It not only greatly reduces the dimensionality but also discovers the important associative relationship between terms. We test our categorization models on 20-newsgroup data set, experimental results show that the models using MBPNN outperform than the basic BPNN. And the application of LSA for our system can lead to dramatic dimensionality reduction while achieving good classification results.

...read moreread less

115 citations

Journal Article•DOI•

Advances in Machine Learning Based Text Categorization

[...]

Su Jinshu, Zhang Bofeng, Xu Xin

01 Jan 2006-Journal of Software

TL;DR: It is pointed out that problems such as nonlinearity, skewed data distribution, labeling bottleneck, hierarchical categorization, scalability of algorithms and categorization of Web pages are the key problems to the study of text categorization.

...read moreread less

Abstract: In recent years, there have been extensive studies and rapid progresses in automatic text categorization, which is one of the hotspots and key techniques in the information retrieval and data mining field. Highlighting the state-of-art challenging issues and research trends for content information processing of Internet and other complex applications, this paper presents a survey on the up-to-date development in text categorization based on machine learning, including model, algorithm and evaluation. It is pointed out that problems such as nonlinearity, skewed data distribution, labeling bottleneck, hierarchical categorization, scalability of algorithms and categorization of Web pages are the key problems to the study of text categorization. Possible solutions to these problems are also discussed respectively. Finally, some future directions of research are given.

...read moreread less

112 citations

Journal Article•DOI•

Attention-based BiGRU-CNN for Chinese question classification

[...]

Jin Liu¹, Yihe Yang¹, Shiqi Lv¹, Jin Wang², Hui Chen³ - Show less +1 more•Institutions (3)

Shanghai Maritime University¹, Changsha University of Science and Technology², Shanghai Normal University³

13 Jun 2019-Journal of Ambient Intelligence and Humanized Computing

TL;DR: A novel deep neural network model is proposed, Attention-Based BiGRU-CNN network (ABBC), which combines the characteristics and advantages of convolutional neural network, attention mechanism and recurrent neural network and achieves the best performance in the Chinese question classification task.

...read moreread less

Abstract: Chinese question classification is one of the essential tasks in nature language processing (NLP) for Chinese language due to its distinctive characteristics. Methods presented in the literature are usually based on rules or traditional machine learning methods, which require manually created rules or features. Thus, the accuracy of the classification is constrained by inherent limitations of these methods. As deep learning-based methods have been proved to be able to mine deep information of text, to alleviate the problem, this article proposes a novel deep neural network model, Attention-Based BiGRU-CNN network (ABBC); and applies it to Chinese question classification task. The model combines the characteristics and advantages of convolutional neural network, attention mechanism and recurrent neural network. Our model can not only extract the features of Chinese questions effectively, but also learn the context information of words to solve the problem that the Text-CNN model can lose position feature. By comparing out model to four other classic models, the experimental results show that our model achieves the best performance in the Chinese question classification task.

...read moreread less

73 citations

Proceedings Article•DOI•

LSTM-CNN Hybrid Model for Text Classification

[...]

Jiarui Zhang¹, Yingxiang Li¹, Juan Tian¹, Tongyan Li¹•Institutions (1)

Chengdu University of Information Technology¹

01 Oct 2018

TL;DR: A hybrid model of LSTM and CNN is proposed that can effectively improve the accuracy of text classification and the performance of the hybrid model is compared with that of other models in the experiment.

...read moreread less

Abstract: Text classification is a classic task in the field of natural language processing. however, the existing methods of text classification tasks still need to be improved because of the complex abstraction of text semantic information and the strong relecvance of context. In this paper, we combine the advantages of two traditional neural network model, Long Short-Term Memory(LSTM) and Convolutional Neural Network(CNN). LSTM can effectively preserve the characteristics of historical information in long text sequences, and extract local features of text by using the structure of CNN. We proposes a hybrid model of LSTM and CNN, construct CNN model on the top of LSTM, the text feature vector output from LSTM is further extracted by CNN structure. The performance of the hybrid model is compared with that of other models in the experiment. The experimental results show that the hybrid model can effectively improve the accuracy of text classification.

...read moreread less

67 citations

Journal Article•DOI•

Text categorization based on combination of modified back propagation neural network and latent semantic analysis

[...]

Wei Wang¹, Bo Yu²•Institutions (2)

Sichuan University¹, Xi'an Jiaotong University²

09 Oct 2009-Neural Computing and Applications

TL;DR: The MBPNN is proposed to accelerate the training speed of BPNN and improve the categorization accuracy, and the application of LSA for the system can lead to dramatic dimensionality reduction while achieving good classification results.

...read moreread less

Abstract: This paper proposed a new text categorization model based on the combination of modified back propagation neural network (MBPNN) and latent semantic analysis (LSA). The traditional back propagation neural network (BPNN) has slow training speed and is easy to trap into a local minimum, and it will lead to a poor performance and efficiency. In this paper, we propose the MBPNN to accelerate the training speed of BPNN and improve the categorization accuracy. LSA can overcome the problems caused by using statistically derived conceptual indices instead of individual words. It constructs a conceptual vector space in which each term or document is represented as a vector in the space. It not only greatly reduces the dimension but also discovers the important associative relationship between terms. We test our categorization model on 20-newsgroup corpus and reuter-21578 corpus, experimental results show that the MBPNN is much faster than the traditional BPNN. It also enhances the performance of the traditional BPNN. And the application of LSA for our system can lead to dramatic dimensionality reduction while achieving good classification results.

...read moreread less

36 citations

1
2
3
4
…
5
6
7
8
9
10
11

Collapse