LSTM-CNN Hybrid Model for Text Classification

doi:10.1109/IAEAC.2018.8577620

Home
/
Papers
/
LSTM-CNN Hybrid Model for Text Classification

Proceedings Article•DOI•

LSTM-CNN Hybrid Model for Text Classification

Jiarui Zhang¹, Yingxiang Li¹, Juan Tian¹, Tongyan Li¹•Institutions (1)

Chengdu University of Information Technology¹

01 Oct 2018-

TL;DR: A hybrid model of LSTM and CNN is proposed that can effectively improve the accuracy of text classification and the performance of the hybrid model is compared with that of other models in the experiment.

read less

Abstract: Text classification is a classic task in the field of natural language processing. however, the existing methods of text classification tasks still need to be improved because of the complex abstraction of text semantic information and the strong relecvance of context. In this paper, we combine the advantages of two traditional neural network model, Long Short-Term Memory(LSTM) and Convolutional Neural Network(CNN). LSTM can effectively preserve the characteristics of historical information in long text sequences, and extract local features of text by using the structure of CNN. We proposes a hybrid model of LSTM and CNN, construct CNN model on the top of LSTM, the text feature vector output from LSTM is further extracted by CNN structure. The performance of the hybrid model is compared with that of other models in the experiment. The experimental results show that the hybrid model can effectively improve the accuracy of text classification.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

An ensemble deep learning technique for detecting suicidal ideation from posts in social media platforms

[...]

01 Nov 2022-Journal of King Saud University - Computer and Information Sciences

TL;DR: In this paper , a LSTM-Attention-CNN combined model was proposed to detect anomalies in human interactions that are indicators of suicidal intentions, which achieved an accuracy of 90.3% and an F1-score of 92.6%.

...read moreread less

4 citations

Posted Content•DOI•

Deep learning-based identification of genetic variants: Application to Alzheimer's disease classification

[...]

Taeho Jo¹, Kwangsik Nho, Paula J. Bice¹, Andrew J. Saykin, Alzheimer’s Disease Neuroimaging Initiative - Show less +1 more•Institutions (1)

Indiana University¹

22 Jul 2021-medRxiv

TL;DR: Wang et al. as discussed by the authors proposed a novel three-step approach for identification of genetic variants using deep learning to identify phenotype-related single nucleotide polymorphisms (SNPs) and develop accurate classification models.

...read moreread less

Abstract: Deep learning is a promising tool that uses nonlinear transformations to extract features from high-dimensional data. Although deep learning has been used in several genetic studies, it is challenging in genome–wide association studies (GWAS) with high-dimensional genomic data. Here we propose a novel three-step approach for identification of genetic variants using deep learning to identify phenotype-related single nucleotide polymorphisms (SNPs) and develop accurate classification models. In the first step, we divided the whole genome into non-overlapping fragments of an optimal size and then ran Convolutional Neural Network (CNN) on each fragment to select phenotype-associated fragments. In the second step, using an overlapping window approach, we ran CNN on the selected fragments to calculate phenotype influence scores (PIS) and identify phenotype-associated SNPs based on PIS. In the third step, we ran CNN on all identified SNPs to develop a classification model. We tested our approach using genome-wide genotyping data for Alzheimer’s disease (AD) (N=981; cognitively normal older adults (CN) =650 and AD=331). Our approach identified the well-known APOE region as the most significant genetic locus for AD. Our classification model achieved an area under the curve (AUC) of 0.82, which outperformed traditional machine learning approaches, Random Forest and XGBoost. By using a novel deep learning-based GWAS approach, we were able to identify AD-associated SNPs and develop a better classification model for AD. Author summary Although deep learning has been successfully applied to many scientific fields, deep learning has not been used in genome–wide association studies (GWAS) in practice due to the high dimensionality of genomic data. To overcome this challenge, we propose a novel three-step approach for identification of genetic variants using deep learning to identify disease-associated single nucleotide polymorphisms (SNPs) and develop accurate classification models. To accomplish this, we divided the whole genome into non-overlapping fragments of an optimal size and ran a deep learning algorithm on each fragment to select disease-associated fragments. We calculated phenotype influence scores (PIS) of each SNP within selected fragments to identify disease-associated significant SNPs and developed a disease classification model by using overlapping window and deep learning algorithms. In the application of our method to Alzheimer’s disease (AD), we identified well-known significant genetic loci for AD and achieved higher classification accuracies than traditional machine learning methods. This study is the first study to our knowledge to develop a deep learning-based identification of genetic variants using fragmentation and window approaches as well as deep learning algorithms to identify disease-related SNPs and develop accurate classification models.

...read moreread less

3 citations

Journal Article•DOI•

Enhanced Arabic Sentiment Analysis Using a Novel Stacking Ensemble of Hybrid and Deep Learning Models

[...]

Hager Saleh, Sherif Mostafa, Lubna A. Gabralla, Ahmad O. Aseeri, Shaker El-Sappagh - Show less +1 more

07 Sep 2022-Applied Sciences

TL;DR: A stacking ensemble model that combined the prediction power of CNN and hybrid deep learning models to predict Arabic sentiment accurately is proposed and it is discovered that the proposed deep stacking model achieved the best performance compared to the previous models.

...read moreread less

Abstract: Sentiment analysis (SA) is a machine learning application that drives people’s opinions from text using natural language processing (NLP) techniques. Implementing Arabic SA is challenging for many reasons, including equivocation, numerous dialects, lack of resources, morphological diversity, lack of contextual information, and hiding of sentiment terms in the implicit text. Deep learning models such as convolutional neural networks (CNN) and long short-term memory (LSTM) have significantly improved in the Arabic SA domain. Hybrid models based on CNN combined with long short-term memory (LSTM) or gated recurrent unit (GRU) have further improved the performance of single DL models. In addition, the ensemble of deep learning models, especially stacking ensembles, is expected to increase the robustness and accuracy of the previous DL models. In this paper, we proposed a stacking ensemble model that combined the prediction power of CNN and hybrid deep learning models to predict Arabic sentiment accurately. The stacking ensemble algorithm has two main phases. Three DL models were optimized in the first phase, including deep CNN, hybrid CNN-LSTM, and hybrid CNN-GRU. In the second phase, these three separate pre-trained models’ outputs were integrated with a support vector machine (SVM) meta-learner. To extract features for DL models, the continuous bag of words (CBOW) and the skip-gram models with 300 dimensions of the word embedding were used. Arabic health services datasets (Main-AHS and Sub-AHS) and the Arabic sentiment tweets dataset were used to train and test the models (ASTD). A number of well-known deep learning models, including DeepCNN, hybrid CNN-LSTM, hybrid CNN-GRU, and conventional ML algorithms, have been used to compare the performance of the proposed ensemble model. We discovered that the proposed deep stacking model achieved the best performance compared to the previous models. Based on the CBOW word embedding, the proposed model achieved the highest accuracy of 92.12%, 95.81%, and 81.4% for Main-AHS, Sub-AHS, and ASTD datasets, respectively.

...read moreread less

3 citations

Journal Article•DOI•

Bug Severity Prediction Algorithm Using Topic-Based Feature Selection and CNN-LSTM Algorithm

[...]

01 Jan 2022-IEEE Access

TL;DR: Wang et al. as mentioned in this paper classify bug reports by topic-based severity and extract features from the severity of each topic, and predict the severity by learning the characteristics from the CNN-LSTM algorithm, and the F-measure was 90.62% and 93.22% of Mozilla.

...read moreread less

Abstract: Increasing software usage has gradually increased the occurrence of bugs. When writing a bug report, the severity of the bug can be freely selected, so the subjective judgment of the author is involved. In subjective judgment, a severity error may occur depending on the background knowledge between the user and the developer. To resolve this problem, in this paper, the severity was predicted using the feature selection algorithm of the severity of each topic. We utilize the dataset in Eclipse and Mozilla open source projects. First, we classify bug reports by topic-based severity, and extract features from the severity of each topic. The severity was predicted by learning the characteristics from the CNN-LSTM algorithm, and the F-measure was 90.62% and 93.22% of Mozilla. To evaluate the effectiveness of the proposed model, we compared the baselines including DeepSeverity and EWD-Multinomial studies with Eclipse and Mozilla open source projects and showed that the proposed model is more efficient.

...read moreread less

3 citations

Proceedings Article•DOI•

A Study on a Joint Deep Learning Model for Myanmar Text Classification

[...]

Myat Sapal Phyu, Khin Thandar Nwet

28 Feb 2020

TL;DR: This paper uses pre-trained word vectors to handle the resource-demanding problem and studies the effectiveness of a joint Convolutional Neural Network and Long Short Term Memory (CNN-LSTM) for Myanmar text classification.

...read moreread less

Abstract: Text classification is one of the most critical areas of research in the field of natural language processing (NLP). Recently, most of the NLP tasks achieve remarkable performance by using deep learning models. Generally, deep learning models require a huge amount of data to be utilized. This paper uses pre-trained word vectors to handle the resource-demanding problem and studies the effectiveness of a joint Convolutional Neural Network and Long Short Term Memory (CNN-LSTM) for Myanmar text classification. The comparative analysis is performed on the baseline Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) and their combined model CNN-RNN.

...read moreread less

3 citations

Cites methods from "LSTM-CNN Hybrid Model for Text Clas..."

...[9] constructed the CNN model on the top of LSTM....
[...]

…
1
2
3
4
5
6
7
…
8
9
10
11
12
13
14

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A neural probabilistic language model

[...]

Yoshua Bengio¹, Réjean Ducharme¹, Pascal Vincent¹, Christian Janvin¹•Institutions (1)

Université de Montréal¹

01 Mar 2003-Journal of Machine Learning Research

TL;DR: The authors propose to learn a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, which can be expressed in terms of these representations.

...read moreread less

Abstract: A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen during training. Traditional but very successful approaches based on n-grams obtain generalization by concatenating very short overlapping sequences seen in the training set. We propose to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. The model learns simultaneously (1) a distributed representation for each word along with (2) the probability function for word sequences, expressed in terms of these representations. Generalization is obtained because a sequence of words that has never been seen before gets high probability if it is made of words that are similar (in the sense of having a nearby representation) to words forming an already seen sentence. Training such large models (with millions of parameters) within a reasonable time is itself a significant challenge. We report on experiments using neural networks for the probability function, showing on two text corpora that the proposed approach significantly improves on state-of-the-art n-gram models, and that the proposed approach allows to take advantage of longer contexts.

...read moreread less

6,832 citations

Proceedings Article•

Recurrent neural network based language model

[...]

Tomas Mikolov¹, Martin Karafiat¹, Lukas Burget¹, Jan Cernocký, Sanjeev Khudanpur² - Show less +1 more•Institutions (2)

Brno University of Technology¹, Johns Hopkins University²

01 Jan 2010

TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.

...read moreread less

Abstract: A new recurrent neural network based language model (RNN LM) with applications to speech recognition is presented. Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Speech recognition experiments show around 18% reduction of word error rate on the Wall Street Journal task when comparing models trained on the same amount of data, and around 5% on the much harder NIST RT05 task, even when the backoff model is trained on much more data than the RNN LM. We provide ample empirical evidence to suggest that connectionist language models are superior to standard n-gram techniques, except their high computational (training) complexity. Index Terms: language modeling, recurrent neural networks, speech recognition

...read moreread less

5,751 citations

"LSTM-CNN Hybrid Model for Text Clas..." refers methods in this paper

...Neural network models, such as Convolutional Neural Network(CNN)[8] and Recurrent Neural Network(RNN)[9] are used for text classification tasks, and the performance of neural network models are better than traditional machine learning methods....
[...]

Journal Article•DOI•

Using Maximum Entropy Model for Chinese Text Categorization

[...]

Li Ronglu, Wang Jianhui, Chen Xiaoyun, Tao Xiaopeng, Hu Yunfa - Show less +1 more

15 Jan 2005-Journal of Computer Research and Development

47 citations

Journal Article•

Using Maximum Entropy Model for chinese text categorization

[...]

Ronglu Li, Xiaopeng Tao, Lei Tang, Yunfa Hu

01 Jan 2004-Lecture Notes in Computer Science

TL;DR: In this article, the authors used the maximum entropy model for text categorization and compared it to Bayes, KNN, and SVM, and showed that its performance is higher than Bayes and comparable with SVM.

...read moreread less

Abstract: Maximum Entropy Model is a probability estimation technique widely used for a variety of natural language tasks. It offers a clean and accommodable frame to combine diverse pieces of contextual information to estimate the probability of a certain linguistics phenomena. This approach for many tasks of NLP perform near state-of-the-art level, or outperform other competing probability methods when trained and tested under similar conditions. In this paper, we use maximum entropy model for text categorization. We compare and analyze its categorization performance using different approaches for text feature generation, different number of features and smoothing technique. Moreover, in experiments we compare it to Bayes, KNN and SVM, and show that its performance is higher than Bayes and comparable with KNN and SVM. We think it is a promising technique for text categorization.

...read moreread less

35 citations