LSTM-CNN Hybrid Model for Text Classification

doi:10.1109/IAEAC.2018.8577620

Home
/
Papers
/
LSTM-CNN Hybrid Model for Text Classification

Proceedings Article•DOI•

LSTM-CNN Hybrid Model for Text Classification

Jiarui Zhang¹, Yingxiang Li¹, Juan Tian¹, Tongyan Li¹•Institutions (1)

Chengdu University of Information Technology¹

01 Oct 2018-

TL;DR: A hybrid model of LSTM and CNN is proposed that can effectively improve the accuracy of text classification and the performance of the hybrid model is compared with that of other models in the experiment.

read less

Abstract: Text classification is a classic task in the field of natural language processing. however, the existing methods of text classification tasks still need to be improved because of the complex abstraction of text semantic information and the strong relecvance of context. In this paper, we combine the advantages of two traditional neural network model, Long Short-Term Memory(LSTM) and Convolutional Neural Network(CNN). LSTM can effectively preserve the characteristics of historical information in long text sequences, and extract local features of text by using the structure of CNN. We proposes a hybrid model of LSTM and CNN, construct CNN model on the top of LSTM, the text feature vector output from LSTM is further extracted by CNN structure. The performance of the hybrid model is compared with that of other models in the experiment. The experimental results show that the hybrid model can effectively improve the accuracy of text classification.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Improved Danmaku Emotion Analysis and Its Application Based on Bi-LSTM Model

[...]

Shaokang Wang¹, Yihao Chen¹, Hongjun Ming¹, Hai Huang¹, Lingxian Mi¹, Zengyi Shi¹ - Show less +2 more•Institutions (1)

Beijing University of Posts and Telecommunications¹

10 Jun 2020-IEEE Access

TL;DR: An improved emotion analysis model based on Bi-LSTM model to classify the further four-dimensional emotions of Pleasure, Anger, Sorrow and Joy is proposed and tags such as comment time and user name are added to the danmaku information.

...read moreread less

Abstract: With the rapid development of social media, danmaku video provides a platform for users to communicate online. To some extent, danmaku video provides emotional timing information and an innovative method to analyze video data. In the age of big data, studying the characteristics of danmaku and its emotional tendencies can not only help us understand the psychological characteristics of users but also feedback the effective information of users to video platforms, which can help the platforms optimize related short video recommendations so that it can provide a more accurate solution for the selection of audiences during video production. However, danmaku is different from traditional comments. Current emotion classification methods are only suitable for two-dimensional classification which are not suitable for danmaku emotion analysis. Aiming at the problems such as the colloquialism, diversity, spelling errors, structural non-linearity informal language on the Internet, diversity of social topics, and context dependency of emotion analysis of the danmaku data, this paper proposes an improved emotion analysis model based on Bi-LSTM model to classify the further four-dimensional emotions of Pleasure, Anger, Sorrow and Joy. Furthermore, we add tags such as comment time and user name to the danmaku information. Experimental results show that the improved model has higher Accuracy, Recall, Precision, and F1-Score under the same conditions compared with the CNN and SVM. The classification effect of improved model is close to the SOTA. Experimental results also show that the improved model can be effectively applied to the analysis of irregular danmaku emotion.

...read moreread less

11 citations

Cites methods from "LSTM-CNN Hybrid Model for Text Clas..."

...Through statistical analysis, the four performance indexes of Accuracy, Recall, Precision, and F1-Score (as shown in FIGURE 4) in the model used in the experiment are higher than they in the CNN-Convolutional Neural Networks model [21]....
[...]

Journal Article•DOI•

A novel GCL hybrid classification model for paddy diseases

[...]

Shweta Lamba, Anupam Baliyan, V. K. Kukreja

19 Sep 2022-International journal of information technology

TL;DR: In this paper , a novel neural network-based hybrid model (GCL) is introduced, which is a dataset-augmentation fusion of long short term memory (LSTM) and convolutional neural network (CNN) with generative adversarial network (GAN).

...read moreread less

Abstract: The demand for agricultural products increased exponentially as the global population grew. The rapid development of computer vision-based artificial intelligence and deep learning-related technologies has impacted a wide range of industries, including disease detection and classification. This paper introduces a novel neural network-based hybrid model (GCL). GCL is a dataset-augmentation fusion of long-short term memory (LSTM) and convolutional neural network (CNN) with generative adversarial network (GAN). GAN is used for the augmentation of the dataset, CNN extracts the features and LSTM classifies the various paddy diseases. The GCL model is being investigated to improve the classification model’s accuracy and reliability. The dataset was compiled using secondary resources such as Mendeley, Kaggle, UCI, and GitHub, having images of bacterial blight, leaf smut, and rice blast. The experimental setup for proving the efficacy of the GCL model demonstrates that the GCL is suitable for disease classification and works with 97% testing accuracy. GCL can further be used for the classification of more diseases of paddy.

...read moreread less

11 citations

Journal Article•DOI•

Tamp-X: Attacking explainable natural language classifiers through tampered activations

[...]

Hassan Ali, Muhammad Suleman Khan, Ala Al-Fuqaha, Junaid Qadir

01 Jun 2022-Computers & Security

TL;DR: Tamp-X as mentioned in this paper attacks explainable natural language classifiers with tampered activations and demonstrates that the explanations generated by the tampered classifiers are not reliable, and significantly disagree with those generated for the untampered classifier.

...read moreread less

10 citations

Journal Article•DOI•

A CLSTM-TMN for marketing intention detection

[...]

Yufeng Wang¹, Kun Ma¹, Laura García-Hernández², Jing Chen¹, Zhihao Hou¹, Ke Ji¹, Zhenxiang Chen¹, Ajith Abraham - Show less +4 more•Institutions (2)

University of Jinan¹, University of Córdoba (Spain)²

01 May 2020-Engineering Applications of Artificial Intelligence

TL;DR: A CLSTM-based topic memory network for marketing intention detection and a new combination ensemble both long and short term memory (LSTM) and convolution neural network (CNN) is proposed.

...read moreread less

10 citations

Proceedings Article•DOI•

Prediction of Tourist Behaviour: Tourist Visiting Places by Adapting Convolutional Long Short-Term Deep Learning

[...]

Jaruwan Kanjanasupawan¹, Yi-Cheng Chen², Tipajin Thaipisutikul², Timothy K. Shih², Anongnart Srivihok¹ - Show less +1 more•Institutions (2)

Kasetsart University¹, National Central University²

20 Jul 2019

TL;DR: This work uses sequential patterns of users' behavior which are ordered by time from tourist including opinions, reviews as input data and uses Convolutional Long Short-Term Deep Learning (CLSTDL) which is a deep learning technique that combines convolutional Neural Network (CNN) with Long short-Term Memory (LSTM) to predict the expected location.

...read moreread less

Abstract: The trend of industry tourism GDP is increasing in every year that speculates from statistics of the World Travel & Tourism Council (2018). Moreover, travel industry not only considered as the most dynamic sector but also the most importance generator of income and jobs in the country. Thus, the prototype for tourism plans are needed for strategic planning. Currently, social web is a great tool for providing useful insights about tourist behaviors especially with the text data that comes from travelers' opinions. In this work, we use sequential patterns of users' behavior which are ordered by time from tourist including opinions, reviews as our input data. Then, we use Convolutional Long Short-Term Deep Learning (CLSTDL) which is a deep learning technique that combines Convolutional Neural Network (CNN) with Long Short-Term Memory (LSTM) to predict the expected location. During the process, the output of CNN will be fed into LSTM to learn the sequence pattern behavior of traveler. The model output is then used to predict the next location that particular travelers are likely to go. The experimental results have shown that CLSTDL outperforms other models when evaluating with the accuracy and loss metrics.

...read moreread less

6 citations

Cites methods from "LSTM-CNN Hybrid Model for Text Clas..."

...Other researches in section II [1,3,4,5,6,7,8,9,10,11], they used the hybrid method for image processing and text classification....
[...]
...al that extracted features from CNN and learning sequences by LSTM [8]....
[...]

1
2
3
4
5
…
6
7
8
9
10
11
12
13
14

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A neural probabilistic language model

[...]

Yoshua Bengio¹, Réjean Ducharme¹, Pascal Vincent¹, Christian Janvin¹•Institutions (1)

Université de Montréal¹

01 Mar 2003-Journal of Machine Learning Research

TL;DR: The authors propose to learn a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, which can be expressed in terms of these representations.

...read moreread less

Abstract: A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen during training. Traditional but very successful approaches based on n-grams obtain generalization by concatenating very short overlapping sequences seen in the training set. We propose to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. The model learns simultaneously (1) a distributed representation for each word along with (2) the probability function for word sequences, expressed in terms of these representations. Generalization is obtained because a sequence of words that has never been seen before gets high probability if it is made of words that are similar (in the sense of having a nearby representation) to words forming an already seen sentence. Training such large models (with millions of parameters) within a reasonable time is itself a significant challenge. We report on experiments using neural networks for the probability function, showing on two text corpora that the proposed approach significantly improves on state-of-the-art n-gram models, and that the proposed approach allows to take advantage of longer contexts.

...read moreread less

6,832 citations

Proceedings Article•

Recurrent neural network based language model

[...]

Tomas Mikolov¹, Martin Karafiat¹, Lukas Burget¹, Jan Cernocký, Sanjeev Khudanpur² - Show less +1 more•Institutions (2)

Brno University of Technology¹, Johns Hopkins University²

01 Jan 2010

TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.

...read moreread less

Abstract: A new recurrent neural network based language model (RNN LM) with applications to speech recognition is presented. Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Speech recognition experiments show around 18% reduction of word error rate on the Wall Street Journal task when comparing models trained on the same amount of data, and around 5% on the much harder NIST RT05 task, even when the backoff model is trained on much more data than the RNN LM. We provide ample empirical evidence to suggest that connectionist language models are superior to standard n-gram techniques, except their high computational (training) complexity. Index Terms: language modeling, recurrent neural networks, speech recognition

...read moreread less

5,751 citations

"LSTM-CNN Hybrid Model for Text Clas..." refers methods in this paper

...Neural network models, such as Convolutional Neural Network(CNN)[8] and Recurrent Neural Network(RNN)[9] are used for text classification tasks, and the performance of neural network models are better than traditional machine learning methods....
[...]

Journal Article•DOI•

Using Maximum Entropy Model for Chinese Text Categorization

[...]

Li Ronglu, Wang Jianhui, Chen Xiaoyun, Tao Xiaopeng, Hu Yunfa - Show less +1 more

15 Jan 2005-Journal of Computer Research and Development

47 citations

Journal Article•

Using Maximum Entropy Model for chinese text categorization

[...]

Ronglu Li, Xiaopeng Tao, Lei Tang, Yunfa Hu

01 Jan 2004-Lecture Notes in Computer Science

TL;DR: In this article, the authors used the maximum entropy model for text categorization and compared it to Bayes, KNN, and SVM, and showed that its performance is higher than Bayes and comparable with SVM.

...read moreread less

Abstract: Maximum Entropy Model is a probability estimation technique widely used for a variety of natural language tasks. It offers a clean and accommodable frame to combine diverse pieces of contextual information to estimate the probability of a certain linguistics phenomena. This approach for many tasks of NLP perform near state-of-the-art level, or outperform other competing probability methods when trained and tested under similar conditions. In this paper, we use maximum entropy model for text categorization. We compare and analyze its categorization performance using different approaches for text feature generation, different number of features and smoothing technique. Moreover, in experiments we compare it to Bayes, KNN and SVM, and show that its performance is higher than Bayes and comparable with KNN and SVM. We think it is a promising technique for text categorization.

...read moreread less

35 citations