scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

LSTM-CNN Hybrid Model for Text Classification

TL;DR: A hybrid model of LSTM and CNN is proposed that can effectively improve the accuracy of text classification and the performance of the hybrid model is compared with that of other models in the experiment.
Abstract: Text classification is a classic task in the field of natural language processing. however, the existing methods of text classification tasks still need to be improved because of the complex abstraction of text semantic information and the strong relecvance of context. In this paper, we combine the advantages of two traditional neural network model, Long Short-Term Memory(LSTM) and Convolutional Neural Network(CNN). LSTM can effectively preserve the characteristics of historical information in long text sequences, and extract local features of text by using the structure of CNN. We proposes a hybrid model of LSTM and CNN, construct CNN model on the top of LSTM, the text feature vector output from LSTM is further extracted by CNN structure. The performance of the hybrid model is compared with that of other models in the experiment. The experimental results show that the hybrid model can effectively improve the accuracy of text classification.
Citations
More filters
Journal ArticleDOI
TL;DR: This study addresses the early detection of suicide ideation through deep learning and machine learning-based classification approaches applied to Reddit social media by employing an LSTM-CNN combined model to evaluate and compare to other classification models.
Abstract: Suicide ideation expressed in social media has an impact on language usage. Many at-risk individuals use social forum platforms to discuss their problems or get access to information on similar tasks. The key objective of our study is to present ongoing work on automatic recognition of suicidal posts. We address the early detection of suicide ideation through deep learning and machine learning-based classification approaches applied to Reddit social media. For such purpose, we employ an LSTM-CNN combined model to evaluate and compare to other classification models. Our experiment shows the combined neural network architecture with word embedding techniques can achieve the best relevance classification results. Additionally, our results support the strength and ability of deep learning architectures to build an effective model for a suicide risk assessment in various text classification tasks.

109 citations


Cites background or methods from "LSTM-CNN Hybrid Model for Text Clas..."

  • ...Thus, a single neuron in CNN represents a region within an input sample such as a piece of image or text, in our convolution layer we follow the work by [46]....

    [...]

  • ...In our experiment, we use multiple convolutional filters with various parameter initializations to extract multiple maps from the text [46]....

    [...]

  • ...input sequences X = (xt) with a d-dimensional word embedding vector, while H represents the number of LSTM hidden layer nodes [46]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a hybrid deep learning-LSTM-CNN model was proposed to forecast layer-wise melt pool temperature using a hybrid CNN-LstM technique. And the model results showed that combining CNN and LSTM networks can extract the spatial and temporal information.
Abstract: Melt pool temperature contains abundant information on metallurgical and mechanical aspects of products produced by additive manufacturing. Forecasting melt pool temperature profile during a process can help in reducing microstructural porosity and residual stresses. Although analytical and numerical models were reported, the performance of these are questionable when applied in real-time. Hence, we developed data-driven models to address this challenge, for continuous forecasting layer-wise melt pool temperature using a hybrid deep learning technique. The melt pool temperature forecasting by the proposed CNN-LSTM model is found to be better than other benchmark models in terms of accuracy and efficiency. The model results have shown that combining CNN and LSTM networks can extract the spatial and temporal information from the melt pool temperature data. Further, the proposed model results are compared with existing statistical and machine learning models. The performance measures of the proposed CNN-LSTM model indicate a greater potential for in-situ monitoring of additive manufacturing process.

14 citations

Journal ArticleDOI
TL;DR: In this article, a bidirectional gated temporal convolutional attention (BG-TCA) model is proposed for text classification, which uses the attention mechanism to distinguish the importance of different features while retaining the text features.

13 citations

Proceedings ArticleDOI
17 Jun 2019
TL;DR: Experiments show that the model proposed in this paper has great advantages in Chinese news text classification, and using a C-LSTM with word embedding model to deal with this problem.
Abstract: Traditional text classification methods are based on statistics and feature selection. It does not perform well in processing large - scale corpus. In recent years, with the rapid development of deep learning and artificial neural networks, many scholars use them to solve text classification problems and achieve good results. Common text classification neural network models include textCNN, LSTM, and C-LSTM. Using a specific model can obtain more accurate features but ignore the context information. This paper proposes a C-LSTM with word embedding model to deal with this problem. Experiments show that the model proposed in this paper has great advantages in Chinese news text classification.

11 citations


Cites methods from "LSTM-CNN Hybrid Model for Text Clas..."

  • ...Later, methods combining CNN and LSTM appeared [11][12]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The authors propose to learn a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, which can be expressed in terms of these representations.
Abstract: A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen during training. Traditional but very successful approaches based on n-grams obtain generalization by concatenating very short overlapping sequences seen in the training set. We propose to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. The model learns simultaneously (1) a distributed representation for each word along with (2) the probability function for word sequences, expressed in terms of these representations. Generalization is obtained because a sequence of words that has never been seen before gets high probability if it is made of words that are similar (in the sense of having a nearby representation) to words forming an already seen sentence. Training such large models (with millions of parameters) within a reasonable time is itself a significant challenge. We report on experiments using neural networks for the probability function, showing on two text corpora that the proposed approach significantly improves on state-of-the-art n-gram models, and that the proposed approach allows to take advantage of longer contexts.

6,832 citations

Proceedings Article
01 Jan 2010
TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.
Abstract: A new recurrent neural network based language model (RNN LM) with applications to speech recognition is presented. Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Speech recognition experiments show around 18% reduction of word error rate on the Wall Street Journal task when comparing models trained on the same amount of data, and around 5% on the much harder NIST RT05 task, even when the backoff model is trained on much more data than the RNN LM. We provide ample empirical evidence to suggest that connectionist language models are superior to standard n-gram techniques, except their high computational (training) complexity. Index Terms: language modeling, recurrent neural networks, speech recognition

5,751 citations


"LSTM-CNN Hybrid Model for Text Clas..." refers methods in this paper

  • ...Neural network models, such as Convolutional Neural Network(CNN)[8] and Recurrent Neural Network(RNN)[9] are used for text classification tasks, and the performance of neural network models are better than traditional machine learning methods....

    [...]

Journal Article
TL;DR: In this article, the authors used the maximum entropy model for text categorization and compared it to Bayes, KNN, and SVM, and showed that its performance is higher than Bayes and comparable with SVM.
Abstract: Maximum Entropy Model is a probability estimation technique widely used for a variety of natural language tasks. It offers a clean and accommodable frame to combine diverse pieces of contextual information to estimate the probability of a certain linguistics phenomena. This approach for many tasks of NLP perform near state-of-the-art level, or outperform other competing probability methods when trained and tested under similar conditions. In this paper, we use maximum entropy model for text categorization. We compare and analyze its categorization performance using different approaches for text feature generation, different number of features and smoothing technique. Moreover, in experiments we compare it to Bayes, KNN and SVM, and show that its performance is higher than Bayes and comparable with KNN and SVM. We think it is a promising technique for text categorization.

35 citations