Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Weakly supervised topic sentiment joint model with word embeddings

[...]

Xianghua Fu¹, Xudong Sun¹, Haiying Wu¹, Laizhong Cui¹, Joshua Zhexue Huang¹ - Show less +1 more•Institutions (1)

Shenzhen University¹

01 May 2018-Knowledge Based Systems

TL;DR: A novel topic sentiment Joint model called weakly supervised topic sentiment joint model with word embeddings (WS-TSWE), which incorporates word embedDings and HowNet lexicon simultaneously to improve the topic identification and sentiment recognition.

...read moreread less

Abstract: Topic sentiment joint model aims to deal with the problem about the mixture of topics and sentiment simultaneously from online reviews. Most of existing topic sentiment modeling algorithms are mainly based on the state-of-art latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (PLSA), which infer sentiment and topic distributions from the co-occurrence of words. These methods have been proposed and successfully used for topic and sentiment analysis. However, when the training corpus is small or when the documents are short, the textual features become sparse, so that the results of the sentiment and topic distributions might be not very satisfied. In this paper, we propose a novel topic sentiment joint model called weakly supervised topic sentiment joint model with word embeddings (WS-TSWE), which incorporates word embeddings and HowNet lexicon simultaneously to improve the topic identification and sentiment recognition. The main contributions of WS-TSWE include the following two aspects. (1) Existing models generate the words only from the sentiment-topic-to-word Dirichlet multinomial component, but the WS-TSWE model replaces it with a mixture of two components, a Dirichlet multinomial component and a word embeddings component. Since the word embeddings are trained on a very large corpora and can be used to extend the semantic information of the words, they can provide a certain solution for the problem of the textual sparse. (2) Most of previous models incorporate sentiment knowledge in the β priors. And the priors are usually set from a dictionary and completely rely on previous domain knowledge to identify positive and negative words. In contrast, the WS-TSWE model calculates the sentiment orientation of each word with the HowNet lexicon and automatically infers sentiment-based β priors for sentiment analysis and opinion mining. Furthermore, we implement WS-TSWE with Gibbs sampling algorithms. The experimental results on Chinese and English data sets show that WS-TSWE achieved significant performance in the task of detecting sentiment and topics simultaneously.

...read moreread less

29 citations

Posted Content•

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

[...]

Jianshu Chen¹, Ji He², Yelong Shen¹, Lin Xiao¹, Xiaodong He¹, Jianfeng Gao¹, Xinying Song¹, Li Deng¹ - Show less +4 more•Institutions (2)

Microsoft¹, University of Washington²

14 Aug 2015-arXiv: Learning

TL;DR: In this article, a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i.e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document, was developed.

...read moreread less

Abstract: We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i.e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document. Different from traditional variational learning or Gibbs sampling approaches, the proposed learning method applies (i) the mirror descent algorithm for maximum a posterior inference and (ii) back propagation over a deep architecture together with stochastic gradient/mirror descent for model parameter estimation, leading to scalable and end-to-end discriminative learning of the model. As a byproduct, we also apply this technique to develop a new learning method for the traditional unsupervised LDA model (i.e., BP-LDA). Experimental results on three real-world regression and classification tasks show that the proposed methods significantly outperform the previous supervised topic models, neural networks, and is on par with deep neural networks.

...read moreread less

29 citations

Journal Article•DOI•

SAFE: A Sentiment Analysis Framework for E-Learning

[...]

Francesco Colace¹, Massimo De Santo, Luca Greco•Institutions (1)

University of Salerno¹

08 Dec 2014-International Journal of Emerging Technologies in Learning (ijet)

TL;DR: This paper investigates the adoption of a probabilistic approach based on the Latent Dirichlet Allocation (LDA) as Sentiment grabber for e-learning, and shows how this approach contains a set of weighted word pairs, which are discriminative for sentiment classiﬁcation.

...read moreread less

Abstract: The spread of social networks allows sharing opinions on different aspects of life and daily millions of messages appear on the web. This textual information can be a rich source of data for opinion mining and sentiment analysis: the computational study of opinions, sentiments and emotions expressed in a text. Its main aim is the identiﬁcation of the agreement or disagreement statements that deal with positive or negative feelings in comments or reviews. In this paper, we investigate the adoption, in the field of the e-learning, of a probabilistic approach based on the Latent Dirichlet Allocation (LDA) as Sentiment grabber. By this approach, for a set of documents belonging to a same knowledge domain, a graph, the Mixed Graph of Terms, can be automatically extracted. The paper shows how this graph contains a set of weighted word pairs, which are discriminative for sentiment classiﬁcation. In this way, the system can detect the feeling of students on some topics and teacher can better tune his/her teaching approach. In fact, the proposed method has been tested on datasets coming from e-learning platforms. A preliminary experimental campaign shows how the proposed approach is effective and satisfactory.

...read moreread less

29 citations

Journal Article•DOI•

Forecasting Violent Extremist Cyber Recruitment

[...]

Jacob R. Scanlon¹, Matthew S. Gerber²•Institutions (2)

Teradata¹, University of Virginia²

05 Aug 2015-IEEE Transactions on Information Forensics and Security

TL;DR: Quantitative evaluations showed that employing LDA-based topics as predictors within time series models reduces forecast error compared with naive, autoregressive integrated moving average, and exponential smoothing baselines.

...read moreread less

Abstract: The Internet’s increasing use as a means of communication has led to the formation of cyber communities, which have become appealing to violent extremist (VE) groups. This paper presents research on forecasting the daily level of cyber-recruitment activity of VE groups. We used a previously developed support vector machine model to identify recruitment posts within a Western jihadist discussion forum. We analyzed the textual content of this data set with latent Dirichlet allocation (LDA), and we fed these analyses into a variety of time series models to forecast cyber-recruitment activity within the forum. Quantitative evaluations showed that employing LDA-based topics as predictors within time series models reduces forecast error compared with naive (random-walk), autoregressive integrated moving average, and exponential smoothing baselines. To the best of our knowledge, this is the first result reported on this forecasting task. This research could ultimately help assist with efficient allocation of intelligence analysts in response to predicted levels of cyber-recruitment activity.

...read moreread less

29 citations

Journal Article•DOI•

Concept-LDA: Incorporating Babelfy into LDA for aspect extraction:

[...]

Ekin Ekinci¹, Sevinç İlhan Omurca¹•Institutions (1)

Kocaeli University¹

01 Jun 2020-Journal of Information Science

TL;DR: This article proposes a novel method based on topic modelling to determine the latent aspects of online review documents called Concept-LDA, which achieves better topic representations than an LDA model alone, as measured by topic coherence and F-measure.

...read moreread less

Abstract: Latent Dirichlet allocation (LDA) is one of the probabilistic topic models; it discovers the latent topic structure in a document collection. The basic assumption under LDA is that documents are vi...

...read moreread less

29 citations

Collapse

Network Information

Performance

Metrics

6,513

Papers

245,225

Citations

No. of papers in the topic in previous years
Year	Papers
2023	323
2022	842
2021	418
2020	429
2019	473
2018	446

Latent Dirichlet allocation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics