Topic
Latent Dirichlet allocation
About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.
Papers published on a yearly basis
Papers
More filters
••
24 Aug 2008TL;DR: A novel collapsed Gibbs sampling method for the widely used latent Dirichlet allocation (LDA) model, which can be as much as 8 times faster than the standard collapsed Gibbs sampler for LDA and results in significant speedups on real world text corpora.
Abstract: In this paper we introduce a novel collapsed Gibbs sampling method for the widely used latent Dirichlet allocation (LDA) model. Our new method results in significant speedups on real world text corpora. Conventional Gibbs sampling schemes for LDA require O(K) operations per sample where K is the number of topics in the model. Our proposed method draws equivalent samples but requires on average significantly less then K operations per sample. On real-word corpora FastLDA can be as much as 8 times faster than the standard collapsed Gibbs sampler for LDA. No approximations are necessary, and we show that our fast sampling scheme produces exactly the same results as the standard (but slower) sampling scheme. Experiments on four real world data sets demonstrate speedups for a wide range of collection sizes. For the PubMed collection of over 8 million documents with a required computation time of 6 CPU months for LDA, our speedup of 5.7 can save 5 CPU months of computation.
591 citations
••
TL;DR: This paper identifies the key dimensions of customer service voiced by hotel visitors use a data mining approach, latent dirichlet analysis (LDA), which uncovers 19 controllable dimensions that are key for hotels to manage their interactions with visitors.
570 citations
••
TL;DR: In this article, the authors propose a unified framework for this purpose using unsupervised latent Dirichlet allocation, which enables marketers to track dimensions' importance over time and allows for dynamic mapping of competitive brand positions on those dimensions over time.
Abstract: Online chatter, or user-generated content, constitutes an excellent emerging source for marketers to mine meaning at a high temporal frequency. This article posits that this meaning consists of extracting the key latent dimensions of consumer satisfaction with quality and ascertaining the valence, labels, validity, importance, dynamics, and heterogeneity of those dimensions. The authors propose a unified framework for this purpose using unsupervised latent Dirichlet allocation. The sample of user-generated content consists of rich data on product reviews across 15 firms in five markets over four years. The results suggest that a few dimensions with good face validity and external validity are enough to capture quality. Dynamic analysis enables marketers to track dimensions' importance over time and allows for dynamic mapping of competitive brand positions on those dimensions over time. For vertically differentiated markets (e.g., mobile phones, computers), objective dimensions dominate and are similar acr...
562 citations
••
04 Dec 2006TL;DR: This paper proposes the collapsed variational Bayesian inference algorithm for LDA, and shows that it is computationally efficient, easy to implement and significantly more accurate than standard variationalBayesian inference for L DA.
Abstract: Latent Dirichlet allocation (LDA) is a Bayesian network that has recently gained much popularity in applications ranging from document modeling to computer vision. Due to the large scale nature of these applications, current inference procedures like variational Bayes and Gibbs sampling have been found lacking. In this paper we propose the collapsed variational Bayesian inference algorithm for LDA, and show that it is computationally efficient, easy to implement and significantly more accurate than standard variational Bayesian inference for LDA.
561 citations
•
TL;DR: In this article, the authors investigated the research development, current trends and intellectual structure of topic modeling based on Latent Dirichlet Allocation (LDA), and summarized challenges and introduced famous tools and datasets in topic modelling based on LDA.
Abstract: Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and applied in various fields such as software engineering, political science, medical and linguistic science, etc. There are various methods for topic modeling, which Latent Dirichlet allocation (LDA) is one of the most popular methods in this field. Researchers have proposed various models based on the LDA in topic modeling. According to previous work, this paper can be very useful and valuable for introducing LDA approaches in topic modeling. In this paper, we investigated scholarly articles highly (between 2003 to 2016) related to Topic Modeling based on LDA to discover the research development, current trends and intellectual structure of topic modeling. Also, we summarize challenges and introduce famous tools and datasets in topic modeling based on LDA.
546 citations