Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Exploring the generalizability of discriminant word items and latent topics in online tourist reviews

[...]

Astrid Dickinger, Lidija Lalicic, Josef A. Mazanec

28 Feb 2017-International Journal of Contemporary Hospitality Management

TL;DR: In this paper, the authors explore differences in language between favorable and unfavorable reviews in three service settings (hotels, restaurants and attractions) and illustrate the discrimination between positive and negative reviews based on single word items and the sector-specific relevance of hidden topics.

...read moreread less

Abstract: Purpose Online reviews have been gaining relevance in hospitality and tourism management and represent an important research avenue for academia. This study aims to illustrate the discrimination between positive and negative reviews based on single word items and the sector-specific relevance of hidden topics. Design/methodology/approach By probing two parallel approaches of entirely unrelated analytical methods (penalized support vector machines and Latent Dirichlet Allocation), the analysts explore differences in language between favorable and unfavorable reviews in three service settings (hotels, restaurants and attractions). Findings The percentage of correctly predicted positive and negative review reports by means of individual word items does not decrease if reports from the three tourism businesses are analyzed together. Originality/value However, there is limited generalizability of the discriminant words across the three businesses. Also, the latent topics relevant for generating customers’ review reports differ significantly between the three sectors of tourism businesses.

...read moreread less

31 citations

Journal Article•DOI•

Neuro-SERKET: Development of Integrative Cognitive System Through the Composition of Deep Probabilistic Generative Models

[...]

Tadahiro Taniguchi¹, Tomoaki Nakamura², Masahiro Suzuki³, Ryo Kuniyasu², Kaede Hayashi¹, Akira Taniguchi¹, Takato Horii⁴, Takayuki Nagai⁴, Takayuki Nagai² - Show less +5 more•Institutions (4)

Ritsumeikan University¹, University of Electro-Communications², University of Tokyo³, Osaka University⁴

01 Mar 2020-New Generation Computing

TL;DR: Neuro-SERKET is an extension of SERKET, which can compose elemental PGMs developed in a distributed manner and provide a scheme that allows the composed PGMs to learn throughout the system in an unsupervised way.

...read moreread less

Abstract: This paper describes a framework for the development of an integrative cognitive system based on probabilistic generative models (PGMs) called Neuro-SERKET. Neuro-SERKET is an extension of SERKET, which can compose elemental PGMs developed in a distributed manner and provide a scheme that allows the composed PGMs to learn throughout the system in an unsupervised way. In addition to the head-to-tail connection supported by SERKET, Neuro-SERKET supports tail-to-tail and head-to-head connections, as well as neural network-based modules, i.e., deep generative models. As an example of a Neuro-SERKET application, an integrative model was developed by composing a variational autoencoder (VAE), a Gaussian mixture model (GMM), latent Dirichlet allocation (LDA), and automatic speech recognition (ASR). The model is called VAE + GMM + LDA + ASR. The performance of VAE + GMM + LDA + ASR and the validity of Neuro-SERKET were demonstrated through a multimodal categorization task using image data and a speech signal of numerical digits.

...read moreread less

31 citations

Proceedings Article•

Multi-Modal Hierarchical Dirichlet Process Model for Predicting Image Annotation and Image-Object Label Correspondence.

[...]

Oksana Yakhnenko¹, Vasant Honavar•Institutions (1)

Iowa State University¹

01 Dec 2009

TL;DR: This work addresses the problem of building a predictive model that not only predicts a caption for the image but also labels the individual objects in the image using a multi-modal hierarchical Dirichlet Process model (MoM-HDP) a stochastic process for modeling multimodal data.

...read moreread less

Abstract: Many real-world applications call for learning predictive relationships from multi-modal data. In particular, in multi-media and web applications, given a dataset of images and their associated captions, one might want to construct a predictive model that not only predicts a caption for the image but also labels the individual objects in the image. We address this problem using a multi-modal hierarchical Dirichlet Process model (MoM-HDP) a stochastic process for modeling multimodal data. MoM-HDP is an analog of a multi-modal Latent Dirichlet Allocation (MoM-LDA) with an infinite number of mixture components. Thus MoM-HDP allows circumventing the need for a priori choice of the number of mixture components or the computational expense of model selection. During training, the model has access to an un-segmented image and its caption, but not the labels for each object in the image. The trained model is used to predict the label for each region of interest in a segmented image. The model parameters are estimated efficiently using variational inference. We use two large benchmark datasets to compare the performance of the proposed MoM-HDP model with that of MoM-LDA model as well as some simple alternatives: Naive Bayes and Logistic Regression classifiers based on the formulation of the image annotation and imagelabel correspondence problems as one-against-all classification. Our experimental results show that unlike MoM-LDA, the performance of MoM-HDP is invariant to the number of mixture components. Furthermore, our experimental evaluation shows that the generalization performance of MoM-HDP is superior to that of MoM-HDP as well as the one-against-all Naive Bayes and Logistic Regression classifiers.

...read moreread less

31 citations

Proceedings Article•DOI•

Characterizing the Language of Online Communities and its Relation to Community Reception

[...]

Trang Tran¹, Mari Ostendorf¹•Institutions (1)

University of Washington¹

15 Sep 2016

TL;DR: This paper investigated style and topic aspects of language in online communities and found that style is a better indicator of community identity than topic, even for communities organized around specific topics, while there is a positive correlation between the community reception to a contribution and the style similarity to that community, but not so for topic similarity.

...read moreread less

Abstract: This work investigates style and topic aspects of language in online communities: looking at both utility as an identifier of the community and correlation with community reception of content. Style is characterized using a hybrid word and part-of-speech tag n-gram language model, while topic is represented using Latent Dirichlet Allocation. Experiments with several Reddit forums show that style is a better indicator of community identity than topic, even for communities organized around specific topics. Further, there is a positive correlation between the community reception to a contribution and the style similarity to that community, but not so for topic similarity.

...read moreread less

31 citations

Journal Article•DOI•

An efficient approach to suggesting topically related web queries using hidden topic model

[...]

Lin Li¹, Guandong Xu², Zhenglu Yang³, Peter Dolog², Yanchun Zhang⁴, Masaru Kitsuregawa³ - Show less +2 more•Institutions (4)

Wuhan University of Technology¹, Aalborg University², University of Tokyo³, Victoria University, Australia⁴

01 May 2013-World Wide Web

TL;DR: Experimental results show that the hidden topic based suggestion is much more efficient than the traditional term or URL based approach, and is effective in finding topically related queries for suggestion.

...read moreread less

Abstract: Keyword-based Web search is a widely used approach for locating information on the Web. However, Web users usually suffer from the difficulties of organizing and formulating appropriate input queries due to the lack of sufficient domain knowledge, which greatly affects the search performance. An effective tool to meet the information needs of a search engine user is to suggest Web queries that are topically related to their initial inquiry. Accurately computing query-to-query similarity scores is a key to improve the quality of these suggestions. Because of the short lengths of queries, traditional pseudo-relevance or implicit-relevance based approaches expand the expression of the queries for the similarity computation. They explicitly use a search engine as a complementary source and directly extract additional features (such as terms or URLs) from the top-listed or clicked search results. In this paper, we propose a novel approach by utilizing the hidden topic as an expandable feature. This has two steps. In the offline model-learning step, a hidden topic model is trained, and for each candidate query, its posterior distribution over the hidden topic space is determined to re-express the query instead of the lexical expression. In the online query suggestion step, after inferring the topic distribution for an input query in a similar way, we then calculate the similarity between candidate queries and the input query in terms of their corresponding topic distributions; and produce a suggestion list of candidate queries based on the similarity scores. Our experimental results on two real data sets show that the hidden topic based suggestion is much more efficient than the traditional term or URL based approach, and is effective in finding topically related queries for suggestion.

...read moreread less

31 citations

Collapse

Network Information

Performance

Metrics

6,513

Papers

245,225

Citations

No. of papers in the topic in previous years
Year	Papers
2023	323
2022	842
2021	418
2020	429
2019	473
2018	446

Latent Dirichlet allocation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics