scispace - formally typeset
Search or ask a question
Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.


Papers
More filters
Patent
28 Sep 2011
TL;DR: In this paper, Latent Dirichlet Allocation (LDA) is used to determine the relationship between named entities, which is performed on text associated with the name entities rather than on an entire document.
Abstract: For generating a word space, manual thresholding of word scores is used. Rather than requiring the user to select the threshold arbitrarily or review each word, the user is iteratively requested to indicate the relevance of a given word. Words with greater or lesser scores are labeled in the same way depending upon the response. For determining the relationship between named entities, Latent Dirichlet Allocation (LDA) is performed on text associated with the name entities rather than on an entire document. LDA for relationship mining may include context information and/or supervised learning.

39 citations

Posted Content
Yu Wang1, Jiebo Luo1, Richard G. Niemi1, Yuncheng Li1, Tianran Hu1 
TL;DR: This paper proposed a framework to infer the topic preferences of Donald Trump's followers on Twitter by using latent Dirichlet allocation (LDA) to derive the weighted mixture of topics for each Trump tweet and then used negative binomial regression to model the "likes" with the weights of each topic serving as explanatory variables.
Abstract: In this paper, we propose a framework to infer the topic preferences of Donald Trump's followers on Twitter. We first use latent Dirichlet allocation (LDA) to derive the weighted mixture of topics for each Trump tweet. Then we use negative binomial regression to model the "likes," with the weights of each topic serving as explanatory variables. Our study shows that attacking Democrats such as President Obama and former Secretary of State Hillary Clinton earns Trump the most "likes." Our framework of inference is generalizable to the study of other politicians.

38 citations

Journal ArticleDOI
17 Jul 2019
TL;DR: PREREQ is a new supervised learning method for inferring concept prerequisite relations using latent representations of concepts obtained from the Pairwise Latent Dirichlet Allocation model and a neural network based on the Siamese network architecture that can learn unknown concept prerequisites from course prerequisites and labeled concept prerequisite data.
Abstract: The Internet has rich and rapidly increasing sources of high quality educational content. Inferring prerequisite relations between educational concepts is required for modern large-scale online educational technology applications such as personalized recommendations and automatic curriculum creation. We present PREREQ, a new supervised learning method for inferring concept prerequisite relations. PREREQ is designed using latent representations of concepts obtained from the Pairwise Latent Dirichlet Allocation model, and a neural network based on the Siamese network architecture. PREREQ can learn unknown concept prerequisites from course prerequisites and labeled concept prerequisite data. It outperforms state-of-the-art approaches on benchmark datasets and can effectively learn from very less training data. PREREQ can also use unlabeled video playlists, a steadily growing source of training data, to learn concept prerequisites, thus obviating the need for manual annotation of course prerequisites.

38 citations

Journal ArticleDOI
TL;DR: LDA and PLS produce relevant informative summaries of corpora, and confirm and address more specifically the results of the previous literature concerning relationship quality.
Abstract: The purpose of this paper is to analyze the occurrence of terms to identify the relevant topics and then to investigate the area (based on topics) of hospitality services that is highly associated with relationship quality. This research represents an opportunity to fill the gap in the current literature, and clarify the understanding of guests’ affective states by evaluating all aspects of their relationship with a hotel.,This research focuses on natural opinions upon which machine-learning algorithms can be executed: text summarization, sentiment analysis and latent Dirichlet allocation (LDA). Our data set contains 47,172 reviews of 33 hotels located in Las Vegas, and registered with Yelp. A component-based structural equation modeling (partial least squares (PLS)) is applied, with a dual – exploratory and predictive – purpose.,To maintain a truly loyal relationship and to achieve competitive success, hospitality managers must take into account both tangible and intangible features when allocating their marketing efforts to satisfaction-, trust- and commitment-based cues. On the other hand, the application of the PLS predict algorithm demonstrates the predictive performance (out-of-sample prediction) of our model that supports its ability to predict new and accurate values for individual cases when further samples are added.,LDA and PLS produce relevant informative summaries of corpora, and confirm and address more specifically the results of the previous literature concerning relationship quality. Our results are more reliable and accurate (providing insights not indicated in guests’ ratings into how hotels can improve their services) than prior statistical results based on limited sample data and on numerical satisfaction ratings alone.

38 citations

Proceedings ArticleDOI
15 May 2009
TL;DR: A framework to discover latent topics from web sites of terrorists or extremists via analyzing contents of dark websites is proposed and LDA-based analysis assigns a probability to a document and captures exchangeability of both words and documents.
Abstract: Analysis of dark websites is important for developing effective combating strategies against terrorism or extremists when more and more scattered terrorist cells use the ubiquity of the Internet to form communities in virtual space with fairly low costs. Terrorists or extremists anonymously set up various web sites embedded in the public Internet, exchanging ideology, spreading propaganda, and recruiting new members. In this paper, we propose a framework to discover latent topics via analyzing contents of dark websites. The content and data from dark websites are gathered and extracted by crawlers and exported to documents. Latent Dirichlet Allocation (LDA) algorithm is used to analyze the extracted documents so as to discover latent topics from web sites of terrorists or extremists. In contrast to the traditional Information Retrieval (IR) schemes, LDA-based analysis assigns a probability to a document and captures exchangeability of both words and documents. Our work helps to gain insights into the structure and communities of terrorists and extremists.

38 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
86% related
Support vector machine
73.6K papers, 1.7M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023323
2022842
2021418
2020429
2019473
2018446