scispace - formally typeset
Search or ask a question
Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper employs Latent Dirichlet Allocation (LDA) to discover treatment patterns as a probabilistic combination of clinical activities and surmises the essential features of treatment patterns.
Abstract: A clinical process is typically a mixture of various latent treatment patterns, implicitly indicating the likelihood of what clinical activities are essential/critical to the process. Discovering these hidden patterns is one of the most important components of clinical process analysis. What makes the pattern discovery problem complex is that these patterns are hidden in clinical processes, are composed of variable clinical activities, and often vary significantly between patient individuals. This paper employs Latent Dirichlet Allocation (LDA) to discover treatment patterns as a probabilistic combination of clinical activities. The probability distribution derived from LDA surmises the essential features of treatment patterns, and clinical processes can be accurately described by combining different classes of distributions. The presented approach has been implemented and evaluated via real-world data sets.

51 citations

Journal ArticleDOI
TL;DR: A general Bayesian framework is provided in which a semiparametric hierarchical modeling with an approximate truncation Dirichlet process prior distribution is specified for the latent variables in SEMs with covariates.
Abstract: Latent variables play the most important role in structural equation modeling. In almost all existing structural equation models (SEMs), it is assumed that the distribution of the latent variables is normal. As this assumption is likely to be violated in many biomedical researches, a semiparametric Bayesian approach for relaxing it is developed in this paper. In the context of SEMs with covariates, we provide a general Bayesian framework in which a semiparametric hierarchical modeling with an approximate truncation Dirichlet process prior distribution is specified for the latent variables. The stick-breaking prior and the blocked Gibbs sampler are used for efficient simulation in the posterior analysis. The developed methodology is applied to a study of kidney disease in diabetes patients. A simulation study is conducted to reveal the empirical performance of the proposed approach. Supplementary electronic material for this paper is available in Wiley InterScience at http://www.mrw.interscience.wiley.com/suppmat/1097-0258/suppmat/.

50 citations

Proceedings ArticleDOI
13 Jun 2011
TL;DR: This paper proposes two event detection approaches using generative models that combine the popular LDA model with temporal segmentation and spatial clustering, and adapt an image segmentation model, SLDA, for spatial-temporal event detection on text.
Abstract: A large number of news articles are generated every day on the Web. Automatically identifying events from a large document collection is a challenging problem. In this paper, we propose two event detection approaches using generative models. We combine the popular LDA model with temporal segmentation and spatial clustering. In addition, we adapt an image segmentation model, SLDA, for spatial-temporal event detection on text. The results of our experiments show that both approaches outperform the traditional content-based clustering approaches on our datasets.

50 citations

Journal ArticleDOI
TL;DR: This article explores the use of social media as an observation source for timely decision making and proposes a novel computational method to fit an existing clustering model called k-means latent Dirichlet allocation (KLDA), which is illustrated using a cybersecurity problem.
Abstract: Many decision problems are set in changing environments. For example, determining the optimal investment in cyber maintenance depends on whether there is evidence of an unusual vulnerability, such as “Heartbleed,” that is causing an especially high rate of incidents. This gives rise to the need for timely information to update decision models so that optimal policies can be generated for each decision period. Social media provide a streaming source of relevant information, but that information needs to be efficiently transformed into numbers to enable the needed updates. This article explores the use of social media as an observation source for timely decision making. To efficiently generate the observations for Bayesian updates, we propose a novel computational method to fit an existing clustering model. The proposed method is called k-means latent Dirichlet allocation (KLDA). We illustrate the method using a cybersecurity problem. Many organizations ignore “medium” vulnerabilities identified during peri...

50 citations

Journal ArticleDOI
TL;DR: The analytical definitions of the Chernoff, Bhattacharyya and Jeffreys-Matusita probabilistic distances between two Dirichlet distributions and two Beta distributions are given and their inappropriateness is shown in the analytical case.

50 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
86% related
Support vector machine
73.6K papers, 1.7M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023323
2022842
2021418
2020429
2019473
2018446