Topic
Latent Dirichlet allocation
About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: The proposed automated classification model and LDA-based network analysis method provide a useful approach to enable machine-assisted interpretation of texts-based accident narratives and can provide managers with much-needed information and knowledge to improve safety on-site.
80 citations
••
TL;DR: A scalable Bayesian topic model is proposed to measure and understand changes in consumer opinion about health (and other topics) and calibrate the model on 761,962 online reviews of restaurants posted over eight years.
Abstract: In 2008, New York City mandated that all chain restaurants post calorie information on their menus. For managers of chain and standalone restaurants, as well as for policy makers, a pertinent goal might be to monitor the impact of this regulation on consumer conversations. We propose a scalable Bayesian topic model to measure and understand changes in consumer opinion about health (and other topics). We calibrate the model on 761,962 online reviews of restaurants posted over eight years. Our model allows managers to specify prior topics of interest such as “health” for a calorie posting regulation. It also allows the distribution of topic proportions within a review to be affected by its length, valence, and the experience level of its author. Using a difference-in-differences estimation approach, we isolate the potentially causal effect of the regulation on consumer opinion. Following the regulation, there was a statistically small but significant increase in the proportion of discussion of the health to...
79 citations
••
TL;DR: In this article, a quantitative approach for describing entertainment products, in a way that allows for improving the predictive performance of consumer choice models for these products, has been proposed to improve the prediction performance of these models.
Abstract: The authors propose a quantitative approach for describing entertainment products, in a way that allows for improving the predictive performance of consumer choice models for these products. Their ...
79 citations
••
TL;DR: F fuzzy latent semantic analysis (FLSA) is described, a novel approach in topic modeling using fuzzy perspective that can handle health and medical corpora redundancy issue and provides a new method to estimate the number of topics.
Abstract: The majority of medical documents and electronic health records are in text format that poses a challenge for data processing and finding relevant documents. Looking for ways to automatically retrieve the enormous amount of health and medical knowledge has always been an intriguing topic. Powerful methods have been developed in recent years to make the text processing automatic. One of the popular approaches to retrieve information based on discovering the themes in health and medical corpora is topic modeling; however, this approach still needs new perspectives. In this research, we describe fuzzy latent semantic analysis (FLSA), a novel approach in topic modeling using fuzzy perspective. FLSA can handle health and medical corpora redundancy issue and provides a new method to estimate the number of topics. The quantitative evaluations show that FLSA produces superior performance and features to latent Dirichlet allocation, the most popular topic model.
79 citations
•
TL;DR: The results show that the use of learning materials as training data for the grading model outperforms the k-NN-based grading methods and the division of the learning materials in the training data is crucial.
Abstract: Automatic Essay Assessor (AEA) is a system that utilizes information retrieval techniques such as Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis (PLSA), and Latent Dirichlet Allocation (LDA) for automatic essay grading. The system uses learning materials and relatively few teacher-graded essays for calibrating the scoring mechanism before grading. We performed a series of experiments using LSA, PLSA and LDA for document comparisons in AEA. In addition to comparing the methods on a theoretical level, we compared the applicability of LSA, PLSA, and LDA to essay grading with empirical data. The results show that the use of learning materials as training data for the grading model outperforms the k-NN-based grading methods. In addition to this, we found that using LSA yielded slightly more accurate grading than PLSA and LDA. We also found that the division of the learning materials in the training data is crucial. It is better to divide learning materials into sentences than paragraphs.
79 citations