scispace - formally typeset
Search or ask a question
Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.


Papers
More filters
Journal ArticleDOI
TL;DR: BTM topic model is employed to process short texts–micro-blog data for alleviating the problem of sparsity, and K-means clustering algorithm is integrated into BTM (Biterm Topic Model) for topics discovery further.
Abstract: The development of micro-blog, generating large-scale short texts, provides people with convenient communication. In the meantime, discovering topics from short texts genuinely becomes an intractable problem. It was hard for traditional topic model-to-model short texts, such as probabilistic latent semantic analysis (PLSA) and Latent Dirichlet Allocation (LDA). They suffered from the severe data sparsity when disposed short texts. Moreover, K-means clustering algorithm can make topics discriminative when datasets is intensive and the difference among topic documents is distinct. In this paper, BTM topic model is employed to process short texts–micro-blog data for alleviating the problem of sparsity. At the same time, we integrating K-means clustering algorithm into BTM (Biterm Topic Model) for topics discovery further. The results of experiments on Sina micro-blog short text collections demonstrate that our method can discover topics effectively.

33 citations

Journal ArticleDOI
TL;DR: This paper presents a visual analytics-based approach to assist an expert user in tracking topics relative to his/her company from Twitter, developed for visualizing topic long-term evolution and detecting weak signals.

33 citations

Journal ArticleDOI
14 Nov 2016
TL;DR: The results show the potential of the GRT to support both teachers and students, and the Vector Space Model with Euclidian distance based clustering proved to be particularly well suited for detecting text differences as a basis for group formation.
Abstract: Orchestrating collaborative learning in the classroom involves tasks such as forming learning groups with heterogeneous knowledge and making learners aware of the knowledge differences. However, gathering information on which the formation of appropriate groups and the creation of graphical knowledge representations can be based is very effortful for teachers. Tools supporting cognitive group awareness provide such representations to guide students during their collaboration, but mainly rely on specifically created input. Our work is guided by the questions of how the analysis and visualization of cognitive information can be supported by automatic mechanisms (especially using text mining), and what effects a corresponding tool can achieve in the classroom. We systematically compared different methods to be used in a Grouping and Representing Tool (GRT), and evaluated the tool in an experimental field study. Latent Dirichlet Allocation proved successful in transforming the topics of texts into values as a basis for representing cognitive information graphically. The Vector Space Model with Euclidian distance based clustering proved to be particularly well suited for detecting text differences as a basis for group formation. The subsequent evaluation of the GRT with 54 high school students further confirmed the GRT’s impact on learning support: students who used the tool added twice as many concepts in an essay after discussing as those in the unsupported group. These results show the potential of the GRT to support both teachers and students.

33 citations

Journal ArticleDOI
TL;DR: ADR-SPLDA, an unsupervised model for human activity discovery and recognition in pervasive environments that studies the relationship between the activities and the sequential patterns extracted from the sequences, and empirically demonstrates the effectiveness of the model for activity recognition.

33 citations

Patent
Suresh Chari1, Ian M. Molloy1, Youngja Park1
02 Mar 2012
TL;DR: In this paper, a method for performing role mining given a plurality of users and a majority of permissions is provided, where at least one generative machine learning technique, e.g., LDA, is used to obtain a probability distribution θ for user-to-role assignments and β for role-topermission assignments.
Abstract: Applications of machine learning techniques such as Latent Dirichlet Allocation (LDA) and author-topic models (ATM) to the problems of mining of user roles to specify access control policies from entitlement as well as logs which contain record of the usage of these entitlements are provided. In one aspect, a method for performing role mining given a plurality of users and a plurality of permissions is provided. The method includes the following steps. At least one generative machine learning technique, e.g., LDA, is used to obtain a probability distribution θ for user-to-role assignments and a probability distribution β for role-to-permission assignments. The probability distribution θ for user-to-role assignments and the probability distribution β for role-to-permission assignments are used to produce a final set of roles, including user-to-role assignments and role-to-permission assignments.

33 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
86% related
Support vector machine
73.6K papers, 1.7M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023323
2022842
2021418
2020429
2019473
2018446