Latent dirichlet allocation
Citations
170 citations
170 citations
Cites background from "Latent dirichlet allocation"
...The basic idea of the topic model [Blei et al., 2003; Blei, 2012] is that data are represented as random mixtures over an underlying set of topics, where each topic is characterized by a distribution over words....
[...]
170 citations
169 citations
Cites methods from "Latent dirichlet allocation"
...For LDA, we ran two experiments, with 30 and 6 topics respectively....
[...]
...We find that although K-means does better than LDA, overall, both methods produce clusters of very poor quality (with respect to our topical hash-tags)....
[...]
...With these definitions, precision (P), recall (R) and pairwise F1 scores are defined as: The results for LDA and K-means algorithms are shown in Table 2....
[...]
...Although the tweets do not tend to form natural topicbased clusters, as evidenced by the performance of methods such as K-means and LDA, the use of supervised models and training data significantly improves performance for the task of topical clustering....
[...]
...Ideally, if tweets tended to cluster together along topic lines, one would expect to see each LDA topic correspond to an actual hash-tag based topic....
[...]
169 citations
References
17,608 citations
16,079 citations
"Latent dirichlet allocation" refers background in this paper
...Finally, Griffiths and Steyvers (2002) have presented a Markov chain Monte Carlo algorithm for LDA....
[...]
...Structures similar to that shown in Figure 1 are often studied in Bayesian statistical modeling, where they are referred to ashierarchical models(Gelman et al., 1995), or more precisely asconditionally independent hierarchical models(Kass and Steffey, 1989)....
[...]
...Structures similar to that shown in Figure 1 are often studied in Bayesian statistical modeling, where they are referred to as hierarchical models (Gelman et al., 1995), or more precisely as conditionally independent hierarchical models (Kass and Steffey, 1989)....
[...]
12,443 citations
"Latent dirichlet allocation" refers methods in this paper
...To address these shortcomings, IR researchers have proposed several other dimensionality reduction techniques, most notably latent semantic indexing (LSI) (Deerwester et al., 1990)....
[...]
...To address these shortcomings, IR researchers have proposed several other dimensionality reduction techniques, most notablylatent semantic indexing (LSI)(Deerwester et al., 1990)....
[...]
12,059 citations
"Latent dirichlet allocation" refers background or methods in this paper
...In the populartf-idf scheme (Salton and McGill, 1983), a basic vocabulary of “words” or “terms” is chosen, and, for each document in the corpus, a count is formed of the number of occurrences of each word....
[...]
...We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model....
[...]
7,086 citations