Detecting topic evolution in scientific literature: how can citations help?

doi:10.1145/1645953.1646076

Proceedings ArticleDOI

Detecting topic evolution in scientific literature: how can citations help?

Qi He, +5 more

- pp 957-966

Chats0

TLDR

An iterative topic evolution learning framework is proposed by adapting the Latent Dirichlet Allocation model to the citation network and develop a novel inheritance topic model, which clearly shows that citations can help to understand topic evolution better.

Abstract:

Understanding how topics in scientific literature evolve is an interesting and important problem. Previous work simply models each paper as a bag of words and also considers the impact of authors. However, the impact of one document on another as captured by citations, one important inherent element in scientific literature, has not been considered. In this paper, we address the problem of understanding topic evolution by leveraging citations, and develop citation-aware approaches. We propose an iterative topic evolution learning framework by adapting the Latent Dirichlet Allocation model to the citation network and develop a novel inheritance topic model. We evaluate the effectiveness and efficiency of our approaches and compare with the state of the art approaches on a large collection of more than 650,000 research papers in the last 16 years and the citation network enabled by CiteSeerX. The results clearly show that citations can help to understand topic evolution better.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Quantitative Horizon Scanning for Mitigating Technological Surprise: Detecting the Potential for Collaboration at the Interface

Carey E. Priebe, +3 more

- 01 Jun 2012 -

Statistical Analysis and Data Mining

TL;DR: This work develops an innovative statistical approach thereto—not a final etched‐in‐stone approach, but perhaps the first complete quantitative methodology explicitly addressing QHS for MTS.

...read moreread less

Dissertation

Towards structured representation of academic search results

Daniil Mirylenka

TL;DR: A novel method of representing academic search results with concise and informative topic maps, based on sequential prediction to automatically learn to build informative summaries from examples, and an interactive learning method for selecting the categories of Wikipedia relevant to a given domain.

...read moreread less

Journal ArticleDOI

Refining the Measurement of Topic Similarities Through Bibliographic Coupling and LDA

Omer Hanif, +3 more

- 09 Dec 2019 -

IEEE Access

TL;DR: This paper presents an approach for measuring the similarity between topics based on the bibliographic coupling and believes that finding such an association between unrelated innovative inventions across various industries may help public and private research units in planning research direction and serve as a reference for future research.

...read moreread less

Journal ArticleDOI

Measuring the innovation of method knowledge elements in scientific literature

Zhong-Yi Wang, +4 more

- 25 Mar 2022 -

Scientometrics

Journal ArticleDOI

MRT: Tracing the Evolution of Scientific Publications

Da Yin, +3 more

- 01 Jan 2021 -

IEEE Transactions on Knowledge and Data ...

TL;DR: This work proposed a practical framework called Master Reading Tree (MRT), which can build annotated evolution roadmaps for publications and identify important previous works or evolution tracks by generating expressive embeddings and clustering them into various groups.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Latent dirichlet allocation

David M. Blei, +2 more

- 01 Mar 2003 -

Journal of Machine Learning Research

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.

...read moreread less

Proceedings Article

Latent Dirichlet Allocation

David M. Blei, +2 more

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).

...read moreread less

Journal ArticleDOI

Term Weighting Approaches in Automatic Text Retrieval

Gerard Salton, +1 more

- 01 Aug 1988 -

Information Processing and Management

TL;DR: This paper summarizes the insights gained in automatic term weighting, and provides baseline single term indexing models with which other more elaborate content analysis procedures can be compared.

...read moreread less

Journal ArticleDOI

Finding scientific topics

Thomas L. Griffiths, +1 more

- 06 Apr 2004 -

Proceedings of the National Academy of S...

TL;DR: A generative model for documents is described, introduced by Blei, Ng, and Jordan, and a Markov chain Monte Carlo algorithm is presented for inference in this model, which is used to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics.

...read moreread less

Journal ArticleDOI

Hierarchical Dirichlet Processes

Yee Whye Teh, +3 more

- 01 Dec 2006 -

Journal of the American Statistical Asso...

TL;DR: This work considers problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups, and considers a hierarchical model, specifically one in which the base measure for the childDirichlet processes is itself distributed according to a Dirichlet process.

...read moreread less

Collapse

Detecting topic evolution in scientific literature: how can citations help?

Citations

Quantitative Horizon Scanning for Mitigating Technological Surprise: Detecting the Potential for Collaboration at the Interface

Towards structured representation of academic search results

Refining the Measurement of Topic Similarities Through Bibliographic Coupling and LDA

Measuring the innovation of method knowledge elements in scientific literature

MRT: Tracing the Evolution of Scientific Publications

References

Latent dirichlet allocation

Latent Dirichlet Allocation

Term Weighting Approaches in Automatic Text Retrieval

Finding scientific topics

Hierarchical Dirichlet Processes

Related Papers (5)

Latent dirichlet allocation

Dynamic topic models

Finding scientific topics

Topics over time: a non-Markov continuous-time model of topical trends

The author-topic model for authors and documents