scispace - formally typeset
Search or ask a question

Showing papers on "Dynamic topic model published in 2021"


Journal ArticleDOI
TL;DR: The authors proposed a parallel dynamic topic model by developing an adjustment mechanism of evolving topics and reducing the sampling probabilities of topic-indiscriminate words to achieve parallelism without sacrificing the non-parametric feature of Hierarchical Dirichlet Process.

1 citations



Journal ArticleDOI
TL;DR: In this article, the authors proposed a mixed framework with two components to capture customer buying behavior and its changes over time and visualize these results to better help retailers choose and target products strategically for marketing.
Abstract: Deciding when and which products to recommend to whom is always an essential issue for retailers. In this study, we propose a mixed framework with two components to capture customer buying behavior and its changes over time and visualize these results to better help retailers choose and target products strategically for marketing. In this framework, a topic model is first used to extract customer’s purchase behavior instead of association rules or K-means as mainly used in market field. To automatically choose the optimal number of topics, we implement an approach proposed by Koltcov et al. on point-of-sale (POS) data in the supermarket. Meanwhile, to grasp the change of topics over time, we divided monthly POS data in half and applied the topic model with Renyi entropy separately. The results suggest that splitting data might be a better way to understand customer behavior. Second, we consider how to develop an effective way to visualize the results of the topic model, which is essential, because in a supermarket context, simply knowing which product categories are included under which topics is not enough to support how a supermarket promotes their products. To address this, we design a three-layer visualization approach to better interpret the topic model results and to help retailers design target promotion strategies. The design of visualization was overlooked by studies related to the use of topic models on supermarket data. Finally, to demonstrate the usefulness of our proposed framework, we conduct a simple scenario-based analysis between our framework and other models, such as Latent Dirichlet Allocation (LDA) and the Dynamic Topic Model (DTM). The results show that for most periods, our proposed framework outperforms LDA and DTM.

1 citations


Journal ArticleDOI
28 May 2021-PLOS ONE
TL;DR: In this article, the authors investigated the evolution of provincial new energy policies and industries of China using a topic modeling approach and found that underdeveloped provinces tend to use environment-oriented tools to regulate and control CO2 emissions, while developed regions employ the more balanced policy mix for improving new energy vehicles and other industries.
Abstract: This study investigates the evolution of provincial new energy policies and industries of China using a topic modeling approach. To this end, six out of 31 provinces in China are first selected as research samples, central and provincial new energy policies in the period of 2010 to 2019 are collected to establish a text corpus with 23, 674 documents. Then, the policy corpus is fed to two different topic models, one is the Latent Dirichlet Allocation for modeling static policy topics, another is the Dynamic Topic Model for extracting topics over time. Finally, the obtained topics are mapped into policy tools for comparisons. The dynamic policy topics are further analyzed with the panel data from provincial new energy industries. The results show that the provincial new energy policies moved to different tracks after about 2014 due to the regional conditions such as the economy and CO2 emission intensity. Underdeveloped provinces tend to use environment-oriented tools to regulate and control CO2 emissions, while developed regions employ the more balanced policy mix for improving new energy vehicles and other industries. Widespread hysteretic effects are revealed during the correlation analysis of the policy topics and new energy capacity.

1 citations


Posted Content
TL;DR: In this article, a jointly dynamic topic model was proposed to recognize the lead-lag relationship between multiple text corpora and further utilize this relationship to improve topic modeling. But the model is not suitable for large-scale text corpus.
Abstract: Topic evolution modeling has received significant attentions in recent decades. Although various topic evolution models have been proposed, most studies focus on the single document corpus. However in practice, we can easily access data from multiple sources and also observe relationships between them. Then it is of great interest to recognize the relationship between multiple text corpora and further utilize this relationship to improve topic modeling. In this work, we focus on a special type of relationship between two text corpora, which we define as the "lead-lag relationship". This relationship characterizes the phenomenon that one text corpus would influence the topics to be discussed in the other text corpus in the future. To discover the lead-lag relationship, we propose a jointly dynamic topic model and also develop an embedding extension to address the modeling problem of large-scale text corpus. With the recognized lead-lag relationship, the similarities of the two text corpora can be figured out and the quality of topic learning in both corpora can be improved. We numerically investigate the performance of the jointly dynamic topic modeling approach using synthetic data. Finally, we apply the proposed model on two text corpora consisting of statistical papers and the graduation theses. Results show the proposed model can well recognize the lead-lag relationship between the two corpora, and the specific and shared topic patterns in the two corpora are also discovered.

Posted Content
TL;DR: This paper studied the content of central bank speech communication from 1997 through 2020 and asked the following questions: (i) What global topics do central banks talk about? (ii) How do these topics evolve over time?
Abstract: This paper studies the content of central bank speech communication from 1997 through 2020 and asks the following questions: (i) What global topics do central banks talk about? (ii) How do these topics evolve over time? I turn to natural language processing, and more specifically Dynamic Topic Models, to answer these questions. The analysis consists of an aggregate study of nine major central banks and a case study of the Federal Reserve, which allows for region specific control variables. I show that: (i) Central banks address a broad range of topics. (ii) The topics are well captured by Dynamic Topic Models. (iii) The global topics exhibit strong and significant autoregressive properties not easily explained by financial control variables.

Posted Content
TL;DR: Topic scaling as discussed by the authors ranks learned topics within the same document scale and uses Wordfish to estimate document positions that serve as a dependent variable to learn relevant topics via a supervised Latent Dirichlet Allocation.
Abstract: This paper proposes a new methodology to study sequential corpora by implementing a two-stage algorithm that learns time-based topics with respect to a scale of document positions and introduces the concept of Topic Scaling which ranks learned topics within the same document scale. The first stage ranks documents using Wordfish, a Poisson-based document scaling method, to estimate document positions that serve, in the second stage, as a dependent variable to learn relevant topics via a supervised Latent Dirichlet Allocation. This novelty brings two innovations in text mining as it explains document positions, whose scale is a latent variable, and ranks the inferred topics on the document scale to match their occurrences within the corpus and track their evolution. Tested on the U.S. State Of The Union two-party addresses, this inductive approach reveals that each party dominates one end of the learned scale with interchangeable transitions that follow the parties' term of office. Besides a demonstrated high accuracy in predicting in-sample documents' positions from topic scores, this method reveals further hidden topics that differentiate similar documents by increasing the number of learned topics to unfold potential nested hierarchical topic structures. Compared to other popular topic models, Topic Scaling learns topics with respect to document similarities without specifying a time frequency to learn topic evolution, thus capturing broader topic patterns than dynamic topic models and yielding more interpretable outputs than a plain latent Dirichlet allocation.