Home
/
Authors
/
Bolun Chen

Author

Bolun Chen

Bio: Bolun Chen is an academic researcher. The author has contributed to research in topics: Topic model & Academic writing. The author has an hindex of 3, co-authored 3 publications receiving 22 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Collective topical PageRank: a model to evaluate the topic-dependent academic impact of scientific papers

[...]

Yongjun Zhang¹, Jialin Ma, Zijian Wang¹, Bolun Chen, Yongtao Yu - Show less +1 more•Institutions (1)

Hohai University¹

01 Mar 2018-Scientometrics

TL;DR: A pipeline model, named collective topical PageRank, is proposed, which incorporates the venue, the correlations of the scientific topics, and the publication year of each paper into a random walk to evaluate the topic-dependent impact of scientific papers.

...read moreread less

Abstract: With the explosive growth of academic writing, it is difficult for researchers to find significant papers in their area of interest. In this paper, we propose a pipeline model, named collective topical PageRank, to evaluate the topic-dependent impact of scientific papers. First, we fit the model to a correlation topic model based on the textual content of papers to extract scientific topics and correlations. Then, we present a modified PageRank algorithm, which incorporates the venue, the correlations of the scientific topics, and the publication year of each paper into a random walk to evaluate the paper’s topic-dependent academic impact. Our experiments showed that the model can effectively identify significant papers as well as venues for each scientific topic, recommend papers for further reading or citing, explore the evolution of scientific topics, and calculate the venues’ dynamic topic-dependent academic impact.

...read moreread less

16 citations

Journal Article•DOI•

LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification

[...]

Yongjun Zhang¹, Zijian Wang¹, Yongtao Yu, Bolun Chen, Jialin Ma¹, Liang Shi - Show less +2 more•Institutions (1)

Hohai University¹

01 Apr 2018-International Journal of Data Warehousing and Mining

TL;DR: There is a huge room for a multi-label classification of text-based documents, according to KeyWoRDS Function Terms.

...read moreread less

Abstract: This article describes how text documents are a major data structure in the era of big data. With the explosive growth of data, the number of documents with multi-labels has increased dramatically. The popular multi-label classification technology, which is usually employed to handle multinomial text documents, is sensitive to the noise terms of text documents. Therefore, there still exists a huge room for multi-label classification of text documents. This article introduces a supervised topic model, named labeled LDA with function terms (LF-LDA), to filter out the noisy function terms from text documents, which can help to improve the performance of multi-label classification of text documents. The article also shows the derivation of the Gibbs Sampling formulas in detail, which can be generalized to other similar topic models. Based on the textual data set RCV1-v2, the article compared the proposed model with other two state-of-the-art multi-label classifiers, Tuned SVM and labeled LDA, on both Macro-F1 and Micro-F1 metrics. The result shows that LF-LDA outperforms them and has the lowest variance, which indicates the robustness of the LF-LDA classifier. KeyWoRDS Function Terms, Gibbs Sampling, Graph Model, Multi Label, Parameter Estimation, Probability Generation Process, Text Classification, Topic Model

...read moreread less

8 citations

Book Chapter•DOI•

LF-LDA: A Topic Model for Multi-label Classification

[...]

Yongjun Zhang, Jialin Ma, Zijian Wang¹, Bolun Chen•Institutions (1)

Hohai University¹

10 Jun 2017

TL;DR: The experimental result on RCV1-v2 textual dataset shows that LF-LDA can outperform the other two state-of-art multi-label classifiers: Tuned SVM and L-L DA on both Macro-F1 and Micro-F 1 metrics, and the low variance also indicates LF- LDA is a robust classifier.

...read moreread less

Abstract: The textual data grows explosively with the advent of the era of big data, a significant portion of textual data is text documents labeled with multi-label such as the papers with keywords. Multi-label classification is a power technology to handle the multi-labeled textual data, but a huge room stays for improving the effect of multi-label classifying for textual data. This paper introduces labeled LDA with function terms (LF-LDA), a topic model that extracts noisy function terms from textual data to improve the performance of multi-label classification. The experimental result on RCV1-v2 textual dataset shows that LF-LDA can outperform the other two state-of-art multi-label classifiers: Tuned SVM and L-LDA on both Macro-F1 and Micro-F1 metrics. The low variance also indicates LF-LDA is a robust classifier.

...read moreread less

6 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Deep neural network for hierarchical extreme multi-label text classification

[...]

Francesco Gargiulo¹, Stefano Silvestri², Stefano Silvestri¹, Mario Ciampi¹, Giuseppe De Pietro¹ - Show less +1 more•Institutions (2)

Indian Council of Agricultural Research¹, University of Naples Federico II²

01 Jun 2019-Applied Soft Computing

TL;DR: An analysis of a Deep Learning architecture devoted to text classification, considering the extreme multi-class and multi-label text classification problem, when a hierarchical label set is defined and a methodology named Hierarchical Label Set Expansion (HLSE) is presented.

...read moreread less

86 citations

Journal Article•DOI•

Mapping of topics in DESIDOC Journal of Library and Information Technology, India: a study

[...]

Manika Lamba¹, Margam Madhusudhan¹•Institutions (1)

University of Delhi¹

01 Aug 2019-Scientometrics

TL;DR: It was found that there were some unique sub-fields to Indian Library and Information Science research, such as open access; online exhibition; virtual libraries; multimedia libraries; open source software; library automation; and library management system.

...read moreread less

Abstract: This study analyzed 928 full-text research articles retrieved from DESIDOC Journal of Library and Information Technology for the period of 1981–2018 using Latent Dirichlet Allocation. The study further tagged the articles with the modeled topics. 50 core topics were identified throughout the period of 38 years whereas only 26 topics were unique in nature. Bibliometrics, ICT, information retrieval, and user studies were highly researched areas in India for the epoch. Further, Spain and Taiwan showed common research trends and areas as India whereas India has quite distinct research interests from America and China. Therefore, researchers in Library and Information Science in India should pay more attention to the topics which are under-researched. Further, it was found that there were some unique sub-fields to Indian Library and Information Science research, such as open access; online exhibition; virtual libraries; multimedia libraries; open source software; library automation; and library management system. With the passage of time topics evolve over time, new topics emerge, and old ones become obsolete. Topic modeling not only helps the researcher to determine the trending themes or related fields with respect to their field of interest but also helps them to identify new concepts and fields over time.

...read moreread less

30 citations

Journal Article•DOI•

Correction: A correlated topic model of Science

[...]

David M. Blei, John Lafferty

10 Dec 2007-arXiv: Applications

TL;DR: This paper presents a meta-analyses of the determinants of infectious disease in eight operation theatres of the immune system and shows clear patterns of decline in the number of vaccinated patients and their ages.

...read moreread less

Abstract: Correction to Annals of Applied Statistics 1 (2007) 17--35 [doi:10.1214/07-AOAS114]

...read moreread less

27 citations

Proceedings Article•DOI•

Deep Convolution Neural Network for Extreme Multi-label Text Classification

[...]

Francesco Gargiulo¹, Stefano Silvestri¹, Mario Ciampi¹•Institutions (1)

Indian Council of Agricultural Research¹

01 Jan 2018

TL;DR: This paper presents an analysis on the usage of Deep Neural Networks for extreme multi-label and multiclass text classification, and investigates on the behaviour of the neural networks as function of the training hyperparameters, analysing the link between them and the dataset complexity.

...read moreread less

Abstract: In this paper we present an analysis on the usage of Deep Neural Networks for extreme multi-label and multiclass text classification. We will consider two network models: the first one is formed by a word embeddings (WEs) stage followed by two dense layers, hereinafter Dense, and a second model with a convolution stage between the WEs and the dense layers, hereinafter CNN-Dense. We will take into account classification problems characterized by different number of labels, ranging from an order of 10 to an order of 30,000, showing the different performances of the neural networks varying the total label number and the average number of labels for sample, exploiting the hierarchical structure of the label space of the dataset used for experimental assessment. It is worth noting that multi-label classification is an harder problem if compared to multi-class, due to the variable number of labels associated to each sample. We will even investigate on the behaviour of the neural networks as function of the training hyperparameters, analysing the link between them and the dataset complexity. All the result will be evaluated using the PubMed scientific articles collection as

...read moreread less

19 citations

Book•

Research Advances in the Integration of Big Data and Smart Computing

[...]

Pradeep Kumar Mallick

13 Oct 2015

16 citations

1
2
3
4
…
5
6

Collapse