TeKET : a Tree-Based Unsupervised Keyphrase Extraction Technique

doi:10.1007/S12559-019-09706-3

Open AccessJournal ArticleDOI

TeKET : a Tree-Based Unsupervised Keyphrase Extraction Technique

Gollam Rabby, +7 more

- 05 Mar 2020 -

Cognitive Computation

- Vol. 12, Iss: 4, pp 811-833

Chats0

TLDR

The proposed unsupervised keyphrase extraction technique, named TeKET or Tree-based Keyphrase Extraction Technique, is a domain-independent technique that employs limited statistical knowledge and requires no train data.

Abstract:

Automatic keyphrase extraction techniques aim to extract quality keyphrases for higher level summarization of a document. Majority of the existing techniques are mainly domain-specific, which require application domain knowledge and employ higher order statistical methods, and computationally expensive and require large train data, which is rare for many applications. Overcoming these issues, this paper proposes a new unsupervised keyphrase extraction technique. The proposed unsupervised keyphrase extraction technique, named TeKET or Tree-based Keyphrase Extraction Technique, is a domain-independent technique that employs limited statistical knowledge and requires no train data. This technique also introduces a new variant of a binary tree, called KeyPhrase Extraction (KePhEx) tree, to extract final keyphrases from candidate keyphrases. In addition, a measure, called Cohesiveness Index or CI, is derived which denotes a given node’s degree of cohesiveness with respect to the root. The CI is used in flexibly extracting final keyphrases from the KePhEx tree and is co-utilized in the ranking process. The effectiveness of the proposed technique and its domain and language independence are experimentally evaluated using available benchmark corpora, namely SemEval-2010 (a scientific articles dataset), Theses100 (a thesis dataset), and a German Research Article dataset, respectively. The acquired results are compared with other relevant unsupervised techniques belonging to both statistical and graph-based techniques. The obtained results demonstrate the improved performance of the proposed technique over other compared techniques in terms of precision, recall, and F1 scores.

TeKET : a Tree-Based Unsupervised Keyphrase Extraction Technique

Citations

Deep Learning in Mining Biological Data

Application of deep learning in detecting neurological disorders from magnetic resonance images: a survey on the detection of Alzheimer’s disease, Parkinson’s disease and schizophrenia

Social Group Optimization-Assisted Kapur's Entropy and Morphological Segmentation for Automated Detection of COVID-19 Infection from Computed Tomography Images.

Organizing Knowledge: An Introduction to Managing Access to Information

Performance Comparison of Machine Learning Techniques in Identifying Dementia from Open Access Clinical Datasets

References

The anatomy of a large-scale hypertextual Web search engine

The PageRank Citation Ranking : Bringing Order to the Web

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Term Weighting Approaches in Automatic Text Retrieval

TextRank: Bringing Order into Text

Related Papers (5)

Applications of Deep Learning and Reinforcement Learning to Biological Data

Detecting Neurodegenerative Disease from MRI: A Brief Review on a Deep Learning Perspective

A consensus novelty detection ensemble approach for anomaly detection in activities of daily living

Application of Convolutional Neural Network in Segmenting Brain Regions from MRI Data.

Deep Learning in Mining Biological Data