scispace - formally typeset
Open AccessProceedings Article

Applying Graph-based Keyword Extraction to Document Retrieval

Reads0
Chats0
TLDR
A keyword extraction process, based on the PageRank algorithm, to reduce noise of input data for measuring semantic similarity and experimental results showed significantly improved document retrieval performance with this extraction process in place.
Abstract
This paper proposes a keyword extraction process, based on the PageRank algorithm, to reduce noise of input data for measuring semantic similarity. This paper will introduce several features related to implementation and discuss their effects. It will also discuss experimental results which showed significantly improved document retrieval performance with this extraction process in place.

read more

Citations
More filters
Proceedings ArticleDOI

DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases

TL;DR: An end-to-end method called DivGraphPointer is presented for extracting a set of diversified keyphrases from a document that combines the advantages of traditional graph-based ranking methods and recent neural network-based approaches.
Journal ArticleDOI

Fast and Constrained Absent Keyphrase Generation by Prompt-Based Learning

TL;DR: The result shows that the proposed constrained absent keyphrase generation method can generate more consistent keyphrases, which can improve document retrieval performance, and with a non-autoregressive decoding manner, can speed up the absentKeyphrase generation by 8.67× compared with the autoregressive method.
Proceedings ArticleDOI

Hyperbolic Relevance Matching for Neural Keyphrase Extraction

TL;DR: A newhyperbolic matching model (HyperMatch) is designed to explore keyphrase extraction in hyperbolic space and outperforms the recent state-of-the-art baselines on six benchmark datasets.
Proceedings ArticleDOI

DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases

TL;DR: DivGraphPointer as discussed by the authors combines the advantages of traditional graph-based ranking methods and recent neural network-based approaches to extract a set of diversified keyphrases from a document.
Proceedings ArticleDOI

Extraction of keyphrases from single document based on hierarchical concepts

TL;DR: This paper provides modification of approaches for extraction of keyphrases from single textual document based on the hierarchical concepts created upon the text of particular document using FCA-based algorithm known as generalized one-sided concept lattice.
References
More filters
Journal ArticleDOI

The anatomy of a large-scale hypertextual Web search engine

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Journal Article

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Sergey Brin, +1 more
- 01 Jan 1998 - 
TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.
Book

Introduction to Information Retrieval

TL;DR: In this article, the authors present an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections.
Journal ArticleDOI

Term Weighting Approaches in Automatic Text Retrieval

TL;DR: This paper summarizes the insights gained in automatic term weighting, and provides baseline single term indexing models with which other more elaborate content analysis procedures can be compared.
Journal ArticleDOI

A vector space model for automatic indexing

TL;DR: An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents, demonstating the usefulness of the model.