scispace - formally typeset
Open AccessDOI

Using lexical chains for text summarization

TLDR
Empirical results on the identification of strong chains and of significant sentences are presented in this paper, and plans to address short-comings are briefly presented.
Abstract
We investigate one technique to produce a summary of an original text without requiring its full semantic interpretation, but instead relying on a model of the topic progression in the text derived from lexical chains We present a new algorithm to compute lexical chains in a text, merging several robust knowledge sources: the WordNet thesaurus, a part-of-speech tagger, shallow parser for the identification of nominal groups, and a segmentation algorithm Summarization proceeds in four steps: the original text is segmented, lexical chains are constructed, strong chains are identified and significant sentences are extracted We present in this paper empirical results on the identification of strong chains and of significant sentences Preliminary results indicate that quality indicative summaries are produced Pending problems are identified Plans to address these short-comings are briefly presented

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

LexRank: graph-based lexical centrality as salience in text summarization

TL;DR: LexRank as discussed by the authors is a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing (NLP), which is based on the concept of eigenvector centrality.
Journal ArticleDOI

Word sense disambiguation: A survey

TL;DR: This work introduces the reader to the motivations for solving the ambiguity of words and provides a description of the task, and overviews supervised, unsupervised, and knowledge-based approaches.
Proceedings Article

Mining opinion features in customer reviews

TL;DR: This project aims to summarize all the customer reviews of a product by mining opinion/product features that the reviewers have commented on and a number of techniques are presented to mine such features.
Proceedings ArticleDOI

Generic text summarization using relevance measure and latent semantic analysis

Yihong Gong, +1 more
TL;DR: This paper proposes two generic text summarization methods that create text summaries by ranking and extracting sentences from the original documents, and uses the latent semantic analysis technique to identify semantically important sentences, for summary creations.
Book

Automatic Summarization

TL;DR: The challenges that remain open, in particular the need for language generation and deeper semantic understanding of language that would be necessary for future advances in the field are discussed.
References
More filters
Book

Cohesion in English

TL;DR: This book studies the cohesion that arises from semantic relations between sentences, reference from one to the other, repetition of word meanings, the conjunctive force of but, so, then and the like are considered.
Journal ArticleDOI

Introduction to WordNet: An On-line Lexical Database

TL;DR: Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list.
Journal ArticleDOI

Rhetorical Structure Theory : Toward a Functional Theory of Text Organization

TL;DR: Rhetorical Structure Theory (RST) as mentioned in this paper is a descriptive theory of a major aspect of the organization of natural text, which is a linguistically useful method for describing natural texts, characterizing their Structure primarily in terms of relations that hold between parts of the text.
Journal ArticleDOI

The automatic creation of literature abstracts

TL;DR: In the exploratory research described, the complete text of an article in machine-readable form is scanned by an IBM 704 data-processing machine and analyzed in accordance with a standard program.
Journal ArticleDOI

New Methods in Automatic Extracting

TL;DR: New methods of automatically extracting documents for screening purposes, i.e. the computer selection of sentences having the greatest potential for conveying to the reader the substance of the document, indicate that the three newly proposed components dominate the frequency component in the production of better extracts.