Showing papers on "Multi-document summarization published in 1997"

PDF

Open Access

Using lexical chains for text summarization

[...]

Regina Barzilay, Michael Elhadad¹•Institutions (1)

01 Jan 1997

TL;DR: Empirical results on the identification of strong chains and of significant sentences are presented in this paper, and plans to address short-comings are briefly presented.

...read moreread less

Abstract: We investigate one technique to produce a summary of an original text without requiring its full semantic interpretation, but instead relying on a model of the topic progression in the text derived from lexical chains We present a new algorithm to compute lexical chains in a text, merging several robust knowledge sources: the WordNet thesaurus, a part-of-speech tagger, shallow parser for the identification of nominal groups, and a segmentation algorithm Summarization proceeds in four steps: the original text is segmented, lexical chains are constructed, strong chains are identified and significant sentences are extracted We present in this paper empirical results on the identification of strong chains and of significant sentences Preliminary results indicate that quality indicative summaries are produced Pending problems are identified Plans to address these short-comings are briefly presented

...read moreread less

1,047 citations

Journal Article•DOI•

Automatic text structuring and summarization

[...]

Gerard Salton¹, Amit Singhal¹, Mandar Mitra¹, Chris Buckley¹•Institutions (1)

Cornell University¹

01 Mar 1997

TL;DR: This study applies the ideas from the automatic link generation research to attack another important problem in text processing—automatic text summarization, and generates intra-document links between passages of a document.

...read moreread less

Abstract: In recent years, information retrieval techniques have been used for automatic generation of semantic hypertext links. This study applies the ideas from the automatic link generation research to attack another important problem in text processing—automatic text summarization. An automatic “general purpose” text summarization tool would be of immense utility in this age of information overload. Using the techniques used (by most automatic hypertext link generation algorithms) for inter-document link generation, we generate intra-document links between passages of a document. Based on the intra-document linkage pattern of a text, we characterize the structure of the text. We apply the knowledge of text structure to do automatic text summarization by passage extraction. We evaluate a set of fifty summaries generated using our techniques by comparing them to paragraph extracts constructed by humans. The automatic summarization methods perform well, especially in view of the fact that the summaries generated by two humans for the same article are surprisingly dissimilar.

...read moreread less

525 citations

Proceedings Article•

Automated Text Summarization in SUMMARIST

[...]

Eduard Hovy¹, Chin-Yew Lin¹•Institutions (1)

Information Sciences Institute¹

01 Jul 1997

TL;DR: The system’s architecture is described and details of some of its modules, many of them trained on large corpora of text, are provided.

...read moreread less

Abstract: SUMMARIST is an attempt to create a robust automated text summarization system, based on the ‘equation’: summarization = topic identification + interpretation + generation. Each of these stages contains several independent modules, many of them trained on large corpora of text. We describe the system’s architecture and provide details of some of its modules.

...read moreread less

484 citations

Journal Article•DOI•

Automatic analysis, theme generation, and summarization of machine-readable texts

[...]

Gerard Salton¹, James Allan¹, Chris Buckley¹, Amit Singhal¹•Institutions (1)

Cornell University¹

01 Dec 1997-Science

TL;DR: In this paper, approaches are outlined for manipulating and accessing texts in arbitrary subject areas in accordance with user needs, and methods are given for determining text themes, traversing texts selectively, and extracting summary statements that reflect text content.

...read moreread less

Abstract: Vast amounts of text material are now available in machine-readable form for automatic processing. Here, approaches are outlined for manipulating and accessing texts in arbitrary subject areas in accordance with user needs. In particular, methods are given for determining text themes, traversing texts selectively, and extracting summary statements that reflect text content.

...read moreread less

326 citations

Proceedings Article•DOI•

Identifying Topics by Position

[...]

Chin-Yew Lin¹, Eduard Hovy¹•Institutions (1)

University of Southern California¹

31 Mar 1997

TL;DR: The automated training and evaluation of an Optimal Position Policy is described, a method of locating the likely positions of topic-bearing sentences based on genre-specific regularities of discourse structure that can be used in applications such as information retrieval, routing, and text summarization.

...read moreread less

Abstract: This paper addresses the problem of identifying likely topics of texts by their position in the text. It describes the automated training and evaluation of an Optimal Position Policy, a method of locating the likely positions of topic-bearing sentences based on genre-specific regularities of discourse structure. This method can be used in applications such as information retrieval, routing, and text summarization.

...read moreread less

312 citations

Proceedings Article•

Multi-document summarization by graph search and matching

[...]

Inderjeet Mani¹, Eric Bloedorn¹•Institutions (1)

Mitre Corporation¹

27 Jul 1997

TL;DR: In this article, the authors describe a method for summarizing similarities and differences in a pair of related documents using a graph representation for text, where concepts denoted by words, phrases, and proper names in the document are represented positionally as nodes in the graph along with edges corresponding to semantic relations between items.

...read moreread less

Abstract: We describe a new method for summarizing similarities and differences in a pair of related documents using a graph representation for text. Concepts denoted by words, phrases, and proper names in the document are represented positionally as nodes in the graph along with edges corresponding to semantic relations between items. Given a perspective in terms of which the pair of documents is to be summarized, the algorithm first uses a spreading activation technique to discover, in each document, nodes semantically related to the topic. The activated graphs of each document are then matched to yield a graph corresponding to similarities and differences between the pair, which is rendered in natural language. An evaluation of these techniques has been carried out.

...read moreread less

247 citations

Journal Article•DOI•

[...]

Inderjeet Mani¹, Eric Bloedorn¹•Institutions (1)

Mitre Corporation¹

25 Jun 1997-Information Retrieval

TL;DR: The approach described here exploits the results of recent progress in information extraction to represent salient units of text and their relationships to represent meaningful relations between units based on an analysis of text cohesion and the context in which the comparison is desired.

...read moreread less

Abstract: This summarization approach exploits the results of recent progress in information extraction to represent salient units of text and their relationships. By exploiting meaningful relations between units and the perspective from which the comparison is desired, the summarizer can pinpoint similarities and differences, generate composite summaries, and align text segments. These techniques have also been evaluated.

...read moreread less

237 citations

Development of a document summarization system for effective information services

[...]

Dong-Hyun Jang¹, Sung Hyun Myaeng¹•Institutions (1)

Chungnam National University¹

25 Jun 1997

TL;DR: A system that constructs a summary by extracting sentences that are likely to represent the main theme of a document by using a probabilistic model that takes into account lexical and statistical information obtained from a document corpus.

...read moreread less

Abstract: This paper describes a system that constructs a summary by extracting sentences that are likely to represent the main theme of a document. As a way of selecting summary sentences, the system uses a probabilistic model that takes into account lexical and statistical information obtained from a document corpus. As such, the system consists of two parts: the training part and the summarization part. The former processes sentences that have been manually tagged for summary sentences and extracts necessary statistical information of various kinds, and the latter uses the information to calculate the likelihood of each sentence to become part of a summary. There are at least three unique aspects of this research. First of all, the system uses a text model to identify different components of a text and eliminates parts of text that are not likely to contain summary sentences. Second, although the probabilistic model stems from an existing model developed for English texts, it applies the model to compute multiple probability values based on several features, and computes the final value by combining pieces of evidence from different sources (features) with the Dempster-Shafer theory. Finally, the system is the first of this kind for Korean texts.

...read moreread less

12 citations

Posted Content•

Multi-document Summarization by Graph Search and Matching

[...]

Inderjeet Mani¹, Eric Bloedorn¹•Institutions (1)

Mitre Corporation¹

10 Dec 1997-arXiv: Computation and Language

TL;DR: A new method for summarizing similarities and differences in a pair of related documents using a graph representation for text using a spreading activation technique to discover nodes semantically related to the topic.

...read moreread less

9 citations

Proceedings Article•

Goal-Directed Approach for Text Summarization

[...]

Ryo Ochitani, Yoshio Nakao, Fumihito Nishino

01 Jan 1997

TL;DR: The information to InClude m a summary vanes depending on the author's mtentmn and the use of the summary, and the appropriate goals of the extracting process should be set and a guide should be outlined that instructs the system how to meet the tasks.

...read moreread less

Abstract: The information to InClude m a summary vanes depending on the author's mtentmn and the use of the summary To create the best summaries, the appropriate goals of the extracting process should be set and a guide should be outlined that instructs the system how to meet the tasks The approach described m thin report m intended to be a basic archltecture to extract a set of concme sentences that are indicated or predlcted by goals and contexts To evaluate a sentence, the sentence selection algorithm simply measures the mformatlveness of each sentence by comparing with the determined goals, and the algorlthm extracts a set of the hlghest scored bentences by repeat apphcatmn of thin com-

...read moreread less

5 citations