Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Bayesian Query-Focused Summarization

[...]

Hal Daumé¹, Daniel Marcu¹•Institutions (1)

Information Sciences Institute¹

17 Jul 2006

TL;DR: It is shown that approximate inference in BAYESUM is possible on large data sets and results in a state-of-the-art summarization system, and how B Bayesian summarization can be understood as a justified query expansion technique in the language modeling for IR framework.

...read moreread less

Abstract: We present BAYESUM (for "Bayesian summarization"), a model for sentence extraction in query-focused summarization. BAYESUM leverages the common case in which multiple documents are relevant to a single query. Using these documents as reinforcement for query terms, BAYESUM is not afflicted by the paucity of information in short queries. We show that approximate inference in BAYESUM is possible on large data sets and results in a state-of-the-art summarization system. Furthermore, we show how BAYESUM can be understood as a justified query expansion technique in the language modeling for IR framework.

...read moreread less

265 citations

Proceedings Article•DOI•

Headline generation based on statistical translation

[...]

Michele Banko¹, Vibhu Mittal², Michael Witbrock•Institutions (2)

Johns Hopkins University¹, Jordan University of Science and Technology²

03 Oct 2000

TL;DR: This paper presents results on experiments using this approach, in which statistical models of the term selection and term ordering are jointly applied to produce summaries in a style learned from a training corpus.

...read moreread less

Abstract: Extractive summarization techniques cannot generate document summaries shorter than a single sentence, something that is often required. An ideal summarization system would understand each document and generate an appropriate summary directly from the results of that understanding. A more practical approach to this problem results in the use of an approximation: viewing summarization as a problem analogous to statistical machine translation. The issue then becomes one of generating a target document in a more concise language from a source document in a more verbose language. This paper presents results on experiments using this approach, in which statistical models of the term selection and term ordering are jointly applied to produce summaries in a style learned from a training corpus.

...read moreread less

255 citations

Proceedings Article•

Multi-document summarization by graph search and matching

[...]

Inderjeet Mani¹, Eric Bloedorn¹•Institutions (1)

Mitre Corporation¹

27 Jul 1997

TL;DR: In this article, the authors describe a method for summarizing similarities and differences in a pair of related documents using a graph representation for text, where concepts denoted by words, phrases, and proper names in the document are represented positionally as nodes in the graph along with edges corresponding to semantic relations between items.

...read moreread less

Abstract: We describe a new method for summarizing similarities and differences in a pair of related documents using a graph representation for text. Concepts denoted by words, phrases, and proper names in the document are represented positionally as nodes in the graph along with edges corresponding to semantic relations between items. Given a perspective in terms of which the pair of documents is to be summarized, the algorithm first uses a spreading activation technique to discover, in each document, nodes semantically related to the topic. The activated graphs of each document are then matched to yield a graph corresponding to similarities and differences between the pair, which is rendered in natural language. An evaluation of these techniques has been carried out.

...read moreread less

247 citations

Proceedings Article•DOI•

A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization

[...]

Ani Nenkova¹, Lucy Vanderwende², Kathleen McKeown¹•Institutions (2)

Stanford University¹, Microsoft²

06 Aug 2006

TL;DR: The research shows that a frequency based summarizer can achieve performance comparable to that of state-of-the-art systems, but only with a good composition function; context sensitivity improves performance and significantly reduces repetition.

...read moreread less

Abstract: The usual approach for automatic summarization is sentence extraction, where key sentences from the input documents are selected based on a suite of features. While word frequency often is used as a feature in summarization, its impact on system performance has not been isolated. In this paper, we study the contribution to summarization of three factors related to frequency: content word frequency, composition functions for estimating sentence importance from word frequency, and adjustment of frequency weights based on context. We carry out our analysis using datasets from the Document Understanding Conferences, studying not only the impact of these features on automatic summarizers, but also their role in human summarization. Our research shows that a frequency based summarizer can achieve performance comparable to that of state-of-the-art systems, but only with a good composition function; context sensitivity improves performance and significantly reduces repetition.

...read moreread less

246 citations

Journal Article•DOI•

Opinion mining from online hotel reviews A text summarization approach

[...]

Ya Han Hu¹, Yen-Liang Chen², Hui-Ling Chou²•Institutions (2)

National Chung Cheng University¹, National Central University²

01 Mar 2017-Information Processing and Management

TL;DR: This study proposes a novel multi-text summarization technique for identifying the top-k most informative sentences of hotel reviews, and developed a new sentence importance metric.

...read moreread less

Abstract: Text summarization technique can extract essential information from online reviews.Our method can identify top-k most informative sentences from online hotel reviews.We jointly considered author, review time, usefulness, and opinion factors.Online hotel reviews were collected from TripAdvisor in experimental evaluation.The results show that our approach provides more comprehensive hotel information. Online travel forums and social networks have become the most popular platform for sharing travel information, with enormous numbers of reviews posted daily. Automatically generated hotel summaries could aid travelers in selecting hotels. This study proposes a novel multi-text summarization technique for identifying the top-k most informative sentences of hotel reviews. Previous studies on review summarization have primarily examined content analysis, which disregards critical factors like author credibility and conflicting opinions. We considered such factors and developed a new sentence importance metric. Both the content and sentiment similarities were used to determine the similarity of two sentences. To identify the top-k sentences, the k-medoids clustering algorithm was used to partition sentences into k groups. The medoids from these groups were then selected as the final summarization results. To evaluate the performance of the proposed method, we collected two sets of reviews for the two hotels posted on TripAdvisor.com. A total of 20 subjects were invited to review the text summarization results from the proposed approach and two conventional approaches for the two hotels. The results indicate that the proposed approach outperforms the other two, and most of the subjects believed that the proposed approach can provide more comprehensive hotel information.

...read moreread less

243 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics