scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper presents an automatic extractive Arabic text summarization system where the user can cap the size of the final summary, a direct system where no machine learning is involved.

56 citations

Journal Article
TL;DR: The Text Analysis Conference MultiLing Pilot of 2011 posed a multi-lingual summarization task to the summarization community, aiming to quantify and measure the performance of multi-lingsual, multi-document summarization systems.
Abstract: The Text Analysis Conference MultiLing Pilot of 2011 posed a multi-lingual summarization task to the summarization community, aiming to quantify and measure the performance of multi-lingual, multi-document summarization systems. The task was to create a 240‐250 word summary from 10 news texts, describing a given topic. The texts of each topic were provided in seven languages (Arabic, Czech, English, French, Greek, Hebrew, Hindi) and each participant generated summaries for at least 2 languages. The evaluation of the summaries was performed using automatic (AutoSummENG, Rouge) and manual processes (Overall Responsiveness score). The participating systems were 8, some of which providing summaries across all languages. This paper provides a brief description for the collection of the data, the evaluation methodology, the problems and challenges faced, and an overview of participation and corresponding results.

56 citations

Proceedings Article
Xiaojun Wan1
19 Jun 2011
TL;DR: Two summarization methods (SimFusion and CoRank) are proposed to leverage the bilingual information in the graph-based ranking framework for cross-language summary extraction to improve the effectiveness of these methods on the DUC2001 dataset.
Abstract: Cross-language document summarization is defined as the task of producing a summary in a target language (e.g. Chinese) for a set of documents in a source language (e.g. English). Existing methods for addressing this task make use of either the information from the original documents in the source language or the information from the translated documents in the target language. In this study, we propose to use the bilingual information from both the source and translated documents for this task. Two summarization methods (SimFusion and CoRank) are proposed to leverage the bilingual information in the graph-based ranking framework for cross-language summary extraction. Experimental results on the DUC2001 dataset with manually translated reference Chinese summaries show the effectiveness of the proposed methods.

56 citations

01 Jan 2004
TL;DR: A sentence extraction system that produces two sorts of multi-document summaries: the rst is a general-purpose summary of a cluster of related documents while the second is an entity-based summary of documents related to a particular person.
Abstract: We describe a sentence extraction system that produces two sorts of multi-document summaries: the rst is a general-purpose summary of a cluster of related documents while the second is an entity-based summary of documents related to a particular person. The general-purpose summary is generated by a process that ranks sentences based on their document and cluster \worthiness". The personality-based summary is constructed by a process that ranks sentences according to a metric that uses coreference and lexical information in a person prole. In both cases, a process of redundancy removal is applied to exclude repeated information.

56 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852