scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Journal ArticleDOI
01 Jul 2022
TL;DR: In this paper , a memetic algorithm, specifically a Multi-Objective Shuffled Frog-Leaping Algorithm (MOSFLA), has been developed, implemented, and applied to solve the query-oriented extractive multi-document text summarization problem.
Abstract: Automatic text summarization is a topic of great interest in many fields of knowledge. Particularly, query-oriented extractive multi-document text summarization methods have increased their importance recently, since they can automatically generate a summary according to a query given by the user. One way to address this problem is by multi-objective optimization approaches. In this paper, a memetic algorithm, specifically a Multi-Objective Shuffled Frog-Leaping Algorithm (MOSFLA) has been developed, implemented, and applied to solve the query-oriented extractive multi-document text summarization problem. Experiments have been conducted with datasets from Text Analysis Conference (TAC), and the obtained results have been evaluated with Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics. The results have shown that the proposed approach has achieved important improvements with respect to the works of scientific literature. Specifically, 25.41%, 7.13%, and 30.22% of percentage improvements in ROUGE-1, ROUGE-2, and ROUGE-SU4 scores have been respectively reached. In addition, MOSFLA has been applied to medicine texts from the Topically Diverse Query Focus Summarization (TD-QFS) dataset as a case study.

4 citations

Book ChapterDOI
20 Mar 2016
TL;DR: This paper proposes a novel methodology to evaluate summaries in the context of online reputation which profits from an analogy between reputation reports and the problem of diversity in search and provides empirical evidence that incorporating priority signals may benefit this summary task.
Abstract: Producing online reputation reports for an entity (company, brand, etc.) is a focused summarization task with a distinctive feature: issues that may affect the reputation of the entity take priority in the summary. In this paper we (i) propose a novel methodology to evaluate summaries in the context of online reputation which profits from an analogy between reputation reports and the problem of diversity in search; and (ii) provide empirical evidence that incorporating priority signals may benefit this summarization task.

4 citations

Posted Content
TL;DR: This article study the problem of domain adaptation for neural abstractive summarization and find that the combination of in-domain and out-of-domain setup yields better summaries when indomain data is insufficient.
Abstract: We study the problem of domain adaptation for neural abstractive summarization. We make initial efforts in investigating what information can be transferred to a new domain. Experimental results on news stories and opinion articles indicate that neural summarization model benefits from pre-training based on extractive summaries. We also find that the combination of in-domain and out-of-domain setup yields better summaries when in-domain data is insufficient. Further analysis shows that, the model is capable to select salient content even trained on out-of-domain data, but requires in-domain data to capture the style for a target domain.

4 citations

Proceedings ArticleDOI
12 Jul 2009
TL;DR: This paper proposes a novel approach for multi-document summarization based on subtopic segmentation that firstly detects the subtopics in a topic, and then finds the central sentence for each subtopic.
Abstract: This paper proposes a novel approach for multi-document summarization based on subtopic segmentation. It firstly detects the subtopics in a topic, and then finds the central sentence for each subtopic. The sentences are scored based on their importance in the document and in the subtopic. Two anti-redundancy strategies are used to extract sentences to form summarization. Since our approach is intrinsically incremental, it is effective when new documents are added to the document set. Experimental results indicate that the proposed approach is effective and efficient.

4 citations

Book ChapterDOI
30 Aug 1999
TL;DR: F fuzzy technology is explored to provide this semantics for the summarizations and aggregates developed in data warehousing systems by providing query capabilities against such enhanced data warehouses by extensions of SQL.
Abstract: A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "Whether a total sales amount 1000 items indicates a good or bad sales performance is still unclear." From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three-layered data summarization architecture, namely, quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed. To facilitate the construction of these three summarization levels, two operators are introduced. We provide query capabilities against such enhanced data warehouses by extensions of SQL.

4 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852