Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Multi-document summarization using closed patterns

[...]

Jipeng Qiang¹, Ping Chen¹, Wei Ding¹, Fei Xie², Xindong Wu³ - Show less +1 more•Institutions (3)

University of Massachusetts Boston¹, Hefei Normal University², University of Vermont³

01 May 2016-Knowledge Based Systems

TL;DR: A pattern-based model for generic multi-document summarization is presented, which exploits closed patterns to extract the most salient sentences from a document collection and reduce redundancy in the summary.

...read moreread less

Abstract: There are two main categories of multi-document summarization: term-based and ontology-based methods. A term-based method cannot deal with the problems of polysemy and synonymy. An ontology-based approach addresses such problems by taking into account of the semantic information of document content, but the construction of ontology requires lots of manpower. To overcome these open problems, this paper presents a pattern-based model for generic multi-document summarization, which exploits closed patterns to extract the most salient sentences from a document collection and reduce redundancy in the summary. Our method calculates the weight of each sentence of a document collection by accumulating the weights of its covering closed patterns with respect to this sentence, and iteratively selects one sentence that owns the highest weight and less similarity to the previously selected sentences, until reaching the length limitation. The sentence weight calculation by patterns reduces the dimension and captures more relevant information. Our method combines the advantages of the term-based and ontology-based models while avoiding their weaknesses. Empirical studies on the benchmark DUC2004 datasets demonstrate that our pattern-based method significantly outperforms the state-of-the-art methods. Multi-document summarization can be used to extract a particular individual's opinions in the form of closed patterns, from this individual's documents shared in social networks, hence provides a useful tool for further analyzing the individual's behavior and influence in group activities.

...read moreread less

54 citations

Proceedings Article•DOI•

Fast and quasi-natural language search for gigabytes of Chinese texts

[...]

Lee-Feng Chien¹•Institutions (1)

Academia Sinica¹

01 Jul 1995

TL;DR: The proposed approach is an integrated and efficient text access method, which performs well both in exact match searching of Boolean queries and best match searching (ranking) of quasi-natural language queries, which is capable of retrieving gigabytes of Chinese texts very efficiently and intelligently.

...read moreread less

Abstract: This paper presents an efficient signature file approach for fast and intelligent retrieval of large Chinese full-text document databases. The proposed approach is an integrated and efficient text access method, which performs well both in exact match searching of Boolean queries and best match searching (ranking) of quasi-natural language queries. Using this approach, the inherent difficulties of Chinese word segmentation and proper noun identification can be effectively reduced, queries can be expressed with non-controlled vocabulary, and the ranking function can be easily implemented neither demanding extra space overhead nor affecting the retrieval efficiency. The experimental results show that the proposed approach achieves good performance in many ways, especially in the reduction of false drops and space overhead, the speedup of retrieval time, and the capability of best match searching using quasi-natural language queries. In conclusion, the proposed approach is capable of retrieving gigabytes of Chinese texts very efficiently and intelligently.

...read moreread less

53 citations

Proceedings Article•

A Study on Position Information in Document Summarization

[...]

You Ouyang¹, Wenjie Li¹, Qin Lu¹, Renxian Zhang¹•Institutions (1)

Hong Kong Polytechnic University¹

23 Aug 2010

TL;DR: An extractive summarization model is proposed to provide an evaluation framework for the position information and results show that word position information is more effective and adaptive than sentence position information.

...read moreread less

Abstract: Position information has been proved to be very effective in document summarization, especially in generic summarization. Existing approaches mostly consider the information of sentence positions in a document, based on a sentence position hypothesis that the importance of a sentence decreases with its distance from the beginning of the document. In this paper, we consider another kind of position information, i.e., the word position information, which is based on the ordinal positions of word appearances instead of sentence positions. An extractive summarization model is proposed to provide an evaluation framework for the position information. The resulting systems are evaluated on various data sets to demonstrate the effectiveness of the position information in different summarization tasks. Experimental results show that word position information is more effective and adaptive than sentence position information.

...read moreread less

53 citations

Journal Article•DOI•

An R&D knowledge management method for patent document summarization

[...]

Amy J.C. Trappey¹, Charles V. Trappey•Institutions (1)

National Tsing Hua University¹

21 Mar 2008-Industrial Management and Data Systems

TL;DR: An automatic patent summarization method for accurate knowledge abstraction and effective R&D knowledge management combining the concepts of key phrase recognition and significant information density is developed.

...read moreread less

Abstract: Purpose – In an era of rapidly expanding digital content, the number of e‐documents and the amount of knowledge frequently overwhelm the R&D teams and often impede intellectual property management. The purpose of this paper is to develop an automatic patent summarization method for accurate knowledge abstraction and effective R&D knowledge management.Design/methodology/approach – This paper develops an integrated approach for automatic patent summary generation combining the concepts of key phrase recognition and significant information density. Significant information density is defined based on the domain‐specific key concepts/phrases, relevant phrases, title phrases, indicator phrases and topic sentences of a given patent document.Findings – The document compression ratio and the knowledge retention ratio are used to measure both quantitative and qualitative outcomes of the new summarization methodology. Both measurements indicate the significant benefits and superior results of the method.Research lim...

...read moreread less

53 citations

Posted Content•

Summary Transfer: Exemplar-based Subset Selection for Video Summarization

[...]

Ke Zhang¹, Wei-Lun Chao¹, Fei Sha¹, Kristen Grauman²•Institutions (2)

University of Southern California¹, University of Texas at Austin²

10 Mar 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: A novel subset selection technique that leverages supervision in the form of humancreated summaries to perform automatic keyframe-based video summarization, and shows how to extend the method to exploit semantic side information about the video's category/ genre to guide the transfer process by those training videos semantically consistent with the test input.

...read moreread less

Abstract: Video summarization has unprecedented importance to help us digest, browse, and search today's ever-growing video collections. We propose a novel subset selection technique that leverages supervision in the form of human-created summaries to perform automatic keyframe-based video summarization. The main idea is to nonparametrically transfer summary structures from annotated videos to unseen test videos. We show how to extend our method to exploit semantic side information about the video's category/genre to guide the transfer process by those training videos semantically consistent with the test input. We also show how to generalize our method to subshot-based summarization, which not only reduces computational costs but also provides more flexible ways of defining visual similarity across subshots spanning several frames. We conduct extensive evaluation on several benchmarks and demonstrate promising results, outperforming existing methods in several settings.

...read moreread less

53 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics