scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
01 Dec 2015
TL;DR: This study evaluated the existing methods of automatic document summarization system and proposed two approaches in English documents that are based on Latent semantic analysis, which compared the performance of the systems with existing systems in the literature.
Abstract: In this study we have evaluated the existing methods of automatic document summarization system and we proposed two approaches in English documents that are based on Latent semantic analysis. Summary selection four existing and two proposed methods for automatic summarization are also used. The evaluated methods that are used include Gong and Liu, Steinberger and Jezek, Murray, Renal & Chaletta, Cross approach and the proposed methods are avesvd and ravesvd. Latent semantic analysis (LSA) is a technique that uses vectorial semantics, for analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA brings out latent relationships within a collection of documents rather than looking at each document isolated from the others. It looks at all the documents as a whole and the terms they contain to identify relationships between them. We have compared the performance of our systems with existing systems in the literature which was developed for this document summarization. The document set used for evaluation of our system is the Document Understanding Conferences (DUC) datasets are document summaries on corpus DUC-2002 and 2004. The evaluation and comparisons of the summaries are performed with ROUGE-L.

4 citations

Proceedings Article
13 Jul 2008
TL;DR: This system, NetSum, is the first system to produce highlights of an article and significantly outperform the baseline, and uses novel information sources to exploit human interest for highlight extraction.
Abstract: As the amount of information on the Web grows, the ability to retrieve relevant information quickly and easily is necessary. The combination of ample news sources on the Web, little time to browse news, and smaller mobile devices motivates the development of automatic highlight extraction from single news articles. Our system, NetSum, is the first system to produce highlights of an article and significantly outperform the baseline. Our approach uses novel information sources to exploit human interest for highlight extraction. In this paper, we briefly describe the novelties of NetSum, originally presented at EMNLP 2007, and embed our work in the AI context.

4 citations

01 Jan 2012
TL;DR: Text summarization technique is designed for the documents having the fixed format by analyzing all the different parts of the documents and generates the summary of the fixedformat documents by analyzingall the different part of the papers.
Abstract: The rapid growth of online information has encumbered the user with colossal amount of information. It is difficult to access large amount of data. This problem has increased the research in the field of automatic text summarization. Automatic text summarization is a technique where the text is input to the computer and it returns the clipped and concise extract of the original text and also sustains the overall meaning and main information content. In this paper, text summarization technique is designed for the documents having the fixed format. The proposed system generates the summary of the fixed format documents by analyzing all the different parts of the documents. The system consists of five stages. In first stage each sentence is partitioned into the list of tokens and stop words are removed. In second stage, frequency usage is counted for each word. In third stage, assign POS tag for each weighted term and Word sense disambiguation is done. In the fourth stage, pragmatic analysis is performed. After Pragmatic Analysis, summarized sentences will be store in a database.

4 citations

Proceedings ArticleDOI
01 Dec 2013
TL;DR: The proposed system stitches video summary based on summary time span and top-ranked shots that are semantically relevant to the user's preferences that are tailored to the preferences or interests of the users.
Abstract: Although in the past, several automatic video summarization systems had been proposed to generate video summary, a generic summary based only on low-level features will not satisfy every user. As users' needs or preferences for the summary vastly differ for the same video, a unique personalized and customized video summarization system becomes an urgent need nowadays. To address this urgent need, this paper proposes a novel system for generating unique semantically meaningful video summaries for the same video, that are tailored to the preferences or interests of the users. The proposed system stitches video summary based on summary time span and top-ranked shots that are semantically relevant to the user's preferences. The experimental results on the performance of the proposed video summarization system are encouraging.

4 citations

Proceedings Article
01 Jan 2017
TL;DR: This work first formulate the task as multidocument summarization and question-answering tasks given a set of aspects of the review based on an investigation of system summary tables of NLP tasks, and presents a method to address the former type of task.
Abstract: A synthesis matrix is a table that summarizes various aspects of multiple documents. In our work, we specifically examine a problem of automatically generating a synthesis matrix for scientific literature review. As described in this paper, we first formulate the task as multidocument summarization and question-answering tasks given a set of aspects of the review based on an investigation of system summary tables of NLP tasks. Next, we present a method to address the former type of task. Our system consists of two steps: sentence ranking and sentence selection. In the sentence ranking step, the system ranks sentences in the input papers by regarding aspects as queries. We use LexRank and also incorporate query expansion and word embedding to compensate for tersely expressed queries. In the sentence selection step, the system selects sentences that remain in the final output. Specifically emphasizing the summarization type aspects, we regard this step as an integer linear programming problem with a special type of constraint imposed to make summaries comparable. We evaluated our system using a dataset we created from the ACL Anthology. The results of manual evaluation demonstrated that our selection method using comparability improved

4 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852