scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
31 Jul 2000
TL;DR: The method proposed is based on spreading activation over documents syntactically and scmantically annotated with GDA (Global Document Annotation) tags, which extracts important documents and important parts therein, and creates a network consisting of important entities and relations among them.
Abstract: Summarization of multiple documents featuring multiple topics is discussed. The example treated here consists of fifty articles about the Peru hostage incident for December 1996 through April 1997. They include a lot of topics such as opening, negotiation, ending, and so on. The method proposed in this paper is based on spreading activation over documents syntactically and scmantically annotated with GDA (Global Document Annotation) tags. The method extracts important documents and important parts therein, and creates a network consisting of important entities and relations among them. It also identifies cross-document coreferences to replace expressions with more concrete ones. The method is essentially multilingual due to the language-independence of the GDA tagset. This tagset can provide a standard format for the study on the transformation and/or generation stage of summarization process, among other natural language processing tasks.

3 citations

Proceedings ArticleDOI
TL;DR: This paper proposes an automatic video summary evaluation algorithm adapted from teh text summarization domain and addresses the topics of rigorous summary evaluation methods and clear understanding of user needs, obtained through user centered design.
Abstract: Automatic video summarization has become an active research topic in content-based video processing. However, not much emphasis has been placed on developing rigorous summary evaluation methods and developing summarization systems based on a clear understanding of user needs, obtained through user centered design. In this paper we address these two topics and propose an automatic video summary evaluation algorithm adapted from teh text summarization domain.

3 citations

Book ChapterDOI
26 Oct 2012
TL;DR: A Chinese summarization method based on Affinity Propagation clustering and latent semantic analysis and LSA, which is a technique in natural language processing of analyzing relationships between a set of sentences, is proposed.
Abstract: As the rapid development of the internet, we can collect more and more information. it also means we need the abitily to search the information which really useful to us from the amount of information quickly. Automatic summarization is useful to us for handling the huge amount of text information in the Web. This paper proposes a Chinese summarization method based on Affinity Propagation(AP)clustering and latent semantic analysis(LSA). AP is a new clustering algorithm raised by B. J. Frey on science in 2007 that takes as input measures of similarity between pairs of data points and simultaneously considers all data points as potential exemplars. LSA is a technique in natural language processing, in particular in vectorial semantics, of analyzing relationships between a set of sentences. Experiment results show that our method could get more comprehensive and high-quality summarization.

3 citations

01 Jan 2004
TL;DR: For summarizing multiple documents translated by a machine translator, this work identifies important sentences, and detect redundancy using an improved term-weighting method, and assigns weights to words, using syntactic information.
Abstract: We try to summarize multiple documents translated from Japanese to Korean in TSC3. For summarizing multiple documents translated by a machine translator, we identify important sentences, and detect redundancy using an improved term-weighting method. It assigns weights to words, using syntactic information. According to the score of the extracted sentence, we choose sentences, and map them to Japanese sentences in original documents. Finally, we arrange them in chronological order, and report them as the result of our system. We submitted both a short and long type of summary, and the evaluation of our results showed the possibility of cross-language multidocument summarization.

3 citations

Proceedings ArticleDOI
01 Dec 2007
TL;DR: This paper investigates the use of information from relevant documents retrieved from a contemporary text collection for each sentence of a spoken document to be summarized in a probabilistic generative framework for extractive spoken document summarization.
Abstract: Extractive summarization usually automatically selects indicative sentences from a document according to a certain target summarization ratio, and then sequences them to form a summary. In this paper, we investigate the use of information from relevant documents retrieved from a contemporary text collection for each sentence of a spoken document to be summarized in a probabilistic generative framework for extractive spoken document summarization. In the proposed methods, the probability of a document being generated by a sentence is modeled by a hidden Markov model (HMM), while the retrieved relevant text documents are used to estimate the HMM's parameters and the sentence's prior probability. The results of experiments on Chinese broadcast news compiled in Taiwan show that the new methods outperform the previous HMM approach.

3 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852