scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Book ChapterDOI
27 Nov 2006
TL;DR: The results show that the hierarchical summarization of multiple documents organized in hierarchical structure outperforms other multi-document summarization systems without using the hierarchical structure.
Abstract: Hierarchical summarization technique summarizes a large document based on the hierarchical structure and salient features of the document. Previous study has shown that hierarchical summarization is a promising technique which can effectively extract the most important information from the source document. Hierarchical summarization has been extended to summarization of multiple documents. Three hierarchical structures were proposed to organize a set of related documents. This paper investigates the impact of document structure on hierarchical summarization. The results show that the hierarchical summarization of multiple documents organized in hierarchical structure outperforms other multi-document summarization systems without using the hierarchical structure. Moreover, the hierarchical summarization by event topics extracts a set of sentences significantly different from hierarchical summarization of other hierarchical structures and performs the best when the summary is highly-compressed.

5 citations

Proceedings ArticleDOI
26 Jun 2016
TL;DR: The proposed mining system is able to cope with news written in multiple languages, generates multiple-level summaries covering specific and high-level concepts in separate sections, on behalf of users with different skill levels, and ranks the summary content based on both objective and subjective quality indices.
Abstract: In today's world, plenty of textual news on stock markets written in different languages are available for traders, financial promoters, and private investors. However, their potential in supporting trading in multiple foreign markets is limited by the large volume of the textual corpora, which is practically unmanageable for manual inspection. Although, text mining and information retrieval techniques allow the automatic generation of interesting summaries from document collections, the study and application of multilingual summarization algorithms to financial news is still an open research problem. This paper addresses the summarization of collections of financial documents written in different languages to enhance the financial actor's awareness of foreign markets. Specifically, the proposed mining system (i) is able to cope with news written in multiple languages, (ii) generates multiple-level summaries covering specific and high-level concepts in separate sections, on behalf of users with different skill levels, and (iii) ranks the summary content based on both objective and subjective quality indices. These features are taking an increasingly important role in financial data summarization. As a case study, a preliminary implementation of the proposed system has been presented and validated on real multilingual news ranging over stocks of different markets. The preliminary results show the effectiveness and usability of the proposed approach.

5 citations

01 Jan 2004
TL;DR: A multiple-document summarization system with user interaction that summarizes more than one document to a document and showsk best keywords with respect to scoring by the system to a user on the screen.
Abstract: We propose a multiple-document summarization system with user interaction that summarizes more than one document to a document. Our system extracts keywords from sets of documents to be summarized and showsk best keywords with respect to scoring by our system to a user on the screen. From the shown keywords, the user selects those reflecting the user’s summarization need. Our system controls the produced summary by using these selected keywords. For evaluation of our method, we participated in TSC3 of NTCIR4 workshop by letting our system select allk

5 citations

Journal ArticleDOI
TL;DR: A deep sub modular network (DSN) is introduced, which is a deep network meeting submodularity characteristics that lets modular and submodular features to participate in constructing a tailored model that fits the best with a problem.
Abstract: Employing deep learning makes it possible to learn high-level features from raw data, resulting in more precise models. On the other hand, submodularity makes the solution scalable and provides the means to guarantee a lower bound for its performance. In this paper, a deep submodular network (DSN) is introduced, which is a deep network meeting submodularity characteristics. DSN lets modular and submodular features to participate in constructing a tailored model that fits the best with a problem. Various properties of DSN are examined and its learning method is presented. By proving that cost function used for learning process is a convex function, it is concluded that minimization can be done in polynomial time and also, by choosing a suitable learning rate and performing enough iterations, a lower empirical error can be ensured. Finally, in order to demonstrate the applicability of DSN for real-world problems, automatic multi-document summarization is considered and a summarizer called DSNSum is introduced. Then, the performance of DSNSum is compared with the state-of-the-art summarizers based on DUC 2004 and CNN/DailyMail corpora. The experimental results show that the performance of the proposed summarizer is comparable with the state-of-the-art methods.

5 citations

25 Jul 2011
TL;DR: The first steps toward improving summarization of scientific documents through citation analysis and parsing are presented, and it is demonstrated that confidence scores from the Stanford NLP Parser are significantly improved, and that Trimmer, a sentence-compression tool, is able to generate higher-quality candidates.
Abstract: In this paper we present the first steps toward improving summarization of scientific documents through citation analysis and parsing. Prior work (Mohammad et al., 2009) argues that citation texts (sentences that cite other papers) play a crucial role in automatic summarization of a topical area, but did not take into account the noise introduced by the citations themselves. We demonstrate that it is possible to improve summarization output through careful handling of these citations. We base our experiments on the application of an improved trimming approach to summarization of citation texts extracted from Question-Answering and Dependency-Parsing documents. We demonstrate that confidence scores from the Stanford NLP Parser (Klein and Manning, 2003) are significantly improved, and that Trimmer (Zajic et al., 2007), a sentence-compression tool, is able to generate higher-quality candidates. Our summarization output is currently used as part of a larger system, Action Science Explorer (ASE) (Gove, 2011).

4 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852