scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Journal Article
TL;DR: The method of news sentence extraction, sentence importance calculation, and redundancy reducing is introduced and experimental results show that summarization is good enough for practical application.
Abstract: News special topic in Web site has plentiful pages.People can get main information rapidly by automatic multidocument summarization.A method which uses time stamp to improve sentence extraction quality is presented.The method of news sentence extraction,sentence importance calculation,and redundancy reducing is introduced.Experimental results show that summarization is good enough for practical application.

1 citations

Journal Article
TL;DR: Chinese multi-document summarization based on topic model is a new attempt and experimental results show that the performance is a clear superiority over the traditional method under the proposed evaluation scheme.
Abstract: Multi-document summarization can help people access to information automatically and fast.Chinese multi-document summarization based on topic model is a new attempt.The LDA(Latent Dirichlet Allocation)model is a multi-level generative probabilistic model,can detect the topic distribution of the document.In the method,it models the document using LDA,then calculates the distance between a sentence and the given multi-documents via their topic probability distributions as the weight of the sentence.The paper extracts sentences according to the weight of the sentence.Experimental results show that the performance is a clear superiority over the traditional method under the proposed evaluation scheme.

1 citations

01 Jan 1998
TL;DR: This research will test an evaluation method and metric to compare human assessments with machine output of newstext multiple document summaries and uncover useful metrics and evaluation variables that can be used by other research efforts in this area.
Abstract: This paper describes an ongoing research effort to produce multiple document summaries in response to information requests. Given the absence of tools to evaluate multiple document summaries, this research will test an evaluation method and metric to compare human assessments with machine output of newstext multiple document summaries. Using the DR-LINK information retrieval and analysis system, components of documents and metadata generated during document processing become candidates tbr use in multiple document summaries. This research is sponsored by the U.S. Government through the Tipster Phase Ill Text Summarization project. TextWise is a participant in the Tipster Phase III Text Summarization project funded by the U.S. Government. Our research objective is to produce high quality multiple document summaries. An established set of metrics to evaluate the performance of our production of multiple document summaries is not available at present. Therefore, this research effort is also concerned with developing a procedure to evaluate the summaries we create. We hope that we will uncover useful metrics and evaluation variables that can be used by other research efforts in this area. The lack of automatic summarization evaluation tools is directly connected to the need for a comprehensive description of the different types of summaries possible. Automatic text summarization can mean many different things. The summary may be addressing a need of an information seeker (query dependent summary) or it may be independent of any specified information need (generic summary). The summary may represent a single document (single document summary) or a group documents (multiple document summary). The summary may be an extract of sentences or pieces of text from a document (extract summary) or it may not use any of the actual wording from the source documents (generated text summary). Finally, the summary may provide general overview of document contents (indicative summary), or it may act as a substitute for the actual document (informative summary). This terminology will be used though out this report in an attempt to clarify and define the various possible outcomes of automatic text summarization.

1 citations

Proceedings ArticleDOI
04 Jun 2007
TL;DR: The summarization method reported in this paper attempts to use relationship between query terms to produce an extractive summary and a graphical overview, in the form of a conceptual graph, of the document content with respect to a standing query.
Abstract: An investigation into the usefulness of the query based document summarization in the context of information retrieval has been made. The summarization method reported in this paper attempts to use relationship between query terms to produce an extractive summary and a graphical overview, in the form of a conceptual graph, of the document content with respect to a standing query. The graphical summary will be quite useful for short queries, e.g. queries comprising of person names only, which is quite common in a newswire domain. The conceptual graph produced by the system reflects how query terms appearing in a retrieved document are related with other terms. The evaluation scheme focuses on measuring the effectiveness of the summary in making user's relevance decision. Subject-based evaluation is proposed for the system. The initial investigation suggests that our approach is more indicative of source's content than the lead-based approach.

1 citations

01 Jan 2008
TL;DR: This paper presents the goals, results and conclusions from an experiment where several shallow text summarization methods have been applied to news articles written in Polish and focused on various techniques of salient sentence selection.
Abstract: This paper presents the goals, results and conclusions from an experiment where several shallow text summarization methods have been applied to news articles written in Polish. Specifically, we focused on various techniques of salient sentence selection as these algorithms are most popular in the English-spoken world and are highly efficient in practice. The quality of automatically generated summaries was evaluated by comparing them against a reference set of man-made summaries. This reference set of summaries is a valuable resource on its own as it comes from an unrestricted survey where user groups from different backgrounds had been asked to select “most appropriate” sentences to form a summary.

1 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852