scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Journal Article
TL;DR: MDF simplifies traditional multi-document representation in cross structure theory and simultaneously, supplements change and distribution informations of events topics which cannot be obtained in information fusion theory are obtained.
Abstract: A Multiple Documents Framework(MDF) is proposed for multi-document automatic summarization taskBy representing interrelationship between text units at different levels of granularity and the happen and change of various events at time dimension,this framework can achieve information fusion of multi-document while reserve original information of set of related documentsMDF simplifies traditional multi-document representation in cross structure theory and simultaneously,supplements change and distribution informations of events topics which cannot be obtained in information fusion theoryConcretely,a series of algorithms including building MDF,multi-document information fusion based MDF and summarization generation are proposedThe capability of concurrently fusing multiple knowledge sources of MDF strategies is testified by experiments in 32 different sets of net documents and shows good results

7 citations

Book ChapterDOI
23 Jan 2016
TL;DR: A novel method for automatic extractive multi-document summarization, based on bicliques in the bipartite word-sentence occurrence graph, which is particularly suited for collections of very short, independently written texts often single sentences with many repeated phrases.
Abstract: With vast amounts of text being available in electronic format, such as news and social media, automatic multi-document summarization can help extract the most important information. We present and evaluate a novel method for automatic extractive multi-document summarization. The method is purely combinatorial, based on bicliques in the bipartite word-sentence occurrence graph. It is particularly suited for collections of very short, independently written texts often single sentences with many repeated phrases, such as customer reviews of products. The method can run in subquadratic time in the number of documents, which is relevant for the application to large collections of documents.

7 citations

Book ChapterDOI
22 Jun 2006
TL;DR: This paper focuses on getting diverse views for a single event using multi-camera system and deals with the problem of summarizing event sequences collected in the office environment based on this perspective, and confirms that the proposed method yields acceptable results.
Abstract: Recently, research for the summarization of video data has been studied a lot due to the proliferation of user created contents. Besides, the use of multiple cameras for the collection of the video data has been increasing, but most of them have used the multi-camera system either to cover the wide area or to track moving objects. This paper focuses on getting diverse views for a single event using multi-camera system and deals with the problem of summarizing event sequences collected in the office environment based on this perspective. Summarization includes camera view selection and event sequence summarization. View selection makes a single event sequence from multiple event sequences as selecting optimal views in each time, for which domain ontology based on the elements in an office environment and rules from questionnaire surveys have been used. Summarization generates a summarized sequence from a whole sequence, and the fuzzy rule-based system is used to approximate human decision making. The degrees of interests input by users are used in both parts. Finally, we have confirmed that the proposed method yields acceptable results using experiments of summarization.

7 citations

01 Jan 2016
TL;DR: This work has employed several sentiment lexicons like SentiWordNet, SenticNet etc. with tabulation based approaches to identify the important query-based iUnits for ranking and summarization.
Abstract: NTCIR-12 MobileClick task has been designed to rank and summarize English queries. The primary aim of this task was to develop a system which is capable of minimizing interaction between the human users and mobile phones while extracting relevant data with respect to given queries. Organizers provided the data represented as information units (iUnits). Each of the iUnits describes a pertinent query associated with other information like type or category, relevance, sense and knowledge-based relations [1] [2] [4]. The task is divided into two sub-tasks namely ranking and summarization. The ranking sub-task focuses on identifying the important iUnits related to a query. In the summarization sub-task, the output has to be designed as a two-layered model where the first layer will identify the important iUnits and the second layer will compile those important iUnits and generate a summarized output for the query. In this present task, we have employed several sentiment lexicons like SentiWordNet, SenticNet etc. with tabulation based approaches to identify the important query-based iUnits for ranking and summarization. Our sense-based system has achieved a score of 0.8859 mean Q-measure for ranking and score of 11.7033 mean M-measure for summarization tasks, respectively.

7 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852