scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
08 Dec 2016
TL;DR: An open source framework to create an extendable multilingual corpus for abstractive single document summarization and describes a tool consisted of a scalable crawler and a centralized key-value store database to construct a corpus of an arbitrary size using a news aggregator service.
Abstract: ive single document summarization is considered as a challenging problem in the field of artificial intelligence and natural language processing. Meanwhile and specifically in the last two years, several deep learning summarization approaches were proposed that once again attracted the attention of researchers to this field.It is a well-known issue that deep learning approaches do not work well with small amounts of data. With some exceptions, this is, unfortunately, the case for most of the datasets available for the summarization task. Besides this problem, it should be considered that phonetic, morphological, semantic and syntactic features of the language are constantly changing over the time and unfortunately most of the summarization corpora are constructed from old resources. Another problem is the language of the corpora. Not only in the summarization field, but also in other fields of natural language processing, most of the corpora are only available in English. In addition to the above problems, license terms, and fees of the corpora are obstacles that prevent many academics and specifically non-academics from accessing these data.This work describes an open source framework to create an extendable multilingual corpus for abstractive single document summarization that addresses the above-mentioned problems. We describe a tool consisted of a scalable crawler and a centralized key-value store database to construct a corpus of an arbitrary size using a news aggregator service.

2 citations

01 Jan 2002
TL;DR: An automatic navigation method that follows users’ requirements for Web sites is proposed that detects the essential pages in the site and generates an ordered list of pages that present results by applying page linkage information.
Abstract: In recent years, the growth of the WWW has rapidly expanded and it’s not always easy to retrieve information. Search engines are very useful, but they treat each page as an independent document. In this paper, we propose an automatic navigation method that follows users’ requirements for Web sites. This method detects the essential pages in the site and generates an ordered list of pages that present results by applying page linkage information. Furthermore, we have designed and implemented a prototype system based on our proposed method.

2 citations

Proceedings ArticleDOI
27 May 2014
TL;DR: An unsupervised graph based ranking model for text summarization that can appease readers with vision difficulties while keep them updated is introduced.
Abstract: Automated text summarization can be applied as an assistive tool for people with vision deficiency as well as with language understanding or attention deficit disorders. In this paper, we introduce an unsupervised graph based ranking model for text summarization. Our model builds a graph by collecting words, and their lexical relationships from the document. We apply a handful of available semantic information (definition, sentimental polarity) of words to enhance edge-weights (interconnectivity) between nodes (words). After applying a polarity based ranking algorithm over the graph we collect a subset of high-ranked and low-ranked words, name those as keywords. We, then, extract sentences that possess a higher rank defined by the rank vector of keywords. Sentences extracted in this manner correlate with each other and express the summary of the document quite successfully. Summaries formed by our model can appease readers with vision difficulties while keep them updated.

2 citations

01 Jan 2010
TL;DR: The proposed approach was competitively better as compared to state of MEAD summarizer at focused compression ratios and investigation carried out from an average of 22 documents shows that the system is promising.
Abstract: Multi document summarization has very great impact among research community, ever since the growth of online information and availability. Selecting most important sentences from such huge repository of data is quiet tricky and challenging task. While multi document poses some additional overhead in sentence selection, generating summaries for each individual documents and merging the sentences in a coherent order would greater strength. The proposed approach was competitively better as compared to state of MEAD summarizer at focused compression ratios. This paper focus on three different studies namely i. To find the performance of multi document summarizer from single document cluster (using MEAD) ii. Comparison of our approach with MEAD performance for the dataset considered iii. To extract sentences for multi document summarization at 30% compression rate to obtain 100% efficiency using 7-point summary sheet. Investigation carried out from an average of 22 documents shows that our system is promising.

2 citations

Journal ArticleDOI
TL;DR: This paper describes a novel ontology construction approach based on Natural Language Processing (NLP) and Knowledge Representation techniques to facilitate finding important sentences for automatic summarization that is more flexible and less domain dependent than other traditional statistical-based ontologyConstruction methods.
Abstract: The increased presence of unconventional data sources on the Internet in many fields have led to an exponential increase in the amount of information available. Even though search engines are used to filter the search results, a multitude of unwanted results remain. Automatic summarization has become important in speeding up the information retrieval process. However, automatic summarization usually needs a good ontology, which is not always available for all domains. This paper describes a novel ontology construction approach based on Natural Language Processing (NLP) and Knowledge Representation techniques to facilitate finding important sentences for automatic summarization. This approach is more flexible and less domain dependent than other traditional statistical-based ontology construction methods. We outline an ontology construction that improves the results of automatic summarization.

2 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852