Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Generating extractive sentiment summaries for natural language user queries on products

[...]

Siqi Gao, Yiu-Kai Dennis Ng

01 Jun 2022-ACM Sigapp Applied Computing Review

TL;DR: A fully-automated summarizer that compiles comprehensive reviews by extracting important facets and sentiment information based on various sentence features rather than applying complex machine learning algorithms is proposed.

...read moreread less

Abstract: Multi-document sentiment analysis is an important natural language processing problem. Summaries generated by these analyzers can greatly reduce the time necessary to read a collection of topically-related documents to locate the desired information needs of a user. With the ever-increasing globalization and technology of the modern day, analysis of online user reviews on different products is an especially pertinent application of the aforementioned problem. At present there are way too many user reviews on popular products for potential buyers to spend adequate time to read and extract the most salient product details and opinions of previous buyers. In solving this problem, we propose a fully-automated summarizer to reduce the workload of online customers. The proposed system takes a user query and extracts the most relevant and essential comments made by individual reviewers. As opposed to existing multi-document summarization approaches, our summarizer compiles comprehensive reviews by extracting important facets and sentiment information based on various sentence features rather than applying complex machine learning algorithms. The design of our summarizer is easy to understand and implement, without the required massive training data and excessive training time. The conducted empirical study shows that the proposed summarization system outperforms current state-of-the-art multi-document sentiment summarization approaches.

...read moreread less

3 citations

Proceedings Article•DOI•

Nutri-bullets Hybrid: Consensual Multi-document Summarization

[...]

Darsh J Shah¹, Lili Yu¹, Tao Lei¹, Regina Barzilay¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jun 2021

TL;DR: Compared to conventional methods, the hybrid generation approach inspired by traditional concept-to-text systems leads to more faithful, relevant and aggregation-sensitive summarization – while being equally fluent.

...read moreread less

Abstract: We present a method for generating comparative summaries that highlight similarities and contradictions in input documents. The key challenge in creating such summaries is the lack of large parallel training data required for training typical summarization systems. To this end, we introduce a hybrid generation approach inspired by traditional concept-to-text systems. To enable accurate comparison between different sources, the model first learns to extract pertinent relations from input documents. The content planning component uses deterministic operators to aggregate these relations after identifying a subset for inclusion into a summary. The surface realization component lexicalizes this information using a text-infilling language model. By separately modeling content selection and realization, we can effectively train them with limited annotations. We implemented and tested the model in the domain of nutrition and health – rife with inconsistencies. Compared to conventional methods, our framework leads to more faithful, relevant and aggregation-sensitive summarization – while being equally fluent.

...read moreread less

3 citations

Journal Article•DOI•

DR-LINK in TIPSTER III

[...]

Elizabeth D. Liddy¹, Ted Diamond¹, Mary McKenna²•Institutions (2)

Syracuse University¹, University of Rochester²

01 Dec 2000-Information Retrieval

TL;DR: Experimental results show that there is potential for improving retrieval through query-specific fusion and that analysts found the Detailed Multiple Document Summary to be extremely useful for almost every query, while the Thumbnail sketch was useful in approximately 50% of the queries.

...read moreread less

Abstract: A Natural Language Processing based Information Retrieval System that was one of the original systems developed in Phase I of TIPSTER, was the basis of research in TIPSTER III the goal of which was to add two extended capabilities to the core system. Following a description of the multiple levels of linguistic processing that were developed for the original DR-LINK System, details are provided on research into query-specific data fusion and query-specific cross-document summarization. Experimental results show that there is potential for improving retrieval through query-specific fusion and that analysts found the Detailed Multiple Document Summary to be extremely useful for almost every query, while the Thumbnail sketch was useful in approximately 50% of the queries.

...read moreread less

3 citations

Journal Article•DOI•

Web document segmentation using frequent term sets for summarization

[...]

Chitra Pasupathi¹, Baskaran Ramachandran², Sarukesi Karunakaran³•Institutions (3)

RMK Engineering College¹, Anna University², Hindustan University³

19 Dec 2012-Journal of Computer Science

TL;DR: Performance of this query sensitive summarization system is more promising than other measures like cosine similarity, jaccard measure which make use of sparse term-frequent vectors, since the most frequent term sets are consider ed to measure the relevance.

...read moreread less

Abstract: Query sensitive summarization aims at extracting th e query relevant contents from web documents. Web page segmentation focuses on reducing the run time overhead of the summarization systems by grouping the related contents of a web page into segments. A t query time, query relevant segments of the web pa ge are identified and important sentences from these s egments are extracted to compose the summary. DOM tree structures of the web documents are utilized t o perform the segmentation of the contents. Leaf no des of DOM tress are merged to form segments according to the statistical and linguistic similarity measur e. The proposed system has been evaluated by intrinsic approach making use of user satisfaction index. Th e performance of the system is compared with summarization without using preprocessed segments. Performance of this system is more promising than t he other measures like cosine similarity, jaccard measure which make use of sparse term-frequent vectors, since the most frequent term sets are consider ed to measure the relevance. Relevant segments alone n eed to be processed at run time for summarization which reduces the time complexity of the summarization process.

...read moreread less

3 citations

Book Chapter•DOI•

Investigating summarization techniques for geo-tagged image indexing

[...]

Ahmet Aker¹, Xin Fan², Mark Sanderson³, Robert Gaizauskas¹•Institutions (3)

University of Sheffield¹, Yahoo!², RMIT University³

01 Apr 2012

TL;DR: This paper investigates how various summarization techniques affect image retrieval performance and shows significant improvements can be obtained when using the summaries for indexing.

...read moreread less

Abstract: Images with geo-tagging information are increasingly available on the Web. However, such images need to be annotated with additional textual information if they are to be retrievable, since users do not search by geo-coordinates. We propose to automatically generate such textual information by (1) generating toponyms from the geo-tagging information (2) retrieving Web documents using toponyms as queries (3) summarizing the retrieved documents. The summaries are then used to index the images. In this paper we investigate how various summarization techniques affect image retrieval performance and show significant improvements can be obtained when using the summaries for indexing.

...read moreread less

3 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics