scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Patent
27 Jun 2008
TL;DR: In this paper, a multi-method summarization program including instructions for summarizing data for a transaction processing system is presented, where at least one functional aspect of the system for which a summarization of a subset of the data is desired is determined.
Abstract: A method of summarizing data includes providing a multi-method summarization program including instructions for summarizing data for a transaction processing system. At least one functional aspect of the transaction processing system for which a summarization of a subset of the data is desired is determined. The functional subset to a user as a light summarization program is exposed. The dependencies of the functional subset can be enforced at runtime allowing packaging flexibility. A method for efficient parallel processing involving not necessarily filled requests for help.

7 citations

Journal Article
TL;DR: The state-of-the-art in automatic systems techniques is explored, but also a comparison with human summarization activity is compared.
Abstract: Focused Multi-Document Summarization (MDS) is concerned with summarizing documents in a collection with a concentration toward a particular external request (i.e. query, question, topic, etc.), or focus. Although the current state-of-the-art provides somewhat decent performance for DUC/TAC-like evaluations (i.e. government and news concerns), other considerations need to be explored. This paper not only briefly explores the state-of-the-art in automatic systems techniques, but also a comparison with human summarization activity.

7 citations

Proceedings ArticleDOI
23 Aug 2008
TL;DR: Results show that summaries generated using this approach are considerably better than those produced by an up-to-date latent semantic analysis (LSA) summarization method and suggest that humans prefer summaries restricted to the information conveyed in the input source.
Abstract: Speech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We propose the inclusion of related, solid background information to cope with the difficulties of summarizing spoken language and the use of multi-document summarization techniques in single document speech-to-text summarization. In this work, we explore the possibilities offered by phonetic information to select the background information and conduct a perceptual evaluation to better assess the relevance of the inclusion of that information. Results show that summaries generated using this approach are considerably better than those produced by an up-to-date latent semantic analysis (LSA) summarization method and suggest that humans prefer summaries restricted to the information conveyed in the input source.

7 citations

Journal ArticleDOI
TL;DR: A series of algorithms including building MRS, multi-document information fusion based MRS and summarization generation are proposed and the capability of concurrently fuse multiple knowledge sources of MRS strategies shows good result.
Abstract: A Multi-document Rhetorical Structure (MRS) is proposed for multi-document automatic summarization task. In this structure, interrelationship between text units, including the correlation between units calculated by hierarchical topic tree, the rhetorical relationship and temporal relationship, were represented at different levels of granularity. MRS simplified traditional multi-document representation in cross structure theory and supplement change and distribution information of events topics which cannot be obtained in information fusion theory. Concretely, a series of algorithms including building MRS, multi-document information fusion based MRS and summarization generation are proposed. The capability of concurrently fuse multiple knowledge sources of MRS strategies is testified by sets of experiments and shows good result.

7 citations

Journal ArticleDOI
TL;DR: In this article, the authors propose a method to solve the problem of the lack of a suitable training set for a teacher-student set-up, which is a problem in the context of education.
Abstract: 本論文では, ユーザの要約要求を反映するためにユーザとのインタラクションを導入した複数文書要約手法を提案する. 従来, 文書自動要約とは, 主として1つの文書から1つの要約を自動的に生成する技術であった. しかしながら, 人間が知的な活動を行うためには1つの文書の要約を生成するよりも, ある事柄に関連した複数の文書から1つの要約を生成する技術 (すなわち, 複数文書要約技術) の方がより重要である. なぜなら, 人間の限られた情報処理能力では, 検索結果などで得られた関連した多くの文書を読むのに多大な時間が必要であり, それらを1つの要約にまとめることで読む時間の大幅な削減が可能になるからである. しかしながら, 一般にユーザごとに興味のある情報が異なるため, ユーザによって必要な情報が異なる. そこで, 本論文では, 複数文書要約においてユーザが知りたい情報を“要約要求”と定義し, ユーザの要約要求を考慮しそれに適合した要約を生成できる複数文書要約手法を提案する. 具体的には, 要約対象となる, ある事柄に関連した複数文書からその事柄に関連のあるキーワードを抽出しユーザに提示する. ユーザは提示されたキーワードから要約要求に適したキーワードを選択する. その選択されたキーワードによって生成される要約が変化する.提案した要約手法の評価のために, 国立情報学研究所主催の, 検索と要約の評価のためのワークショップNTCIR4における要約タスク (TSC3) に参加した. その結果, 複数文書要約タスクにおいて良好な成績を得ることができた. なお, TSC3へは, 本システムによってスコア付けされたキーワードのうち上位12個を必ず選択するように変更することで, ユーザとのインタラクションを行わない複数文書要約システムとして参加した. また, ユーザとのインタラクションによる複数文書要約への効果を評価し, 提案手法の有効性を確認した.

6 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852