Showing papers on "Multi-document summarization published in 2013"

PDF

Open Access

Proceedings Article•DOI•

Large-Scale Video Summarization Using Web-Image Priors

[...]

Aditya Khosla, Raffay Hamid¹, Chih-Jen Lin², Neel Sundaresan¹•Institutions (2)

23 Jun 2013

TL;DR: This work applies novel insight to develop a summarization algorithm that uses the web-image based prior information in an unsupervised manner and proposes a framework that relies on multiple summaries obtained through crowdsourcing to automatically evaluate summarization algorithms on a large scale.

...read moreread less

Abstract: Given the enormous growth in user-generated videos, it is becoming increasingly important to be able to navigate them efficiently. As these videos are generally of poor quality, summarization methods designed for well-produced videos do not generalize to them. To address this challenge, we propose to use web-images as a prior to facilitate summarization of user-generated videos. Our main intuition is that people tend to take pictures of objects to capture them in a maximally informative way. Such images could therefore be used as prior information to summarize videos containing a similar set of objects. In this work, we apply our novel insight to develop a summarization algorithm that uses the web-image based prior information in an unsupervised manner. Moreover, to automatically evaluate summarization algorithms on a large scale, we propose a framework that relies on multiple summaries obtained through crowdsourcing. We demonstrate the effectiveness of our evaluation framework by comparing its performance to that of multiple human evaluators. Finally, we present results for our framework tested on hundreds of user-generated videos.

...read moreread less

318 citations

Journal Article•DOI•

Assessing sentence scoring techniques for extractive text summarization

[...]

Rafael Ferreira¹, Luciano Cabral¹, Rafael Dueire Lins¹, Gabriel de França Pereira e Silva¹, Fred Freitas¹, George D. C. Cavalcanti¹, Rinaldo Lima¹, Steven J. Simske², Luciano Favaro² - Show less +5 more•Institutions (2)

Federal University of Pernambuco¹, Hewlett-Packard²

15 Oct 2013-Expert Systems With Applications

TL;DR: A quantitative and qualitative assessment of 15 algorithms for sentence scoring available in the literature are described and directions to improve the sentence extraction results obtained are suggested.

...read moreread less

Abstract: Text summarization is the process of automatically creating a shorter version of one or more text documents. It is an important way of finding relevant information in large text libraries or in the Internet. Essentially, text summarization techniques are classified as Extractive and Abstractive. Extractive techniques perform text summarization by selecting sentences of documents according to some criteria. Abstractive summaries attempt to improve the coherence among sentences by eliminating redundancies and clarifying the contest of sentences. In terms of extractive summarization, sentence scoring is the technique most used for extractive text summarization. This paper describes and performs a quantitative and qualitative assessment of 15 algorithms for sentence scoring available in the literature. Three different datasets (News, Blogs and Article contexts) were evaluated. In addition, directions to improve the sentence extraction results obtained are suggested.

...read moreread less

278 citations

Journal Article•DOI•

Multi‐document summarization of evaluative text

[...]

Giuseppe Carenini¹, Jackie Chi Kit Cheung², Adam Pauls³•Institutions (3)

University of British Columbia¹, University of Toronto², University of California, Berkeley³

01 Nov 2013

TL;DR: In this paper, the authors presented a framework for summarizing a corpus of evaluative documents about a single entity by a natural language summary, which can be used to generate summaries tailored to a model of the user preferences.

...read moreread less

Abstract: In many decision-making scenarios, people can benefit from knowing what other people's opinions are. As more and more evaluative documents are posted on the Web, summarizing these useful resources becomes a critical task for many organizations and individuals. This paper presents a framework for summarizing a corpus of evaluative documents about a single entity by a natural language summary. We propose two summarizers: an extractive summarizer and an abstractive one. As an additional contribution, we show how our abstractive summarizer can be modified to generate summaries tailored to a model of the user preferences that is solidly grounded in decision theory and can be effectively elicited from users. We have tested our framework in three user studies. In the first one, we compared the two summarizers. They performed equally well relative to each other quantitatively, while significantly outperforming a baseline standard approach to multidocument summarization. Trends in the results as well as qualitative comments from participants suggest that the summarizers have different strengths and weaknesses. After this initial user study, we realized that the diversity of opinions expressed in the corpus (i.e., its controversiality) might play a critical role in comparing abstraction versus extraction. To clearly pinpoint the role of controversiality, we ran a second user study in which we controlled for the degree of controversiality of the corpora that were summarized for the participants. The outcome of this study indicates that for evaluative text abstraction tends to be more effective than extraction, particularly when the corpus is controversial. In the third user study we assessed the effectiveness of our user tailoring strategy. The results of this experiment confirm that user tailored summaries are more informative than untailored ones.

...read moreread less

202 citations

Book Chapter•DOI•

Automatic Text Summarization: Past, Present and Future

[...]

Horacio Saggion¹, Thierry Poibeau²•Institutions (2)

Pompeu Fabra University¹, École Normale Supérieure²

01 Jan 2013

TL;DR: This paper gives a short overview of summarization methods and evaluation and the number of interesting summarization topics being proposed in different contexts by end users.

...read moreread less

Abstract: Automatic text summarization, the computer-based production of condensed versions of documents, is an important technology for the information society. Without summaries it would be practically impossible for human beings to get access to the ever growing mass of information available online. Although research in text summarization is over 50 years old, some efforts are still needed given the insufficient quality of automatic summaries and the number of interesting summarization topics being proposed in different contexts by end users (“domain-specific summaries”, “opinion-oriented summaries”, “update summaries”, etc.). This paper gives a short overview of summarization methods and evaluation.

...read moreread less

157 citations

Patent•

Distributed high performance analytics store

[...]

David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang

31 Jan 2013

TL;DR: In this article, a search head is associated with one more indexers containing event records, and queries directed towards summarizing and reporting on event records may be received at the search head.

...read moreread less

Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

...read moreread less

153 citations

Proceedings Article•DOI•

Evaluating source code summarization techniques: Replication and expansion

[...]

Brian P. Eddy¹, Jeffrey Robinson¹, Nicholas A. Kraft¹, Jeffrey C. Carver¹•Institutions (1)

University of Alabama¹

20 May 2013

TL;DR: A new topic modeling based approach to source code summarization is proposed, and via a study of 14 developers, source code summaries generated using the proposed technique are evaluated.

...read moreread less

Abstract: During software evolution a developer must investigate source code to locate then understand the entities that must be modified to complete a change task. To help developers in this task, Haiduc et al. proposed text summarization based approaches to the automatic generation of class and method summaries, and via a study of four developers, they evaluated source code summaries generated using their techniques. In this paper we propose a new topic modeling based approach to source code summarization, and via a study of 14 developers, we evaluate source code summaries generated using the proposed technique. Our study partially replicates the original study by Haiduc et al. in that it uses the objects, the instruments, and a subset of the summaries from the original study, but it also expands the original study in that it includes more subjects and new summaries. The results of our study both support the findings of the original and provide new insights into the processes and criteria that developers use to evaluate source code summaries. Based on our results, we suggest future directions for research on source code summarization.

...read moreread less

123 citations

Proceedings Article•

A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

[...]

Lu Wang¹, Hema Raghavan², Vittorio Castelli², Radu Florian², Claire Cardie¹ - Show less +1 more•Institutions (2)

Cornell University¹, IBM²

01 Aug 2013

TL;DR: This work considers the problem of using sentence compression techniques to facilitate queryfocused multi-document summarization, and presents a sentence-compression-based framework, and designs a series of learning-based compression models built on parse trees.

...read moreread less

Abstract: We consider the problem of using sentence compression techniques to facilitate queryfocused multi-document summarization. We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees. An innovative beam search decoder is proposed to efficiently find highly probable compressions. Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function. Our best model achieves statistically significant improvement over the state-of-the-art systems on several metrics (e.g. 8.0% and 5.4% improvements in ROUGE-2 respectively) for the DUC 2006 and 2007 summarization task.

...read moreread less

123 citations

Proceedings Article•

Towards Coherent Multi-Document Summarization

[...]

Janara Christensen¹, Stephen Soderland¹, Oren Etzioni¹•Institutions (1)

University of Washington¹

01 Jun 2013

TL;DR: G-FLOW is evaluated on Mechanical Turk, and it is found that it generates dramatically better summaries than an extractive summarizer based on a pipeline of state-of-the-art sentence selection and reordering components, underscoring the value of the joint model.

...read moreread less

Abstract: This paper presents G-FLOW, a novel system for coherent extractive multi-document summarization (MDS). 1 Where previous work on MDS considered sentence selection and ordering separately, G-FLOW introduces a joint model for selection and ordering that balances coherence and salience. G-FLOW’s core representation is a graph that approximates the discourse relations across sentences based on indicators including discourse cues, deverbal nouns, co-reference, and more. This graph enables G-FLOW to estimate the coherence of a candidate summary. We evaluate G-FLOW on Mechanical Turk, and find that it generates dramatically better summaries than an extractive summarizer based on a pipeline of state-of-the-art sentence selection and reordering components, underscoring the value of our joint model.

...read moreread less

112 citations

Proceedings Article•

Automatic Summarization of Events from Social Media

[...]

Freddy Chong Tat Chua¹, Sitaram Asur²•Institutions (2)

Singapore Management University¹, Hewlett-Packard²

28 Jun 2013

TL;DR: This work proposes a search and summarization framework to extract relevant representative tweets from a time-ordered sample of tweets to generate a coherent and concise summary of an event.

...read moreread less

Abstract: Social media services such as Twitter generate phenomenal volume of content for most real-world events on a daily basis. Digging through the noise and redundancy to understand the important aspects of the content is a very challenging task. We propose a search and summarization framework to extract relevant representative tweets from a time-ordered sample of tweets to generate a coherent and concise summary of an event. We introduce two topic models that take advantage of temporal correlation in the data to extract relevant tweets for summarization. The summarization framework has been evaluated using Twitter data on four real-world events. Evaluations are performed using Wikipedia articles on the events as well as using Amazon Mechanical Turk (MTurk) with human readers (MTurkers). Both experiments show that the proposed models outperform traditional LDA and lead to informative summaries.

...read moreread less

104 citations

Journal Article•DOI•

GraphSum: Discovering correlations among multiple terms for graph-based summarization

[...]

Elena Baralis¹, Luca Cagliero¹, Naeem Ahmed Mahoto¹, Alessandro Fiori•Institutions (1)

Polytechnic University of Turin¹

10 Nov 2013-Information Sciences

TL;DR: A novel and general-purpose graph-based summarizer is presented, namely G raph S um (Graph-based Summarizer), which discovers and exploits association rules to represent the correlations among multiple terms that have been neglected by previous approaches.

...read moreread less

103 citations

Journal Article•DOI•

Multiple documents summarization based on evolutionary optimization algorithm

[...]

Rasim M. Alguliev¹, Ramiz M. Aliguliyev¹, Nijat R. Isazade¹•Institutions (1)

Azerbaijan National Academy of Sciences¹

01 Apr 2013-Expert Systems With Applications

TL;DR: Experimental results provide strong evidence that the proposed optimization-based approach is a viable method for document summarization and an improved differential evolution algorithm is created to solve the optimization problem.

...read moreread less

Abstract: This paper proposes an optimization-based model for generic document summarization. The model generates a summary by extracting salient sentences from documents. This approach uses the sentence-to-document collection, the summary-to-document collection and the sentence-to-sentence relations to select salient sentences from given document collection and reduce redundancy in the summary. To solve the optimization problem has been created an improved differential evolution algorithm. The algorithm can adjust crossover rate adaptively according to the fitness of individuals. We implemented the proposed model on multi-document summarization task. Experiments have been performed on DUC2002 and DUC2004 data sets. The experimental results provide strong evidence that the proposed optimization-based approach is a viable method for document summarization.

...read moreread less

Proceedings Article•DOI•

Sentiment Analysis and Summarization of Twitter Data

[...]

Seyed Ali Bahrainian¹, Andreas Dengel¹•Institutions (1)

Kaiserslautern University of Technology¹

03 Dec 2013

TL;DR: A novel solution to target-oriented sentiment summarization and SA of short informal texts with a main focus on Twitter posts known as "tweets" is introduced and it is shown that the hybrid polarity detection system not only outperforms the unigram state-of-the-art baseline, but also could be an advantage over other methods when used as a part of a sentiment summarizing system.

...read moreread less

Abstract: Sentiment Analysis (SA) and summarization has recently become the focus of many researchers, because analysis of online text is beneficial and demanded in many different applications. One such application is product-based sentiment summarization of multi-documents with the purpose of informing users about pros and cons of various products. This paper introduces a novel solution to target-oriented (i.e. aspect-based) sentiment summarization and SA of short informal texts with a main focus on Twitter posts known as "tweets". We compare different algorithms and methods for SA polarity detection and sentiment summarization. We show that our hybrid polarity detection system not only outperforms the unigram state-of-the-art baseline, but also could be an advantage over other methods when used as a part of a sentiment summarization system. Additionally, we illustrate that our SA and summarization system exhibits a high performance with various useful functionalities and features.

...read moreread less

Journal Article•DOI•

SumView: A Web-based engine for summarizing product reviews and customer opinions

[...]

Dingding Wang¹, Shenghuo Zhu, Tao Li¹•Institutions (1)

Florida International University¹

01 Jan 2013-Expert Systems With Applications

TL;DR: SumView is developed, a Web-based review summarization system, to automatically extract the most representative expressions and customer opinions in the reviews on various product features by selecting the most Representative review sentences for each extracted product feature.

...read moreread less

Abstract: In this paper, we develop SumView, a Web-based review summarization system, to automatically extract the most representative expressions and customer opinions in the reviews on various product features. Different from existing review analysis which makes more efforts on sentiment classification and opinion mining, our system mainly focuses on summarization, i.e., delivering the majority of information contained in the review documents by selecting the most representative review sentences for each extracted product feature. Comprehensive case studies and experiments demonstrate the effectiveness of our system, and the user study shows users' satisfaction.

...read moreread less

A Bottom-Up Approach to Sentence Ordering for Multi-Document Summarization.

[...]

Danushka Bollegala¹, Naoaki Okazaki¹, Mitsuru Ishizuka¹•Institutions (1)

University of Tokyo¹

01 Jan 2013

TL;DR: This article proposed a bottom-up approach to arrange sentences extracted for multi-document summarization, where chronology, topical-closeness, precedence, and succession are integrated into a criterion by a supervised learning approach.

...read moreread less

Abstract: Ordering information is a difficult but important task for applications generating natural language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is selected from a set of source documents. However, improper ordering of information in a summary can confuse the reader and deteriorate the readability of the summary. Therefore, it is vital to properly order the information in multi-document summarization. We present a bottom-up approach to arrange sentences extracted for multi-document summarization. To capture the association and order of two textual segments (e.g. sentences), we define four criteria: chronology, topical-closeness, precedence, and succession. These criteria are integrated into a criterion by a supervised learning approach. We repeatedly concatenate two textual segments into one segment based on the criterion, until we obtain the overall segment with all sentences arranged. We evaluate the sentence orderings produced by the proposed method and numerous baselines using subjective gradings as well as automatic evaluation measures. We introduce the average continuity, an automatic evaluation measure of sentence ordering in a summary, and investigate its appropriateness for this task.

...read moreread less

Journal Article•DOI•

Scene-Based Movie Summarization Via Role-Community Networks

[...]

Tsai Chia-Ming¹, Li-Wei Kang², Chia-Wen Lin³, Weisi Lin⁴•Institutions (4)

National Chung Cheng University¹, National Yunlin University of Science and Technology², National Tsing Hua University³, Nanyang Technological University⁴

01 Nov 2013-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A two-stage scene-based movie summarization method based on mining the relationship between role-communities, which achieves better subjective performance than attention-based and role-based summarization methods in terms of semantic content preservation for a movie summary.

...read moreread less

Abstract: Video summarization techniques aim at condensing a full-length video to a significantly shortened version that still preserves the major semantic content of the original video. Movie summarization, being a special class of video summarization, is particularly challenging since a large variety of movie scenarios and film styles complicate the problem. In this paper, we propose a two-stage scene-based movie summarization method based on mining the relationship between role-communities since the role-communities in earlier scenes are usually used to develop the role relationship in later scenes. In the analysis stage, we construct a social network to characterize the interactions between role-communities. As a result, the social power of each role-community is evaluated by the community's centrality value and the role communities are clustered into relevant groups based on the centrality values. In the summarization stage, a set of feasible summary combinations of scenes is identified and an information-rich summary is selected from these candidates based on social power preservation. Our evaluation results show that in at most test cases the proposed method achieves better subjective performance than attention-based and role-based summarization methods in terms of semantic content preservation for a movie summary.

...read moreread less

Proceedings Article•DOI•

Towards Twitter context summarization with user influence models

[...]

Yi Chang¹, Xuanhui Wang², Qiaozhu Mei³, Yan Liu⁴•Institutions (4)

Yahoo!¹, Facebook², University of Michigan³, University of Southern California⁴

04 Feb 2013

TL;DR: This work studies how user influence models, which project user interaction information onto a Twitter context tree, can help Twitter context summarization within a supervised learning framework and shows that pairwise user influence signals can significantly improve the task performance.

...read moreread less

Abstract: Twitter has become one of the most popular platforms for users to share information in real time. However, as an individual tweet is short and lacks sufficient contextual information, users cannot effectively understand or consume information on Twitter, which can either make users less engaged or even detached from using Twitter. In order to provide informative context to a Twitter user, we propose the task of Twitter context summarization, which generates a succinct summary from a large but noisy Twitter context tree. Traditional summarization techniques only consider text information, which is insufficient for Twitter context summarization task, since text information on Twitter is very sparse. Given that there are rich user interactions in Twitter, we thus study how to improve summarization methods by leveraging such signals. In particular, we study how user influence models, which project user interaction information onto a Twitter context tree, can help Twitter context summarization within a supervised learning framework. To evaluate our methods, we construct a data set by asking human editors to manually select the most informative tweets as a summary. Our experimental results based on this editorial data set show that Twitter context summarization is a promising research topic and pairwise user influence signals can significantly improve the task performance.

...read moreread less

Journal Article•DOI•

Automatic Twitter Topic Summarization With Speech Acts

[...]

Renxian Zhang, Wenjie Li, Dehong Gao, You Ouyang

01 Mar 2013-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: Evaluated on two 100-topic datasets, the summaries generated by the novel speech act-guided summarization approach outperform two kinds of representative extractive summaries and rival human-written summaries in terms of explanatoriness and informativeness.

...read moreread less

Abstract: With the growth of the social media service of Twitter, automatic summarization of Twitter messages (tweets) is in urgent need for efficient processing of the massive tweeted information. Unlike multi-document summarization in general, Twitter topic summarization must handle the numerous, short, dissimilar, and noisy nature of tweets. To address this challenge, we propose a novel speech act-guided summarization approach in this work. Speech acts characterize tweeters' communicative behavior and provide an organized view of their messages. Speech act recognition is a multi-class classification problem, which we solve by using word-based and symbol-based features that capture both the linguistic features of speech acts and the particularities of Twitter text. The recognized speech acts in tweets are then used to direct the extraction of key words and phrases to fill in templates designed for speech acts. Leveraging high-ranking words and phrases as well as topic information for major speech acts, we propose a round-robin algorithm to generate template-based summaries. Different from the extractive method adopted in most previous works, our summarization method is abstractive. Evaluated on two 100-topic datasets, the summaries generated by our method outperform two kinds of representative extractive summaries and rival human-written summaries in terms of explanatoriness and informativeness.

...read moreread less

Journal Article•DOI•

Multi-document summarization based on the Yago ontology

[...]

Elena Baralis¹, Luca Cagliero¹, Saima Jabeen¹, Alessandro Fiori, Sajid Shah¹ - Show less +1 more•Institutions (1)

Polytechnic University of Turin¹

01 Dec 2013-Expert Systems With Applications

TL;DR: A novel summarizer, namely Yago-based Summarizer, that relies on an ontology-based evaluation and selection of the document sentences and an established entity recognition and disambiguation step based on the Yago ontology is integrated into the summarization process.

...read moreread less

Abstract: Sentence-based multi-document summarization is the task of generating a succinct summary of a document collection, which consists of the most salient document sentences. In recent years, the increasing availability of semantics-based models (e.g., ontologies and taxonomies) has prompted researchers to investigate their usefulness for improving summarizer performance. However, semantics-based document analysis is often applied as a preprocessing step, rather than integrating the discovered knowledge into the summarization process. This paper proposes a novel summarizer, namely Yago-based Summarizer, that relies on an ontology-based evaluation and selection of the document sentences. To capture the actual meaning and context of the document sentences and generate sound document summaries, an established entity recognition and disambiguation step based on the Yago ontology is integrated into the summarization process. The experimental results, which were achieved on the DUC'04 benchmark collections, demonstrate the effectiveness of the proposed approach compared to a large number of competitors as well as the qualitative soundness of the generated summaries.

...read moreread less

Journal Article•DOI•

Editorial: COMPENDIUM: A text summarization system for generating abstracts of research papers

[...]

Elena Lloret¹, María Teresa Romá-Ferri¹, Manuel Palomar¹•Institutions (1)

University of Alicante¹

01 Nov 2013

TL;DR: Results show that extractive and abstractive-oriented summaries perform similarly as far as the information they contain, so both approaches are able to keep the relevant information of the source documents, but the latter is more appropriate from a human perspective, when a user satisfaction assessment is carried out.

...read moreread less

Abstract: This article analyzes the appropriateness of a text summarization system, COMPENDIUM, for generating abstracts of biomedical papers. Two approaches are suggested: an extractive (COMPENDIUM"E), which only selects and extracts the most relevant sentences of the documents, and an abstractive-oriented one (COMPENDIUM"E"-"A), thus facing also the challenge of abstractive summarization. This novel strategy combines extractive information, with some pieces of information of the article that have been previously compressed or fused. Specifically, in this article, we want to study: i) whether COMPENDIUM produces good summaries in the biomedical domain; ii) which summarization approach is more suitable; and iii) the opinion of real users towards automatic summaries. Therefore, two types of evaluation were performed: quantitative and qualitative, for evaluating both the information contained in the summaries, as well as the user satisfaction. Results show that extractive and abstractive-oriented summaries perform similarly as far as the information they contain, so both approaches are able to keep the relevant information of the source documents, but the latter is more appropriate from a human perspective, when a user satisfaction assessment is carried out. This also confirms the suitability of our suggested approach for generating summaries following an abstractive-oriented paradigm.

...read moreread less

Book Chapter•DOI•

Updating users about time critical events

[...]

Qi Guo¹, Fernando Diaz¹, Elad Yom-Tov¹•Institutions (1)

Microsoft¹

24 Mar 2013

TL;DR: This work presents the problem of updating users about time critical news events, and proposes a solution which incorporates techniques from information retrieval and multi-document summarization and introduces an evaluation method which is significantly less expensive than traditional approaches to temporal summarization.

...read moreread less

Abstract: During unexpected events such as natural disasters, individuals rely on the information generated by news outlets to form their understanding of these events. This information, while often voluminous, is frequently degraded by the inclusion of unimportant, duplicate, or wrong information. It is important to be able to present users with only the novel, important information about these events as they develop. We present the problem of updating users about time critical news events, and focus on the task of deciding which information to select for updating users as an event develops. We propose a solution to this problem which incorporates techniques from information retrieval and multi-document summarization and evaluate this approach on a set of historic events using a large stream of news documents. We also introduce an evaluation method which is significantly less expensive than traditional approaches to temporal summarization.

...read moreread less

Journal Article•DOI•

The effectiveness of automatic text summarization in mobile learning contexts

[...]

Guangbing Yang¹, Nian-Shing Chen², Kinshuk³, Erkki Sutinen¹, Terry Anderson³, Dunwei Wen³ - Show less +2 more•Institutions (3)

University of Eastern Finland¹, National Sun Yat-sen University², Athabasca University³

01 Oct 2013-Computer Education

TL;DR: The findings of this work indicate that properly summarized learning content is not only able to satisfy learning achievements, but also able to align content size with the unique characteristics and affordances of mobile devices.

...read moreread less

Abstract: Mobile learning benefits from the unique merits of mobile devices and mobile technology to give learners capability to access information anywhere and anytime. However, mobile learning also has many challenges, especially in the processing and delivery of learning content. With the aim of making the learning content suitable for the mobile environment, this study investigates automatic text summarization to provide a tool set that reduces the quantity of textual content for mobile learning support. Text summarization is used to condense texts into the most important ideas. However, reducing the amount of content transmitted may negatively impact the meaning conveyed within. Although many solutions of text summarization have been applied by intelligent tutoring systems for learning support, few of them have been quantitatively investigated for learning achievements of learners, especially in mobile learning context. This study focuses on a methodology for investigating the effectiveness of automatic text summarization used in mobile learning context. The experimental results demonstrate that our proposed summarization approach is able to generate summaries effectively, and those generated summaries are perceived as helpful to support mobile learning. The findings of this work indicate that properly summarized learning content is not only able to satisfy learning achievements, but also able to align content size with the unique characteristics and affordances of mobile devices.

...read moreread less

Journal Article•DOI•

Ranking Through Clustering: An Integrated Approach to Multi-Document Summarization

[...]

Xiaoyan Cai¹, Wenjie Li•Institutions (1)

Northwest A&F University¹

01 Jul 2013-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: A novel approach that directly generates clusters integrated with ranking is proposed that is demonstrated by both the cluster quality analysis and the summarization evaluation conducted on the DUC 2004-2007 datasets.

...read moreread less

Abstract: Multi-document summarization aims to create a condensed summary while retaining the main characteristics of the original set of documents. Under such background, sentence ranking has hitherto been the issue of most concern. Since documents often cover a number of topic themes with each theme represented by a cluster of highly related sentences, sentence clustering has been explored in the literature in order to provide more informative summaries. For each topic theme, the rank of terms conditional on this topic theme should be very distinct, and quite different from the rank of terms in other topic themes. Existing cluster-based summarization approaches apply clustering and ranking in isolation, which leads to incomplete, or sometimes rather biased, analytical results. A newly emerged framework uses sentence clustering results to improve or refine the sentence ranking results. Under this framework, we propose a novel approach that directly generates clusters integrated with ranking in this paper. The basic idea of the approach is that ranking distribution of sentences in each cluster should be quite different from each other, which may serve as features of clusters and new clustering measures of sentences can be calculated accordingly. Meanwhile, better clustering results can achieve better ranking results. As a result, ranking and clustering by mutually and simultaneously updating each other so that the performance of both can be improved. The effectiveness of the proposed approach is demonstrated by both the cluster quality analysis and the summarization evaluation conducted on the DUC 2004-2007 datasets.

...read moreread less

Proceedings Article•DOI•

A Survey of Extractive and Abstractive Text Summarization Techniques

[...]

Vipul Dalal, Latesh Malik

16 Dec 2013

TL;DR: This paper intends to investigate techniques and methods used by researchers for automatic text summarization, with special attention paid to Bio-inspired methods for text summarizing.

...read moreread less

Abstract: The existence of the World Wide Web has caused an information explosion. Readers are overloaded with lengthy text documents where a shorter version would suffice. All computer users, be it professionals or novice users, are particularly affected by this predicament. There exists an urgent need for the discovery of knowledge embedded in digital documents. This paper intends to investigate techniques and methods used by researchers for automatic text summarization. Special attention is paid to Bio-inspired methods for text summarization.

...read moreread less

Journal Article•DOI•

Literature review writing: how information is selected and transformed

[...]

Kokil Jaidka¹, Christopher S. G. Khoo¹, Jin-Cheon Na¹•Institutions (1)

Nanyang Technological University¹

10 Nov 2013-Aslib Proceedings

TL;DR: A study of researchers' preferences in selecting information from cited papers to include in a literature review, and the kinds of transformations and editing applied to the selected information.

...read moreread less

Abstract: Purpose – This paper aims to report a study of researchers' preferences in selecting information from cited papers to include in a literature review, and the kinds of transformations and editing applied to the selected information.Design/methodology/approach – This is a part of a larger project to develop an automatic summarization method that emulates human literature review writing behaviour. Research questions were: how are literature reviews written – where do authors select information from, what types of information do they select and how do they transform it? What is the relationship between styles of literature review (integrative and descriptive) and each of these variables (source sections, types of information and types of transformation)? The authors analysed the literature review sections of 20 articles from the Journal of the American Society for Information Science and Technology, 2001‐2008, to answer these questions. Referencing sentences were mapped to 279 source papers to determine the s...

...read moreread less

Journal Article•DOI•

Automatic Single Document Text Summarization Using Key Concepts in Documents

[...]

Kamal Sarkar

01 Dec 2013-Journal of Information Processing Systems

TL;DR: The proposed method of text summarization chooses a subset of sentences from a document that maximizes the important concepts in the final summary and outperforms the existing systems to which it is compared.

...read moreread less

Abstract: Many previous research studies on extractive text summarization consider a subset of words in a document as keywords and use a sentence ranking function that ranks sentences based on their similarities with the list of extracted keywords. But the use of key concepts in automatic text summarization task has received less attention in literature on summarization. The proposed work uses key concepts identified from a document for creating a summary of the document. We view single-word or multi-word keyphrases of a document as the important concepts that a document elaborates on. Our work is based on the hypothesis that an extract is an elaboration of the important concepts to some permissible extent and it is controlled by the given summary length restriction. In other words, our method of text summarization chooses a subset of sentences from a document that maximizes the important concepts in the final summary. To allow diverse information in the summary, for each important concept, we select one sentence that is the best possible elaboration of the concept. Accordingly, the most important concept will contribute first to the summary, then to the second best concept, and so on. To prove the effectiveness of our proposed summarization method, we have compared it to some state-of-the art summarization systems and the results show that the proposed method outperforms the existing systems to which it is compared.

...read moreread less

Proceedings Article•

Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain

[...]

Jackie Chi Kit Cheung¹, Gerald Penn¹•Institutions (1)

University of Toronto¹

01 Aug 2013

TL;DR: A series of studies comparing human-written model summaries to system summaries at the semantic level of caseframes suggest that substantial improvements are unlikely to result from better optimizing centrality-based criteria, but rather more domain knowledge is needed.

...read moreread less

Abstract: In automatic summarization, centrality is the notion that a summary should contain the core parts of the source text. Current systems use centrality, along with redundancy avoidance and some sentence compression, to produce mostly extractive summaries. In this paper, we investigate how summarization can advance past this paradigm towards robust abstraction by making greater use of the domain of the source text. We conduct a series of studies comparing human-written model summaries to system summaries at the semantic level of caseframes. We show that model summaries (1) are more abstractive and make use of more sentence aggregation, (2) do not contain as many topical caseframes as system summaries, and (3) cannot be reconstructed solely from the source text, but can be if texts from in-domain documents are added. These results suggest that substantial improvements are unlikely to result from better optimizing centrality-based criteria, but rather more domain knowledge is needed.

...read moreread less

Journal Article•

Literature Review of Automatic Multiple Documents Text Summarization

[...]

Md. Majharul Haque¹, Suraiya Pervin¹, Zerina Begum²•Institutions (2)

University of Dhaka¹, Institute of Information Technology, University of Dhaka²

02 May 2013-International Journal of Innovation and Applied Studies

TL;DR: For the blessing of World Wide Web, the corpus of online information is gigantic in its volume and search engines have been developed to retrieve specific information from this huge amount of data but the outcome of search engine is unable to provide expected result.

...read moreread less

Abstract: For the blessing of World Wide Web, the corpus of online information is gigantic in its volume. Search engines have been developed such as Google, AltaVista, Yahoo, etc., to retrieve specific information from this huge amount of data. But the outcome of search engine is unable to provide expected result as the quantity of information is increasing enormously day by day and the findings are abundant. So, the automatic text summarization is demanded for salient information retrieval. Automatic text summarization is a system of summarizing text by computer where a text is given to the computer as input and the output is a shorter and less redundant form of the original text. An informative pr

...read moreread less

Journal Article•DOI•

Improving readability through extractive summarization for learners with reading difficulties

[...]

K. Nandhini, S. R. Balasundaram

01 Nov 2013-Egyptian Informatics Journal

TL;DR: The design and evaluation of extractive summarization approach to assist the learners with reading difficulties and the results show significant improvement in readability for the target audience using assistive summary.

...read moreread less

Journal Article•DOI•

Rhetorics-based multi-document summarization

[...]

John Atkinson¹, Ricardo Munoz¹•Institutions (1)

University of Concepción¹

01 Sep 2013-Expert Systems With Applications

TL;DR: A new multi-document summarization framework which combines rhetorical roles and corpus-based semantic analysis is proposed which is able to capture the semantic and rhetorical relationships between sentences so as to combine them to produce coherent summaries.

...read moreread less

Abstract: In this paper, a new multi-document summarization framework which combines rhetorical roles and corpus-based semantic analysis is proposed. The approach is able to capture the semantic and rhetorical relationships between sentences so as to combine them to produce coherent summaries. Experiments were conducted on datasets extracted from web-based news using standard evaluation methods. Results show the promise of our proposed model as compared to state-of-the-art approaches.

...read moreread less

Journal Article•DOI•

Exploiting relevance, coverage, and novelty for query-focused multi-document summarization

[...]

Wenjuan Luo¹, Fuzhen Zhuang¹, Qing He¹, Zhongzhi Shi¹•Institutions (1)

Chinese Academy of Sciences¹

01 Jul 2013-Knowledge Based Systems

TL;DR: A novel Probabilistic-modeling Relevance, Coverage, and Novelty (PRCN) framework is proposed, which exploits a reference topic model incorporating user query for dependent relevance measurement and topic coverage is also modeled under this framework.

...read moreread less

Abstract: Summarization plays an increasingly important role with the exponential document growth on the Web. Specifically, for query-focused summarization, there exist three challenges: (1) how to retrieve query relevant sentences; (2) how to concisely cover the main aspects (i.e., topics) in the document; and (3) how to balance these two requests. Specially for the issue relevance, many traditional summarization techniques assume that there is independent relevance between sentences, which may not hold in reality. In this paper, we go beyond this assumption and propose a novel Probabilistic-modeling Relevance, Coverage, and Novelty (PRCN) framework, which exploits a reference topic model incorporating user query for dependent relevance measurement. Along this line, topic coverage is also modeled under our framework. To further address the issues above, various sentence features regarding relevance and novelty are constructed as features, while moderate topic coverage are maintained through a greedy algorithm for topic balance. Finally, experiments on DUC2005 and DUC2006 datasets validate the effectiveness of the proposed method.

...read moreread less

Collapse