scispace - formally typeset

Automatic summarization

About: Automatic summarization is a(n) research topic. Over the lifetime, 16442 publication(s) have been published within this topic receiving 404404 citation(s). The topic is also known as: text summarization & summarization. more


Open accessJournal ArticleDOI: 10.1093/BIOINFORMATICS/BTT656
Yang Liao1, Gordon K. Smyth1, Wei Shi1Institutions (1)
01 Apr 2014-Bioinformatics
Abstract: MOTIVATION: Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. RESULTS: We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. AVAILABILITY AND IMPLEMENTATION: featureCounts is available under GNU General Public License as part of the Subread ( or Rsubread ( software packages. more

8,495 Citations

Open accessBook
Bo Pang1, Lillian Lee2Institutions (2)
08 Jul 2008-
Abstract: An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided. more

7,180 Citations

Open accessProceedings Article
Chin-Yew Lin1Institutions (1)
25 Jul 2004-
Abstract: ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. The measures count the number of overlapping units such as n-gram, word sequences, and word pairs between the computer-generated summary to be evaluated and the ideal summaries created by humans. This paper introduces four different ROUGE measures: ROUGE-N, ROUGE-L, ROUGE-W, and ROUGE-S included in the ROUGE summarization evaluation package and their evaluations. Three of them have been used in the Document Understanding Conference (DUC) 2004, a large-scale summarization evaluation sponsored by NIST. more

6,572 Citations

Open accessProceedings ArticleDOI: 10.1145/1014052.1014073
Minqing Hu1, Bing Liu1Institutions (1)
22 Aug 2004-
Abstract: Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques. more

  • Figure 6: Infrequent feature extraction
    Figure 6: Infrequent feature extraction
  • Figure 7: Predicting the orientations of opinion sentences
    Figure 7: Predicting the orientations of opinion sentences
  • Table 1: Recall and precision at each step of feature generation
    Table 1: Recall and precision at each step of feature generation
  • Table 2: Recall and precision of FASTR
    Table 2: Recall and precision of FASTR
  • Table 3: Results of opinion sentence extraction and sentence orientation prediction
    Table 3: Results of opinion sentence extraction and sentence orientation prediction
  • + 2

6,565 Citations

Open accessJournal ArticleDOI: 10.1037/0033-295X.85.5.363
Abstract: The semantic structure of texts can be described both at the local microlevel and at a more global macrolevel A model for text comprehension based on this notion accounts for the formation of a coherent semantic text base in terms of a cyclical process constrained by limitations of working memory Furthermore, the model includes macro-operators, whose purpose is to reduce the information in a text base to its gist, that is, the theoretical macrostructure These operations are under the control of a schema, which is a theoretical formulation of the comprehender's goals The macroprocesses are predictable only when the control schema can be made explicit On the production side, the model is concerned with the generation of recall and summarization protocols This process is partly reproductive and partly constructive, involving the inverse operation of the macro-operators The model is applied to a paragraph from a psychological research report, and methods for the empirical testing of the model are developed more

4,629 Citations

No. of papers in the topic in previous years

Top Attributes

Show by:

Topic's top 5 most impactful authors

Wenjie Li

76 papers, 2.2K citations

Dragomir R. Radev

60 papers, 4K citations

Kathleen R. McKeown

53 papers, 4.6K citations

Xiaojun Wan

50 papers, 2.4K citations

Mirella Lapata

44 papers, 3.7K citations

Network Information
Related Topics (5)

18.7K papers, 434K citations

88% related
Question answering

14K papers, 375.4K citations

87% related
Information extraction

14.3K papers, 295.1K citations

87% related
Query expansion

17.5K papers, 452.7K citations

87% related
Sentiment analysis

22.1K papers, 460.8K citations

86% related