Home
/
Authors
/
Jade Goldstein

Author

Jade Goldstein

Bio: Jade Goldstein is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Automatic summarization & Multi-document summarization. The author has an hindex of 14, co-authored 17 publications receiving 5205 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The use of MMR, diversity-based reranking for reordering documents and producing summaries

[...]

Jaime Carbinell¹, Jade Goldstein¹•Institutions (1)

Carnegie Mellon University¹

01 Aug 1998

TL;DR: A method for combining query-relevance with information-novelty in the context of text retrieval and summarization and preliminary results indicate some benefits for MMR diversity ranking in document retrieval and in single document summarization.

...read moreread less

Abstract: This paper presents a method for combining query-relevance with information-novelty in the context of text retrieval and summarization. The Maximal Marginal Relevance (MMR) criterion strives to reduce redundancy while maintaining query relevance in re-ranking retrieved documents and in selecting apprw priate passages for text summarization. Preliminary results indicate some benefits for MMR diversity ranking in document retrieval and in single document summarization. The latter are borne out by the recent results of the SUMMAC conference in the evaluation of summarization systems. However, the clearest advantage is demonstrated in constructing non-redundant multi-document summaries, where MMR results are clearly superior to non-MMR passage selection.

...read moreread less

2,365 citations

Proceedings Article•DOI•

The Use of MMR and Diversity-Based Reranking for Reodering Documents and Producing Summaries

[...]

Jaime G. Carbonell¹, Jade Goldstein¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1998

TL;DR: The MaximalMarginal Relevance (MMR) criterion as mentioned in this paper aims to reduce redundancy while maintaining query relevance in retrieving retrieved documents and selecting appropriate passages for text summarization.

...read moreread less

Abstract: This paper presents a method for combining query-relevance with information-novelty in the context of text retrieval and summarization. The Maximal Marginal Relevance (MMR) criterion strives to reduce redundancy while maintaining query relevance in re-ranking retrieved documents and in selecting appropriate passages for text summarization. Preliminary results indicate some benefits for MMR diversity ranking in document retrieval and in single document summarization. The latter are borne out by the recent results of the SUMMAC conference in the evaluation of summarization systems. However, the clearest advantage is demonstrated in constructing non-redundant multi-document summaries, where MMR results are clearly superior to non-MMR passage selection.

...read moreread less

1,479 citations

Proceedings Article•DOI•

Summarizing text documents: sentence selection and evaluation metrics

[...]

Jade Goldstein¹, Mark Kantrowitz², Vibhu Mittal², Jaime G. Carbonell¹•Institutions (2)

Carnegie Mellon University¹, Jordan University of Science and Technology²

01 Aug 1999

TL;DR: An analysis of news-article summaries generated by sentence selection, using a normalized version of precision-recall curves with a baseline of random sentence selection to evaluate features and empirical results show the importance of corpus-dependent baseline summarization standards, compression ratios and carefully crafted long queries.

...read moreread less

Abstract: Human-quality text summarization systems are di cult to design, and even more di cult to evaluate, in part because documents can di er along several dimensions, such as length, writing style and lexical usage. Nevertheless, certain cues can often help suggest the selection of sentences for inclusion in a summary. This paper presents our analysis of news-article summaries generated by sentence selection. Sentences are ranked for potential inclusion in the summary using a weighted combination of statistical and linguistic features. The statistical features were adapted from standard IR methods. The potential linguistic ones were derived from an analysis of news-wire summaries. To evaluate these features we use a normalized version of precision-recall curves, with a baseline of random sentence selection, as well as analyze the properties of such a baseline. We illustrate our discussions with empirical results showing the importance of corpus-dependent baseline summarization standards, compression ratios and carefully crafted long queries.

...read moreread less

546 citations

Proceedings Article•DOI•

Multi-document summarization by sentence extraction

[...]

Jade Goldstein¹, Vibhu Mittal², Jaime G. Carbonell¹, Mark Kantrowitz²•Institutions (2)

Carnegie Mellon University¹, Jordan University of Science and Technology²

30 Apr 2000

TL;DR: This paper discusses a text extraction approach to multi- document summarization that builds on single-document summarization methods by using additional, available information about the document set as a whole and the relationships between the documents.

...read moreread less

Abstract: This paper discusses a text extraction approach to multi-document summarization that builds on single-document summarization methods by using additional, available information about the document set as a whole and the relationships between the documents. Multi-document summarization differs from single in that the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Our approach addresses these issues by using domain-independent techniques based mainly on fast, statistical processing, a metric for reducing redundancy and maximizing diversity in the selected passages, and a modular framework to allow easy parameterization for different genres, corpora characteristics and user requirements.

...read moreread less

408 citations

Proceedings Article•DOI•

Interactive graphic design using automatic presentation knowledge

[...]

Steven F. Roth¹, John Kolojejchick¹, Joe Mattis¹, Jade Goldstein¹•Institutions (1)

Carnegie Mellon University¹

24 Apr 1994

TL;DR: SAGE is a knowledge-based presentation system that automatically designs graphics and also interprets a user's specifications conveyed with the other tools, and enhances userdirected design by completing partial specifications.

...read moreread less

Abstract: We present three novel tools for creating data graphics: (1) SageBrush, for assembling graphics from primitive objects like bars, lines and axes, (2) SageBook, for browsing previously created graphics relevant to current needs, and (3) SAGE, a knowledge-based presentation system that automatically designs graphics and also interprets a user's specifications conveyed with the other tools The combination of these tools supports two complementary processes in a single environment: design as a constructive process of selecting and arranging graphical elements, and design as a process of browsing and customizing previous cases SAGE enhances userdirected design by completing partial specifications, by retrieving previously created graphics based on their appearance and data content, by creating the novel displays that users specify, and by designing alternatives when users request them Our approach was to propose interfaces employing styles of interaction that appear to support graphic design Knowledge-based techniques were then applied to enable the interfaces and enhance their usability

...read moreread less

202 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Mining and summarizing customer reviews

[...]

Minqing Hu¹, Bing Liu¹•Institutions (1)

University of Illinois at Chicago¹

22 Aug 2004

TL;DR: This research aims to mine and to summarize all the customer reviews of a product, and proposes several novel techniques to perform these tasks.

...read moreread less

Abstract: Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.

...read moreread less

7,330 citations

Proceedings Article•DOI•

The eyes have it: a task by data type taxonomy for information visualizations

[...]

Ben Shneiderman¹•Institutions (1)

University of Maryland, College Park¹

03 Sep 1996

TL;DR: A task by data type taxonomy with seven data types and seven tasks (overview, zoom, filter, details-on-demand, relate, history, and extracts) is offered.

...read moreread less

Abstract: A useful starting point for designing advanced graphical user interfaces is the visual information seeking Mantra: overview first, zoom and filter, then details on demand. But this is only a starting point in trying to understand the rich and varied set of information visualizations that have been proposed in recent years. The paper offers a task by data type taxonomy with seven data types (one, two, three dimensional data, temporal and multi dimensional data, and tree and network data) and seven tasks (overview, zoom, filter, details-on-demand, relate, history, and extracts).

...read moreread less

5,290 citations

Journal Article•DOI•

LexRank: graph-based lexical centrality as salience in text summarization

[...]

Gunes Erkan¹, Dragomir R. Radev¹•Institutions (1)

University of Michigan¹

01 Jul 2004-Journal of Artificial Intelligence Research

TL;DR: LexRank as discussed by the authors is a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing (NLP), which is based on the concept of eigenvector centrality.

...read moreread less

Abstract: We introduce a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. We test the technique on the problem of Text Summarization (TS). Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Salience is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence. We consider a new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. In this model, a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix of the graph representation of sentences. Our system, based on LexRank ranked in first place in more than one task in the recent DUC 2004 evaluation. In this paper we present a detailed analysis of our approach and apply it to a larger data set including data from earlier DUC evaluations. We discuss several methods to compute centrality using the similarity graph. The results show that degree-based methods (including LexRank) outperform both centroid-based methods and other systems participating in DUC in most of the cases. Furthermore, the LexRank with threshold method outperforms the other degree-based techniques including continuous LexRank. We also show that our approach is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents.

...read moreread less

2,367 citations

Proceedings Article•DOI•

A Diversity-Promoting Objective Function for Neural Conversation Models

[...]

Jiwei Li¹, Michel Galley², Chris Brockett³, Jianfeng Gao³, Bill Dolan³ - Show less +1 more•Institutions (3)

Stanford University¹, Carnegie Mellon University², Microsoft³

01 Mar 2016

TL;DR: The authors proposed using Maximum Mutual Information (MMI) as the objective function in neural models to generate more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets.

...read moreread less

Abstract: Sequence-to-sequence neural network models for generation of conversational responses tend to generate safe, commonplace responses (e.g., I don’t know) regardless of the input. We suggest that the traditional objective function, i.e., the likelihood of output (response) given input (message) is unsuited to response generation tasks. Instead we propose using Maximum Mutual Information (MMI) as the objective function in neural models. Experimental results demonstrate that the proposed MMI models produce more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets and in human evaluations.

...read moreread less

1,812 citations

Journal Article•DOI•

Information visualization and visual data mining

[...]

Daniel A. Keim¹•Institutions (1)

AT&T¹

01 Jan 2002-IEEE Transactions on Visualization and Computer Graphics

TL;DR: This paper proposes a classification of information visualization and visual data mining techniques which is based on the data type to be visualized, the visualization technique, and the interaction and distortion technique.

...read moreread less

Abstract: Never before in history has data been generated at such high volumes as it is today. Exploring and analyzing the vast volumes of data is becoming increasingly difficult. Information visualization and visual data mining can help to deal with the flood of information. The advantage of visual data exploration is that the user is directly involved in the data mining process. There are a large number of information visualization techniques which have been developed over the last decade to support the exploration of large data sets. In this paper, we propose a classification of information visualization and visual data mining techniques which is based on the data type to be visualized, the visualization technique, and the interaction and distortion technique. We exemplify the classification using a few examples, most of them referring to techniques and systems presented in this special section.

...read moreread less

1,759 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse