Home
/
Authors
/
Bruce G. Lindsay

Author

Bruce G. Lindsay

Bio: Bruce G. Lindsay is an academic researcher from IBM. The author has contributed to research in topics: Database design & Relational database management system. The author has an hindex of 48, co-authored 123 publications receiving 8925 citations. Previous affiliations of Bruce G. Lindsay include GlobalFoundries.

Papers published on a yearly basis

2015
2013
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1972

Papers

PDF

Open Access

More filters

Journal Article•DOI•

ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging

[...]

Chandrasekaran Mohan¹, Don Haderle¹, Bruce G. Lindsay¹, Hamid Pirahesh¹, Peter Schwarz¹ - Show less +1 more•Institutions (1)

IBM¹

01 Mar 1992-ACM Transactions on Database Systems

TL;DR: ARIES as discussed by the authors is a database management system applicable not only to database management systems but also to persistent object-oriented languages, recoverable file systems and transaction-based operating systems.

...read moreread less

Abstract: DB2TM, IMS, and TandemTM systems. ARIES is applicable not only to database management systems but also to persistent object-oriented languages, recoverable file systems and transaction-based operating systems. ARIES has been implemented, to varying degrees, in IBM's OS/2TM Extended Edition Database Manager, DB2, Workstation Data Save Facility/VM, Starburst and QuickSilver, and in the University of Wisconsin's EXODUS and Gamma database machine.

...read moreread less

1,083 citations

Journal Article•DOI•

The Recovery Manager of the System R Database Manager

[...]

Jim Gray, Paul McJones¹, M. W. Blasgen², Bruce G. Lindsay², Raymond A. Lorie², T. G. Price², Franco Putzolu², Irving L. Traiger² - Show less +4 more•Institutions (2)

Xerox¹, IBM²

01 Jun 1981-ACM Computing Surveys

TL;DR: The recovery subsystem of an experimental data management system is described and evaluated and the DO-UNDO-REDO protocol allows new recoverable types and operations to be added to the recovery system.

...read moreread less

Abstract: The recovery subsystem of an experimental data management system is described and evaluated. The transactmn concept allows application programs to commit, abort, or partially undo their effects. The DO-UNDO-REDO protocol allows new recoverable types and operations to be added to the recovery system Apphcation programs can record data m the transaction log to facilitate application-specific recovery. Transaction undo and redo are based on records kept in a transaction log. The checkpoint mechanism is based on differential fries (shadows). The recovery log is recorded on disk rather than tape.

...read moreread less

575 citations

Journal Article•DOI•

Efficiently Publishing Relational Data as XML Documents

[...]

Jayavel Shanmugasundaram¹, Eugene J. Shekita¹, Rimon Barr², Michael J. Carey, Bruce G. Lindsay¹, Hamid Pirahesh¹, Berthold Reinwald² - Show less +3 more•Institutions (2)

IBM¹, Cornell University²

01 Sep 2001

TL;DR: The results of an experimental study show that constructing XML documents inside the relational engine can have a significant performance benefit and show the superiority of having the relational engines use what is called an “outer union plan” to generate the content of an XML document.

...read moreread less

Abstract: XML is rapidly emerging as a standard for exchanging business data on the World Wide Web. For the foreseeable future, however, most business data will continue to be stored in relational database systems. Consequently, if XML is to fulfill its potential, some mechanism is needed to publish relational data as XML documents. Towards that goal, one of the major challenges is finding a way to efficiently structure and tag data from one or more tables as a hierarchical XML document. Different alternatives are possible depending on when this processing takes place and how much of it is done inside the relational engine. In this paper, we characterize and study the performance of these alternatives. Among other things, we explore the use of new scalar and aggregate functions in SQL for constructing complex XML documents directly in the relational engine. We also explore different execution plans for generating the content of an XML document. The results of an experimental study show that constructing XML documents inside the relational engine can have a significant performance benefit. Our results also show the superiority of having the relational engine use what we call an “outer union plan” to generate the content of an XML document.

...read moreread less

365 citations

Proceedings Article•DOI•

Approximate medians and other quantiles in one pass and with limited memory

[...]

Gurmeet Singh Manku¹, Sridhar Rajagopalan¹, Bruce G. Lindsay¹•Institutions (1)

IBM¹

01 Jun 1998

TL;DR: New algorithms for computing approximate quantiles of large datasets in a single pass are presented, and the main memory requirements are smaller than those reported by an order of magnitude.

...read moreread less

Abstract: We present new algorithms for computing approximate quantiles of large datasets in a single pass. The approximation guarantees are explicit, and apply for arbitrary value distributions and arrival distributions of the dataset. The main memory requirements are smaller than those reported earlier by an order of magnitude.We also discuss methods that couple the approximation algorithms with random sampling to further reduce memory requirements. With sampling, the approximation guarantees are explicit but probabilistic, i.e. they apply with respect to a (user controlled) confidence parameter.We present the algorithms, their theoretical analysis and simulation results on different datasets.

...read moreread less

340 citations

Journal Article•DOI•

Transaction management in the R* distributed database management system

[...]

Chandrasekaran Mohan¹, Bruce G. Lindsay¹, Ronald Lester Obermarck•Institutions (1)

IBM¹

01 Dec 1986-ACM Transactions on Database Systems

TL;DR: This paper concentrates primarily on the description of the R* commit protocols, Presumed Abort (PA) and Presumed Commit (PC), which are extensions of the well-known, two-phase (2P) commit protocol.

...read moreread less

Abstract: This paper deals with the transaction management aspects of the R* distributed database system. It concentrates primarily on the description of the R* commit protocols, Presumed Abort (PA) and Presumed Commit (PC). PA and PC are extensions of the well-known, two-phase (2P) commit protocol. PA is optimized for read-only transactions and a class of multisite update transactions, and PC is optimized for other classes of multisite update transactions. The optimizations result in reduced intersite message traffic and log writes, and, consequently, a better response time. The paper also discusses R*'s approach toward distributed deadlock detection and resolution.

...read moreread less

318 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•

[...]

Aristides Gionis¹, Piotr Indyk¹, Rajeev Motwani¹•Institutions (1)

Stanford University¹

07 Sep 1999

TL;DR: Experimental results indicate that the novel scheme for approximate similarity search based on hashing scales well even for a relatively large number of dimensions, and provides experimental evidence that the method gives improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition.

...read moreread less

Abstract: The nearestor near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and genome databases. Unfortunately, all known techniques for solving this problem fall prey to the \curse of dimensionality." That is, the data structures scale poorly with data dimensionality; in fact, if the number of dimensions exceeds 10 to 20, searching in k-d trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than brute-force linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather heuristic, determining an approximate nearest neighbor should su ce for most practical purposes. In this paper, we examine a novel scheme for approximate similarity search based on hashing. The basic idea is to hash the points Supported by NAVY N00014-96-1-1221 grant and NSF Grant IIS-9811904. Supported by Stanford Graduate Fellowship and NSF NYI Award CCR-9357849. Supported by ARO MURI Grant DAAH04-96-1-0007, NSF Grant IIS-9811904, and NSF Young Investigator Award CCR9357849, with matching funds from IBM, Mitsubishi, Schlumberger Foundation, Shell Foundation, and Xerox Corporation. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 25th VLDB Conference, Edinburgh, Scotland, 1999. from the database so as to ensure that the probability of collision is much higher for objects that are close to each other than for those that are far apart. We provide experimental evidence that our method gives signi cant improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition. Experimental results also indicate that our scheme scales well even for a relatively large number of dimensions (more than 50).

...read moreread less

3,705 citations

Proceedings Article•DOI•

Models and issues in data stream systems

[...]

Brian Babcock¹, Shivnath Babu¹, Mayur Datar¹, Rajeev Motwani¹, Jennifer Widom¹ - Show less +1 more•Institutions (1)

Stanford University¹

03 Jun 2002

TL;DR: The need for and research issues arising from a new model of data processing, where data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams are motivated.

...read moreread less

Abstract: In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work relevant to data stream systems and current projects in the area, the paper explores topics in stream query languages, new requirements and challenges in query processing, and algorithmic issues.

...read moreread less

2,933 citations

Proceedings Article•DOI•

Storing and querying ordered XML using a relational database system

[...]

Igor Tatarinov¹, Stratis D. Viglas², Kevin Scott Beyer³, Jayavel Shanmugasundaram⁴, Eugene J. Shekita³, Chun Zhang² - Show less +2 more•Institutions (4)

University of Washington¹, University of Wisconsin-Madison², IBM³, Cornell University⁴

03 Jun 2002

TL;DR: This paper shows that XML's ordered data model can indeed be efficiently supported by a relational database system, and proposes three order encoding methods that can be used to represent XML order in the relational data model, and also proposes algorithms for translating ordered XPath expressions into SQL using these encoding methods.

...read moreread less

Abstract: XML is quickly becoming the de facto standard for data exchange over the Internet. This is creating a new set of data management requirements involving XML, such as the need to store and query XML documents. Researchers have proposed using relational database systems to satisfy these requirements by devising ways to "shred" XML documents into relations, and translate XML queries into SQL queries over these relations. However, a key issue with such an approach, which has largely been ignored in the research literature, is how (and whether) the ordered XML data model can be efficiently supported by the unordered relational data model. This paper shows that XML's ordered data model can indeed be efficiently supported by a relational database system. This is accomplished by encoding order as a data value. We propose three order encoding methods that can be used to represent XML order in the relational data model, and also propose algorithms for translating ordered XPath expressions into SQL using these encoding methods. Finally, we report the results of an experimental study that investigates the performance of the proposed order encoding methods on a workload of ordered XML queries and updates.

...read moreread less

2,402 citations

Book•

Principles of Distributed Database Systems

[...]

M. Tamer zsu¹, Patrick Valduriez²•Institutions (2)

University of Alberta¹, French Institute for Research in Computer Science and Automation²

01 Aug 1990

TL;DR: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels and concentrates on fundamental theories as well as techniques and algorithms in distributed data management.

...read moreread less

Abstract: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels. The material concentrates on fundamental theories as well as techniques and algorithms. The advent of the Internet and the World Wide Web, and, more recently, the emergence of cloud computing and streaming data applications, has forced a renewal of interest in distributed and parallel data management, while, at the same time, requiring a rethinking of some of the traditional techniques. This book covers the breadth and depth of this re-emerging field. The coverage consists of two parts. The first part discusses the fundamental principles of distributed data management and includes distribution design, data integration, distributed query processing and optimization, distributed transaction management, and replication. The second part focuses on more advanced topics and includes discussion of parallel database systems, distributed object management, peer-to-peer data management, web data management, data stream systems, and cloud computing. New in this Edition: New chapters, covering database replication, database integration, multidatabase query processing, peer-to-peer data management, and web data management. Coverage of emerging topics such as data streams and cloud computing Extensive revisions and updates based on years of class testing and feedback Ancillary teaching materials are available.

...read moreread less

2,395 citations

Proceedings Article•DOI•

QBIC project: querying images by content, using color, texture, and shape

[...]

Carlton Wayne Niblack¹, R. Barber¹, Will Equitz¹, Myron D. Flickner¹, Eduardo H. Glasman¹, Dragutin Petkovic¹, Peter Cornelius Yanker¹, Christos Faloutsos¹, Gabriel Taubin¹ - Show less +5 more•Institutions (1)

IBM¹

14 Apr 1993-Storage and Retrieval for Image and Video Databases

TL;DR: The main algorithms for color texture, shape and sketch query that are presented, show example query results, and discuss future directions are presented.

...read moreread less

Abstract: In the query by image content (QBIC) project we are studying methods to query large on-line image databases using the images' content as the basis of the queries. Examples of the content we use include color, texture, and shape of image objects and regions. Potential applications include medical (`Give me other images that contain a tumor with a texture like this one'), photo-journalism (`Give me images that have blue at the top and red at the bottom'), and many others in art, fashion, cataloging, retailing, and industry. Key issues include derivation and computation of attributes of images and objects that provide useful query functionality, retrieval methods based on similarity as opposed to exact match, query by image example or user drawn image, the user interfaces, query refinement and navigation, high dimensional database indexing, and automatic and semi-automatic database population. We currently have a prototype system written in X/Motif and C running on an RS/6000 that allows a variety of queries, and a test database of over 1000 images and 1000 objects populated from commercially available photo clip art images. In this paper we present the main algorithms for color texture, shape and sketch query that we use, show example query results, and discuss future directions.© (1993) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

2,127 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse