scispace - formally typeset
Topic

Distributed database

About: Distributed database is a(n) research topic. Over the lifetime, 11788 publication(s) have been published within this topic receiving 210562 citation(s).

...read more

Papers
More filters

Proceedings ArticleDOI
03 May 2010-
TL;DR: The architecture of HDFS is described and experience using HDFS to manage 25 petabytes of enterprise data at Yahoo! is reported on.

...read more

Abstract: The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow with demand while remaining economical at every size. We describe the architecture of HDFS and report on experience using HDFS to manage 25 petabytes of enterprise data at Yahoo!.

...read more

4,572 citations


Journal ArticleDOI
Amit P. Sheth, James A. Larson1Institutions (1)
Abstract: A federated database system (FDBS) is a collection of cooperating database systems that are autonomous and possibly heterogeneous. In this paper, we define a reference architecture for distributed database management systems from system and schema viewpoints and show how various FDBS architectures can be developed. We then define a methodology for developing one of the popular architectures of an FDBS. Finally, we discuss critical issues related to developing and operating an FDBS.

...read more

2,352 citations


Book
M. Tamer zsu1, Patrick Valduriez2Institutions (2)
01 Aug 1990-
TL;DR: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels and concentrates on fundamental theories as well as techniques and algorithms in distributed data management.

...read more

Abstract: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels. The material concentrates on fundamental theories as well as techniques and algorithms. The advent of the Internet and the World Wide Web, and, more recently, the emergence of cloud computing and streaming data applications, has forced a renewal of interest in distributed and parallel data management, while, at the same time, requiring a rethinking of some of the traditional techniques. This book covers the breadth and depth of this re-emerging field. The coverage consists of two parts. The first part discusses the fundamental principles of distributed data management and includes distribution design, data integration, distributed query processing and optimization, distributed transaction management, and replication. The second part focuses on more advanced topics and includes discussion of parallel database systems, distributed object management, peer-to-peer data management, web data management, data stream systems, and cloud computing. New in this Edition: New chapters, covering database replication, database integration, multidatabase query processing, peer-to-peer data management, and web data management. Coverage of emerging topics such as data streams and cloud computing Extensive revisions and updates based on years of class testing and feedback Ancillary teaching materials are available.

...read more

2,328 citations


Journal ArticleDOI
David Jefferson1Institutions (1)
TL;DR: Virtual time is a new paradigm for organizing and synchronizing distributed systems which can be applied to such problems as distributed discrete event simulation and distributed database concurrency control.

...read more

Abstract: Virtual time is a new paradigm for organizing and synchronizing distributed systems which can be applied to such problems as distributed discrete event simulation and distributed database concurrency control. Virtual time provides a flexible abstraction of real time in much the same way that virtual memory provides an abstraction of real memory. It is implemented using the Time Warp mechanism, a synchronization protocol distinguished by its reliance on lookahead-rollback, and by its implementation of rollback via antimessages.

...read more

2,238 citations


Journal ArticleDOI
01 Nov 1998-Journal of the ACM
TL;DR: This work describes schemes that enable a user to access k replicated copies of a database and privately retrieve information stored in the database, so that each individual server gets no information on the identity of the item retrieved by the user.

...read more

Abstract: Publicly accessible databases are an indispensable resource for retrieving up-to-date information. But they also pose a significant risk to the privacy of the user, since a curious database operator can follow the user's queries and infer what the user is after. Indeed, in cases where the users' intentions are to be kept secret, users are often cautious about accessing the database. It can be shown that when accessing a single database, to completely guarantee the privacy of the user, the whole database should be down-loaded; namely n bits should be communicated (where n is the number of bits in the database).In this work, we investigate whether by replicating the database, more efficient solutions to the private retrieval problem can be obtained. We describe schemes that enable a user to access k replicated copies of a database (k≥2) and privately retrieve information stored in the database. This means that each individual server (holding a replicated copy of the database) gets no information on the identity of the item retrieved by the user. Our schemes use the replication to gain substantial saving. In particular, we present a two-server scheme with communication complexity O(n1/3).

...read more

1,856 citations


Network Information
Related Topics (5)
Server

79.5K papers, 1.4M citations

93% related
Distributed Computing Environment

3.7K papers, 51.2K citations

93% related
Query optimization

17.6K papers, 474.4K citations

92% related
Web query classification

11.9K papers, 339.3K citations

92% related
Load balancing (computing)

27.3K papers, 415.5K citations

92% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20222
2021350
2020429
2019480
2018535
2017599

Top Attributes

Show by:

Topic's top 5 most impactful authors

Bharat Bhargava

30 papers, 421 citations

Patrick Valduriez

25 papers, 4.5K citations

Philip S. Yu

17 papers, 378 citations

Sang H. Son

17 papers, 191 citations

Sushil Jajodia

16 papers, 741 citations