scispace - formally typeset
Search or ask a question
Topic

Distributed database

About: Distributed database is a research topic. Over the lifetime, 11788 publications have been published within this topic receiving 210562 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors introduce design principles for a data management architecture called the data grid, and describe two basic services that are fundamental to the design of a data grid: storage systems and metadata management.

1,198 citations

Journal ArticleDOI
01 Aug 2008
TL;DR: PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees and utilizes automated load-balancing and failover to reduce operational complexity.
Abstract: We describe PNUTS, a massively parallel and geographically distributed database system for Yahoo!'s web applications. PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees. It is a hosted, centrally managed, and geographically distributed service, and utilizes automated load-balancing and failover to reduce operational complexity. The first version of the system is currently serving in production. We describe the motivation for PNUTS and the design and implementation of its table storage and replication layers, and then present experimental results.

1,142 citations

Journal ArticleDOI
TL;DR: A “majority consensus” algorithm which represents a new solution to the update synchronization problem for multiple copy databases is presented and can function effectively in the presence of communication and database site outages.
Abstract: A “majority consensus” algorithm which represents a new solution to the update synchronization problem for multiple copy databases is presented. The algorithm embodies distributed control and can function effectively in the presence of communication and database site outages. The correctness of the algorithm is demonstrated and the cost of using it is analyzed. Several examples that illustrate aspects of the algorithm operation are included in the Appendix.

1,136 citations

Journal ArticleDOI
17 May 2002-Science
TL;DR: A model is presented that offers an explanation of social network searchability in terms of recognizable personal identities: sets of characteristics measured along a number of social dimensions that may be applicable to many network search problems.
Abstract: Social networks have the surprising property of being "searchable": Ordinary people are capable of directing messages through their network of acquaintances to reach a specific but distant target person in only a few steps. We present a model that offers an explanation of social network searchability in terms of recognizable personal identities: sets of characteristics measured along a number of social dimensions. Our model defines a class of searchable networks and a method for searching them that may be applicable to many network search problems, including the location of data files in peer-to-peer networks, pages on the World Wide Web, and information in distributed databases.

1,015 citations

Journal ArticleDOI
TL;DR: The paper presents the “textbook” architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems, and discusses different kinds of distributed systems such as client-server, middleware (multitier), and heterogeneous database systems and shows how query processing works in these systems.
Abstract: Distributed data processing is becoming a reality. Businesses want to do it for many reasons, and they often must do it in order to stay competitive. While much of the infrastructure for distributed data processing is already there (e.g., modern network technology), a number of issues make distributed data processing still a complex undertaking: (1) distributed systems can become very large, involving thousands of heterogeneous sites including PCs and mainframe server machines; (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system; (3) legacy systems need to be integrated—such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the “textbook” architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intraquery paralleli sm, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses different kinds of distributed systems such as client-server, middleware (multitier), and heterogeneous database systems, and shows how query processing works in these systems.

980 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
93% related
Wireless sensor network
142K papers, 2.4M citations
90% related
Network packet
159.7K papers, 2.2M citations
88% related
Wireless network
122.5K papers, 2.1M citations
88% related
Scheduling (computing)
78.6K papers, 1.3M citations
87% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202339
202274
2021351
2020430
2019480
2018536