Proceedings ArticleDOI
Dynamo: amazon's highly available key-value store
Giuseppe deCandia,Deniz Hastorun,Madan Mohan Rao Jampani,Gunavardhan Kakulapati,Avinash Lakshman,Alex Pilchin,Swaminathan Sivasubramanian,Peter Sven Vosshall,Werner Vogels +8 more
- Vol. 41, Iss: 6, pp 205-220
TLDR
D Dynamo is presented, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience and makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.Abstract:
Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. The Amazon.com platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. At this scale, small and large components fail continuously and the way persistent state is managed in the face of these failures drives the reliability and scalability of the software systems.This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience. To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.read more
Citations
More filters
Patent
Method, server and system for managing content in content delivery network
TL;DR: In this paper, a method for managing content in a content distribution network (CDN) is provided, which includes executing the following steps at the main controlling server: monitoring whether used storage space of the edge node exceeds a predetermined threshold value; acquiring a list of protected files (U1) from the indexing system; performing directory traversal for edge node to get a list (U0) whose difference between latest modified time and the current time exceeds the predetermined time range; getting a list to be deleted U2=U0−U1; and deleting each of the
Book ChapterDOI
Data integration over NoSQL stores using access path based mappings
TL;DR: This paper proposes an access path based mapping solution that takes benefit of the design choices of each data source, and presents a prototype implementation, where the target schema is represented as a set of relations and which enables the integration of two of the most popular NoSQL database models, namely document and a column family stores.
Proceedings ArticleDOI
D4M 2.0 Schema: A General Purpose High Performance Schema for the Accumulo Database
Jeremy Kepner,Christian C. Anderson,William Arcand,David Bestor,Bill Bergeron,Chansup Byun,Matthew Hubbell,Peter Michaleas,Julie Mullen,David O'Gwynn,Andrew Prout,Albert Reuther,Antonio Rosa,Charles Yee +13 more
TL;DR: This paper presents the D4M 2.0 Schema, a general purpose schema that can be used to fully index and rapidly query every unique string in a dataset, which has been applied with little or no customization to cyber, bioinformatics, scientific citation, free text, and social media data.
Proceedings ArticleDOI
A GPU accelerated storage system
TL;DR: The design of a storage system prototype that uses GPU offloading to accelerate a number of computationally intensive primitives based on hashing is presented and the results show that this technique can bring tangible performance gains without negatively impacting the performance of concurrently running applications.
Proceedings ArticleDOI
An analysis of network-partitioning failures in cloud systems
TL;DR: A comprehensive study of 136 system failures attributed to network-partitioning faults from 25 widely used distributed systems found that the majority of the failures led to catastrophic effects, such as data loss, reappearance of deleted data, broken locks, and system crashes.
References
More filters
Proceedings ArticleDOI
Chord: A scalable peer-to-peer lookup service for internet applications
TL;DR: Results from theoretical analysis, simulations, and experiments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes.
Book ChapterDOI
Time, clocks, and the ordering of events in a distributed system
TL;DR: In this paper, the concept of one event happening before another in a distributed system is examined, and a distributed algorithm is given for synchronizing a system of logical clocks which can be used to totally order the events.
Book ChapterDOI
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Antony Rowstron,Peter Druschel +1 more
TL;DR: Pastry as mentioned in this paper is a scalable, distributed object location and routing substrate for wide-area peer-to-peer ap- plications, which performs application-level routing and object location in a po- tentially very large overlay network of nodes connected via the Internet.
Journal ArticleDOI
Time, clocks, and the ordering of events in a distributed system
TL;DR: In this article, the concept of one event happening before another in a distributed system is examined, and a distributed algorithm is given for synchronizing a system of logical clocks which can be used to totally order the events.
Journal ArticleDOI
The Google file system
TL;DR: This paper presents file system interface extensions designed to support distributed applications, discusses many aspects of the design, and reports measurements from both micro-benchmarks and real world use.