A Scalable Data Platform for a Large Number of Small Applications.

Open AccessProceedings Article

A Scalable Data Platform for a Large Number of Small Applications.

TLDR

This work explores a new point in the design space whereby commodity hardware and free software are used to scale to a large number of applications while still supporting full SQL functionality, transactional guarantees, high availability and Service Level Agreements (SLAs).

Abstract:

As a growing number of websites open up their APIs to ex- ternal application developers (e.g., Facebook, Yahoo! Wid- gets, Google Gadgets), these websites are facing an intrigu- ing scalability problem: while each user-generated applica- tion is by itself quite small (in terms of size and through- put requirements), there are many many such applications. Unfortunately, existing data-management solutions are not designed to handle this form of scalability in a cost-effective, manageable and/or flexible manner. For instance, large in- stallations of commercial database systems such as Oracle, DB2 and SQL Server are usually very expensive and diffi- cult to manage. At the other extreme, low-cost hosted data- management solutions such as Amazon's SimpleDB do not support sophisticated data-manipulation primitives such as joins that are necessary for developing most Web applica- tions. To address this issue, we explore a new point in the design space whereby we use commodity hardware and free software (MySQL) to scale to a large number of applications while still supporting full SQL functionality, transactional guarantees, high availability and Service Level Agreements (SLAs). We do so by exploiting the key property that each application is "small" and can fit in a single machine (which can possibly be shared with other applications). Using this property, we design replication strategies, data migration techniques and load balancing operations that automate the tasks that would otherwise contribute to the operational and management complexity of dealing with a large number of applications. Our experiments based on the TPC-W bench- mark suggest that the proposed system can scale to a large number of small applications.

A Scalable Data Platform for a Large Number of Small Applications.

Citations

Megastore: Providing Scalable, Highly Available Storage for Interactive Services

Big data and cloud computing: current state and future opportunities

Mirror mirror on the ceiling: flexible wireless links for data centers

G-Store: a scalable data store for transactional multi key access in the cloud

Zephyr: live migration in shared nothing databases for elastic cloud platforms

References

Chord: A scalable peer-to-peer lookup service for internet applications

Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

A scalable content-addressable network

Bigtable: A Distributed Storage System for Structured Data (Awarded Best Paper!).

Concurrency Control and Recovery in Database Systems

Related Papers (5)

Multi-tenant databases for software as a service: schema-mapping techniques

Zephyr: live migration in shared nothing databases for elastic cloud platforms

PNUTS: Yahoo!'s hosted data serving platform

Dynamo: amazon's highly available key-value store

G-Store: a scalable data store for transactional multi key access in the cloud