Showing papers on "Concurrency control published in 2017"

PDF

Open Access

Journal Article•DOI•

An empirical evaluation of in-memory multi-version concurrency control

[...]

Yingjun Wu¹, Joy Arulraj², Jiexi Lin², Ran Xian², Andrew Pavlo² - Show less +1 more•Institutions (2)

National University of Singapore¹, Carnegie Mellon University²

01 Mar 2017

TL;DR: An extensive study of the scheme's four key design decisions: concurrency control protocol, version storage, garbage collection, and index management is conducted and identifies the fundamental bottlenecks of each design choice.

...read moreread less

Abstract: Multi-version concurrency control (MVCC) is currently the most popular transaction management scheme in modern database management systems (DBMSs). Although MVCC was discovered in the late 1970s, it is used in almost every major relational DBMS released in the last decade. Maintaining multiple versions of data potentially increases parallelism without sacrificing serializability when processing transactions. But scaling MVCC in a multi-core and in-memory setting is non-trivial: when there are a large number of threads running in parallel, the synchronization overhead can outweigh the benefits of multi-versioning.To understand how MVCC perform when processing transactions in modern hardware settings, we conduct an extensive study of the scheme's four key design decisions: concurrency control protocol, version storage, garbage collection, and index management. We implemented state-of-the-art variants of all of these in an in-memory DBMS and evaluated them using OLTP workloads. Our analysis identifies the fundamental bottlenecks of each design choice.

...read moreread less

140 citations

Proceedings Article•DOI•

Cicada: Dependably Fast Multi-Core In-Memory Transactions

[...]

Hyeontaek Lim¹, Michael Kaminsky², David G. Andersen¹•Institutions (2)

Carnegie Mellon University¹, Intel²

09 May 2017

TL;DR: Cicada is a single-node multi-core in-memory transactional database with serializability that reduces overhead and contention at several levels of the system by leveraging optimistic and multi-version concurrency control schemes and multiple loosely synchronized clocks while mitigating their drawbacks.

...read moreread less

Abstract: Multi-core in-memory databases promise high-speed online transaction processing. However, the performance of individual designs suffers when the workload characteristics miss their small sweet spot of a desired contention level, read-write ratio, record size, processing rate, and so forth. Cicada is a single-node multi-core in-memory transactional database with serializability. To provide high performance under diverse workloads, Cicada reduces overhead and contention at several levels of the system by leveraging optimistic and multi-version concurrency control schemes and multiple loosely synchronized clocks while mitigating their drawbacks. On the TPC-C and YCSB benchmarks, Cicada outperforms Silo, TicToc, FOEDUS, MOCC, two-phase locking, Hekaton, and ERMIA in most scenarios, achieving up to 3X higher throughput than the next fastest design. It handles up to 2.07 M TPC-C transactions per second and 56.5 M YCSB transactions per second, and scans up to 356 M records per second on a single 28-core machine.

...read moreread less

126 citations

Journal Article•DOI•

An evaluation of distributed concurrency control

[...]

Rachael Harding¹, Dana Van Aken², Andrew Pavlo², Michael Stonebraker¹•Institutions (2)

Massachusetts Institute of Technology¹, Carnegie Mellon University²

01 Jan 2017

TL;DR: To achieve truly scalable operation, distributed concurrency control solutions must seek a tighter coupling with either novel network hardware or applications (via data modeling and semantically-aware execution), or both.

...read moreread less

Abstract: Increasing transaction volumes have led to a resurgence of interest in distributed transaction processing. In particular, partitioning data across several servers can improve throughput by allowing servers to process transactions in parallel. But executing transactions across servers limits the scalability and performance of these systems.In this paper, we quantify the effects of distribution on concurrency control protocols in a distributed environment. We evaluate six classic and modern protocols in an in-memory distributed database evaluation framework called Deneva, providing an apples-to-apples comparison between each. Our results expose severe limitations of distributed transaction processing engines. Moreover, in our analysis, we identify several protocol-specific scalability bottlenecks. We conclude that to achieve truly scalable operation, distributed concurrency control solutions must seek a tighter coupling with either novel network hardware (in the local area) or applications (via data modeling and semantically-aware execution), or both.

...read moreread less

105 citations

Proceedings Article•DOI•

Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control

[...]

Jialin Li¹, Ellis Michael¹, Dan R. K. Ports¹•Institutions (1)

University of Washington¹

14 Oct 2017

TL;DR: Eris can process a large class of distributed transactions in a single round-trip from the client to the storage system without any explicit coordination between shards or replicas in the normal case, providing atomicity, consistency, and fault tolerance with less than 10% overhead.

...read moreread less

Abstract: Distributed storage systems aim to provide strong consistency and isolation guarantees on an architecture that is partitioned across multiple shards for scalability and replicated for fault tolerance. Traditionally, achieving all of these goals has required an expensive combination of atomic commitment and replication protocols -- introducing extensive coordination overhead. Our system, Eris, takes a different approach. It moves a core piece of concurrency control functionality, which we term multi-sequencing, into the datacenter network itself. This network primitive takes on the responsibility for consistently ordering transactions, and a new lightweight transaction protocol ensures atomicity. The end result is that Eris avoids both replication and transaction coordination overhead: we show that it can process a large class of distributed transactions in a single round-trip from the client to the storage system without any explicit coordination between shards or replicas in the normal case. It provides atomicity, consistency, and fault tolerance with less than 10% overhead -- achieving throughput 3.6-35x higher and latency 72-80% lower than a conventional design on standard benchmarks.

...read moreread less

99 citations

Journal Article•DOI•

High performance transactions via early write visibility

[...]

Jose M. Faleiro¹, Daniel J. Abadi¹, Joseph M. Hellerstein²•Institutions (2)

Yale University¹, University of California, Berkeley²

01 Jan 2017

TL;DR: This paper designs a new serializable concurrency control protocol, piece-wise visibility (PWV), with the explicit goal of enabling early write visibility, and finds that PWV can outperform serializable protocols by an order of magnitude and read committed by 3X on high contention workloads.

...read moreread less

Abstract: In order to guarantee recoverable transaction execution, database systems permit a transaction's writes to be observable only at the end of its execution. As a consequence, there is generally a delay between the time a transaction performs a write and the time later transactions are permitted to read it. This delayed write visibility can significantly impact the performance of serializable database systems by reducing concurrency among conflicting transactions.This paper makes the observation that delayed write visibility stems from the fact that database systems can arbitrarily abort transactions at any point during their execution. Accordingly, we make the case for database systems which only abort transactions under a restricted set of conditions, thereby enabling a new recoverability mechanism, early write visibility, which safely makes transactions' writes visible prior to the end of their execution. We design a new serializable concurrency control protocol, piece-wise visibility (PWV), with the explicit goal of enabling early write visibility. We evaluate PWV against state-of-the-art serializable protocols and a highly optimized implementation of read committed, and find that PWV can outperform serializable protocols by an order of magnitude and read committed by 3X on high contention workloads.

...read moreread less

93 citations

Journal Article•DOI•

Concurrent Control of Mobility and Communication in Multirobot Systems

[...]

James Stephan¹, Jonathan Fink², Vijay Kumar¹, Alejandro Ribeiro¹•Institutions (2)

University of Pennsylvania¹, United States Army Research Laboratory²

06 Jun 2017-IEEE Transactions on Robotics

TL;DR: A hybrid system architecture that enables a team of mobile robots to complete a task in a complex environment by self-organizing into a multihop ad hoc network and solving the concurrent communication and mobility problem is developed.

...read moreread less

Abstract: We develop a hybrid system architecture that enables a team of mobile robots to complete a task in a complex environment by self-organizing into a multihop ad hoc network and solving the concurrent communication and mobility problem. The proposed system consists of a two-layer feedback loop. An outer loop performs infrequent global coordination and a local inner loop determines motion and communication variables. This system provides the lightweight coordination and responsiveness of decentralized systems while avoiding local minima. This allows a team to complete a task in complex environments while maintaining desired end-to-end data rates. The behavior of the system is evaluated in experiments that demonstrate: 1) successful task completion in complex environments; 2) achievement of equal or greater end-to-end data rates as compared to a centralized system; and 3) robustness to unexpected events such as motion restriction.

...read moreread less

61 citations

Proceedings Article•DOI•

DCatch: Automatically Detecting Distributed Concurrency Bugs in Cloud Systems

[...]

Haopeng Liu¹, Guangpu Li¹, Jeffrey F. Lukman¹, Jiaxin Li¹, Shan Lu¹, Haryadi S. Gunawi¹, Chen Tian² - Show less +3 more•Institutions (2)

University of Chicago¹, Huawei²

04 Apr 2017

TL;DR: To build DCatch, a set of happens-before rules are designed that model a wide variety of communication and concurrency mechanisms in real-world distributed cloud systems, and tools to help prune false positives and trigger DCbugs are designed.

...read moreread less

Abstract: In big data and cloud computing era, reliability of distributed systems is extremely important. Unfortunately, distributed concurrency bugs, referred to as DCbugs, widely exist. They hide in the large state space of distributed cloud systems and manifest non-deterministically depending on the timing of distributed computation and communication. Effective techniques to detect DCbugs are desired. This paper presents a pilot solution, DCatch, in the world of DCbug detection. DCatch predicts DCbugs by analyzing correct execution of distributed systems. To build DCatch, we design a set of happens-before rules that model a wide variety of communication and concurrency mechanisms in real-world distributed cloud systems. We then build runtime tracing and trace analysis tools to effectively identify concurrent conflicting memory accesses in these systems. Finally, we design tools to help prune false positives and trigger DCbugs. We have evaluated DCatch on four representative open-source distributed cloud systems, Cassandra, Hadoop MapReduce, HBase, and ZooKeeper. By monitoring correct execution of seven workloads on these systems, DCatch reports 32 DCbugs, with 20 of them being truly harmful.

...read moreread less

46 citations

Journal Article•DOI•

Parallel replication across formats in SAP HANA for scaling out mixed OLTP/OLAP workloads

[...]

Juchang Lee, Seunghyun Moon¹, Kyu Hwan Kim, Kim Deok Hoe, Sang Kyun Cha², Wook-Shin Han¹ - Show less +2 more•Institutions (2)

Pohang University of Science and Technology¹, Seoul National University²

01 Aug 2017

TL;DR: Asynchronous Parallel Table Replication (ATR) employs a novel optimistic lock-free parallel log replay scheme which exploits characteristics of multi-version concurrency control (MVCC) in order to enable real-time reporting by minimizing the propagation delay between the primary and replicas.

...read moreread less

Abstract: Modern in-memory database systems are facing the need of efficiently supporting mixed workloads of OLTP and OLAP. A conventional approach to this requirement is to rely on ETL-style, application-driven data replication between two very different OLTP and OLAP systems, sacrificing real-time reporting on operational data. An alternative approach is to run OLTP and OLAP workloads in a single machine, which eventually limits the maximum scalability of OLAP query performance. In order to tackle this challenging problem, we propose a novel database replication architecture called Asynchronous Parallel Table Replication (ATR). ATR supports OLTP workloads in one primary machine, while it supports heavy OLAP workloads in replicas. Here, row-store formats can be used for OLTP transactions at the primary, while column-store formats are used for OLAP analytical queries at the replicas. ATR is designed to support elastic scalability of OLAP query performance while it minimizes the overhead for transaction processing at the primary and minimizes CPU consumption for replayed transactions at the replicas. ATR employs a novel optimistic lock-free parallel log replay scheme which exploits characteristics of multi-version concurrency control (MVCC) in order to enable real-time reporting by minimizing the propagation delay between the primary and replicas. Through extensive experiments with a concrete implementation available in a commercial database system, we demonstrate that ATR achieves sub-second visibility delay even for update-intensive workloads, providing scalable OLAP performance without notable overhead to the primary.

...read moreread less

31 citations

Journal Article•DOI•

Analyzing the impact of system architecture on the scalability of OLTP engines for high-contention workloads

[...]

Raja Appuswamy¹, Angelos C. Anadiotis¹, Danica Porobic¹, Mustafa K. Iman¹, Anastasia Ailamaki¹ - Show less +1 more•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

01 Oct 2017

TL;DR: An extensive experimental study is presented to understand the impact of each system architecture on overall scalability, the interaction between system architecture and concurrency control protocols, and the pros and cons of new architectures that have been proposed recently to explicitly deal with high-contention workloads.

...read moreread less

Abstract: Main-memory OLTP engines are being increasingly deployed on multicore servers that provide abundant thread-level parallelism. However, recent research has shown that even the state-of-the-art OLTP engines are unable to exploit available parallelism for high contention workloads. While previous studies have shown the lack of scalability of all popular concurrency control protocols, they consider only one system architecture---a non-partitioned, shared everything one where transactions can be scheduled to run on any core and can access any data or metadata stored in shared memory.In this paper, we perform a thorough analysis of the impact of other architectural alternatives (Data-oriented transaction execution, Partitioned Serial Execution, and Delegation) on scalability under high contention scenarios. In doing so, we present Trireme, a main-memory OLTP engine testbed that implements four system architectures and several popular concurrency control protocols in a single code base. Using Trireme, we present an extensive experimental study to understand i) the impact of each system architecture on overall scalability, ii) the interaction between system architecture and concurrency control protocols, and iii) the pros and cons of new architectures that have been proposed recently to explicitly deal with high-contention workloads.

...read moreread less

30 citations

Proceedings Article•DOI•

Blotter: Low Latency Transactions for Geo-Replicated Storage

[...]

Henrique Moniz¹, João Leitão², Ricardo J. Dias, Johannes Gehrke³, Nuno Preguiça², Rodrigo Rodrigues⁴ - Show less +2 more•Institutions (4)

Google¹, Universidade Nova de Lisboa², Microsoft³, University of Lisbon⁴

03 Apr 2017

TL;DR: This paper uses a recently proposed isolation level, called Non-Monotonic Snapshot Isolation, to achieve ACID transactions with low latency, and presents Blotter, a geo-replicated system that leverages these semantics in the design of a new concurrency control protocol that leaves a small amount of local state during reads to make commits more efficient.

...read moreread less

Abstract: Most geo-replicated storage systems use weak consistency to avoid the performance penalty of coordinating replicas in different data centers. This departure from strong semantics poses problems to application programmers, who need to address the anomalies enabled by weak consistency. In this paper we use a recently proposed isolation level, called Non-Monotonic Snapshot Isolation, to achieve ACID transactions with low latency. To this end, we present Blotter, a geo-replicated system that leverages these semantics in the design of a new concurrency control protocol that leaves a small amount of local state during reads to make commits more efficient, which is combined with a configuration of Paxos that is tailored for good performance in wide area settings. Read operations always run on the local data center, and update transactions complete in a small number of message steps to a subset of the replicas. We implemented Blotter as an extension to Cassandra. Our experimental evaluation shows that Blotter has a small overhead at the data center scale, and performs better across data centers when compared with our implementations of the core Spanner protocol and of Snapshot Isolation on the same codebase.

...read moreread less

29 citations

Proceedings Article•DOI•

What Are We Doing With Our Lives?: Nobody Cares About Our Concurrency Control Research

[...]

Andrew Pavlo¹•Institutions (1)

Carnegie Mellon University¹

09 May 2017

TL;DR: It is argued that the real issues that will have the most impact are not easily solved by more "clever" algorithms and instead, they can only be solved by hardware improvements and artificial intelligence.

...read moreread less

Abstract: Most of the academic papers on concurrency control published in the last five years have assumed the following two design decisions: (1) applications execute transactions with serializable isolation and (2) applications execute most (if not all) of their transactions using stored procedures. But results from a recent survey of database administrators indicates that these assumptions are not realistic. This survey includes both legacy deployments where the cost of changing the application to use either serializable isolation or stored procedures is not feasible, as well as new "greenfield" projects that not encumbered by prior constraints. As such, the research produced by our community is not helping people with their real-world systems and thus is essentially irrelevant. I know this because I am guilty of writing these papers too. In this talk/denouncement, I will descend from my ivory tower and argue that we need to rethink our agenda for concurrency control research. Recent trends focus on asking the wrong questions and solving the wrong problems. I contend that the real issues that will have the most impact are not easily solved by more "clever" algorithms. Instead, in many cases, they can only be solved by hardware improvements and artificial intelligence.

...read moreread less

Proceedings Article•DOI•

Transaction Repair for Multi-Version Concurrency Control

[...]

Mohammad Dashti¹, Sachin Basil John¹, Amir Shaikhha¹, Christoph Koch¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

09 May 2017

TL;DR: This paper proposes a novel approach for conflict resolution in MVCC for in-memory databases that maximizes the reuse of the computations done in the initial execution round, and increases the transaction processing throughput.

...read moreread less

Abstract: The optimistic variants of Multi-Version Concurrency Control (MVCC) avoid blocking concurrent transactions at the cost of having a validation phase. Upon failure in the validation phase, the transaction is usually aborted and restarted from scratch. The "abort and restart" approach becomes a performance bottleneck for use cases with high contention objects or long running transactions. In addition, restarting from scratch creates a negative feedback loop in the system, because the system incurs additional overhead that may create even more conflicts. In this paper, we propose a novel approach for conflict resolution in MVCC for in-memory databases. This low overhead approach summarizes the transaction programs in the form of a dependency graph. The dependency graph also contains the constructs used in the validation phase of the MVCC algorithm. Then, when encountering conflicts among transactions, our mechanism quickly detects the conflict locations in the program and partially re-executes the conflicting transactions. This approach maximizes the reuse of the computations done in the initial execution round, and increases the transaction processing throughput.

...read moreread less

Journal Article•DOI•

Dynamic Partial Reconfiguration of Concurrent Control Systems Implemented in FPGA Devices

[...]

Remigiusz Wisniewski¹, Grzegorz Bazydlo¹, Luis Gomes², Anikó Costa²•Institutions (2)

University of Zielona Góra¹, Nova Southeastern University²

10 May 2017-IEEE Transactions on Industrial Informatics

TL;DR: A novel prototyping technique for concurrent control systems implemented in field programmable gate array (FPGA) devices is proposed in the paper, which allows for dynamic modification of the implemented system.

...read moreread less

Abstract: A novel prototyping technique for concurrent control systems implemented in field programmable gate array (FPGA) devices is proposed in the paper. The method allows for dynamic modification of the implemented system. It means that the functionality of a part of the controller can be changed, while the rest of the system is still running. The approach applies to unified modeling language state machine diagrams as a specification of the system. Contrary to other methods, the presented concept requires neither major changes to the design, nor the application of external, specialized tools. The proposed idea has been experimentally verified with the use of Xilinx FPGAs.

...read moreread less

Journal Article•DOI•

Concurrency bugs in open source software: a case study

[...]

Sara Abbaspour Asadollah¹, Daniel Sundmark¹, Sigrid Eldh², Hans Hansson¹•Institutions (2)

Mälardalen University College¹, Ericsson²

04 Apr 2017-Journal of Internet Services and Applications

TL;DR: This paper presents an empirical study focusing on understanding the differences and similarities between concurrency bugs and other bugs, as well as the differences among various concurrency bug types in terms of their severity and their fixing time, and reproducibility.

...read moreread less

Abstract: Concurrent programming puts demands on software debugging and testing, as concurrent software may exhibit problems not present in sequential software, e.g., deadlocks and race conditions. In aiming to increase efficiency and effectiveness of debugging and bug-fixing for concurrent software, a deep understanding of concurrency bugs, their frequency and fixing-times would be helpful. Similarly, to design effective tools and techniques for testing and debugging concurrent software, understanding the differences between non-concurrency and concurrency bugs in real-word software would be useful. This paper presents an empirical study focusing on understanding the differences and similarities between concurrency bugs and other bugs, as well as the differences among various concurrency bug types in terms of their severity and their fixing time, and reproducibility. Our basis is a comprehensive analysis of bug reports covering several generations of five open source software projects. The analysis involves a total of 11860 bug reports from the last decade, including 351 reports related to concurrency bugs. We found that concurrency bugs are different from other bugs in terms of their fixing time and severity while they are similar in terms of reproducibility. Our findings shed light on concurrency bugs and could thereby influence future design and development of concurrent software, their debugging and testing, as well as related tools.

...read moreread less

Proceedings Article•

Adaptive Concurrency Control: Despite the Looking Glass, One Concurrency Control Does Not Fit All.

[...]

Dixin Tang¹, Hao Jiang¹, Aaron J. Elmore¹•Institutions (1)

University of Chicago¹

01 Jan 2017

TL;DR: Adapt concurrency control (ACC) is presented, that dynamically clusters data and chooses the optimal Concurrency control protocol for each cluster, to address three key challenges: how to cluster data to minimize cross-cluster access and maintain load-balancing, how to model workloads and perform protocol selection accordingly.

...read moreread less

Abstract: Use of transactional multicore main-memory databases is growing due to dramatic increases in memory size and CPU cores available for a single machine. To leverage these resources, recent concurrency control protocols have been proposed for main-memory databases, but are largely optimized for specific workloads. Due to shifting and unknown access patterns, workloads may change and one specific algorithm cannot dynamically fit all varied workloads. Thus, it is desirable to choose the right concurrency control protocol for a given workload. To address this issue we present adaptive concurrency control (ACC), that dynamically clusters data and chooses the optimal concurrency control protocol for each cluster. ACC addresses three key challenges: i) how to cluster data to minimize cross-cluster access and maintain load-balancing, ii) how to model workloads and perform protocol selection accordingly, and iii) how to support mixed concurrency control protocols running simultaneously. In this paper, we outline these challenges and present preliminary results.

...read moreread less

Journal Article•DOI•

Efficiently making (almost) any concurrency control mechanism serializable

[...]

Tianzheng Wang¹, Ryan Johnson, Alan Fekete², Ippokratis Pandis³•Institutions (3)

University of Toronto¹, University of Sydney², Amazon.com³

01 Aug 2017

TL;DR: The serial safety net (SSN) as discussed by the authors is a serializability-enforcing certifier which can be applied on top of various CC schemes that offer higher performance but admit anomalies, such as snapshot isolation and read committed.

...read moreread less

Abstract: Concurrency control (CC) algorithms must trade off strictness for performance. In particular, serializable CC schemes generally pay higher cost to prevent anomalies, both in runtime overhead such as the maintenance of lock tables and in efforts wasted by aborting transactions. We propose the serial safety net (SSN), a serializability-enforcing certifier which can be applied on top of various CC schemes that offer higher performance but admit anomalies, such as snapshot isolation and read committed. The underlying CC mechanism retains control of scheduling and transactional accesses, while SSN tracks the resulting dependencies. At commit time, SSN performs a validation test by examining only direct dependencies of the committing transaction to determine whether it can commit safely or must abort to avoid a potential dependency cycle. SSN performs robustly for a variety of workloads. It maintains the characteristics of the underlying CC without biasing toward a certain type of transactions, though the underlying CC scheme might. Besides traditional OLTP workloads, SSN also efficiently handles heterogeneous workloads which include a significant portion of long, read-mostly transactions. SSN can avoid tracking the vast majority of reads (thus reducing the overhead of serializability certification) and still produce serializable executions with little overhead. The dependency tracking and validation tests can be done efficiently, fully parallel and latch-free, for multi-version systems on modern hardware with substantial core count and large main memory. We demonstrate the efficiency, accuracy and robustness of SSN using extensive simulations and an implementation that overlays snapshot isolation in ERMIA, a memory-optimized OLTP engine that supports multiple CC schemes. Evaluation results confirm that SSN is a promising approach to serializability with robust performance and low overhead for various workloads.

...read moreread less

Proceedings Article•DOI•

Bringing Modular Concurrency Control to the Next Level

[...]

Chunzhi Su¹, Natacha Crooks¹, Cong Ding², Lorenzo Alvisi¹, Chao Xie¹ - Show less +1 more•Institutions (2)

University of Texas at Austin¹, Cornell University²

09 May 2017

TL;DR: Tebaldi partitions conflicts at a fine granularity and matches them to specialized CCs within a hierarchical framework that is modular, extensible, and able to support a wide variety of concurrency control techniques, from single-version to multiversion and from lock-based to timestamp-based.

...read moreread less

Abstract: This paper presents Tebaldi, a distributed key-value store that explores new ways to harness the performance opportunity of combining different specialized concurrency control mechanisms (CCs) within the same database. Tebaldi partitions conflicts at a fine granularity and matches them to specialized CCs within a hierarchical framework that is modular, extensible, and able to support a wide variety of concurrency control techniques, from single-version to multiversion and from lock-based to timestamp-based. When running the TPC-C benchmark, Tebaldi yields more than 20× the throughput of the basic two-phase locking protocol, and over 3.7× the throughput of Callas, a recent system that, like Tebaldi, aims to combine different CCs.

...read moreread less

Proceedings Article•DOI•

Eunomia: Scaling Concurrent Search Trees under Contention Using HTM

[...]

Xin Wang¹, Weihua Zhang¹, Zhaoguo Wang², Ziyun Wei¹, Haibo Chen³, Wenyun Zhao¹ - Show less +2 more•Institutions (3)

Fudan University¹, New York University², Shanghai Jiao Tong University³

26 Jan 2017

TL;DR: Evaluation using key-value store benchmarks on a 20-core HTM-capable multi-core machine shows that Eunomia leads to 5X-11X speedup under high contention, while incurring small overhead under low contention.

...read moreread less

Abstract: While hardware transactional memory (HTM) has recently been adopted to construct efficient concurrent search tree structures, such designs fail to deliver scalable performance under contention. In this paper, we first conduct a detailed analysis on an HTM-based concurrent B+Tree, which uncovers several reasons for excessive HTM aborts induced by both false and true conflicts under contention. Based on the analysis, we advocate Eunomia, a design pattern for search trees which contains several principles to reduce HTM aborts, including splitting HTM regions with version-based concurrency control to reduce HTM working sets, partitioned data layout to reduce false conflicts, proactively detecting and avoiding true conflicts, and adaptive concurrency control. To validate their effectiveness, we apply such designs to construct a scalable concurrent B+Tree using HTM. Evaluation using key-value store benchmarks on a 20-core HTM-capable multi-core machine shows that Eunomia leads to 5X-11X speedup under high contention, while incurring small overhead under low contention.

...read moreread less

Proceedings Article•DOI•

Proving Opacity of a Pessimistic {STM}

[...]

Simon Doherty¹, Brijesh Dongol², John Derrick¹, Gerhard Schellhorn³, Heike Wehrheim⁴ - Show less +1 more•Institutions (4)

University of Sheffield¹, University of London², University of Augsburg³, University of Paderborn⁴

01 Jan 2017

TL;DR: This paper presents the first formal verification of a pessimistic software TM algorithm, namely, an algorithm proposed by Matveev and Shavit, and proves that this pessimistic TM is a refinement of an intermediate opaque I/O-automaton, known as TMS2.

...read moreread less

Abstract: Transactional Memory (TM) is a high-level programming abstraction for concurrency control that provides programmers with the illusion of atomically executing blocks of code, called transactions. TMs come in two categories, optimistic and pessimistic, where in the latter transactions never abort. While this simplifies the programming model, high-performing pessimistic TMs can be complex. In this paper, we present the first formal verification of a pessimistic software TM algorithm, namely, an algorithm proposed by Matveev and Shavit. The correctness criterion used is opacity, formalising the transactional atomicity guarantees. We prove that this pessimistic TM is a refinement of an intermediate opaque I/O-automaton, known as TMS2. To this end, we develop a rely-guarantee approach for reducing the complexity of the proof. Proofs are mechanised in the interactive prover Isabelle.

...read moreread less

Journal Article•DOI•

Collaborative CAD Synchronization Based on a Symmetric and Consistent Modeling Procedure

[...]

Yiqi Wu, Fazhi He, Soonhung Han

23 Apr 2017-Symmetry

TL;DR: In order to generate a valid, unique, and symmetric queue among collaborative sites, a set of correlated mechanisms is presented in this paper, in which all Co-CAD sites maintain symmetric and consistent operating procedures.

...read moreread less

Abstract: One basic issue with collaborative computer aided design (Co-CAD) is how to maintain valid and consistent modeling results across all design sites. Moreover, modeling history is important in parametric CAD modeling. Therefore, different from a typical co-editing approach, this paper proposes a novel method for Co-CAD synchronization, in which all Co-CAD sites maintain symmetric and consistent operating procedures. Consequently, the consistency of both modeling results and history can be achieved. In order to generate a valid, unique, and symmetric queue among collaborative sites, a set of correlated mechanisms is presented in this paper. Firstly, the causal relationship of operations is maintained. Secondly, the operation queue is reconstructed for partial concurrency operation, and the concurrent operation can be retrieved. Thirdly, a symmetric, concurrent operation control strategy is proposed to determine the order of operations and resolve possible conflicts. Compared with existing Co-CAD consistency methods, the proposed method is convenient and flexible in supporting collaborative design. The experiment performed based on the collaborative modeling procedure demonstrates the correctness and applicability of this work.

...read moreread less

Patent•

High concurrent data transmission method based on RDMA (Remote Direct Memory Access)

[...]

Lu Youyou, Shu Jiwu, Youmin Chen

10 May 2017

TL;DR: In this paper, a high concurrent data transmission method based on RDMA (Remote Direct Memory Access) is proposed, which comprises the steps of establishing a graded buffer area before a client accesses a data transmission system; actively carrying out data transfer among a user buffer area, the graded buffer areas and a remote memory area by the client when remote data read/write is carried out; and setting a lock field at the head part of each independent data block of the remote memory areas by a server, wherein the lock field is used for concurrent control, when a plurality of clients concurrently

...read moreread less

Abstract: The invention discloses a high concurrent data transmission method based on RDMA (Remote Direct Memory Access). The method comprises the steps of establishing a graded buffer area before a client accesses a data transmission system; actively carrying out data transfer among a user buffer area, the graded buffer area and a remote memory area by the client when remote data read/write is carried out; and setting a lock field at the head part of each independent data block of the remote memory area by a server, wherein the lock field is used for concurrent control, when a plurality of clients concurrently read/write data, the concurrent control is carried out through a distributed lock protocol that the server locks locally and the clients unlock remotely. The method has the advantages that the data replication is reduced when a file is read/written, the processing pressure of the server is reduced, and the efficient concurrent control is provided.

...read moreread less

Journal Article•DOI•

HiperTM: High performance, fault-tolerant transactional memory

[...]

Sachin Hirve¹, Roberto Palmieri¹, Binoy Ravindran¹•Institutions (1)

Virginia Tech¹

06 Aug 2017-Theoretical Computer Science

TL;DR: The implementation reveals that HiperTM guarantees 0% of out-of-order optimistic deliveries and performance up to 3.5× better than atomic broadcast-based competitor (PaxosSTM) using the standard configuration of TPC-C benchmark.

...read moreread less

Proceedings Article•DOI•

ALOHA-KV: high performance read-only and write-only distributed transactions

[...]

Hua Fan¹, Wojciech Golab¹, Charles B. Morrey²•Institutions (2)

University of Waterloo¹, Hewlett-Packard²

24 Sep 2017

TL;DR: This paper proposes a simpler and leaner protocol for serializable read-only write-only transactions, which uses only one round trip to commit a transaction in the absence of failures irrespective of contention, and integrates this protocol into ALOHA-KV, a scalable distributed key-value store for read- only write- only transactions.

...read moreread less

Abstract: There is a trend in recent database research to pursue coordination avoidance and weaker transaction isolation under a long-standing assumption: concurrent serializable transactions under read-write or write-write conflicts require costly synchronization, and thus may incur a steep price in terms of performance. In particular, distributed transactions, which access multiple data items atomically, are considered inherently costly. They require concurrency control for transaction isolation since both read-write and write-write conflicts are possible, and they rely on distributed commitment protocols to ensure atomicity in the presence of failures. This paper presents serializable read-only and write-only distributed transactions as a counterexample to show that concurrent transactions can be processed in parallel with low-overhead despite conflicts. Inspired by the slotted ALOHA network protocol, we propose a simpler and leaner protocol for serializable read-only write-only transactions, which uses only one round trip to commit a transaction in the absence of failures irrespective of contention. Our design is centered around an epoch-based concurrency control (ECC) mechanism that minimizes synchronization conflicts and uses a small number of additional messages whose cost is amortized across many transactions. We integrate this protocol into ALOHA-KV, a scalable distributed key-value store for read-only write-only transactions, and demonstrate that the system can process close to 15 million read/write operations per second per server when each transaction batches together thousands of such operations.

...read moreread less

Posted Content•

Accelerating Analytical Processing in MVCC using Fine-Granular High-Frequency Virtual Snapshotting

[...]

Ankur Sharma¹, Felix Martin Schuhknecht¹, Jens Dittrich¹•Institutions (1)

Saarland University¹

13 Sep 2017-arXiv: Databases

TL;DR: AnKerDB as discussed by the authors is a heterogeneous transaction processing system that outsources OLAP transactions to run on separate (virtual) snapshots while OLTP transactions run on the most recent representation of the database.

...read moreread less

Abstract: Efficient transactional management is a delicate task. As systems face transactions of inherently different types, ranging from point updates to long running analytical computations, it is hard to satisfy their individual requirements with a single processing component. Unfortunately, most systems nowadays rely on such a single component that implements its parallelism using multi-version concurrency control (MVCC). While MVCC parallelizes short-running OLTP transactions very well, it struggles in the presence of mixed workloads containing long-running scan-centric OLAP queries, as scans have to work their way through large amounts of versioned data. To overcome this problem, we propose a system, which reintroduces the concept of heterogeneous transaction processing: OLAP transactions are outsourced to run on separate (virtual) snapshots while OLTP transactions run on the most recent representation of the database. Inside both components, MVCC ensures a high degree of concurrency. The biggest challenge of such a heterogeneous approach is to generate the snapshots at a high frequency. Previous approaches heavily suffered from the tremendous cost of snapshot creation. In our system, we overcome the restrictions of the OS by introducing a custom system call vm_snapshot, that is hand-tailored to our precise needs: it allows fine-granular snapshot creation at very high frequencies, rendering the snapshot creation phase orders of magnitudes faster than state-of-the-art approaches. Our experimental evaluation on a heterogeneous workload based on TPC-H transactions and handcrafted OLTP transactions shows that our system enables significantly higher analytical transaction throughputs on mixed workloads than homogeneous approaches. In this sense, we introduce a system that accelerates Analytical processing by introducing custom Kernel functionalities: AnKerDB.

...read moreread less

Journal Article•DOI•

A Transactional Correctness Tool for Abstract Data Types

[...]

Christina Peterson¹, Damian Dechev¹•Institutions (1)

University of Central Florida¹

14 Nov 2017-ACM Transactions on Architecture and Code Optimization

TL;DR: This article presents Transactional Correctness tool for Abstract Data Types (TxC-ADT), the first tool that can check the correctness of transactional data structures and presents a technique for defining correctness as a happens-before relation, an essential aspect for checking correctness of transactions that synchronize only for high-level semantic conflicts.

...read moreread less

Abstract: Transactional memory simplifies multiprocessor programming by providing the guarantee that a sequential block of code in the form of a transaction will exhibit atomicity and isolation. Transactional data structures offer the same guarantee to concurrent data structures by enabling the atomic execution of a composition of operations. The concurrency control of transactional memory systems preserves atomicity and isolation by detecting read/write conflicts among multiple concurrent transactions. State-of-the-art transactional data structures improve on this concurrency control protocol by providing explicit transaction-level synchronization for only non-commutative operations. Since read/write conflicts are handled by thread-level concurrency control, the correctness of transactional data structures cannot be evaluated according to the read/write histories. This presents a challenge for existing correctness verification techniques for transactional memory, because correctness is determined according to the transitions taken by the transactions in the presence of read/write conflicts.In this article, we present Transactional Correctness tool for Abstract Data Types (TxC-ADT), the first tool that can check the correctness of transactional data structures. TxC-ADT elevates the standard definitions of transactional correctness to be in terms of an abstract data type, an essential aspect for checking correctness of transactions that synchronize only for high-level semantic conflicts. To accommodate a diverse assortment of transactional correctness conditions, we present a technique for defining correctness as a happens-before relation. Defining a correctness condition in this manner enables an automated approach in which correctness is evaluated by generating and analyzing a transactional happens-before graph during model checking. A transactional happens-before graph is maintained on a per-thread basis, making our approach applicable to transactional correctness conditions that do not enforce a total order on a transactional execution. We demonstrate the practical applications of TxC-ADT by checking Lock Free Transactional Transformation and Transactional Data Structure Libraries for serializability, strict serializability, opacity, and causal consistency.

...read moreread less

Patent•

Programming model and interpreted runtime environment for high performance services with implicit concurrency control

[...]

Ahmet Salih Iscen

10 May 2017

TL;DR: Disclosed as discussed by the authors is a programming model for the definition of services to be operated on large sets of data with numerous responsibilities, the programming model comprising program units in a tree topology for high performance and implicit concurrency control, where each program unit definition comprises responsibilities defined in behaviors and configurations.

...read moreread less

Abstract: Disclosed is a programming model utilized for the definition of services to be operated on large sets of data with numerous responsibilities, the programming model comprising program units in a tree topology for high performance and implicit concurrency control, where each program unit definition comprises responsibilities defined in behaviors and configurations. A runtime environment may be utilized to provide implicit concurrency, parallelization, and concurrency control for operations executed on program unit instances.

...read moreread less

Journal Article•DOI•

Alone together: compositional reasoning and inference for weak isolation

[...]

Gowtham Kaki¹, Kartik Nagar¹, Mahsa Najafzadeh¹, Suresh Jagannathan¹•Institutions (1)

Purdue University¹

27 Dec 2017

TL;DR: In this paper, the authors present a program logic that enables compositional reasoning about the behavior of concurrently executing weakly-isolated transactions, and they also describe an inference procedure based on this foundation that ascertains the weakest isolation level that still guarantees the safety of high-level consistency assertions associated with such transactions.

...read moreread less

Abstract: Serializability is a well-understood correctness criterion that simplifies reasoning about the behavior of concurrent transactions by ensuring they are isolated from each other while they execute. However, enforcing serializable isolation comes at a steep cost in performance because it necessarily restricts opportunities to exploit concurrency even when such opportunities would not violate application-specific invariants. As a result, database systems in practice support, and often encourage, developers to implement transactions using weaker alternatives. These alternatives break the strong isolation guarantees offered by serializable transactions to permit greater concurrency. Unfortunately, the semantics of weak isolation is poorly understood, and usually explained only informally in terms of low-level implementation artifacts. Consequently, verifying high-level correctness properties in such environments remains a challenging problem. To address this issue, we present a novel program logic that enables compositional reasoning about the behavior of concurrently executing weakly-isolated transactions. Recognizing that the proof burden necessary to use this logic may dissuade application developers, we also describe an inference procedure based on this foundation that ascertains the weakest isolation level that still guarantees the safety of high-level consistency assertions associated with such transactions. The key to effective inference is the observation that weakly-isolated transactions can be viewed as functional (monadic) computations over an abstract database state, allowing us to treat their operations as state transformers over the database. This interpretation enables automated verification using off-the-shelf SMT solvers. Our development is parametric over a transaction’s specific isolation semantics, allowing it to be applicable over a range of concurrency control mechanisms. Case studies and experiments on real-world applications (written in an embedded DSL in OCaml) demonstrate the utility of our approach, and provide strong evidence that automated verification of weakly-isolated transactions can be placed on the same formal footing as their strongly-isolated serializable counterparts.

...read moreread less

Proceedings Article•DOI•

Priority Heuristic in Mobile Distributed Real Time Database Using Optimistic Concurrency Control

[...]

Prakash Kumar Singh, Udai Shanker

01 Sep 2017

TL;DR: A new priority heuristic method based on deadline computed by considering write only operations has been proposed for wireless environments which shows better results than earlier priority heuristics.

...read moreread less

Abstract: Priority scheduling among running transactions is one of the most important issues in the design of mobile distributed real-time database systems (MDRTDBS). In MDRTDBS, to perform correct transaction scheduling, several priority heuristics with different concurrency control methods are used, so that, it could minimize transaction abort rate. Priority heuristic approaches deal with problem of assigning the priorities among transactions so that it could helpful with concurrency control (CC) mechanism to achieve typical time constraint. In recent few years, the performance of CC protocols of distributed real-time database Systems (DRTDBS) have been examined using different priority heuristics methods. However, very few numbers of approaches have been proposed on priority heuristics for wireless environments. Hence, a new priority heuristic method based on deadline computed by considering write only operations has been proposed for wireless environments. It is improved priority heuristic approach which shows better results than earlier priority heuristics. In recent years, researchers have classified the transactions in two types called as Read only transactions (ROT) and Update transaction. This new priority heuristic for mobile environment considers ROT and update transactions separately. Further, a study has also been done to examine the impact of these priority heuristics as compared with number of locks and mixed method approaches.

...read moreread less

Journal Article•DOI•

Efficient Situational Scheduling of Graph Workloads on Single-Chip Multicores and GPUs

[...]

Masab Ahmad¹, Chris J. Michael², Omer Khan¹•Institutions (2)

University of Connecticut¹, United States Naval Research Laboratory²

01 Jan 2017-IEEE Micro

TL;DR: A situationally adaptive scheduler (SAS) that learns the architectural choices offline using synthetically generated graphs that is comparable to the optimal setup that optimizes both algorithmic and architectural choices is proposed.

...read moreread less

Abstract: Situational dynamic changes in graph analytic algorithm implementations give rise to efficiency challenges in concurrent hardware, such as GPUs and large-scale multicores. These performance variations stem from input dependence, such as the density and degree of the graph being processed. Consequently, concurrency control becomes challenging, because the complex data-dependent behavior in these workloads exhibits a range of plausible algorithmic and architectural choices. This article addresses the question of how to efficiently harness the multidimensional search space of such choices for graph analytic workloads in a real-time execution environment. A key insight is that architectural choices are sufficient to yield a concurrency control setting that is comparable to the optimal setup that optimizes both algorithmic and architectural choices. The authors propose a situationally adaptive scheduler (SAS) that learns the architectural choices offline using synthetically generated graphs. SAS-assisted execution in a real-time setup provides geometric performance gains of 40 percent for a large-scale GPU (Nvidia GTX-970), 35 percent for a smaller GPU (Nvidia GTX- 750Ti), and 30 percent for a large-scale multicore (Intel Xeon Phi).

...read moreread less

Proceedings Article•DOI•

An Analytical Model of Hardware Transactional Memory

[...]

Daniel Castro¹, Paolo Romano², Diego Didona³, Willy Zwaenepoel³•Institutions (3)

University of Lisbon¹, Instituto Superior Técnico², École Polytechnique Fédérale de Lausanne³

20 Sep 2017

TL;DR: This paper designs a set of experiments that allow them to shed lights on the internal mechanisms used in TSX to manage conflicts among transactions and to track their readsets and writesets, and builds an analytical model of TSX focused on capturing the impact on performance of two key mechanisms.

...read moreread less

Abstract: This paper investigates the problem of deriving a white box performance model of Hardware Transactional Memory (HTM) systems. The proposed model targets TSX, a popular implementation of HTM integrated in Intel processors starting with the Haswell family in 2013.An inherent difficulty with building white-box models of commercially available HTM systems is that their internals are either vaguely documented or undisclosed by their manufacturers. We tackle this challenge by designing a set of experiments that allow us to shed lights on the internal mechanisms used in TSX to manage conflicts among transactions and to track their readsets and writesets.We exploit the information inferred from this experimental study to build an analytical model of TSX focused on capturing the impact on performance of two key mechanisms: the concurrency control scheme and the management of transactional meta-data in the processor's caches. We validate the proposed model by means of an extensive experimental study encompassing a broad range of workloads executed on a real system.

...read moreread less