scispace - formally typeset
Search or ask a question

Showing papers on "Concurrency control published in 2014"


Journal ArticleDOI
01 Nov 2014
TL;DR: In this article, the authors evaluate concurrency control for on-line transaction processing (OLTP) workloads on many-core chips and show that the complexity of coordinating competing accesses to data will likely diminish the gains from increased core counts.
Abstract: Computer architectures are moving towards an era dominated by many-core machines with dozens or even hundreds of cores on a single chip. This unprecedented level of on-chip parallelism introduces a new dimension to scalability that current database management systems (DBMSs) were not designed for. In particular, as the number of cores increases, the problem of concurrency control becomes extremely challenging. With hundreds of threads running in parallel, the complexity of coordinating competing accesses to data will likely diminish the gains from increased core counts.To better understand just how unprepared current DBMSs are for future CPU architectures, we performed an evaluation of concurrency control for on-line transaction processing (OLTP) workloads on many-core chips. We implemented seven concurrency control algorithms on a main-memory DBMS and using computer simulations scaled our system to 1024 cores. Our analysis shows that all algorithms fail to scale to this magnitude but for different reasons. In each case, we identify fundamental bottlenecks that are independent of the particular database implementation and argue that even state-of-the-art DBMSs suffer from these limitations. We conclude that rather than pursuing incremental solutions, many-core chips may require a completely redesigned DBMS architecture that is built from ground up and is tightly coupled with the hardware.

239 citations


Proceedings ArticleDOI
06 Oct 2014
TL;DR: ROCOCO is presented, a novel concurrency control protocol for distributed transactions that outperforms 2PL and OCC by allowing more concurrency and is compared to OCC and 2PL using a scaled TPC-C benchmark.
Abstract: Distributed storage systems run transactions across machines to ensure serializability. Traditional protocols for distributed transactions are based on two-phase locking (2PL) or optimistic concurrency control (OCC). 2PL serializes transactions as soon as they conflict and OCC resorts to aborts, leaving many opportunities for concurrency on the table. This paper presents ROCOCO, a novel concurrency control protocol for distributed transactions that outperforms 2PL and OCC by allowing more concurrency. ROCOCO executes a transaction as a collection of atomic pieces, each of which commonly involves only a single server. Servers first track dependencies between concurrent transactions without actually executing them. At commit time, a transaction's dependency information is sent to all servers so they can re-order conflicting pieces and execute them in a serializable order.We compare ROCOCO to OCC and 2PL using a scaled TPC-C benchmark. ROCOCO outperforms 2PL and OCC in workloads with varying degrees of contention. When the contention is high, ROCOCO's throughput is 130% and 347% higher than that of 2PL and OCC.

132 citations


Proceedings ArticleDOI
06 Oct 2014
TL;DR: Doppel as discussed by the authors proposes a new concurrency control technique, phase reconciliation, that solves the problem of contention in multicore main-memory database performance when many transactions contend on the same data.
Abstract: Multicore main-memory database performance can collapse when many transactions contend on the same data. Contending transactions are executed serially--either by locks or by optimistic concurrency control aborts--in order to ensure that they have serializable effects. This leaves many cores idle and performance poor. We introduce a new concurrency control technique, phase reconciliation, that solves this problem for many important workloads. Doppel, our phase reconciliation database, repeatedly cycles through joined, split, and reconciliation phases. Joined phases use traditional concurrency control and allow any transaction to execute. When workload contention causes unnecessary serial execution, Doppel switches to a split phase. There, updates to contended items modify per-core state, and thus proceed in parallel on different cores. Not all transactions can execute in a split phase; for example, all modifications to a contended item must commute. A reconciliation phase merges these per-core states into the global store, producing a complete database ready for joined-phase transactions. A key aspect of this design is determining which items to split, and which operations to allow on split items.Phase reconciliation helps most when there are many updates to a few popular database records. Its throughput is up to 38x higher than conventional concurrency control protocols on microbenchmarks, and up to 3x on a larger application, at the cost of increased latency for some transactions.

90 citations


Posted Content
TL;DR: Bohm as mentioned in this paper is a concurrency control protocol for main-memory multi-versioned database systems that guarantees serializable execution while ensuring that reads never block writes and does not require reads to perform any book-keeping whatsoever.
Abstract: Multi-versioned database systems have the potential to significantly increase the amount of concurrency in transaction processing because they can avoid read-write conflicts. Unfortunately, the increase in concurrency usually comes at the cost of transaction serializability. If a database user requests full serializability, modern multi-versioned systems significantly constrain read-write concurrency among conflicting transactions and employ expensive synchronization patterns in their design. In main-memory multi-core settings, these additional constraints are so burdensome that multi-versioned systems are often significantly outperformed by single-version systems. We propose Bohm, a new concurrency control protocol for main-memory multi-versioned database systems. Bohm guarantees serializable execution while ensuring that reads never block writes. In addition, Bohm does not require reads to perform any book-keeping whatsoever, thereby avoiding the overhead of tracking reads via contended writes to shared memory. This leads to excellent scalability and performance in multi-core settings. Bohm has all the above characteristics without performing validation based concurrency control. Instead, it is pessimistic, and is therefore not prone to excessive aborts in the presence of contention. An experimental evaluation shows that Bohm performs well in both high contention and low contention settings, and is able to dramatically outperform state-of-the-art multi-versioned systems despite maintaining the full set of serializability guarantees.

84 citations


Journal ArticleDOI
TL;DR: By utilizing the proposed high-performance concurrent error detection scheme, more reliable and robust hardware implementations for the newly-standardized SHA-3 are realized.
Abstract: The secure hash algorithm (SHA)-3 has been selected in 2012 and will be used to provide security to any application which requires hashing, pseudo-random number generation, and integrity checking. This algorithm has been selected based on various benchmarks such as security, performance, and complexity. In this paper, in order to provide reliable architectures for this algorithm, an efficient concurrent error detection scheme for the selected SHA-3 algorithm, i.e., Keccak, is proposed. To the best of our knowledge, effective countermeasures for potential reliability issues in the hardware implementations of this algorithm have not been presented to date. In proposing the error detection approach, our aim is to have acceptable complexity and performance overheads while maintaining high error coverage. In this regard, we present a low-complexity recomputing with rotated operands-based scheme which is a step-forward toward reducing the hardware overhead of the proposed error detection approach. Moreover, we perform injection-based fault simulations and show that the error coverage of close to 100% is derived. Furthermore, we have designed the proposed scheme and through ASIC analysis, it is shown that acceptable complexity and performance overheads are reached. By utilizing the proposed high-performance concurrent error detection scheme, more reliable and robust hardware implementations for the newly-standardized SHA-3 are realized.

56 citations


Journal ArticleDOI
TL;DR: The experimental results show that Magiclock is significantly more scalable and efficient than existing dynamic detectors in analyzing and detecting potential deadlocks in execution traces of large-scale multithreaded programs.
Abstract: We present Magiclock, a novel potential deadlock detection technique by analyzing execution traces (containing no deadlock occurrence) of large-scale multithreaded programs. Magiclock iteratively eliminates removable lock dependencies before potential deadlock localization. It divides lock dependencies into thread specific partitions, consolidates equivalent lock dependencies, and searches over the set of lock dependency chains without the need to examine any duplicated permutations of the same lock dependency chains. We validate Magiclock through a suite of real-world, large-scale multithreaded programs. The experimental results show that Magiclock is significantly more scalable and efficient than existing dynamic detectors in analyzing and detecting potential deadlocks in execution traces of large-scale multithreaded programs.

49 citations


Journal ArticleDOI
01 Dec 2014
TL;DR: The results of comprehensive performance experiments show that the Incremental approach significantly outperforms any other known method from the literature and requires no a priori knowledge of which nodes of a distributed system are involved in executing a transaction.
Abstract: Modern database systems employ Snapshot Isolation to implement concurrency control and isolationbecause it promises superior query performance compared to lock-based alternatives. Furthermore, Snapshot Isolation never blocks readers, which is an important property for modern information systems, which have mixed workloads of heavy OLAP queries and short update transactions. This paper revisits the problem of implementing Snapshot Isolation in a distributed database system and makes three important contributions. First, a complete definition of Distributed Snapshot Isolation is given, thereby extending existing definitions from the literature. Based on this definition, a set of criteria is proposed to efficiently implement Snapshot Isolation in a distributed system. Second, the design space of alternative methods to implement Distributed Snapshot Isolation is presented based on this set of criteria. Third, a new approach to implement Distributed Snapshot Isolation is devised; we refer to this approach as Incremental. The results of comprehensive performance experiments with the TPC-C benchmark show that the Incremental approach significantly outperforms any other known method from the literature. Furthermore, the Incremental approach requires no a priori knowledge of which nodes of a distributed system are involved in executing a transaction. Also, the Incremental approach can execute transactions that involve data from a single node only with the same efficiency as a centralized database system. This way, the Incremental approach takes advantage of sharding or other ways to improve data locality. The cost for synchronizing transactions in a distributed system is only paid by transactions that actually involve data from several nodes. All these properties make the Incremental approach more practical than related methods proposed in the literature.

49 citations


Book ChapterDOI
04 Jan 2014
TL;DR: This paper presents a time-stamp based multiversion STM system that satisfies opacity and is easy to implement, and formally proves the correctness of the proposedSTM system.
Abstract: Software Transactional Memory Systems STM are a promising alternative for concurrency control in shared memory systems. Multiversion STM systems maintain multiple versions for each t-object. The advantage of storing multiple versions is that it facilitates successful execution of higher number of read operations than otherwise. Multi-Version permissiveness mv-permissiveness is a progress condition for multi-version STMs that states that a read-only transaction never aborts. Recently a STM system was proposed that maintains only a single version but is mv-permissive. This raises a natural question: how much concurrency can be achieved by multi-version STM. We show that fewer transactions are aborted in multi-version STMs than single-version systems. We also show that any STM system that is permissive w.r.t opacity must maintain at least as many versions as the number of live transactions. A direct implication of this result is that no single-version STM can be permissive w.r.t opacity. In this paper we present a time-stamp based multiversion STM system that satisfies opacity and is easy to implement. We formally prove the correctness of the proposed STM system. Although many multi-version STM systems have been proposed in literature that satisfy opacity, to the best of our knowledge none of them has been formally proved to be opaque. We also present garbage collection procedure which deletes unwanted versions of the transaction objects. We show that with garbage collection the number of versions maintained is bounded by number of live transactions.

46 citations


Journal ArticleDOI
01 Aug 2014
TL;DR: A novel kV-Indirection structure is described to enable efficient (parallelizable) optimistic and pessimistic multi-version concurrency control by utilizing the old versions of records to provide direct access to the recent changes of records without the need of temporal indexes.
Abstract: In multi-version databases, updates and deletions of records by transactions require appending a new record to tables rather than performing in-place updates. This mechanism incurs non-negligible performance overhead in the presence of multiple indexes on a table, where changes need to be propagated to all indexes. Additionally, an uncommitted record update will block other active transactions from using the index to fetch the most recently committed values for the updated record. In general, in order to support snapshot isolation and/or multi-version concurrency, either each active transaction is forced to search a database temporary area (e.g., roll-back segments) to fetch old values of desired records, or each transaction is forced to scan the entire table to find the older versions of the record in a multi-version database (in the absence of specialized temporal indexes).In this work, we describe a novel kV-Indirection structure to enable efficient (parallelizable) optimistic and pessimistic multi-version concurrency control by utilizing the old versions of records (at most two versions of each record) to provide direct access to the recent changes of records without the need of temporal indexes. As a result, our technique results in higher degree of concurrency by reducing the clashes between readers and writers of data and avoiding extended lock delays. We have a working prototype of our concurrency model and kV-Indirection structure in a commercial database and conducted an extensive evaluation to demonstrate the benefits of our multi-version concurrency control, and we obtained orders of magnitude speed up over the single-version concurrency control.

44 citations


Journal ArticleDOI
22 Jan 2014
TL;DR: It is shown that the design and implementation of adaptive indexing rigorously separates index structures from index contents; this relaxes constraints and requirements during Adaptive indexing compared to those of traditional index updates.
Abstract: Adaptive indexing initializes and optimizes indexes incrementally, as a side effect of query processing. The goal is to achieve the benefits of indexes while hiding or minimizing the costs of index creation. However, index-optimizing side effects seem to turn read-only queries into update transactions that might, for example, create lock contention. This paper studies concurrency control and recovery in the context of adaptive indexing. We show that the design and implementation of adaptive indexing rigorously separates index structures from index contents; this relaxes constraints and requirements during adaptive indexing compared to those of traditional index updates. Our design adapts to the fact that an adaptive index is refined continuously and exploits any concurrency opportunities in a dynamic way. A detailed experimental analysis demonstrates that (a) adaptive indexing maintains its adaptive properties even when running concurrent queries, (b) adaptive indexing can exploit the opportunity for parallelism due to concurrent queries, (c) the number of concurrency conflicts and any concurrency administration overheads follow an adaptive behavior, decreasing as the workload evolves and adapting to the workload needs.

32 citations


Journal ArticleDOI
01 Jul 2014
TL;DR: The proposed SMARTSPACE-a hybrid multiagent based distributed platform for efficient semantic service discovery, utilizing reactive agents in modeling services, users' requests, and registry management middleware, is able to achieve fast, scalable, parallel, and concurrent service finding within a systemic environment that can be highly dynamic, asynchronous, and concurrently.
Abstract: Service discovery is an integral issue in the area of service oriented computing (SOC). A centralized platform based service discovery suffers from major drawbacks such as scalability and a single point of failure. A P2P based design incurs high maintenance overhead for a distributed service registry and querying task. In this paper, we have proposed SMARTSPACE—a hybrid multiagent based distributed platform for efficient semantic service discovery. By utilizing reactive agents in modeling services, users' requests, and registry management middleware, the proposed service discovery algorithm, SmartDiscover, is able to achieve fast, scalable, parallel, and concurrent service finding within a systemic environment that can be highly dynamic, asynchronous, and concurrent. We have conducted the SmartDiscover experiments within the JADE 3.7 agent framework on top of both IBM Cloud Cluster and NetLogo simulation environments. The results showed promising positive outcomes in terms of average query response time and the number of message exchanges to maintain the distributed registry. The accuracy of SmartDiscover was measured and compared with the widely accepted benchmark OWL-S MX approach.

Proceedings ArticleDOI
01 Jan 2014
TL;DR: Diff-Index is proposed to support a spectrum of index maintenance schemes to suit different objectives in index consistency and performance, and demonstrates that it offers various performance/consistency balance and satisfactory scalability while avoiding global coordination.
Abstract: Log-Structured-Merge (LSM) Tree gains much attention recently because of its superior performance in write-intensive workloads. LSM Tree uses an append-only structure in memory to achieve low write latency; at memory capacity, in-memory data are flushed to other storage media (e.g. disk). Consequently, read access is slower comparing to write. These specific features of LSM, including no in-place update and asymmetric read/write performance raise unique challenges in index maintenance for LSM. The structural difference between LSM and B-Tree also prevents mature B-Tree based approaches from being directly applied. To address the issues of index maintenance for LSM, we propose Diff-Index to support a spectrum of index maintenance schemes to suit different objectives in index consistency and performance. The schemes consist of sync-full, sync-insert, async-simple and async-session. Experiments on our HBase implementation quantitatively demonstrate that Diff-Index offers various performance/consistency balance and satisfactory scalability while avoiding global coordination. Syncinsert and async-simple can reduce 60%-80% of the overall index update latency when compared to the baseline syncfull ; async-simple can achieve superior index update performance with an acceptable inconsistency. Diff-Index exploits LSM features such as versioning and the flush-compact process to achieve goals of concurrency control and failure ∗Work done while author was at IBM Almaden Research Center. †Work done while author was an intern at IBM T. J. Watson Research Center. (c) 2014, Copyright is with the authors. Published in Proc. EDBT on OpenProceedings.org. Distribution of this paper is permitted under the terms of the Creative Commons license CC-by-nc-nd 4.0. recovery with low complexity and overhead. Diff-Index is included in IBM InfoSphere BigInsights, an IBM big data offering.

01 Jan 2014
TL;DR: Since 2011, engineers at Amazon Web Services have been using formal specification and model checking to help solve difficult design problems in critical systems, finding that subtle bugs can hide in complex concurrent fault-tolerant systems.
Abstract: Since 2011, engineers at Amazon Web Services (AWS) have been using formal specification and model checking to help solve difficult design problems in critical systems. This paper describes our motivation and experience, what has worked well in our problem domain, and what has not. When discussing personal experiences we refer to authors by their initials. At AWS we strive to build services that are simple for customers to use. That external simplicity is built on a hidden substrate of complex distributed systems. Such complex internals are required to achieve high availability while running on cost-efficient infrastructure, and also to cope with relentless rapid business growth. As an example of this growth; in 2006 we launched S3, our Simple Storage Service. In the 6 years after launch, S3 grew to store 1 trillion objects [1] . Less than a year later it had grown to 2 trillion objects, and was regularly handling 1.1 million requests per second [2] . S3 is just one of tens of AWS services that store and process data that our customers have entrusted to us. To safeguard that data, the core of each service relies on fault-tolerant distributed algorithms for replication, consistency, concurrency control, auto-scaling, load balancing, and other coordination tasks. There are many such algorithms in the literature, but combining them into a cohesive system is a major challenge, as the algorithms must usually be modified in order to interact properly in a real-world system. In addition, we have found it necessary to invent algorithms of our own. We work hard to avoid unnecessary complexity, but the essential complexity of the task remains high. High complexity increases the probability of human error in design, code, and operations. Errors in the core of the system could cause loss or corruption of data, or violate other interface contracts on which our customers depend. So, before launching such a service, we need to reach extremely high confidence that the core of the system is correct. We have found that the standard verification techniques in industry are necessary but not sufficient. We use deep design reviews, code reviews, static code analysis, stress testing, fault-injection testing, and many other techniques, but we still find that subtle bugs can hide in complex concurrent fault-tolerant systems. One reason for this problem is that human intuition is poor at estimating the true probability of supposedly ‘extremely rare’ combinations of events in systems operating at a scale of millions of requests per second.

Proceedings ArticleDOI
12 Feb 2014
TL;DR: A parameterised protocol description language, Pabble, which can guarantee safety and progress in a large class of practical, complex parameterised message-passing programs through static checking, and guarantees the termination of endpoint projection and type checking.
Abstract: Many parallel and distributed message-passing programs are written in a parametric way over available resources, in particular the number of nodes and their topologies, so that a single parallel program can scale over different environments. This paper presents a parameterised protocol description language, Pabble, which can guarantee safety and progress in a large class of practical, complex parameterised message-passing programs through static checking. Pabble can describe an overall interaction topology, using a concise and expressive notation, designed for a variable number of participants arranged in multiple dimensions. These parameterised protocols in turn automatically generate local protocols for type checking parameterised MPI programs for communication safety and deadlock freedom. In spite of undecidability of endpoint projection and type checking in the underlying parameterised session type theory, our method guarantees the termination of endpoint projection and type checking.

Journal ArticleDOI
TL;DR: This model has been implemented in a lightweight software transactional memory system, TinySTM, and has been evaluated on a set of benchmarks obtaining an important performance improvement over the baseline TM system.
Abstract: This article describes a transactional memory execution model intended to exploit maximum parallelism from sequential and multithreaded programs. A program code section is partitioned into chunks that will be mapped onto threads and executed transactionally. These transactions run concurrently and out of order, trying to exploit maximum parallelism but managed by a specific fully distributed commit control to meet data dependencies. To accomplish correct parallel execution, a partial precedence order relation is derived from the program code section and/or defined by the programmer. When a conflict between chunks is eagerly detected, the precedence order relation is used to determine the best policy to solve the conflict that preserves the precedence order while maximizing concurrency. The model defines a new transactional state called executed but not committed. This state allows exploiting concurrency on two levels: intrathread and interthread. Intrathread concurrency is improved by having pending uncommitted transactions while executing a new one in the same thread. The new state improves interthread concurrency because it permits out-of-order transaction commits regarding the precedence order. Our model has been implemented in a lightweight software transactional memory system, TinySTM, and has been evaluated on a set of benchmarks obtaining an important performance improvement over the baseline TM system.

Journal ArticleDOI
TL;DR: An extension of Discrete-Event Systems theory to handle dynamically instantiated and terminated threads is proposed, which makes use of elements of Dynamic DES theory to model threads whose lifetimes can be arbitrary.

Book ChapterDOI
04 Jan 2014
TL;DR: This work presents HiperTM, a high performance active replication protocol for fault-tolerant distributed transactional memory that guarantees 0% of out-of-order optimistic deliveries and performance up to 1.2× better than atomic broadcast-based competitor PaxosSTM.
Abstract: We present HiperTM, a high performance active replication protocol for fault-tolerant distributed transactional memory. The active replication paradigm allows transactions to execute locally, costing them only a single network communication step during transaction execution. Shared objects are replicated across all sites, avoiding remote object accesses. Replica consistency is ensured by a OS-Paxos, an optimistic atomic broadcast layer that total-orders transactional requests, and b SCC, a local multi-version concurrency control protocol that enforces a commit order equivalent to transactions' delivery order. SCC executes write transactions serially without incurring any synchronization overhead, and runs read-only transactions in parallel to write transactions with non-blocking execution and abort-freedom. Our implementation reveals that HiperTM guarantees 0% of out-of-order optimistic deliveries and performance up to 1.2× better than atomic broadcast-based competitor PaxosSTM.

Proceedings ArticleDOI
19 May 2014
TL;DR: A novel dynamic concurrency control technique which can significantly improve performance as well as resource utilization for software Transactional Memory applications at higher core counts and can actually improve upon the performance of the oracle chosen specification by more than 10% for certain applications through dynamic adaptation to available parallelism.
Abstract: Software Transactional Memory (STM) systems provide an easy to use programming model for concurrent code and have been found suitable for parallelizing many applications providing performance gains with minimal programmer effort. With increasing core counts on modern processors one would expect increasing benefits. However, we observe that running STM applications on higher core counts is sometimes, in fact, detrimental to performance. This is due to the larger number of conflicts that arise with a larger number of parallel cores. As the number of cores available on processors steadily rise, a larger number of applications are beginning to exhibit these characteristics. In this paper we propose a novel dynamic concurrency control technique which can significantly improve performance (up to 50%) as well as resource utilization (up to 85%) for these applications at higher core counts. Our technique uses ideas borrowed from TCP's network congestion control algorithm and uses self-induced concurrency fluctuations to dynamically monitor and match varying concurrency levels in applications while minimizing global synchronization. Our flux-based feedback-driven concurrency control technique is capable of fully recovering the performance of the best statically chosen concurrency specification (as chosen by an oracle) regardless of the initial specification for several real world applications. Further, our technique can actually improve upon the performance of the oracle chosen specification by more than 10% for certain applications through dynamic adaptation to available parallelism. We demonstrate our approach on the STAMP benchmark suite while reporting significant performance and resource utilization benefits. We also demonstrate significantly better performance when comparing against state of the art concurrency control and scheduling techniques. Further, our technique is programmer friendly as it requires no changes to application code and no offline phases.

Patent
23 Jul 2014
TL;DR: In this paper, a memory database OLTP and OLAP concurrency query optimization method is proposed, which includes the steps that (1) by means of two query processing engines, independent storage engines are adopted to a dimension table and a fact table; (2) the dimension table is updated through an embedded concurrency control mechanism of the independent storage engine, and (3) the fact table is equivalent to multiple continuous arrays in logic and maintains two dynamic data structures, namely a read record pointer and a write record pointer.
Abstract: The invention relates to a memory database OLTP and OLAP concurrency query optimization method. The method includes the steps that (1) by means of two query processing engines, independent storage engines are adopted to a dimension table and a fact table; (2) the dimension table is updated through an embedded concurrency control mechanism of the independent storage engines, the fact table is equivalent to multiple continuous arrays in logic and maintains two dynamic data structures, namely a read record pointer and a write record pointer, the read record pointer records the position of the last record inquired through OLPA currently, and the write record pointer records the insert position of a new record; (3) an OLTP transactional queue and an OLAP transactional queue are independently executed with the read pointer and the write pointer as boundaries, the fact table adopts a column storage horizontal fragmentation model based on the fixed number of columns, N columns of storage records serve as an independent column storage container, and each column storage container adopts an independent data compression mechanism; (4) an access function on compressed data or non-compressed data is provided through access interfaces of the column storage containers when OLAP query has access to the column storage containers.

Patent
Allan Henry Vermeulen1
26 Jun 2014
TL;DR: In this article, a transaction request is received at a log-based transaction manager, indicating a conflict check delimiter and a read set descriptor indicative of one or more locations from which data is read during the requested transaction.
Abstract: A transaction request is received at a log-based transaction manager, indicating a conflict check delimiter and a read set descriptor indicative of one or more locations from which data is read during the requested transaction. Using the conflict check delimiter, a subset of transaction records stored in a particular persistent log to be examined for conflicts prior to committing the requested transaction is identified. In response to determining that none of the read locations of the requested transaction correspond to a write location indicated in the subset of transaction records, a new transaction record is stored in the particular persistent log indicating that the requested transaction has been committed.

Proceedings ArticleDOI
03 Nov 2014
TL;DR: This paper verifies that the formula protocol and the staged grid architecture provide a satisfactory solution to one of the important challenges in the database systems: to develop a highly scalable database management system that supports various consistency levels from ACID to BASE.
Abstract: This paper proposes a new formula protocol for distributed concurrency control, and specifies a staged grid architecture for highly scalable database management systems. The paper also describes novel implementation techniques of Rubato DB based on the proposed protocol and architecture. We have conducted extensive experiments which clearly show that Rubato DB is highly scalable with efficient performance under both TPC-C and YCSB benchmarks. Our paper verifies that the formula protocol and the staged grid architecture provide a satisfactory solution to one of the important challenges in the database systems: to develop a highly scalable database management system that supports various consistency levels from ACID to BASE.

Book ChapterDOI
03 Jun 2014
TL;DR: This paper addresses the problem of how to migrate the HDFS' metadata to a relational model, so that it can support larger amounts of storage on a shared-nothing, in-memory, distributed database and guarantees freedom from deadlocks.
Abstract: The Hadoop Distributed File System HDFS scales to store tens of petabytes of data despite the fact that the entire file system's metadata must fit on the heap of a single Java virtual machine. The size of HDFS' metadata is limited to under 100 GB in production, as garbage collection events in bigger clusters result in heartbeats timing out to the metadata server NameNode. In this paper, we address the problem of how to migrate the HDFS' metadata to a relational model, so that we can support larger amounts of storage on a shared-nothing, in-memory, distributed database. Our main contribution is that we show how to provide at least as strong consistency semantics as HDFS while adding support for a multiple-writer, multiple-reader concurrency model. We guarantee freedom from deadlocks by logically organizing inodes and their constituent blocks and replicas into a hierarchy and having all metadata operations agree on a global order for acquiring both explicit locks and implicit locks on subtrees in the hierarchy. We use transactions with pessimistic concurrency control to ensure the safety and progress of metadata operations. Finally, we show how to improve performance of our solution by introducing a snapshotting mechanism at NameNodes that minimizes the number of roundtrips to the database.

Proceedings ArticleDOI
30 Jun 2014
TL;DR: This work attempts to solve the partitioning process, choosing the correct transactional primitive, and routing transactions appropriately by automating the partitioned process.
Abstract: Modern transactional processing systems need to be fast and scalable, but this means many such systems settled for weak consistency models. It is however possible to achieve all of strong consistency, high scalability and high performance, by using fine-grained partitions and light-weight concurrency control that avoids superfluous synchronization and other overheads such as lock management. Independent transactions are one such mechanism, that rely on good partitions and appropriately defined transactions. On the downside, it is not usually straightforward to determine optimal partitioning schemes, especially when dealing with non-trivial amounts of data. Our work attempts to solve this problem by automating the partitioning process, choosing the correct transactional primitive, and routing transactions appropriately.

Proceedings ArticleDOI
17 May 2014
TL;DR: This work defines an architecture, a set of workflows, an set of policies and an implementation for the distributed enforcement of continuous usage control of multiple copies of data objects in distributed systems.
Abstract: This paper deals with the problem of continuous usage control of multiple copies of data objects in distributed systems. This work defines an architecture, a set of workflows, a set of policies and an implementation for the distributed enforcement. The policies, besides including access and usage rules, also specify the parties that will be involved in the decision process. Indeed, the enforcement requires collaboration of several entities because the access decision might be evaluated on one site, enforced on another, and the attributes needed for the policy evaluation might be stored in many distributed locations.

Proceedings ArticleDOI
12 Feb 2014
TL;DR: A modified implementation of Bowtie2, one of the fastest (concurrent, Pthread-based) and state of the art not GPU-based alignment tool, is presented, in which the concurrency structure has been changed.
Abstract: The implementation of DNA alignment tools for Bioinformatics lead to face different problems that dip into performances. A single alignment take an amount of time that is not predictable and there are different factors that can affect performances, for example the length of sequences can determine the computational grain of the task and mismatches or insertion/deletion (indels) increase the amount of time needed to complete an alignment. Moreover, an alignment is a strong memory-bound problem because of the irregular memory access patterns and limitations in memory-bandwidth. Over the years, many alignment tools were implemented. A concrete example is Bowtie2, one of the fastest (concurrent, Pthread-based) and state of the art not GPU-based alignment tool. Bowtie2 exploits concurrency by instantiating a pool of threads, which have access to a global input dataset, share the reference genome and have access to different objects for collecting alignment results. In this paper a modified implementation of Bowtie2 is presented, in which the concurrency structure has been changed. The proposed implementation exploits the task-farm skeleton pattern implemented as a Master-Worker. The Master-Worker pattern permits to delegate only to the master thread dataset reading and to make private to each Worker data structures that are shared in the original version. Only the reference genome is left shared. As a further optimisation, the Master and each Worker were pinned on cores and the reference genome was allocated interleaved among memory nodes. The proposed implementation is able to gain up to 10 speedup points over the original implementation.

Proceedings ArticleDOI
12 Feb 2014
TL;DR: This paper presents a solution that dynamically shrinks or enlarges the set of input features to be exploited by the machine-learner, which allows for tuning the concurrency level while also minimizing the overhead for input-features sampling.
Abstract: In this paper we explore machine-learning approaches for dynamically selecting the well suited amount of concurrent threads in applications relying on Software Transactional Memory (STM). Specifically, we present a solution that dynamically shrinks or enlarges the set of input features to be exploited by the machine-learner. This allows for tuning the concurrency level while also minimizing the overhead for input-features sampling, given that the cardinality of the input-feature set is always tuned to the minimum value that still guarantees reliability of workload characterization. We also present a fully heedged implementation of our proposal within the TinySTM open source framework, and provide the results of an experimental study relying on the STAMP benchmark suite, which show significant reduction of the response time with respect to proposals based on static feature selection.

Proceedings ArticleDOI
24 Mar 2014
TL;DR: A new debug approach for MPSoCs that combines dynamic analysis and the benefits of virtual platforms is proposed that enables automatic exploration of SW behavior, identifies problematic concurrent interactions, provokes possibly erroneous executions and detects concurrency bugs.
Abstract: Writing correct parallel software for modern multi-processor systems-on-chip (MPSoCs) is a complicated task. Programmers can rarely anticipate all possible external and internal interactions in complex concurrent systems. Concurrency bugs originating from races and improper synchronization are difficult to understand and reproduce. Furthermore, traditional debug and verification practices for embedded systems lack support to address this issue efficiently. For instance, programmers still need to step through several executions until finding a buggy state or analyze complex traces, which results in productivity losses. This paper proposes a new debug approach for MPSoCs that combines dynamic analysis and the benefits of virtual platforms. All in all, it (i) enables automatic exploration of SW behavior, (ii) identifies problematic concurrent interactions, (iii) provokes possibly erroneous executions and, ultimately, (iv) detects concurrency bugs. The approach is demonstrated on an industrial-strength virtual platform with a full Linux operating system and real-world parallel benchmarks.

Patent
17 Dec 2014
TL;DR: In this article, the authors propose a concurrency control in log-structured merge (LSM) data stores, where a call is received from a thread for writing a value to a key of LSM components.
Abstract: The present teaching relates to concurrency control in log-structured merge (LSM) data stores. In one example, a call is received from a thread for writing a value to a key of LSM components. A shared mode lock is set on the LSM components in response to the call. The value is written to the key once the shared mode lock is set on the LSM components. The shared mode lock is released from the LSM components after the value is written to the key.

Proceedings ArticleDOI
06 Nov 2014
TL;DR: Roboscoop is presented, a new robotics framework based on Simple Concurrent Object Oriented Programming (SCOOP), which excludes data races by construction, thereby eliminating a major class of concurrent programming errors.
Abstract: Concurrency is inherent to robots, and using concurrency in robotics can greatly enhance performance of the robotics applications. So far, however, the use of concurrency in robotics has been limited and cumbersome. This paper presents Roboscoop, a new robotics framework based on Simple Concurrent Object Oriented Programming (SCOOP). SCOOP excludes data races by construction, thereby eliminating a major class of concurrent programming errors. Roboscoop utilizes SCOOP’s concurrency and synchronization mechanisms for coordination in robotics applications. We demonstrate Roboscoop’s simplicity by comparing Roboscoop to existing middlewares and evaluate Roboscoop’s usability by employing it in education.

01 Jan 2014
TL;DR: A sound program logic called synchronization object logic (SOL) that supports reasoning about the execution order and linearization orders of method calls and also the properties of common synchronization object types is presented.
Abstract: Transactional Memory (TM) provides programmers with a high-level and composable concurrency control abstraction. The correct execution of client programs using TM is directly dependent on the correctness of the TM algorithms. In return for the simpler programming model, designing a correct TM algorithm is an art. This dissertation presents techniques to prove the correctness or incorrectness of TM algorithms. In particular, it contributes to the specification, safety criterion, testing and verification of TM algorithms.We introduce a language for architecture-independent specification of synchronization algorithms. An algorithm specification captures two abstract properties of the algorithm namely the type of the used synchronization objects and the pairs of method calls that should preserve their program order in the relaxed execution.We introduce the markability correctness condition as the conjunction of intuitive invariants: write-observation and read-preservation. We prove the equivalence of markability and opacity correctness conditions. Decomposition of the correctness condition supports modular and scalable verification.We identify two pitfalls that lead to violation of opacity: the write-skew and write-exposure anomalies. We present a tool called Samand that automatically finds traces of such bug patterns. Using Samand, we show that the DSTM and McRT algorithms suffer from the write-skew and write-exposure anomalies respectively.We present a sound program logic called synchronization object logic (SOL) that supports reasoning about the execution order and linearization orders of method calls. It provides inference rules that axiomatize the properties and the interdependence of these orders and also the properties of common synchronization object types. We present the derivation of markability in SOL as a sound syntactic proof technique for opacity. We use SOL to prove the markability and hence opacity of the TL2 algorithm in PVS.