scispace - formally typeset
Search or ask a question
Author

Qian Lin

Bio: Qian Lin is an academic researcher from National University of Singapore. The author has contributed to research in topics: Scalability & Virtualization. The author has an hindex of 15, co-authored 52 publications receiving 776 citations. Previous affiliations of Qian Lin include Chinese Academy of Sciences & Shanghai Jiao Tong University.


Papers
More filters
Proceedings ArticleDOI
25 Jun 2019
TL;DR: In this article, the authors proposed a principled approach to apply sharding to blockchain systems in order to improve their transaction throughput at scale, which is challenging due to the fundamental difference in failure models between databases and blockchain.
Abstract: Existing blockchain systems scale poorly because of their distributed consensus protocols. Current attempts at improving blockchain scalability are limited to cryptocurrency. Scaling blockchain systems under general workloads (i.e., non-cryptocurrency applications) remains an open question. This work takes a principled approach to apply sharding to blockchain systems in order to improve their transaction throughput at scale. This is challenging, however, due to the fundamental difference in failure models between databases and blockchain. To achieve our goal, we first enhance the performance of Byzantine consensus protocols, improving individual shards' throughput. Next, we design an efficient shard formation protocol that securely assigns nodes into shards. We rely on trusted hardware, namely Intel SGX, to achieve high performance for both consensus and shard formation protocol. Third, we design a general distributed transaction protocol that ensures safety and liveness even when transaction coordinators are malicious. Finally, we conduct an extensive evaluation of our design both on a local cluster and on Google Cloud Platform. The results show that our consensus and shard formation protocols outperform state-of-the-art solutions at scale. More importantly, our sharded blockchain reaches a high throughput that can handle Visa-level workloads, and is the largest ever reported in a realistic environment.

238 citations

Posted Content
TL;DR: This work takes a principled approach to apply sharding to blockchain systems in order to improve their transaction throughput at scale, and achieves a high throughput that can handle Visa-level workloads, and is the largest ever reported in a realistic environment.
Abstract: Existing blockchain systems scale poorly because of their distributed consensus protocols. Current attempts at improving blockchain scalability are limited to cryptocurrency. Scaling blockchain systems under general workloads (i.e., non-cryptocurrency applications) remains an open question. In this work, we take a principled approach to apply sharding, which is a well-studied and proven technique to scale out databases, to blockchain systems in order to improve their transaction throughput at scale. This is challenging, however, due to the fundamental difference in failure models between databases and blockchain. To achieve our goal, we first enhance the performance of Byzantine consensus protocols, by doing so we improve individual shards' throughput. Next, we design an efficient shard formation protocol that leverages a trusted random beacon to securely assign nodes into shards. We rely on trusted hardware, namely Intel SGX, to achieve high performance for both consensus and shard formation protocol. Third, we design a general distributed transaction protocol that ensures safety and liveness even when transaction coordinators are malicious. Finally, we conduct an extensive evaluation of our design both on a local cluster and on Google Cloud Platform. The results show that our consensus and shard formation protocols outperform state-of-the-art solutions at scale. More importantly, our sharded blockchain reaches a high throughput that can handle Visa-level workloads, and is the largest ever reported in a realistic environment.

128 citations

Journal ArticleDOI
01 Jun 2018
TL;DR: ForkBase is presented, a storage engine specifically designed to provide efficient support for blockchain and forkable applications by integrating the core application properties into the storage, which achieves superior performance while significantly lowering the development cost.
Abstract: Existing data storage systems offer a wide range of functionalities to accommodate an equally diverse range of applications. However, new classes of applications have emerged, e.g., blockchain and collaborative analytics, featuring data versioning, fork semantics, tamper-evidence or any combination thereof. They present new opportunities for storage systems to efficiently support such applications by embedding the above requirements into the storage.In this paper, we present ForkBase, a storage engine designed for blockchain and forkable applications. By integrating core application properties into the storage, ForkBase not only delivers high performance but also reduces development effort. The storage manages multiversion data and supports two variants of fork semantics which enable different fork worklflows. ForkBase is fast and space efficient, due to a novel index class that supports efficient queries as well as effective detection of duplicate content across data objects, branches and versions. We demonstrate ForkBase's performance using three applications: a blockchain platform, a wiki engine and a collaborative analytics application. We conduct extensive experimental evaluation against respective state-of-the-art solutions. The results show that ForkBase achieves superior performance while significantly lowering the development effort.

100 citations

Proceedings ArticleDOI
27 May 2015
TL;DR: A novel stream join model, called join-biclique, which organizes a large cluster as a complete bipartite graph, which is designed to support efficient full-history joins, window-based joins and online data aggregation and is developed as a scalable distributed stream join system, BiStream.
Abstract: Efficient and scalable stream joins play an important role in performing real-time analytics for many cloud applications. However, like in conventional database processing, online theta-joins over data streams are computationally expensive and moreover, being memory-based processing, they impose high memory requirement on the system. In this paper, we propose a novel stream join model, called join-biclique, which organizes a large cluster as a complete bipartite graph. Join-biclique has several strengths over state-of-the-art techniques, including memory-efficiency, elasticity and scalability. These features are essential for building efficient and scalable streaming systems. Based on join-biclique, we develop a scalable distributed stream join system, BiStream, over a large-scale commodity cluster. Specifically, BiStream is designed to support efficient full-history joins, window-based joins and online data aggregation. BiStream also supports adaptive resource management to dynamically scale out and down the system according to its application workloads. We provide both theoretical cost analysis and extensive experimental evaluations to evaluate the efficiency, elasticity and scalability of BiStream.

89 citations

Proceedings ArticleDOI
21 Mar 2011
TL;DR: The hybrid virtualization which runs the paravirtualized guest in the hardware assisted virtual machine container to take advantage of bothHardware assisted virtualization is superior in CPU and memory virtualization, yet parvirtualization is still valuable in some aspects as it is capable of shortening the disposal path of I/O virtualization.
Abstract: It is crucial to minimize virtualization overhead for virtual machine deployment. The conventional ×86 CPU is incapable of classical trap-and-emulate virtualization, leading that paravirtualization was the optimal virtualization strategy formerly. Since architectural extensions are introduced to support classical virtualization, hardware assisted virtualization becomes a competitive alternative method. Hardware assisted virtualization is superior in CPU and memory virtualization, yet paravirtualization is still valuable in some aspects as it is capable of shortening the disposal path of I/O virtualization. Thus we propose the hybrid virtualization which runs the paravirtualized guest in the hardware assisted virtual machine container to take advantage of both. Experiment results indicate that our hybrid solution outweighs origin paravirtualization by nearly 30% in memory intensive test and 50% in microbenchmarks. Meanwhile, compared with the origin hardware assisted virtual machine, hybrid guest owns over 16% improvement in I/O intensive workloads.

76 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This paper conducts a comprehensive evaluation of three major blockchain systems based on BLOCKBENCH, namely Ethereum, Parity, and Hyperledger Fabric, and discusses several research directions for bringing blockchain performance closer to the realm of databases.
Abstract: Blockchain technologies are gaining massive momentum in the last few years. Blockchains are distributed ledgers that enable parties who do not fully trust each other to maintain a set of global states. The parties agree on the existence, values, and histories of the states. As the technology landscape is expanding rapidly, it is both important and challenging to have a firm grasp of what the core technologies have to offer, especially with respect to their data processing capabilities. In this paper, we first survey the state of the art, focusing on private blockchains (in which parties are authenticated). We analyze both in-production and research systems in four dimensions: distributed ledger, cryptography, consensus protocol, and smart contract. We then present BLOCKBENCH, a benchmarking framework for understanding performance of private blockchains against data processing workloads. We conduct a comprehensive evaluation of three major blockchain systems based on BLOCKBENCH, namely Ethereum, Parity, and Hyperledger Fabric. The results demonstrate several trade-offs in the design space, as well as big performance gaps between blockchain and database systems. Drawing from design principles of database systems, we discuss several research directions for bringing blockchain performance closer to the realm of databases.

769 citations

Proceedings ArticleDOI
09 May 2017
TL;DR: Blockbench as mentioned in this paper is an evaluation framework for analyzing private blockchains, which can be used to assess blockchains' viability as another distributed data processing platform, while helping developers to identify bottlenecks and accordingly improve their platforms.
Abstract: Blockchain technologies are taking the world by storm. Public blockchains, such as Bitcoin and Ethereum, enable secure peer-to-peer applications like crypto-currency or smart contracts. Their security and performance are well studied. This paper concerns recent private blockchain systems designed with stronger security (trust) assumption and performance requirement. These systems target and aim to disrupt applications which have so far been implemented on top of database systems, for example banking, finance and trading applications. Multiple platforms for private blockchains are being actively developed and fine tuned. However, there is a clear lack of a systematic framework with which different systems can be analyzed and compared against each other. Such a framework can be used to assess blockchains' viability as another distributed data processing platform, while helping developers to identify bottlenecks and accordingly improve their platforms. In this paper, we first describe BLOCKBENCH, the first evaluation framework for analyzing private blockchains. It serves as a fair means of comparison for different platforms and enables deeper understanding of different system design choices. Any private blockchain can be integrated to BLOCKBENCH via simple APIs and benchmarked against workloads that are based on real and synthetic smart contracts. BLOCKBENCH measures overall and component-wise performance in terms of throughput, latency, scalability and fault-tolerance. Next, we use BLOCKBENCH to conduct comprehensive evaluation of three major private blockchains: Ethereum, Parity and Hyperledger Fabric. The results demonstrate that these systems are still far from displacing current database systems in traditional data processing workloads. Furthermore, there are gaps in performance among the three systems which are attributed to the design choices at different layers of the blockchain's software stack. We have released BLOCKBENCH for public use.

731 citations

Journal ArticleDOI
TL;DR: It is highlighted that blockchain’s structure and modern cloud- and edge-computing paradigms are crucial in enabling a widespread adaption and development of blockchain technologies for new players in today unprecedented vibrant global market.
Abstract: Blockchain technologies have grown in prominence in recent years, with many experts citing the potential applications of the technology in regard to different aspects of any industry, market, agency, or governmental organizations. In the brief history of blockchain, an incredible number of achievements have been made regarding how blockchain can be utilized and the impacts it might have on several industries. The sheer number and complexity of these aspects can make it difficult to address blockchain potentials and complexities, especially when trying to address its purpose and fitness for a specific task. In this survey, we provide a comprehensive review of applying blockchain as a service for applications within today’s information systems. The survey gives the reader a deeper perspective on how blockchain helps to secure and manage today information systems. The survey contains a comprehensive reporting on different instances of blockchain studies and applications proposed by the research community and their respective impacts on blockchain and its use across other applications or scenarios. Some of the most important findings this survey highlights include the fact that blockchain’s structure and modern cloud- and edge-computing paradigms are crucial in enabling a widespread adaption and development of blockchain technologies for new players in today unprecedented vibrant global market. Ensuring that blockchain is widely available through public and open-source code libraries and tools will help to ensure that the full potential of the technology is reached and that further developments can be made concerning the long-term goals of blockchain enthusiasts.

291 citations