scispace - formally typeset

Operating Systems Design and Implementation

About: Operating Systems Design and Implementation is an academic conference. The conference publishes majorly in the area(s): File system & Server. Over the lifetime, 516 publication(s) have been published by the conference receiving 128856 citation(s). more

Topics: File system, Server, Scheduling (computing) more

Journal ArticleDOI: 10.21276/IJRE.2018.5.5.4
Jeffrey Dean1, Sanjay Ghemawat1Institutions (1)
06 Dec 2004-
Abstract: MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day. more

19,629 Citations

Open accessProceedings ArticleDOI: 10.5555/3026877.3026899
Martín Abadi1, Paul Barham1, Jianmin Chen1, Zhifeng Chen1  +18 moreInstitutions (1)
02 Nov 2016-
Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. Tensor-Flow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model and demonstrate the compelling performance that TensorFlow achieves for several real-world applications. more

Topics: Dataflow (62%), Computational learning theory (58%), Deep learning (55%) more

10,880 Citations

Open accessProceedings Article
01 Jan 2006-
Abstract: Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this article, we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable. more

Topics: Distributed data store (60%)

4,677 Citations

Journal ArticleDOI: 10.1145/844128.844142
09 Dec 2002-
Abstract: We present the Tiny AGgregation (TAG) service for aggregation in low-power, distributed, wireless environments. TAG allows users to express simple, declarative queries and have them distributed and executed efficiently in networks of low-power, wireless sensors. We discuss various generic properties of aggregates, and show how those properties affect the performance of our in network approach. We include a performance study demonstrating the advantages of our approach over traditional centralized, out-of-network methods, and discuss a variety of optimizations for improving the performance and fault tolerance of the basic solution. more

3,137 Citations

Open accessProceedings ArticleDOI: 10.5555/296806.296824
Miguel Castro1, Barbara Liskov1Institutions (1)
22 Feb 1999-
Abstract: This paper describes a new replication algorithm that is able to tolerate Byzantine faults. We believe that Byzantinefault-tolerant algorithms will be increasingly important in the future because malicious attacks and software errors are increasingly common and can cause faulty nodes to exhibit arbitrary behavior. Whereas previous algorithms assumed a synchronous system or were too slow to be used in practice, the algorithm described in this paper is practical: it works in asynchronous environments like the Internet and incorporates several important optimizations that improve the response time of previous algorithms by more than an order of magnitude. We implemented a Byzantine-fault-tolerant NFS service using our algorithm and measured its performance. The results show that our service is only 3% slower than a standard unreplicated NFS. more

2,919 Citations

No. of papers from the Conference in previous years

Top Attributes

Show by:

Conference's top 5 most impactful authors

Nickolai Zeldovich

14 papers, 1.4K citations

Andrea C. Arpaci-Dusseau

10 papers, 403 citations

Jason Flinn

9 papers, 731 citations

M. Frans Kaashoek

9 papers, 1.1K citations

Larry L. Peterson

8 papers, 1.4K citations

Network Information
Related Conferences (5)
USENIX Annual Technical Conference

1.2K papers, 85.1K citations

95% related
Networked Systems Design and Implementation

697 papers, 88.7K citations

93% related
Symposium on Operating Systems Principles

862 papers, 139.1K citations

93% related
Workshop on Hot Topics in Operating Systems

146 papers, 6.5K citations

92% related
European Conference on Computer Systems

824 papers, 48.5K citations

91% related