Journal•

arXiv: Distributed, Parallel, and Cluster Computing

About: arXiv: Distributed, Parallel, and Cluster Computing is an academic journal. The journal publishes majorly in the area(s): Cloud computing & Scalability. Over the lifetime, 9308 publications have been published receiving 91411 citations.

...read moreread less

Topics: Cloud computing, Scalability, Scheduling (computing), Speedup, Server ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

[...]

01 Jan 2015-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.

...read moreread less

Abstract: TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.

...read moreread less

10,447 citations

Posted Content•

TensorFlow: A system for large-scale machine learning

[...]

Martín Abadi¹, Paul Barham¹, Jianmin Chen¹, Zhifeng Chen¹, Andy Davis¹, Jeffrey Dean¹, Matthieu Devin¹, Sanjay Ghemawat¹, Geoffrey Irving¹, Michael Isard¹, Manjunath Kudlur¹, Josh Levenberg¹, Rajat Monga¹, Sherry Moore¹, Derek G. Murray¹, Benoit Steiner¹, Paul A. Tucker¹, Vijay K. Vasudevan¹, Pete Warden¹, Martin Wicke¹, Yuan Yu¹, Xiaoqiang Zheng¹ - Show less +18 more•Institutions (1)

Google¹

27 May 2016-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: The TensorFlow dataflow model is described and the compelling performance that Tensor Flow achieves for several real-world applications is demonstrated.

...read moreread less

Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with particularly strong support for training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model in contrast to existing systems, and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

...read moreread less

5,542 citations

Posted Content•

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

[...]

Tianqi Chen, Mu Li¹, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang - Show less +6 more•Institutions (1)

Carnegie Mellon University¹

03 Dec 2015-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: The API design and the system implementation of MXNet are described, and it is explained how embedding of both symbolic expression and tensor operation is handled in a unified fashion.

...read moreread less

Abstract: MXNet is a multi-language machine learning (ML) library to ease the development of ML algorithms, especially for deep neural networks. Embedded in the host language, it blends declarative symbolic expression with imperative tensor computation. It offers auto differentiation to derive gradients. MXNet is computation and memory efficient and runs on various heterogeneous systems, ranging from mobile devices to distributed GPU clusters. This paper describes both the API design and the system implementation of MXNet, and explains how embedding of both symbolic expression and tensor operation is handled in a unified fashion. Our preliminary experiments reveal promising results on large scale deep neural network applications using multiple GPU machines.

...read moreread less

2,153 citations

Proceedings Article•DOI•

Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities

[...]

Rajkumar Buyya¹, Chee Shin Yeo¹, Srikumar Venugopal¹•Institutions (1)

University of Melbourne¹

26 Aug 2008-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: In this article, the authors present a 21st century vision of computing, identify various computing paradigms promising to deliver the vision of cloud utilities, define cloud computing and provide the architecture for creating market-oriented clouds by leveraging technologies such as VMs.

...read moreread less

Abstract: This keynote paper: presents a 21st century vision of computing; identifies various computing paradigms promising to deliver the vision of computing utilities; defines Cloud computing and provides the architecture for creating market-oriented Clouds by leveraging technologies such as VMs; provides thoughts on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain SLA-oriented resource allocation; presents some representative Cloud platforms especially those developed in industries along with our current work towards realising market-oriented resource allocation of Clouds by leveraging the 3rd generation Aneka enterprise Grid technology; reveals our early thoughts on interconnecting Clouds for dynamically creating an atmospheric computing environment along with pointers to future community research; and concludes with the need for convergence of competing IT paradigms for delivering our 21st century vision.

...read moreread less

1,437 citations

Posted Content•

Internet of Things (IoT): A Vision, Architectural Elements, and Future Directions

[...]

Jayavardhana Gubbi¹, Rajkumar Buyya¹, Slaven Marusic¹, Marimuthu Palaniswami¹•Institutions (1)

University of Melbourne¹

01 Jul 2012-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: This paper presents a Cloud centric vision for worldwide implementation of Internet of Things, and expands on the need for convergence of WSN, the Internet and distributed computing directed at technological research community.

...read moreread less

Abstract: Ubiquitous sensing enabled by Wireless Sensor Network (WSN) technologies cuts across many areas of modern day living. This offers the ability to measure, infer and understand environmental indicators, from delicate ecologies and natural resources to urban environments. The proliferation of these devices in a communicating-actuating network creates the Internet of Things (IoT), wherein, sensors and actuators blend seamlessly with the environment around us, and the information is shared across platforms in order to develop a common operating picture (COP). Fuelled by the recent adaptation of a variety of enabling device technologies such as RFID tags and readers, near field communication (NFC) devices and embedded sensor and actuator nodes, the IoT has stepped out of its infancy and is the the next revolutionary technology in transforming the Internet into a fully integrated Future Internet. As we move from www (static pages web) to web2 (social networking web) to web3 (ubiquitous computing web), the need for data-on-demand using sophisticated intuitive queries increases significantly. This paper presents a cloud centric vision for worldwide implementation of Internet of Things. The key enabling technologies and application domains that are likely to drive IoT research in the near future are discussed. A cloud implementation using Aneka, which is based on interaction of private and public clouds is presented. We conclude our IoT vision by expanding on the need for convergence of WSN, the Internet and distributed computing directed at technological research community.

...read moreread less

1,372 citations

Collapse

Network Information

Related Journals (5)

IEEE Transactions on Parallel and Distributed Systems

5.2K papers, 237.8K citations

90% related

Future Generation Computer Systems

6.6K papers, 246.8K citations

86% related

ACM Computing Surveys

2.4K papers, 395.7K citations

85% related

arXiv: Learning

45K papers, 837.1K citations

84% related

SIAM Journal on Computing

3.5K papers, 327.5K citations

84% related

Performance

Metrics

9,308

Papers

119,879

Citations

No. of papers from the Journal in previous years
Year	Papers
2022	1
2021	1,208
2020	1,389
2019	1,231
2018	1,030
2017	855