Coding for Distributed Fog Computing

doi:10.1109/MCOM.2017.1600894

Home
/
Papers
/
Coding for Distributed Fog Computing

Journal Article•DOI•

Coding for Distributed Fog Computing

Songze Li¹, Mohammad Ali Maddah-Ali², A. Salman Avestimehr¹•Institutions (2)

University of Southern California¹, Bell Labs²

01 Apr 2017-IEEE Communications Magazine (IEEE)-Vol. 55, Iss: 4, pp 34-40

TL;DR: In this paper, the authors demonstrate the transformational role of coding in fog computing for leveraging such redundancy to substantially reduce the bandwidth consumption and latency of computing, and discuss two recently proposed coding concepts, minimum bandwidth codes and minimum latency codes.

read less

Abstract: Redundancy is abundant in fog networks (i.e., many computing and storage points) and grows linearly with network size. We demonstrate the transformational role of coding in fog computing for leveraging such redundancy to substantially reduce the bandwidth consumption and latency of computing. In particular, we discuss two recently proposed coding concepts, minimum bandwidth codes and minimum latency codes, and illustrate their impacts on fog computing. We also review a unified coding framework that includes the above two coding techniques as special cases, and enables a trade-off between computation latency and communication load to optimize system performance. At the end, we will discuss several open problems and future research directions.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

All one needs to know about fog computing and related edge computing paradigms: A complete survey

[...]

Ashkan Yousefpour¹, Caleb Fung², Tam T. Nguyen³, Krishna P. Kadiyala², Fatemeh Jalali⁴, Amirreza Niakanlahiji⁵, Jian Kong², Jason P. Jue² - Show less +4 more•Institutions (5)

University of California, Berkeley¹, University of Texas at Dallas², Georgia Institute of Technology³, IBM⁴, University of North Carolina at Charlotte⁵

01 Sep 2019-Journal of Systems Architecture

TL;DR: This paper provides a tutorial on fog computing and its related computing paradigms, including their similarities and differences, and provides a taxonomy of research topics in fog computing.

...read moreread less

783 citations

Journal Article•DOI•

A Comprehensive Survey on Fog Computing: State-of-the-Art and Research Challenges

[...]

Carla Mouradian¹, Diala Naboulsi¹, Sami Yangui¹, Roch Glitho¹, Monique Morrow², Paul Anthony Polakos² - Show less +2 more•Institutions (2)

Concordia University¹, Cisco Systems, Inc.²

01 Jan 2018-IEEE Communications Surveys and Tutorials

TL;DR: Fog computing is not a substitute for cloud computing but a powerful complement as discussed by the authors, which enables processing at the edge while still offering the possibility to interact with the cloud. But it still faces several challenges, such as the distance between the cloud and the end devices.

...read moreread less

Abstract: Cloud computing with its three key facets (i.e., Infrastructure-as-a-Service, Platform-as-a-Service, and Software-as-a-Service) and its inherent advantages (e.g., elasticity and scalability) still faces several challenges. The distance between the cloud and the end devices might be an issue for latency-sensitive applications such as disaster management and content delivery applications. Service level agreements (SLAs) may also impose processing at locations where the cloud provider does not have data centers. Fog computing is a novel paradigm to address such issues. It enables provisioning resources and services outside the cloud, at the edge of the network, closer to end devices, or eventually, at locations stipulated by SLAs. Fog computing is not a substitute for cloud computing but a powerful complement. It enables processing at the edge while still offering the possibility to interact with the cloud. This paper presents a comprehensive survey on fog computing. It critically reviews the state of the art in the light of a concise set of evaluation criteria. We cover both the architectures and the algorithms that make fog systems. Challenges and research directions are also introduced. In addition, the lessons learned are reviewed and the prospects are discussed in terms of the key role fog is likely to play in emerging technologies such as tactile Internet.

...read moreread less

598 citations

Posted Content•

A Comprehensive Survey on Fog Computing: State-of-the-art and Research Challenges

[...]

Carla Mouradian¹, Diala Naboulsi¹, Sami Yangui¹, Roch Glitho¹, Monique Morrow², Paul Anthony Polakos² - Show less +2 more•Institutions (2)

Concordia University¹, Cisco Systems, Inc.²

30 Oct 2017-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: A comprehensive survey on fog computing is presented in this article, which critically reviews the state of the art in the light of a concise set of evaluation criteria and challenges and research directions.

...read moreread less

Abstract: Cloud computing with its three key facets (i.e., IaaS, PaaS, and SaaS) and its inherent advantages (e.g., elasticity and scalability) still faces several challenges. The distance between the cloud and the end devices might be an issue for latency-sensitive applications such as disaster management and content delivery applications. Service Level Agreements (SLAs) may also impose processing at locations where the cloud provider does not have data centers. Fog computing is a novel paradigm to address such issues. It enables provisioning resources and services outside the cloud, at the edge of the network, closer to end devices or eventually, at locations stipulated by SLAs. Fog computing is not a substitute for cloud computing but a powerful complement. It enables processing at the edge while still offering the possibility to interact with the cloud. This article presents a comprehensive survey on fog computing. It critically reviews the state of the art in the light of a concise set of evaluation criteria. We cover both the architectures and the algorithms that make fog systems. Challenges and research directions are also introduced. In addition, the lessons learned are reviewed and the prospects are discussed in terms of the key role fog is likely to play in emerging technologies such as Tactile Internet.

...read moreread less

450 citations

Journal Article•DOI•

A Fundamental Tradeoff Between Computation and Communication in Distributed Computing

[...]

Songze Li¹, Mohammad Ali Maddah-Ali², Qian Yu¹, A. Salman Avestimehr¹•Institutions (2)

University of Southern California¹, Sharif University of Technology²

01 Jan 2018-IEEE Transactions on Information Theory

TL;DR: A coded scheme, named “coded distributed computing” (CDC), is proposed to demonstrate that increasing the computation load of the Map functions by a factor of r can create novel coding opportunities that reduce the communication load by the same factor.

...read moreread less

Abstract: How can we optimally trade extra computing power to reduce the communication load in distributed computing? We answer this question by characterizing a fundamental tradeoff between computation and communication in distributed computing, ie, the two are inversely proportional to each other More specifically, a general distributed computing framework, motivated by commonly used structures like MapReduce, is considered, where the overall computation is decomposed into computing a set of “Map” and “Reduce” functions distributedly across multiple computing nodes A coded scheme, named “coded distributed computing” (CDC), is proposed to demonstrate that increasing the computation load of the Map functions by a factor of $r$ (ie, evaluating each function at $r$ carefully chosen nodes) can create novel coding opportunities that reduce the communication load by the same factor An information-theoretic lower bound on the communication load is also provided, which matches the communication load achieved by the CDC scheme As a result, the optimal computation-communication tradeoff in distributed computing is exactly characterized Finally, the coding techniques of CDC is applied to the Hadoop TeraSort benchmark to develop a novel CodedTeraSort algorithm, which is empirically demonstrated to speed up the overall job execution by $197\times $ – $339\times $ , for typical settings of interest

...read moreread less

399 citations

Journal Article•DOI•

The Exact Rate-Memory Tradeoff for Caching With Uncoded Prefetching

[...]

Qian Yu¹, Mohammad Ali Maddah-Ali², A. Salman Avestimehr¹•Institutions (2)

University of Southern California¹, Sharif University of Technology²

01 Feb 2018-IEEE Transactions on Information Theory

TL;DR: A novel caching scheme is proposed, which strictly improves the state of the art by exploiting commonality among user demands and fully characterize the rate-memory tradeoff for a decentralized setting, in which users fill out their cache content without any coordination.

...read moreread less

Abstract: We consider a basic cache network, in which a single server is connected to multiple users via a shared bottleneck link. The server has a database of files (content). Each user has an isolated memory that can be used to cache content in a prefetching phase. In a following delivery phase, each user requests a file from the database, and the server needs to deliver users’ demands as efficiently as possible by taking into account their cache contents. We focus on an important and commonly used class of prefetching schemes, where the caches are filled with uncoded data. We provide the exact characterization of the rate-memory tradeoff for this problem, by deriving both the minimum average rate (for a uniform file popularity) and the minimum peak rate required on the bottleneck link for a given cache size available at each user. In particular, we propose a novel caching scheme, which strictly improves the state of the art by exploiting commonality among user demands. We then demonstrate the exact optimality of our proposed scheme through a matching converse, by dividing the set of all demands into types, and showing that the placement phase in the proposed caching scheme is universally optimal for all types. Using these techniques, we also fully characterize the rate-memory tradeoff for a decentralized setting, in which users fill out their cache content without any coordination.

...read moreread less

378 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

MapReduce: simplified data processing on large clusters

[...]

Jeffrey Dean¹, Sanjay Ghemawat¹•Institutions (1)

Google¹

06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

Abstract: MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day.

...read moreread less

20,309 citations

Journal Article•DOI•

MapReduce: simplified data processing on large clusters

[...]

Jeffrey Dean¹, Sanjay Ghemawat¹•Institutions (1)

Google¹

01 Jan 2008-Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

Abstract: MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.

...read moreread less

17,663 citations

Proceedings Article•

Spark: cluster computing with working sets

[...]

Matei Zaharia¹, Mosharaf Chowdhury¹, Michael J. Franklin¹, Scott Shenker¹, Ion Stoica¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

22 Jun 2010

TL;DR: Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.

...read moreread less

Abstract: MapReduce and its variants have been highly successful in implementing large-scale data-intensive applications on commodity clusters. However, most of these systems are built around an acyclic data flow model that is not suitable for other popular applications. This paper focuses on one such class of applications: those that reuse a working set of data across multiple parallel operations. This includes many iterative machine learning algorithms, as well as interactive data analysis tools. We propose a new framework called Spark that supports these applications while retaining the scalability and fault tolerance of MapReduce. To achieve these goals, Spark introduces an abstraction called resilient distributed datasets (RDDs). An RDD is a read-only collection of objects partitioned across a set of machines that can be rebuilt if a partition is lost. Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.

...read moreread less

4,959 citations

Proceedings Article•DOI•

Fog computing and its role in the internet of things

[...]

Flavio Bonomi¹, Rodolfo A. Milito¹, Jiang Zhu¹, Sateesh Addepalli¹•Institutions (1)

Cisco Systems, Inc.¹

17 Aug 2012

TL;DR: This paper argues that the above characteristics make the Fog the appropriate platform for a number of critical Internet of Things services and applications, namely, Connected Vehicle, Smart Grid, Smart Cities, and, in general, Wireless Sensors and Actuators Networks (WSANs).

...read moreread less

Abstract: Fog Computing extends the Cloud Computing paradigm to the edge of the network, thus enabling a new breed of applications and services. Defining characteristics of the Fog are: a) Low latency and location awareness; b) Wide-spread geographical distribution; c) Mobility; d) Very large number of nodes, e) Predominant role of wireless access, f) Strong presence of streaming and real time applications, g) Heterogeneity. In this paper we argue that the above characteristics make the Fog the appropriate platform for a number of critical Internet of Things (IoT) services and applications, namely, Connected Vehicle, Smart Grid, Smart Cities, and, in general, Wireless Sensors and Actuators Networks (WSANs).

...read moreread less

4,440 citations

Book Chapter•DOI•

Fog Computing and Its Role in the Internet of Things

[...]

C. V. Nisha Angeline¹, Raja Lavanya¹•Institutions (1)

Thiagarajar College of Engineering¹

01 Jan 2019

TL;DR: This chapter argues that the above characteristics make the Fog the appropriate platform for a number of critical internet of things services and applications, namely connected vehicle, smart grid, smart cities, and in general, wireless sensors and actuators networks (WSANs).

...read moreread less

Abstract: Fog computing extends the cloud computing paradigm to the edge of the network, thus enabling a new breed of applications and services. Defining characteristics of the Fog are 1) low latency and location awareness, 2) widespread geographical distribution, 3) mobility, 4) very large number of nodes, 5) predominant role of wireless access, 6) strong presence of streaming and real time applications, and 7) heterogeneity. In this chapter, the authors argue that the above characteristics make the Fog the appropriate platform for a number of critical internet of things (IoT) services and applications, namely connected vehicle, smart grid, smart cities, and in general, wireless sensors and actuators networks (WSANs).

...read moreread less

2,384 citations