scispace - formally typeset
Open AccessJournal ArticleDOI

A Comprehensive Survey on Parallelization and Elasticity in Stream Processing

TLDR
In this article, a survey of the state of the art in stream processing parallelization and elasticity is presented, which is necessary to consolidate the state-of-the-art and to plan future research directions on this basis.
Abstract
Stream Processing (SP) has evolved as the leading paradigm to process and gain value from the high volume of streaming data produced, e.g., in the domain of the Internet of Things. An SP system is a middleware that deploys a network of operators between data sources, such as sensors, and the consuming applications. SP systems typically face intense and highly dynamic data streams. Parallelization and elasticity enable SP systems to process these streams with continuous high quality of service. The current research landscape provides a broad spectrum of methods for parallelization and elasticity in SP. Each method makes specific assumptions and focuses on particular aspects. However, the literature lacks a comprehensive overview and categorization of the state of the art in SP parallelization and elasticity, which is necessary to consolidate the state of the research and to plan future research directions on this basis. Therefore, in this survey, we study the literature and develop a classification of current methods for both parallelization and elasticity in SP systems.

read more

Citations
More filters
Journal ArticleDOI

Orchestrating the Development Lifecycle of Machine Learning-based IoT Applications: A Taxonomy and Survey

Abstract: Machine Learning (ML) and Internet of Things (IoT) are complementary advances: ML techniques unlock the potential of IoT with intelligence, and IoT applications increasingly feed data collected by sensors into ML models, thereby employing results to improve their business processes and services. Hence, orchestrating ML pipelines that encompass model training and implication involved in the holistic development lifecycle of an IoT application often leads to complex system integration. This article provides a comprehensive and systematic survey of the development lifecycle of ML-based IoT applications. We outline the core roadmap and taxonomy and subsequently assess and compare existing standard techniques used at individual stages.
Journal ArticleDOI

Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques, and Tools

TL;DR: This survey performs a broad and thorough investigation on challenges, techniques and tools for scalable DL on distributed infrastructures, and highlights future research trends in DL systems that deserve further research.
Journal ArticleDOI

Elastic Scheduling for Microservice Applications in Clouds

TL;DR: The task scheduling problem of microservices is defined as a cost optimization problem with deadline constraints and a statistics-based strategy to determine the configuration of containers under a streaming workload is proposed and an urgency-based workflow scheduling algorithm is proposed.
Journal ArticleDOI

A Survey on Automatic Parameter Tuning for Big Data Processing Systems

TL;DR: This work investigates existing approaches on parameter tuning for both batch and stream data processing systems and classify them into six categories: rule-based, cost modeling, simulation- based, experiment-driven, machine learning, and adaptive tuning.
Posted Content

Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools

TL;DR: This survey performs a broad and thorough investigation on challenges, techniques and tools for scalable DL on distributed infrastructures, and highlights future research trends in DL systems that deserve further research.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Journal ArticleDOI

A view of cloud computing

TL;DR: The clouds are clearing the clouds away from the true potential and obstacles posed by this computing capability.
Journal ArticleDOI

The vision of autonomic computing

TL;DR: A 2001 IBM manifesto noted the almost impossible difficulty of managing current and planned computing systems, which require integrating several heterogeneous environments into corporate-wide computing systems that extend into the Internet.
Proceedings ArticleDOI

Fog computing and its role in the internet of things

TL;DR: This paper argues that the above characteristics make the Fog the appropriate platform for a number of critical Internet of Things services and applications, namely, Connected Vehicle, Smart Grid, Smart Cities, and, in general, Wireless Sensors and Actuators Networks (WSANs).
Journal ArticleDOI

The many faces of publish/subscribe

TL;DR: This paper factors out the common denominator underlying these variants: full decoupling of the communicating entities in time, space, and synchronization to better identify commonalities and divergences with traditional interaction paradigms.
Related Papers (5)