scispace - formally typeset
Proceedings ArticleDOI

Elastic scaling of data parallel operators in stream processing

TLDR
An approach to elastically scale the performance of a data analytics operator that is part of a streaming application that focuses on dynamically adjusting the amount of computation an operator can carry out in response to changes in incoming workload and the availability of processing cycles is described.
Abstract
We describe an approach to elastically scale the performance of a data analytics operator that is part of a streaming application. Our techniques focus on dynamically adjusting the amount of computation an operator can carry out in response to changes in incoming workload and the availability of processing cycles. We show that our elastic approach is beneficial in light of the dynamic aspects of streaming workloads and stream processing environments. Addressing another recent trend, we show the importance of our approach as a means to providing computational elasticity in multicore processor-based environments such that operators can automatically find their best operating point. Finally, we present experiments driven by synthetic workloads, showing the space where the optimizing efforts are most beneficial and a radioastronomy imaging application, where we observe substantial improvements in its performance-critical section.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Odessa: enabling interactive perception applications on mobile devices

TL;DR: Odessa is developed, a novel, lightweight, runtime that automatically and adaptively makes offloading and parallelism decisions for mobile interactive perception applications and provides more than a 3x improvement in application performance over partitioning suggested by domain experts.
Proceedings ArticleDOI

Integrating scale out and fault tolerance in stream processing using operator state management

TL;DR: The key idea is to expose internal operator state explicitly to the SPS through a set of state management primitives that can scale automatically to a load factor of L=350 with 50 VMs, while recovering quickly from failures.
Journal ArticleDOI

A catalog of stream processing optimizations

TL;DR: A survey of optimizations for stream processing, in a style similar to catalogs of design patterns or refactorings, to help future streaming system builders to stand on the shoulders of giants from not just their own community.
Journal ArticleDOI

Elastic Scaling for Data Stream Processing

TL;DR: This article proposes an elastic auto-parallelization solution that can dynamically adjust the number of channels used to achieve high throughput without unnecessarily wasting resources and can handle partitioned stateful operators via run-time state migration, which is fully transparent to the application developers.
Proceedings ArticleDOI

Adaptive Stream Processing using Dynamic Batch Sizing

TL;DR: This paper proposes a simple yet robust control algorithm that automatically adapts the batch size as the situation necessitates and shows that it can ensure system stability and low latency for a wide range of workloads, despite large variations in data rates and operating conditions.
References
More filters
Proceedings ArticleDOI

Pig latin: a not-so-foreign language for data processing

TL;DR: A new language called Pig Latin is described, designed to fit in a sweet spot between the declarative style of SQL, and the low-level, procedural style of map-reduce, which is an open-source, Apache-incubator project, and available for general use.
Proceedings Article

The Design of the Borealis Stream Processing Engine

TL;DR: This paper outlines the basic design and functionality of Borealis, and presents a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.
Proceedings ArticleDOI

The implementation of the Cilk-5 multithreaded language

TL;DR: Cilk-5's novel "two-clone" compilation strategy and its Dijkstra-like mutual-exclusion protocol for implementing the ready deque in the work-stealing scheduler are presented.
Journal ArticleDOI

New trends in high performance computing

TL;DR: The automatically tuned linear algebra software (ATLAS) project is described, as well as the fundamental principles that underly it, with the present emphasis on the basic linear algebra subprograms (BLAS), a widely used, performance-critical, linear algebra kernel library.
Proceedings Article

TelegraphCQ: Continuous Dataflow Processing for an Uncertain World.

TL;DR: The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams and leverages the PostgreSQL open source code base.
Related Papers (5)