M
Matthias J. Sax
Researcher at Humboldt University of Berlin
Publications - 7
Citations - 784
Matthias J. Sax is an academic researcher from Humboldt University of Berlin. The author has contributed to research in topics: Query optimization & Big data. The author has an hindex of 6, co-authored 7 publications receiving 703 citations. Previous affiliations of Matthias J. Sax include Humboldt State University.
Papers
More filters
Journal ArticleDOI
The Stratosphere platform for big data analytics
Alexander Alexandrov,Rico Bergmann,Stephan Ewen,Johann-Christoph Freytag,Fabian Hueske,Arvid Heise,Odej Kao,Marcus Leich,Ulf Leser,Volker Markl,Felix Naumann,Mathias Peters,Astrid Rheinländer,Matthias J. Sax,Sebastian Schelter,Mareike Hoger,Kostas Tzoumas,Daniel Warneke +17 more
TL;DR: The overall system architecture design decisions are presented, Stratosphere is introduced through example queries, and the internal workings of the system’s components that relate to extensibility, programming model, optimization, and query execution are dive into.
Posted Content
Opening the Black Boxes in Data Flow Optimization
Fabian Hueske,Mathias Peters,Matthias J. Sax,Astrid Rheinländer,Rico Bergmann,Aljoscha Krettek,Kostas Tzoumas +6 more
TL;DR: In this paper, the problem of performing data flow optimization at this level of abstraction, where the semantics of operators are not known, was addressed by statically analyzing the general-purpose code of their user-defined functions.
Journal ArticleDOI
Opening the black boxes in data flow optimization
Fabian Hueske,Mathias Peters,Matthias J. Sax,Astrid Rheinländer,Rico Bergmann,Aljoscha Krettek,Kostas Tzoumas +6 more
TL;DR: This work design and implement an optimizer for parallel data flows that does not assume knowledge of semantics or algebraic properties of operators, and can optimize the operator order of nonrelational data flows, a unique feature among today's systems.
Proceedings ArticleDOI
Streams and Tables: Two Sides of the Same Coin
TL;DR: This model presents the result of an operator as a stream of successive updates, which induces a duality of results and streams, which provides a natural way to cope with inconsistencies between the physical and logical order of streaming data in a continuous manner, without explicit buffering and reordering.
Proceedings ArticleDOI
Consistency and Completeness: Rethinking Distributed Stream Processing in Apache Kafka
Guozhang Wang,Lei Chen,Ayusman Dikshit,Jason Gustafson,Boyang Chen,Matthias J. Sax,John Roesler,Sophie Blee-Goldman,Bruno Cadonna,Apurva Mehta,Varun Madan,Jun Rao +11 more
TL;DR: Kafka Streams as discussed by the authors is a scalable stream processing client library in Apache Kafka, which defines the processing logic as read-process-write cycles in which all processing state updates and result outputs are captured as log appends.