Conference

Conference on Innovative Data Systems Research

About: Conference on Innovative Data Systems Research is an academic conference. The conference publishes majorly in the area(s): Computer science & Data management. Over the lifetime, 542 publications have been published by the conference receiving 27193 citations.

...read moreread less

Topics: Computer science, Data management, Query optimization, Analytics, SQL ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•

The Design of the Borealis Stream Processing Engine

[...]

Daniel J. Abadi¹, Yanif Ahmad², Magdalena Balazinska¹, Mitch Cherniack³, Jeong-Hyon Hwang², Wolfgang Lindner¹, Anurag S. Maskey³, Alexander Rasin², Esther Ryvkina³, Nesime Tatbul², Ying Xing², Stan Zdonik² - Show less +8 more•Institutions (3)

Massachusetts Institute of Technology¹, Brown University², Brandeis University³

01 Jan 2005

TL;DR: This paper outlines the basic design and functionality of Borealis, and presents a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.

...read moreread less

Abstract: Borealis is a second-generation distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora [14] and distribution functionality from Medusa [51]. Borealis modifies and extends both systems in non-trivial and critical ways to provide advanced capabilities that are commonly required by newly-emerging stream processing applications. In this paper, we outline the basic design and functionality of Borealis. Through sample real-world applications, we motivate the need for dynamically revising query results and modifying query specifications. We then describe how Borealis addresses these challenges through an innovative set of features, including revision records, time travel, and control lines. Finally, we present a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.

...read moreread less

1,533 citations

Proceedings Article•

TelegraphCQ: Continuous Dataflow Processing for an Uncertain World.

[...]

Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong, Sailesh Krishnamurthy, Samuel Madden, Vijayshankar Raman, Frederick Reiss, Mehul A. Shah - Show less +7 more

01 Jan 2003

TL;DR: The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams and leverages the PostgreSQL open source code base.

...read moreread less

Abstract: Increasingly pervasive networks are leading towards a world where data is constantly in motion. In such a world, conventional techniques for query processing, which were developed under the assumption of a far more static and predictable computational environment, will not be sufficient. Instead, query processors based on adaptive dataflow will be necessary. The Telegraph project has developed a suite of novel technologies for continuously adaptive query processing. The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams. In this paper, we describe the system architecture and its underlying technology, and report on our ongoing implementation effort, which leverages the PostgreSQL open source code base. We also discuss open issues and our research agenda.

...read moreread less

1,248 citations

Proceedings Article•

Megastore: Providing Scalable, Highly Available Storage for Interactive Services

[...]

Jason Baker¹, Christopher N. Bond¹, James C. Corbett¹, J. J. Furman¹, Andrey Khorlin¹, James Larson¹, Jean-Michel Leon¹, Yawei Li¹, Alexander Lloyd¹, Vadim Yushprakh¹ - Show less +6 more•Institutions (1)

Google¹

01 Jan 2011

TL;DR: Megastore provides fully serializable ACID semantics within ne-grained partitions of data, which allows us to synchronously replicate each write across a wide area network with reasonable latency and support seamless failover between datacenters.

...read moreread less

Abstract: Megastore is a storage system developed to meet the requirements of today’s interactive online services. Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability. We provide fully serializable ACID semantics within ne-grained partitions of data. This partitioning allows us to synchronously replicate each write across a wide area network with reasonable latency and support seamless failover between datacenters. This paper describes Megastore’s semantics and replication algorithm. It also describes our experience supporting a wide range of Google production services built with Megastore.

...read moreread less

802 citations

Proceedings Article•

YAGO3: A Knowledge Base from Multilingual Wikipedias

[...]

Farzaneh Mahdisoltani¹, Joanna Biega¹, Fabian M. Suchanek•Institutions (1)

Max Planck Society¹

01 Jan 2014

TL;DR: This work fuses the multilingual information with the English WordNet to build one coherent knowledge base that combines the information from the Wikipedias in multiple languages, and enlarges YAGO by 1m new entities and 7m new facts.

...read moreread less

Abstract: We present YAGO3, an extension of the YAGO knowledge base that combines the information from the Wikipedias in multiple languages. Our technique fuses the multilingual information with the English WordNet to build one coherent knowledge base. We make use of the categories, the infoboxes, and Wikidata, and learn the meaning of infobox attributes across languages. We run our method on 10 different languages, and achieve a precision of 95%-100% in the attribute mapping. Our technique enlarges YAGO by 1m new entities and 7m new facts.

...read moreread less

695 citations

Proceedings Article•

Starfish: A Self-tuning System for Big Data Analytics.

[...]

Herodotos Herodotou¹, Harold Lim¹, Gang Luo, Nedyalko Borisov¹, Liang Dong, Fatma Bilgen Cetin, Shivnath Babu¹ - Show less +3 more•Institutions (1)

Duke University¹

01 Jan 2011

TL;DR: Starfish is introduced, a self-tuning system for big data analytics that builds on Hadoop while adapting to user needs and system workloads to provide good performance automatically, without any need for users to understand and manipulate the many tuning knobs in Hadoops.

...read moreread less

Abstract: Timely and cost-effective analytics over “Big Data” is now a key ingredient for success in many businesses, scientific and engineering disciplines, and government endeavors. The Hadoop software stack—which consists of an extensible MapReduce execution engine, pluggable distributed storage engines, and a range of procedural to declarative interfaces—is a popular choice for big data analytics. Most practitioners of big data analytics—like computational scientists, systems researchers, and business analysts—lack the expertise to tune the system to get good performance. Unfortunately, Hadoop’s performance out of the box leaves much to be desired, leading to suboptimal use of resources, time, and money (in payas-you-go clouds). We introduce Starfish, a self-tuning system for big data analytics. Starfish builds on Hadoop while adapting to user needs and system workloads to provide good performance automatically, without any need for users to understand and manipulate the many tuning knobs in Hadoop. While Starfish’s system architecture is guided by work on self-tuning database systems, we discuss how new analysis practices over big data pose new challenges; leading us to different design choices in Starfish.

...read moreread less

663 citations

Collapse

Performance

Metrics

542

Papers

27,193

Citations

No. of papers from the Conference in previous years
Year	Papers
2023	26
2022	40
2021	36
2020	44
2019	46
2018	6