A
Alan Gates
Researcher at Yahoo!
Publications - 6
Citations - 670
Alan Gates is an academic researcher from Yahoo!. The author has contributed to research in topics: Big data & Data warehouse. The author has an hindex of 6, co-authored 6 publications receiving 628 citations.
Papers
More filters
Journal ArticleDOI
Building a high-level dataflow system on top of Map-Reduce: the Pig experience
Alan Gates,Olga Natkovich,Shubham Chopra,Pradeep Kamath,Shravan Narayanamurthy,Christopher Olston,Benjamin Reed,Santhosh Srinivasan,Utkarsh Srivastava +8 more
TL;DR: Pig is a high-level dataflow system that aims at a sweet spot between SQL and Map-Reduce, and performance comparisons between Pig execution and raw Map- Reduce execution are reported.
Proceedings ArticleDOI
Major technical advancements in apache hive
Yin Huai,Ashutosh Chauhan,Alan Gates,Günther Hagleitner,Eric N. Hanson,Owen O'Malley,Jitendra Pandey,Yuan Yuan,Rubao Lee,Xiaodong Zhang +9 more
TL;DR: A community-based effort on technical advancements in Hive provides significant improvements on storage efficiency and query execution performance and shows how academic research lays a foundation for Hive to improve its daily operations.
Patent
Clustered query support for a database query engine
TL;DR: In this article, the plurality of queries is transformed into a plurality of parse trees, and a determination is made whether the plurality operates on at least the same portion of the same table.
Proceedings ArticleDOI
Apache Hive: From MapReduce to Enterprise-grade Big Data Warehousing
Jesús Camacho-Rodríguez,Ashutosh Chauhan,Alan Gates,Eugene Koifman,Owen O'Malley,Vineet Garg,Zoltan Haindrich,Sergey Shelukhin,Prasanth Jayachandran,Siddharth Seth,Deepak Jaiswal,Slim Bouguerra,Nishant Bangarwa,Sankar Hariappan,Anishek Agarwal,Jason Dere,Daniel Dai,Thejas Nair,Nita Dembla,Gopal Vijayaraghavan,Günther Hagleitner +20 more
TL;DR: Apache Hive as mentioned in this paper is an open-source relational database system for analytic big-data workloads that combines traditional MPP techniques with more recent big data and cloud concepts to achieve the scale and performance required by today's analytic applications.
Book
Programming Pig: Dataflow Scripting with Hadoop
Alan Gates,Daniel Dai +1 more
TL;DR: This second edition of the Apache Pig scripting platform guide is the ideal learning tool for new and experienced users alike, with comprehensive coverage on key features such as the Pig Latin scripting language and the Grunt shell.