M
Matei Zaharia
Researcher at Stanford University
Publications - 222
Citations - 49762
Matei Zaharia is an academic researcher from Stanford University. The author has contributed to research in topics: Computer science & Computer cluster. The author has an hindex of 53, co-authored 194 publications receiving 44123 citations. Previous affiliations of Matei Zaharia include University of California, Berkeley & Microsoft.
Papers
More filters
Journal ArticleDOI
A view of cloud computing
Michael Armbrust,Armando Fox,Rean Griffith,Anthony D. Joseph,Randy H. Katz,Andy Konwinski,Gunho Lee,David A. Patterson,Ariel Rabkin,Ion Stoica,Matei Zaharia +10 more
TL;DR: The clouds are clearing the clouds away from the true potential and obstacles posed by this computing capability.
Journal Article
Above the Clouds: A Berkeley View of Cloud Computing
Michael Armbrust,Armando Fox,Rean Griffith,Anthony D. Joseph,Randy H. Katz,Andy Konwinski,Gunho Lee,David A. Patterson,Ariel Rabkin,Ion Stoica,Matei Zaharia +10 more
TL;DR: This work focuses on SaaS Providers (Cloud Users) and Cloud Providers, which have received less attention than SAAS Users, and uses the term Private Cloud to refer to internal datacenters of a business or other organization, not made available to the general public.
Proceedings Article
Spark: cluster computing with working sets
TL;DR: Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.
Proceedings Article
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing
Matei Zaharia,Mosharaf Chowdhury,Tathagata Das,Ankur Dave,Justin Ma,Murphy McCauley,Michael J. Franklin,Scott Shenker,Ion Stoica +8 more
TL;DR: Resilient Distributed Datasets is presented, a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner and is implemented in a system called Spark, which is evaluated through a variety of user applications and benchmarks.
Proceedings ArticleDOI
Improving MapReduce performance in heterogeneous environments
TL;DR: A new scheduling algorithm, Longest Approximate Time to End (LATE), that is highly robust to heterogeneity and can improve Hadoop response times by a factor of 2 in clusters of 200 virtual machines on EC2.