E
Evan R. Sparks
Researcher at University of California, Berkeley
Publications - 27
Citations - 2896
Evan R. Sparks is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Scalability & Spark (mathematics). The author has an hindex of 18, co-authored 27 publications receiving 2638 citations. Previous affiliations of Evan R. Sparks include Dartmouth College.
Papers
More filters
Journal Article
MLlib: machine learning in apache spark
Xiangrui Meng,Joseph K. Bradley,Burak Yavuz,Evan R. Sparks,Shivaram Venkataraman,Davies Liu,Jeremy Freeman,DB Tsai,Manish Amde,Sean Owen,Doris Xin,Reynold Xin,Michael J. Franklin,Reza Bosagh Zadeh,Matei Zaharia,Ameet Talwalkar +15 more
TL;DR: MLlib as mentioned in this paper is an open-source distributed machine learning library for Apache Spark that provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives.
Posted Content
MLI: An API for Distributed Machine Learning
Evan R. Sparks,Ameet Talwalkar,Virginia Smith,Jey Kottalam,Xinghao Pan,Joseph E. Gonzalez,Michael J. Franklin,Michael I. Jordan,Tim Kraska +8 more
TL;DR: The initial results show that this interface can be used to build distributed implementations of a wide variety of common Machine Learning algorithms with minimal complexity and highly competitive performance and scalability.
Proceedings Article
Paleo: A Performance Model for Deep Neural Networks
TL;DR: This work introduces an analytical performance model called PALEO, which can efficiently and accurately model the expected scalability and performance of a putative deep learning system and is robust to the choice of network architecture, hardware, software, communication schemes, and parallelization strategies.
Proceedings ArticleDOI
Automating model search for large scale machine learning
TL;DR: An architecture for automatic machine learning at scale comprised of a cost-based cluster resource allocation estimator, advanced hyper-parameter tuning techniques, bandit resource allocation via runtime algorithm introspection, and physical optimization via batching and optimal resource allocation is proposed.
Proceedings ArticleDOI
KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics
TL;DR: KeystoneML is presented, a system that captures and optimizes the end-to-end large-scale machine learning applications for high-throughput training in a distributed environment with a high-level API that offers increased ease of use and higher performance over existing systems for large scale learning.