Evan R. Sparks

Researcher at University of California, Berkeley

Publications - 27

Citations - 2896

Evan R. Sparks is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Scalability & Spark (mathematics). The author has an hindex of 18, co-authored 27 publications receiving 2638 citations. Previous affiliations of Evan R. Sparks include Dartmouth College.

Papers

PDF

Open Access

More filters

Journal Article

MLlib: machine learning in apache spark

Xiangrui Meng, +15 more

- 01 Jan 2016 -

Journal of Machine Learning Research

TL;DR: MLlib as mentioned in this paper is an open-source distributed machine learning library for Apache Spark that provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives.

...read moreread less

Posted Content

MLI: An API for Distributed Machine Learning

Evan R. Sparks, +8 more

- 21 Oct 2013 -

arXiv: Learning

TL;DR: The initial results show that this interface can be used to build distributed implementations of a wide variety of common Machine Learning algorithms with minimal complexity and highly competitive performance and scalability.

...read moreread less

Proceedings Article

Paleo: A Performance Model for Deep Neural Networks

Hang Qi, +2 more

TL;DR: This work introduces an analytical performance model called PALEO, which can efficiently and accurately model the expected scalability and performance of a putative deep learning system and is robust to the choice of network architecture, hardware, software, communication schemes, and parallelization strategies.

...read moreread less

Proceedings ArticleDOI

Automating model search for large scale machine learning

Evan R. Sparks, +5 more

TL;DR: An architecture for automatic machine learning at scale comprised of a cost-based cluster resource allocation estimator, advanced hyper-parameter tuning techniques, bandit resource allocation via runtime algorithm introspection, and physical optimization via batching and optimal resource allocation is proposed.

...read moreread less

Proceedings ArticleDOI

KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics

Evan R. Sparks, +4 more

TL;DR: KeystoneML is presented, a system that captures and optimizes the end-to-end large-scale machine learning applications for high-throughput training in a distributed environment with a high-level API that offers increased ease of use and higher performance over existing systems for large scale learning.

...read moreread less

Collapse