API design for machine learning software: experiences from the scikit-learn project

Open AccessProceedings Article

API design for machine learning software: experiences from the scikit-learn project

TLDR

Scikit-learn as discussed by the authors is a machine learning library written in Python, which is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts.

Abstract:

Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Robust Smartphone App Identification via Encrypted Network Traffic Analysis

Vincent F. Taylor, +3 more

- 01 Jan 2018 -

IEEE Transactions on Information Forensi...

TL;DR: In this paper, a passive eavesdropper can feasibly identify smartphone apps by fingerprinting the network traffic that they send, which can reveal much information about a user, such as their medical conditions, sexual orientation or religious beliefs.

...read moreread less

Posted Content

Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics.

Mark Weber, +6 more

- 31 Jul 2019 -

arXiv: Social and Information Networks

TL;DR: This workshop tutorial motivates the opportunity to reconcile the cause of safety with that of financial inclusion, and offers a simple prototype capable of navigating the graph and observing model performance on illicit activity over time.

...read moreread less

Posted Content

How Does Mixup Help With Robustness and Generalization

Linjun Zhang, +4 more

- 09 Oct 2020 -

arXiv: Learning

TL;DR: It is shown that minimizing the Mixup loss corresponds to approximately minimizing an upper bound of the adversarial loss, which explains why models obtained by Mixup training exhibits robustness to several kinds of adversarial attacks such as Fast Gradient Sign Method.

...read moreread less

Posted Content

Provably efficient machine learning for quantum many-body problems.

Hsin-Yuan Huang, +4 more

- 25 Jun 2021 -

arXiv: Quantum Physics

TL;DR: It is proved that classical ML algorithms can efficiently predict ground state properties of gapped Hamiltonian in finite spatial dimensions, after learning from data obtained by measuring other Hamiltonians in the same quantum phase of matter.

...read moreread less

Proceedings ArticleDOI

Word embeddings for Arabic sentiment analysis

A. Aziz Altowayan, +1 more

TL;DR: This paper relies on word embeddings as the main source of features for opinion mining in Arabic text such as tweets, consumer reviews, and news articles and achieves a slightly better accuracy than the top hand-crafted methods.

...read moreread less

Citations

Robust Smartphone App Identification via Encrypted Network Traffic Analysis

Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics.

How Does Mixup Help With Robustness and Generalization

Provably efficient machine learning for quantum many-body problems.

Word embeddings for Arabic sentiment analysis

References

Scikit-learn: Machine Learning in Python

LIBSVM: A library for support vector machines

Matplotlib: A 2D Graphics Environment

The WEKA data mining software: an update

The NumPy Array: A Structure for Efficient Numerical Computation

Related Papers (5)

Scikit-learn: Machine Learning in Python

Random Forests

Adam: A Method for Stochastic Optimization

Deep Residual Learning for Image Recognition

Visualizing Data using t-SNE