Showing papers on "Active learning (machine learning) published in 2019"

PDF

Open Access

Journal Article•DOI•

Recent advances and applications of machine learning in solid-state materials science

[...]

Jonathan Schmidt¹, Mário R. G. Marques¹, Silvana Botti², Miguel A. L. Marques¹•Institutions (2)

Martin Luther University of Halle-Wittenberg¹, University of Jena²

08 Aug 2019

TL;DR: A comprehensive overview and analysis of the most recent research in machine learning principles, algorithms, descriptors, and databases in materials science, and proposes solutions and future research paths for various challenges in computational materials science.

...read moreread less

Abstract: One of the most exciting tools that have entered the material science toolbox in recent years is machine learning. This collection of statistical methods has already proved to be capable of considerably speeding up both fundamental and applied research. At present, we are witnessing an explosion of works that develop and apply machine learning to solid-state systems. We provide a comprehensive overview and analysis of the most recent research in this topic. As a starting point, we introduce machine learning principles, algorithms, descriptors, and databases in materials science. We continue with the description of different machine learning approaches for the discovery of stable materials and the prediction of their crystal structure. Then we discuss research in numerous quantitative structure–property relationships and various approaches for the replacement of first-principle methods by machine learning. We review how active learning and surrogate-based optimization can be applied to improve the rational design process and related examples of applications. Two major questions are always the interpretability of and the physical understanding gained from machine learning models. We consider therefore the different facets of interpretability and their importance in materials science. Finally, we propose solutions and future research paths for various challenges in computational materials science.

...read moreread less

1,301 citations

Proceedings Article•DOI•

Learning Loss for Active Learning

[...]

Donggeun Yoo, In So Kweon¹•Institutions (1)

KAIST¹

15 Jun 2019

TL;DR: In this article, the authors propose a novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks, where a small parametric module, named ''loss prediction module'' to a target network, and learn it to predict target losses of unlabeled inputs.

...read moreread less

Abstract: The performance of deep neural networks improves with more annotated data. The problem is that the budget for annotation is limited. One solution to this is active learning, where a model asks human to annotate data that it perceived as uncertain. A variety of recent methods have been proposed to apply active learning to deep networks but most of them are either designed specific for their target tasks or computationally inefficient for large networks. In this paper, we propose a novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks. We attach a small parametric module, named ``loss prediction module,'' to a target network, and learn it to predict target losses of unlabeled inputs. Then, this module can suggest data that the target model is likely to produce a wrong prediction. This method is task-agnostic as networks are learned from a single loss regardless of target tasks. We rigorously validate our method through image classification, object detection, and human pose estimation, with the recent network architectures. The results demonstrate that our method consistently outperforms the previous methods over the tasks.

...read moreread less

429 citations

Journal Article•DOI•

Machine learning assisted design of high entropy alloys with desired property

[...]

Cheng Wen¹, Cheng Wen², Yan Zhang², Changxin Wang², Dezhen Xue³, Yang Bai², Stoichko Antonov², Lanhong Dai⁴, Turab Lookman⁵, Yanjing Su² - Show less +6 more•Institutions (5)

Guangdong Ocean University¹, University of Science and Technology Beijing², Xi'an Jiaotong University³, Chinese Academy of Sciences⁴, Los Alamos National Laboratory⁵

15 May 2019-Acta Materialia

TL;DR: In this article, a materials design strategy combining a machine learning (ML) surrogate model with experimental design algorithms to search for high entropy alloys (HEAs) with large hardness in a model Al-Co-Cr-Cu-Fe-Ni system was proposed.

...read moreread less

387 citations

Journal Article•DOI•

Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design

[...]

Turab Lookman¹, Prasanna V. Balachandran², Prasanna V. Balachandran¹, Dezhen Xue¹, Dezhen Xue³, Ruihao Yuan³, Ruihao Yuan¹ - Show less +3 more•Institutions (3)

Los Alamos National Laboratory¹, University of Virginia², Xi'an Jiaotong University³

18 Feb 2019

TL;DR: How methods from the information sciences enable us to accelerate the search and discovery of new materials is reviewed and active learning allows us to effectively navigate the search space iteratively to identify promising candidates for guiding experiments and computations.

...read moreread less

Abstract: One of the main challenges in materials discovery is efficiently exploring the vast search space for targeted properties as approaches that rely on trial-and-error are impractical. We review how methods from the information sciences enable us to accelerate the search and discovery of new materials. In particular, active learning allows us to effectively navigate the search space iteratively to identify promising candidates for guiding experiments and computations. The approach relies on the use of uncertainties and making predictions from a surrogate model together with a utility function that prioritizes the decision making process on unexplored data. We discuss several utility functions and demonstrate their use in materials science applications, impacting both experimental and computational research. We summarize by indicating generalizations to multiple properties and multifidelity data, and identify challenges, future directions and opportunities in the emerging field of materials informatics.

...read moreread less

297 citations

Journal Article•DOI•

Active learning of uniformly accurate interatomic potentials for materials simulation

[...]

Linfeng Zhang¹, Deye Lin, Han Wang, Roberto Car¹, Weinan E¹ - Show less +1 more•Institutions (1)

Princeton University¹

25 Feb 2019-Physical Review Materials

TL;DR: Application to the sample systems of Al, Mg and Al-Mg alloys demonstrates that DP-GEN can produce uniformly accurate PES models with a minimal number of reference data.

...read moreread less

Abstract: An active learning procedure called deep potential generator (DP-GEN) is proposed for the construction of accurate and transferable machine learning-based models of the potential energy surface (PES) for the molecular modeling of materials. This procedure consists of three main components: exploration, generation of accurate reference data, and training. Application to the sample systems of Al, Mg, and Al-Mg alloys demonstrates that DP-GEN can produce uniformly accurate PES models with a minimal number of reference data.

...read moreread less

282 citations

Posted Content•

Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds

[...]

Jordan T. Ash¹, Chicheng Zhang², Akshay Krishnamurthy³, John Langford³, Alekh Agarwal³ - Show less +1 more•Institutions (3)

Princeton University¹, University of Arizona², Microsoft³

09 Jun 2019-arXiv: Learning

TL;DR: BADGE as discussed by the authors samples groups of points that are disparate and high-magnitude when represented in a hallucinated gradient space, a strategy designed to incorporate both predictive uncertainty and sample diversity into every selected batch.

...read moreread less

Abstract: We design a new algorithm for batch active learning with deep neural network models. Our algorithm, Batch Active learning by Diverse Gradient Embeddings (BADGE), samples groups of points that are disparate and high-magnitude when represented in a hallucinated gradient space, a strategy designed to incorporate both predictive uncertainty and sample diversity into every selected batch. Crucially, BADGE trades off between diversity and uncertainty without requiring any hand-tuned hyperparameters. We show that while other approaches sometimes succeed for particular batch sizes or architectures, BADGE consistently performs as well or better, making it a versatile option for practical active learning problems.

...read moreread less

262 citations

Proceedings Article•DOI•

Variational Adversarial Active Learning

[...]

Samarth Sinha¹, Sayna Ebrahimi², Trevor Darrell²•Institutions (2)

University of Toronto¹, University of California, Berkeley²

31 Mar 2019

TL;DR: In this article, a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner is proposed, where the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool.

...read moreread less

Abstract: Active learning aims to develop label-efficient algorithms by sampling the most representative queries to be labeled by an oracle. We describe a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner. Our method learns a latent space using a variational autoencoder (VAE) and an adversarial network trained to discriminate between unlabeled and labeled data. The mini-max game between the VAE and the adversarial network is played such that while the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool, the adversarial network learns how to discriminate between dissimilarities in the latent space. We extensively evaluate our method on various image classification and semantic segmentation benchmark datasets and establish a new state of the art on CIFAR10/100, Caltech-256, ImageNet, Cityscapes, and BDD100K. Our results demonstrate that our adversarial approach learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method. Our code is available at \url{https://github.com/sinhasam/vaal}.

...read moreread less

254 citations

Posted Content•

BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning

[...]

Andreas Kirsch¹, Joost van Amersfoort¹, Yarin Gal¹•Institutions (1)

University of Oxford¹

19 Jun 2019-arXiv: Learning

TL;DR: BatchBALD as discussed by the authors is a tractable approximation to the mutual information between a batch of points and model parameters, which is used as an acquisition function to select multiple informative points jointly for the task of deep Bayesian active learning.

...read moreread less

Abstract: We develop BatchBALD, a tractable approximation to the mutual information between a batch of points and model parameters, which we use as an acquisition function to select multiple informative points jointly for the task of deep Bayesian active learning. BatchBALD is a greedy linear-time $1 - \frac{1}{e}$-approximate algorithm amenable to dynamic programming and efficient caching. We compare BatchBALD to the commonly used approach for batch data acquisition and find that the current approach acquires similar and redundant points, sometimes performing worse than randomly acquiring data. We finish by showing that, using BatchBALD to consider dependencies within an acquisition batch, we achieve new state of the art performance on standard benchmarks, providing substantial data efficiency improvements in batch acquisition.

...read moreread less

238 citations

Posted Content•

Variational Adversarial Active Learning

[...]

Samarth Sinha¹, Sayna Ebrahimi², Trevor Darrell²•Institutions (2)

University of Toronto¹, University of California, Berkeley²

31 Mar 2019-arXiv: Learning

TL;DR: A pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner that learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method.

...read moreread less

Abstract: Active learning aims to develop label-efficient algorithms by sampling the most representative queries to be labeled by an oracle. We describe a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner. Unlike conventional active learning algorithms, our approach is task agnostic, i.e., it does not depend on the performance of the task for which we are trying to acquire labeled data. Our method learns a latent space using a variational autoencoder (VAE) and an adversarial network trained to discriminate between unlabeled and labeled data. The mini-max game between the VAE and the adversarial network is played such that while the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool, the adversarial network learns how to discriminate between dissimilarities in the latent space. We extensively evaluate our method on various image classification and semantic segmentation benchmark datasets and establish a new state of the art on $\text{CIFAR10/100}$, $\text{Caltech-256}$, $\text{ImageNet}$, $\text{Cityscapes}$, and $\text{BDD100K}$. Our results demonstrate that our adversarial approach learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method. Our code is available at this https URL.

...read moreread less

194 citations

Journal Article•DOI•

Exploiting Unlabeled Data in CNNs by Self-Supervised Learning to Rank

[...]

Xialei Liu¹, Joost van de Weijer¹, Andrew D. Bagdanov²•Institutions (2)

Autonomous University of Barcelona¹, University of Florence²

01 Aug 2019-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting.

...read moreread less

Abstract: For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task for some regression problems. As another contribution, we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. We apply our framework to two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning and we show that this reduces labeling effort by up to 50 percent.

...read moreread less

161 citations

Proceedings Article•DOI•

O2U-Net: A Simple Noisy Label Detection Approach for Deep Neural Networks

[...]

Jinchi Huang¹, Lie Qu¹, Rongfei Jia¹, Binqiang Zhao¹•Institutions (1)

Alibaba Group¹

01 Oct 2019

TL;DR: This paper proposes a novel noisy label detection approach, named O2U-net, for deep neural networks without human annotations, which only requires adjusting the hyper-parameters of the deep network to make its status transfer from overfitting to underfitting (O2U) cyclically.

...read moreread less

Abstract: This paper proposes a novel noisy label detection approach, named O2U-net, for deep neural networks without human annotations. Different from prior work which requires specifically designed noise-robust loss functions or networks, O2U-net is easy to implement but effective. It only requires adjusting the hyper-parameters of the deep network to make its status transfer from overfitting to underfitting (O2U) cyclically. The losses of each sample are recorded during iterations. The higher the normalized average loss of a sample, the higher the probability of being noisy labels. O2U-net is naturally compatible with active learning and other human annotation approaches. This introduces extra flexibility for learning with noisy labels. We conduct sufficient experiments on multiple datasets in various settings. The experimental results prove the state-of-the-art of O2S-net.

...read moreread less

Proceedings Article•

BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning

[...]

Andreas Kirsch¹, Joost van Amersfoort¹, Yarin Gal¹•Institutions (1)

University of Oxford¹

19 Jun 2019

TL;DR: BatchBALD is a tractable approximation to the mutual information between a batch of points and model parameters, which is used as an acquisition function to select multiple informative points jointly for the task of deep Bayesian active learning.

...read moreread less

Abstract: We develop BatchBALD, a tractable approximation to the mutual information between a batch of points and model parameters, which we use as an acquisition function to select multiple informative points jointly for the task of deep Bayesian active learning. BatchBALD is a greedy linear-time $1 - icefrac{1}{e}$-approximate algorithm amenable to dynamic programming and efficient caching. We compare BatchBALD to the commonly used approach for batch data acquisition and find that the current approach acquires similar and redundant points, sometimes performing worse than randomly acquiring data. We finish by showing that, using BatchBALD to consider dependencies within an acquisition batch, we achieve new state of the art performance on standard benchmarks, providing substantial data efficiency improvements in batch acquisition.

...read moreread less

Journal Article•DOI•

A quantitative uncertainty metric controls error in neural network-driven chemical discovery

[...]

Jon Paul Janet¹, Chenru Duan¹, Tzuhsiung Yang¹, Aditya Nandy¹, Heather J. Kulik¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

28 Aug 2019-Chemical Science

TL;DR: In this paper, the authors introduce the distance to available data in the latent space of a neural network ML model as a low-cost, quantitative uncertainty metric that works for both inorganic and organic chemistry.

...read moreread less

Abstract: Machine learning (ML) models, such as artificial neural networks, have emerged as a complement to high-throughput screening, enabling characterization of new compounds in seconds instead of hours. The promise of ML models to enable large-scale chemical space exploration can only be realized if it is straightforward to identify when molecules and materials are outside the model's domain of applicability. Established uncertainty metrics for neural network models are either costly to obtain (e.g., ensemble models) or rely on feature engineering (e.g., feature space distances), and each has limitations in estimating prediction errors for chemical space exploration. We introduce the distance to available data in the latent space of a neural network ML model as a low-cost, quantitative uncertainty metric that works for both inorganic and organic chemistry. The calibrated performance of this approach exceeds widely used uncertainty metrics and is readily applied to models of increasing complexity at no additional cost. Tightening latent distance cutoffs systematically drives down predicted model errors below training errors, thus enabling predictive error control in chemical discovery or identification of useful data points for active learning.

...read moreread less

Proceedings Article•DOI•

Explanatory Interactive Machine Learning

[...]

Stefano Teso¹, Kristian Kersting²•Institutions (2)

Katholieke Universiteit Leuven¹, Technische Universität Darmstadt²

27 Jan 2019

TL;DR: This work proposes the novel framework of explanatory interactive learning where, in each step, the learner explains its query to the user, and the user interacts by both answering the query and correcting the explanation.

...read moreread less

Abstract: Although interactive learning puts the user into the loop, the learner remains mostly a black box for the user. Understanding the reasons behind predictions and queries is important when assessing how the learner works and, in turn, trust. Consequently, we propose the novel framework of explanatory interactive learning where, in each step, the learner explains its query to the user, and the user interacts by both answering the query and correcting the explanation. We demonstrate that this can boost the predictive and explanatory powers of, and the trust into, the learned model, using text (e.g. SVMs) and image classification (e.g. neural networks) experiments as well as a user study.

...read moreread less

Posted Content•

Self-Supervised Exploration via Disagreement

[...]

Deepak Pathak¹, Dhiraj Gandhi², Abhinav Gupta²•Institutions (2)

University of California, Berkeley¹, Facebook²

10 Jun 2019-arXiv: Learning

TL;DR: In this article, an ensemble of dynamics models is used to incentivize the agent to explore such that the disagreement of those ensembles is maximized, which results in a sample-efficient exploration.

...read moreread less

Abstract: Efficient exploration is a long-standing problem in sensorimotor learning. Major advances have been demonstrated in noise-free, non-stochastic domains such as video games and simulation. However, most of these formulations either get stuck in environments with stochastic dynamics or are too inefficient to be scalable to real robotics setups. In this paper, we propose a formulation for exploration inspired by the work in active learning literature. Specifically, we train an ensemble of dynamics models and incentivize the agent to explore such that the disagreement of those ensembles is maximized. This allows the agent to learn skills by exploring in a self-supervised manner without any external reward. Notably, we further leverage the disagreement objective to optimize the agent's policy in a differentiable manner, without using reinforcement learning, which results in a sample-efficient exploration. We demonstrate the efficacy of this formulation across a variety of benchmark environments including stochastic-Atari, Mujoco and Unity. Finally, we implement our differentiable exploration on a real robot which learns to interact with objects completely from scratch. Project videos and code are at this https URL

...read moreread less

Journal Article•DOI•

An Adaptive Dropout Deep Computation Model for Industrial IoT Big Data Learning With Crowdsourcing to Cloud Computing

[...]

Qingchen Zhang¹, Laurence T. Yang¹, Zhikui Chen², Peng Li², Fanyu Bu² - Show less +1 more•Institutions (2)

University of Electronic Science and Technology of China¹, Dalian University of Technology²

01 Apr 2019-IEEE Transactions on Industrial Informatics

TL;DR: The results demonstrate that the proposed model can prevent overfitting effectively and aggregate the labeled samples to train the parameters of the deep computation model with crowdsouring for industrial IoT big data feature learning.

...read moreread less

Abstract: Deep computation, as an advanced machine learning model, has achieved the state-of-the-art performance for feature learning on big data in industrial Internet of Things (IoT). However, the current deep computation model usually suffers from overfitting due to the lack of public available labeled training samples, limiting its performance for big data feature learning. Motivated by the idea of active learning, an adaptive dropout deep computation model (ADDCM) with crowdsourcing to cloud is proposed for industrial IoT big data feature learning in this paper. First, a distribution function is designed to set the dropout rate for each hidden layer to prevent overfitting for the deep computation model. Furthermore, the outsourcing selection algorithm based on the maximum entropy is employed to choose appropriate samples from the training set to crowdsource on the cloud platform. Finally, an improved supervised learning from multiple experts scheme is presented to aggregate answers given by human workers and to update the parameters of the ADDCM simultaneously. Extensive experiments are conducted to evaluate the performance of the presented model by comparing with the dropout deep computation model and other state-of-the-art crowdsourcing algorithms. The results demonstrate that the proposed model can prevent overfitting effectively and aggregate the labeled samples to train the parameters of the deep computation model with crowdsouring for industrial IoT big data feature learning.

...read moreread less

Journal Article•DOI•

A weakly supervised deep learning framework for sorghum head detection and counting

[...]

Sambuddha Ghosal¹, Bangyou Zheng², Scott Chapman², Scott Chapman³, Andries Potgieter³, David Jordan³, Xuemin Wang³, Asheesh K. Singh¹, Arti Singh¹, Masayuki Hirafuji⁴, Seishi Ninomiya⁴, Baskar Ganapathysubramanian¹, Soumik Sarkar¹, Wei Guo⁴ - Show less +10 more•Institutions (4)

Iowa State University¹, Commonwealth Scientific and Industrial Research Organisation², University of Queensland³, University of Tokyo⁴

27 Jun 2019

TL;DR: It is demonstrated that it is possible to significantly reduce human labeling effort without compromising final model performance by using a semitrained CNN model (i.e., trained with limited labeled data) to perform synthetic annotation.

...read moreread less

Abstract: The yield of cereal crops such as sorghum (Sorghum bicolor L. Moench) depends on the distribution of crop-heads in varying branching arrangements. Therefore, counting the head number per unit area is critical for plant breeders to correlate with the genotypic variation in a specific breeding field. However, measuring such phenotypic traits manually is an extremely labor-intensive process and suffers from low efficiency and human errors. Moreover, the process is almost infeasible for large-scale breeding plantations or experiments. Machine learning-based approaches like deep convolutional neural network (CNN) based object detectors are promising tools for efficient object detection and counting. However, a significant limitation of such deep learning-based approaches is that they typically require a massive amount of hand-labeled images for training, which is still a tedious process. Here, we propose an active learning inspired weakly supervised deep learning framework for sorghum head detection and counting from UAV-based images. We demonstrate that it is possible to significantly reduce human labeling effort without compromising final model performance ( between human count and machine count is 0.88) by using a semitrained CNN model (i.e., trained with limited labeled data) to perform synthetic annotation. In addition, we also visualize key features that the network learns. This improves trustworthiness by enabling users to better understand and trust the decisions that the trained deep learning model makes.

...read moreread less

Proceedings Article•

Self-Supervised Exploration via Disagreement

[...]

Deepak Pathak¹, Dhiraj Gandhi², Abhinav Gupta²•Institutions (2)

University of California, Berkeley¹, Facebook²

24 May 2019

TL;DR: This paper proposes a formulation for exploration inspired by the work in active learning literature and trains an ensemble of dynamics models and incentivizes the agent to explore such that the disagreement of those ensembles is maximized, which results in a sample-efficient exploration.

...read moreread less

Journal Article•DOI•

Active Learning of Dynamics for Data-Driven Control Using Koopman Operators

[...]

Ian Abraham¹, Todd D. Murphey¹•Institutions (1)

Northwestern University¹

10 Jul 2019-IEEE Transactions on Robotics

TL;DR: In this article, an active learning strategy for robotic systems that takes into account task information, enables fast learning, and allows control to be readily synthesized by taking advantage of the Koopman operator representation is presented.

...read moreread less

Abstract: This paper presents an active learning strategy for robotic systems that takes into account task information, enables fast learning, and allows control to be readily synthesized by taking advantage of the Koopman operator representation. We first motivate the use of representing nonlinear systems as linear Koopman operator systems by illustrating the improved model-based control performance with an actuated Van der Pol system. Information-theoretic methods are then applied to the Koopman operator formulation of dynamical systems where we derive a controller for active learning of robot dynamics. The active learning controller is shown to increase the rate of information about the Koopman operator. In addition, our active learning controller can readily incorporate policies built on the Koopman dynamics, enabling the benefits of fast active learning and improved control. Results using a quadcopter illustrate single-execution active learning and stabilization capabilities during free fall. The results for active learning are extended for automating Koopman observables and we implement our method on real robotic systems.

...read moreread less

Journal Article•DOI•

A general failure-pursuing sampling framework for surrogate-based reliability analysis

[...]

Chen Jiang¹, Haobo Qiu¹, Zan Yang¹, Liming Chen¹, Liang Gao¹, Peigen Li¹ - Show less +2 more•Institutions (1)

Huazhong University of Science and Technology¹

01 Mar 2019-Reliability Engineering & System Safety

TL;DR: This work proposes a failure-pursuing sampling framework, which is able to adopt various surrogate models or active learning strategies, and takes into account the joint probability density function of random variables, the individual information at candidate points and the improvement of the accuracy of predicted failure probability.

...read moreread less

Posted Content•

Discriminative Active Learning

[...]

Daniel Gissin, Shai Shalev-Shwartz

15 Jul 2019-arXiv: Learning

TL;DR: Experimental results show the proposed batch mode active learning algorithm, Discriminative Active Learning, to be on par with state of the art methods in medium and large query batch sizes, while being simple to implement and also extend to other domains besides classification tasks.

...read moreread less

Abstract: We propose a new batch mode active learning algorithm designed for neural networks and large query batch sizes. The method, Discriminative Active Learning (DAL), poses active learning as a binary classification task, attempting to choose examples to label in such a way as to make the labeled set and the unlabeled pool indistinguishable. Experimenting on image classification tasks, we empirically show our method to be on par with state of the art methods in medium and large query batch sizes, while being simple to implement and also extend to other domains besides classification tasks. Our experiments also show that none of the state of the art methods of today are clearly better than uncertainty sampling when the batch size is relatively large, negating some of the reported results in the recent literature.

...read moreread less

Journal Article•DOI•

Active Transfer Learning Network: A Unified Deep Joint Spectral–Spatial Feature Learning Model for Hyperspectral Image Classification

[...]

Cheng Deng¹, Xue Yumeng¹, Xianglong Liu², Chao Li¹, Dacheng Tao - Show less +1 more•Institutions (2)

Xidian University¹, Beihang University²

01 Mar 2019-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: Zhang et al. as discussed by the authors proposed a unified deep network, combined with active transfer learning (TL) that can be well-trained for hyperspectral images classification using only minimally labeled training data.

...read moreread less

Abstract: Deep learning has recently attracted significant attention in the field of hyperspectral images (HSIs) classification. However, the construction of an efficient deep neural network mostly relies on a large number of labeled samples being available. To address this problem, this paper proposes a unified deep network, combined with active transfer learning (TL) that can be well-trained for HSIs classification using only minimally labeled training data. More specifically, deep joint spectral–spatial feature is first extracted through hierarchical stacked sparse autoencoder (SSAE) networks. Active TL is then exploited to transfer the pretrained SSAE network and the limited training samples from the source domain to the target domain, where the SSAE network is subsequently fine-tuned using the limited labeled samples selected from both source and target domains by the corresponding active learning (AL) strategies. The advantages of our proposed method are threefold: 1) the network can be effectively trained using only limited labeled samples with the help of novel AL strategies; 2) the network is flexible and scalable enough to function across various transfer situations, including cross data set and intraimage; and 3) the learned deep joint spectral–spatial feature representation is more generic and robust than many joint spectral–spatial feature representations. Extensive comparative evaluations demonstrate that our proposed method significantly outperforms many state-of-the-art approaches, including both traditional and deep network-based methods, on three popular data sets.

...read moreread less

Journal Article•DOI•

Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning.

[...]

Yao Zhang¹, Alpha A. Lee¹•Institutions (1)

University of Cambridge¹

11 Sep 2019-Chemical Science

TL;DR: In this paper, Bayesian semi-supervised graph convolutional neural networks are used to estimate uncertainty in a statistically principled way through sampling from the posterior distribution, which disentangles representation learning and regression, keeping uncertainty estimates accurate in the low data limit.

...read moreread less

Abstract: Predicting bioactivity and physical properties of small molecules is a central challenge in drug discovery. Deep learning is becoming the method of choice but studies to date focus on mean accuracy as the main metric. However, to replace costly and mission-critical experiments by models, a high mean accuracy is not enough: outliers can derail a discovery campaign, thus models need to reliably predict when it will fail, even when the training data is biased; experiments are expensive, thus models need to be data-efficient and suggest informative training sets using active learning. We show that uncertainty quantification and active learning can be achieved by Bayesian semi-supervised graph convolutional neural networks. The Bayesian approach estimates uncertainty in a statistically principled way through sampling from the posterior distribution. Semi-supervised learning disentangles representation learning and regression, keeping uncertainty estimates accurate in the low data limit and allowing the model to start active learning from a small initial pool of training data. Our study highlights the promise of Bayesian deep learning for chemistry.

...read moreread less

Posted Content•

Selection via Proxy: Efficient Data Selection for Deep Learning

[...]

Cody Coleman¹, Christopher Yeh¹, Stephen Mussmann¹, Baharan Mirzasoleiman¹, Peter Bailis¹, Percy Liang¹, Jure Leskovec¹, Matei Zaharia¹ - Show less +4 more•Institutions (1)

Stanford University¹

26 Jun 2019-arXiv: Learning

TL;DR: This work shows that it can significantly improve the computational efficiency of data selection in deep learning by using a much smaller proxy model to perform data selection for tasks that will eventually require a large target model (e.g., selecting data points to label for active learning).

...read moreread less

Abstract: Data selection methods, such as active learning and core-set selection, are useful tools for machine learning on large datasets. However, they can be prohibitively expensive to apply in deep learning because they depend on feature representations that need to be learned. In this work, we show that we can greatly improve the computational efficiency by using a small proxy model to perform data selection (e.g., selecting data points to label for active learning). By removing hidden layers from the target model, using smaller architectures, and training for fewer epochs, we create proxies that are an order of magnitude faster to train. Although these small proxy models have higher error rates, we find that they empirically provide useful signals for data selection. We evaluate this "selection via proxy" (SVP) approach on several data selection tasks across five datasets: CIFAR10, CIFAR100, ImageNet, Amazon Review Polarity, and Amazon Review Full. For active learning, applying SVP can give an order of magnitude improvement in data selection runtime (i.e., the time it takes to repeatedly train and select points) without significantly increasing the final error (often within 0.1%). For core-set selection on CIFAR10, proxies that are over 10x faster to train than their larger, more accurate targets can remove up to 50% of the data without harming the final accuracy of the target, leading to a 1.6x end-to-end training time improvement.

...read moreread less

Journal Article•DOI•

Machine learning for the Zwicky transient facility

[...]

Ashish Mahabal¹, Umaa Rebbapragada¹, Richard Walters¹, Frank J. Masci¹, Nadejda Blagorodnova¹, Jan van Roestel², Quanzhi Ye¹, Rahul Biswas³, Kevin B. Burdge¹, Chan-Kao Chang⁴, Dmitry A. Duev¹, V. Zach Golkhou⁵, Adam A. Miller⁶, Adam A. Miller⁷, Jakob Nordin⁸, Charlotte Ward⁹, Scott M. Adams¹, Eric C. Bellm⁵, Doug Branton¹⁰, Brian D. Bue¹, Chris Cannella¹, Andrew J. Connolly⁵, Richard Dekany¹, Ulrich Feindt³, Tiara Hung⁹, Lucy Fortson, Sara Frederick⁹, Christoffer Fremling¹, Suvi Gezari⁹, Matthew J. Graham¹, Steven Groom¹, Mansi M. Kasliwal¹, Shrinivas R. Kulkarni¹, Thomas Kupfer¹, Thomas Kupfer¹¹, Hsing Wen Lin⁴, Hsing Wen Lin¹², Chris Lintott¹³, Ragnhild Lunnan³, John K. Parejko⁵, Thomas A. Prince¹, Reed Riddle¹, Ben Rusholme¹, Nicholas Saunders⁵, Nima Sedaghat¹⁴, David L. Shupe¹, Leo Singer⁹, Leo Singer¹⁵, Maayane T. Soumagnac¹⁶, Paula Szkody⁵, Yutaro Tachibana¹⁷, Yutaro Tachibana¹, Kushal Tirumala¹, Sjoert van Velzen⁹, Darryl Wright - Show less +51 more•Institutions (17)

California Institute of Technology¹, Radboud University Nijmegen², Stockholm University³, National Central University⁴, University of Washington⁵, Adler Planetarium⁶, Northwestern University⁷, Humboldt University of Berlin⁸, University of Maryland, College Park⁹, Space Telescope Science Institute¹⁰, University of California, Santa Barbara¹¹, University of Michigan¹², University of Oxford¹³, University of Freiburg¹⁴, Goddard Space Flight Center¹⁵, Weizmann Institute of Science¹⁶, Tokyo Institute of Technology¹⁷

01 Mar 2019-Publications of the Astronomical Society of the Pacific

TL;DR: In this article, various machine learning (ML) implementations and plans to make the maximal use of the large data set by taking advantage of the temporal nature of the data, and further combining it with other data sets.

...read moreread less

Abstract: The Zwicky Transient Facility is a large optical survey in multiple filters producing hundreds of thousands of transient alerts per night. We describe here various machine learning (ML) implementations and plans to make the maximal use of the large data set by taking advantage of the temporal nature of the data, and further combining it with other data sets. We start with the initial steps of separating bogus candidates from real ones, separating stars and galaxies, and go on to the classification of real objects into various classes. Besides the usual methods (e.g., based on features extracted from light curves) we also describe early plans for alternate methods including the use of domain adaptation, and deep learning. In a similar fashion we describe efforts to detect fast moving asteroids. We also describe the use of the Zooniverse platform for helping with classifications through the creation of training samples, and active learning. Finally we mention the synergistic aspects of ZTF and LSST from the ML perspective.

...read moreread less

Posted Content•

Learning Loss for Active Learning

[...]

Donggeun Yoo, In So Kweon¹•Institutions (1)

KAIST¹

09 May 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: A novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks, by attaching a small parametric module, named ``loss prediction module,'' to a target network, and learning it to predict target losses of unlabeled inputs.

...read moreread less

Abstract: The performance of deep neural networks improves with more annotated data. The problem is that the budget for annotation is limited. One solution to this is active learning, where a model asks human to annotate data that it perceived as uncertain. A variety of recent methods have been proposed to apply active learning to deep networks but most of them are either designed specific for their target tasks or computationally inefficient for large networks. In this paper, we propose a novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks. We attach a small parametric module, named "loss prediction module," to a target network, and learn it to predict target losses of unlabeled inputs. Then, this module can suggest data that the target model is likely to produce a wrong prediction. This method is task-agnostic as networks are learned from a single loss regardless of target tasks. We rigorously validate our method through image classification, object detection, and human pose estimation, with the recent network architectures. The results demonstrate that our method consistently outperforms the previous methods over the tasks.

...read moreread less

Proceedings Article•

Bayesian Batch Active Learning as Sparse Subset Approximation

[...]

Robert Pinsler¹, Jonathan Gordon¹, Eric Nalisnick¹, José Miguel Hernández-Lobato¹•Institutions (1)

University of Cambridge¹

01 Jan 2019

TL;DR: A novel Bayesian batch active learning approach that mitigates standard greedy procedures for large-scale regression and classification tasks and derive interpretable closed-form solutions akin to existing active learning procedures for linear models, and generalize to arbitrary models using random projections.

...read moreread less

Abstract: Leveraging the wealth of unlabeled data produced in recent years provides great potential for improving supervised models. When the cost of acquiring labels is high, probabilistic active learning methods can be used to greedily select the most informative data points to be labeled. However, for many large-scale problems standard greedy procedures become computationally infeasible and suffer from negligible model change. In this paper, we introduce a novel Bayesian batch active learning approach that mitigates these issues. Our approach is motivated by approximating the complete data posterior of the model parameters. While naive batch construction methods result in correlated queries, our algorithm produces diverse batches that enable efficient active learning at scale. We derive interpretable closed-form solutions akin to existing active learning procedures for linear models, and generalize to arbitrary models using random projections. We demonstrate the benefits of our approach on several large-scale regression and classification tasks.

...read moreread less

Posted Content•

Diverse mini-batch Active Learning

[...]

Fedor Zhdanov

17 Jan 2019-arXiv: Learning

TL;DR: This work studies the problem of reducing the amount of labeled training data required to train supervised classification models by leveraging Active Learning, through sequential selection of examples which benefit the model most, and considers the mini-batch Active Learning setting, where several examples are selected at once.

...read moreread less

Abstract: We study the problem of reducing the amount of labeled training data required to train supervised classification models. We approach it by leveraging Active Learning, through sequential selection of examples which benefit the model most. Selecting examples one by one is not practical for the amount of training examples required by the modern Deep Learning models. We consider the mini-batch Active Learning setting, where several examples are selected at once. We present an approach which takes into account both informativeness of the examples for the model, as well as the diversity of the examples in a mini-batch. By using the well studied K-means clustering algorithm, this approach scales better than the previously proposed approaches, and achieves comparable or better performance.

...read moreread less

Journal Article•DOI•

An importance learning method for non-probabilistic reliability analysis and optimization

[...]

Zeng Meng¹, Zeng Meng², Dequan Zhang³, Gang Li², Bo Yu¹ - Show less +1 more•Institutions (3)

Hefei University of Technology¹, Dalian University of Technology², Hebei University of Technology³

01 Apr 2019-Structural and Multidisciplinary Optimization

TL;DR: A novel importance learning method (ILM) is proposed on the basis of active learning technique using Kriging metamodel, which builds the Kriged model accurately and efficiently by considering the influence of the most concerned point.

...read moreread less

Abstract: With the time-consuming computations incurred by nested double-loop strategy and multiple performance functions, the enhancement of computational efficiency for the non-probabilistic reliability estimation and optimization is a challenging problem in the assessment of structural safety. In this study, a novel importance learning method (ILM) is proposed on the basis of active learning technique using Kriging metamodel, which builds the Kriging model accurately and efficiently by considering the influence of the most concerned point. To further accelerate the convergence rate of non-probabilistic reliability analysis, a new stopping criterion is constructed to ensure accuracy of the Kriging model. For solving the non-probabilistic reliability-based design optimization (NRBDO) problems with multiple non-probabilistic constraints, a new active learning function is further developed based upon the ILM for dealing with this problem efficiently. The proposed ILM is verified by two non-probabilistic reliability estimation examples and three NRBDO examples. Comparing with the existing active learning methods, the optimal results calculated by the proposed ILM show high performance in terms of efficiency and accuracy.

...read moreread less

Posted Content•

Low-resource Deep Entity Resolution with Transfer and Active Learning

[...]

Jungo Kasai¹, Kun Qian², Sairam Gurajada², Yunyao Li², Lucian Popa² - Show less +1 more•Institutions (2)

University of Washington¹, IBM²

17 Jun 2019-arXiv: Databases

TL;DR: This paper develops a deep learning-based method that targets low-resource settings for ER through a novel combination of transfer learning and active learning and designs an architecture that allows us to learn a transferable model from a high-resource setting to a low- resource one.

...read moreread less

Abstract: Entity resolution (ER) is the task of identifying different representations of the same real-world entities across databases. It is a key step for knowledge base creation and text mining. Recent adaptation of deep learning methods for ER mitigates the need for dataset-specific feature engineering by constructing distributed representations of entity records. While these methods achieve state-of-the-art performance over benchmark data, they require large amounts of labeled data, which are typically unavailable in realistic ER applications. In this paper, we develop a deep learning-based method that targets low-resource settings for ER through a novel combination of transfer learning and active learning. We design an architecture that allows us to learn a transferable model from a high-resource setting to a low-resource one. To further adapt to the target dataset, we incorporate active learning that carefully selects a few informative examples to fine-tune the transferred model. Empirical evaluation demonstrates that our method achieves comparable, if not better, performance compared to state-of-the-art learning-based methods while using an order of magnitude fewer labels.

...read moreread less

Collapse