scispace - formally typeset
Search or ask a question

Showing papers on "Active learning (machine learning) published in 2020"


Posted Content
TL;DR: A formal classification method for the existing work in deep active learning is provided, along with a comprehensive and systematic overview, to investigate whether AL can be used to reduce the cost of sample annotation while retaining the powerful learning capabilities of DL.
Abstract: Active learning (AL) attempts to maximize the performance gain of the model by marking the fewest samples. Deep learning (DL) is greedy for data and requires a large amount of data supply to optimize massive parameters, so that the model learns how to extract high-quality features. In recent years, due to the rapid development of internet technology, we are in an era of information torrents and we have massive amounts of data. In this way, DL has aroused strong interest of researchers and has been rapidly developed. Compared with DL, researchers have relatively low interest in AL. This is mainly because before the rise of DL, traditional machine learning requires relatively few labeled samples. Therefore, early AL is difficult to reflect the value it deserves. Although DL has made breakthroughs in various fields, most of this success is due to the publicity of the large number of existing annotation datasets. However, the acquisition of a large number of high-quality annotated datasets consumes a lot of manpower, which is not allowed in some fields that require high expertise, especially in the fields of speech recognition, information extraction, medical images, etc. Therefore, AL has gradually received due attention. A natural idea is whether AL can be used to reduce the cost of sample annotations, while retaining the powerful learning capabilities of DL. Therefore, deep active learning (DAL) has emerged. Although the related research has been quite abundant, it lacks a comprehensive survey of DAL. This article is to fill this gap, we provide a formal classification method for the existing work, and a comprehensive and systematic overview. In addition, we also analyzed and summarized the development of DAL from the perspective of application. Finally, we discussed the confusion and problems in DAL, and gave some possible development directions for DAL.

372 citations


Journal ArticleDOI
TL;DR: To detect COVID-19, AI-driven tools are expected to have active learning-based cross-population train/test models that employs multitudinal and multimodal data, which is the primary purpose of the paper.
Abstract: The novel coronavirus (COVID-19) outbreak, which was identified in late 2019, requires special attention because of its future epidemics and possible global threats. Beside clinical procedures and treatments, since Artificial Intelligence (AI) promises a new paradigm for healthcare, several different AI tools that are built upon Machine Learning (ML) algorithms are employed for analyzing data and decision-making processes. This means that AI-driven tools help identify COVID-19 outbreaks as well as forecast their nature of spread across the globe. However, unlike other healthcare issues, for COVID-19, to detect COVID-19, AI-driven tools are expected to have active learning-based cross-population train/test models that employs multitudinal and multimodal data, which is the primary purpose of the paper.

265 citations


Journal ArticleDOI
TL;DR: A physics-informed neural network for cardiac activation mapping that accounts for the underlying wave propagation dynamics and quantifies the epistemic uncertainty associated with these predictions to open the door toward physics-based electro-anatomic mapping.
Abstract: A critical procedure in diagnosing atrial fibrillation is the creation of electro-anatomic activation maps. Current methods generate these mappings from interpolation using a few sparse data points recorded inside the atria; they neither include prior knowledge of the underlying physics nor uncertainty of these recordings. Here we propose a physics-informed neural network for cardiac activation mapping that accounts for the underlying wave propagation dynamics and we quantify the epistemic uncertainty associated with these predictions. These uncertainty estimates not only allow us to quantify the predictive error of the neural network, but also help to reduce it by judiciously selecting new informative measurement locations via active learning. We illustrate the potential of our approach using a synthetic benchmark problem and a personalized electrophysiology model of the left atrium. We show that our new method outperforms linear interpolation and Gaussian process regression for the benchmark problem and linear interpolation at clinical densities for the left atrium. In both cases, the active learning algorithm achieves lower error levels than random allocation. Our findings open the door towards physics-based electro-anatomic mapping with the ultimate goals to reduce procedural time and improve diagnostic predictability for patients affected by atrial fibrillation. Open source code is available at https://github.com/fsahli/EikonalNet.

218 citations


Journal ArticleDOI
TL;DR: This article presents an active deep learning approach for HSI classification, which integrates both active learning and deep learning into a unified framework and achieves better performance on three benchmark HSI data sets with significantly fewer labeled samples.
Abstract: Deep neural network has been extensively applied to hyperspectral image (HSI) classification recently. However, its success is greatly attributed to numerous labeled samples, whose acquisition costs a large amount of time and money. In order to improve the classification performance while reducing the labeling cost, this article presents an active deep learning approach for HSI classification, which integrates both active learning and deep learning into a unified framework. First, we train a convolutional neural network (CNN) with a limited number of labeled pixels. Next, we actively select the most informative pixels from the candidate pool for labeling. Then, the CNN is fine-tuned with the new training set constructed by incorporating the newly labeled pixels. This step together with the previous step is iteratively conducted. Finally, Markov random field (MRF) is utilized to enforce class label smoothness to further boost the classification performance. Compared with the other state-of-the-art traditional and deep learning-based HSI classification methods, our proposed approach achieves better performance on three benchmark HSI data sets with significantly fewer labeled samples.

203 citations


Journal ArticleDOI
TL;DR: In this paper, a genetic algorithm was used to select the ML model and materials descriptors from a huge number of alternatives and demonstrated its efficiency on two phase formation problems in high entropy alloys (HEAs).

188 citations


Journal ArticleDOI
18 Mar 2020
TL;DR: In this paper, an adaptive Bayesian inference method for automating the training of interpretable, low-dimensional, and multi-element interatomic force fields using structures drawn on the fly from molecular dynamics simulations is presented.
Abstract: Machine learned force fields typically require manual construction of training sets consisting of thousands of first principles calculations, which can result in low training efficiency and unpredictable errors when applied to structures not represented in the training set of the model. This severely limits the practical application of these models in systems with dynamics governed by important rare events, such as chemical reactions and diffusion. We present an adaptive Bayesian inference method for automating the training of interpretable, low-dimensional, and multi-element interatomic force fields using structures drawn on the fly from molecular dynamics simulations. Within an active learning framework, the internal uncertainty of a Gaussian process regression model is used to decide whether to accept the model prediction or to perform a first principles calculation to augment the training set of the model. The method is applied to a range of single- and multi-element systems and shown to achieve a favorable balance of accuracy and computational efficiency, while requiring a minimal amount of ab initio training data. We provide a fully open-source implementation of our method, as well as a procedure to map trained models to computationally efficient tabulated force fields.

183 citations


Journal ArticleDOI
TL;DR: The inverse design approach is compared with conventional gradient‐based topology optimization and gradient‐free genetic algorithms and the pros and cons of each method are discussed when applied to materials discovery and design problems.
Abstract: In recent years, machine learning (ML) techniques are seen to be promising tools to discover and design novel materials. However, the lack of robust inverse design approaches to identify promising candidate materials without exploring the entire design space causes a fundamental bottleneck. A general-purpose inverse design approach is presented using generative inverse design networks. This ML-based inverse design approach uses backpropagation to calculate the analytical gradients of an objective function with respect to design variables. This inverse design approach is capable of overcoming local minima traps by using backpropagation to provide rapid calculations of gradient information and running millions of optimizations with different initial values. Furthermore, an active learning strategy is adopted in the inverse design approach to improve the performance of candidate materials and reduce the amount of training data needed to do so. Compared to passive learning, the active learning strategy is capable of generating better designs and reducing the amount of training data by at least an order-of-magnitude in the case study on composite materials. The inverse design approach is compared with conventional gradient-based topology optimization and gradient-free genetic algorithms and the pros and cons of each method are discussed when applied to materials discovery and design problems.

160 citations


Journal ArticleDOI
TL;DR: An autonomous materials discovery methodology for functional inorganic compounds is demonstrated which allow scientists to fail smarter, learn faster, and spend less resources in their studies, while simultaneously improving trust in scientific results and machine learning tools.
Abstract: Active learning—the field of machine learning (ML) dedicated to optimal experiment design—has played a part in science as far back as the 18th century when Laplace used it to guide his discovery of celestial mechanics. In this work, we focus a closed-loop, active learning-driven autonomous system on another major challenge, the discovery of advanced materials against the exceedingly complex synthesis-processes-structure-property landscape. We demonstrate an autonomous materials discovery methodology for functional inorganic compounds which allow scientists to fail smarter, learn faster, and spend less resources in their studies, while simultaneously improving trust in scientific results and machine learning tools. This robot science enables science-over-the-network, reducing the economic impact of scientists being physically separated from their labs. The real-time closed-loop, autonomous system for materials exploration and optimization (CAMEO) is implemented at the synchrotron beamline to accelerate the interconnected tasks of phase mapping and property optimization, with each cycle taking seconds to minutes. We also demonstrate an embodiment of human-machine interaction, where human-in-the-loop is called to play a contributing role within each cycle. This work has resulted in the discovery of a novel epitaxial nanocomposite phase-change memory material. Machine learning driven research holds big promise towards accelerating materials’ discovery. Here the authors demonstrate CAMEO, which integrates active learning Bayesian optimization with practical experiments execution, for the discovery of new phase- change materials using X-ray diffraction experiments.

155 citations


Journal ArticleDOI
TL;DR: The data annotation bottleneck is identified as one of the key obstacles to machine learning approaches in clinical NLP, and future research in this field would benefit from alternatives such as data augmentation and transfer learning, or unsupervised learning, which do not require data annotation.
Abstract: Background: Clinical narratives represent the main form of communication within healthcare providing a personalized account of patient history and assessments, offering rich information for clinical decision making. Natural language processing (NLP) has repeatedly demonstrated its feasibility to unlock evidence buried in clinical narratives. Machine learning can facilitate rapid development of NLP tools by leveraging large amounts of text data. Objective: The main aim of this study is to provide systematic evidence on the properties of text data used to train machine learning approaches to clinical NLP. We also investigate the types of NLP tasks that have been supported by machine learning and how they can be applied in clinical practice. Methods: Our methodology was based on the guidelines for performing systematic reviews. In August 2018, we used PubMed, a multi-faceted interface, to perform a literature search against MEDLINE. We identified a total of 110 relevant studies and extracted information about the text data used to support machine learning, the NLP tasks supported and their clinical applications. The data properties considered included their size, provenance, collection methods, annotation and any relevant statistics. Results: The vast majority of datasets used to train machine learning models included only hundreds or thousands of documents. Only 10 studies used tens of thousands of documents with a handful of studies utilizing more. Relatively small datasets were utilized for training even when much larger datasets were available. The main reason for such poor data utilization is the annotation bottleneck faced by supervised machine learning algorithms. Active learning was explored to iteratively sample a subset of data for manual annotation as a strategy for minimizing the annotation effort while maximizing predictive performance of the model. Supervised learning was successfully used where clinical codes integrated with free text notes into electronic health records were utilized as class labels. Similarly, distant supervision was used to utilize an existing knowledge base to automatically annotate raw text. Where manual annotation was unavoidable, crowdsourcing was explored, but it remains unsuitable due to sensitive nature of data considered. Beside the small volume, training data were typically sourced from a small number of institutions, thus offering no hard evidence about the transferability of machine learning models. The vast majority of studies focused on the task of text classification. Most commonly, the classification results were used to support phenotyping, prognosis, care improvement, resource management and surveillance. Conclusions: We identified the data annotation bottleneck as one of the key obstacles to machine learning approaches in clinical NLP. Active learning and distant supervision were explored as a way of saving the annotation efforts. Future research in this field would benefit from alternatives such as data augmentation and transfer learning, or unsupervised learning, which does not require data annotation.

149 citations


Journal ArticleDOI
TL;DR: This mini-review highlights some recent efforts to connect the ML and nanoscience communities focusing on three types of interaction: (1) using ML to analyze and extract new information from large nanos science data sets, (2) applying ML to accelerate materials discovery, including the use of active learning to guide experimental design, and (3) thenanoscience of memristive devices to realize hardware tailored for ML.
Abstract: Recent advances in machine learning (ML) offer new tools to extract new insights from large data sets and to acquire small data sets more effectively. Researchers in nanoscience are experimenting with these tools to tackle challenges in many fields. In addition to ML's advancement of nanoscience, nanoscience provides the foundation for neuromorphic computing hardware to expand the implementation of ML algorithms. In this Mini Review, we highlight some recent efforts to connect the ML and nanoscience communities by focusing on three types of interaction: (1) using ML to analyze and extract new insights from large nanoscience data sets, (2) applying ML to accelerate material discovery, including the use of active learning to guide experimental design, and (3) the nanoscience of memristive devices to realize hardware tailored for ML. We conclude with a discussion of challenges and opportunities for future interactions between nanoscience and ML researchers.

123 citations


Proceedings Article
30 Apr 2020
TL;DR: This work designs a new algorithm for batch active learning with deep neural network models that samples groups of points that are disparate and high-magnitude when represented in a hallucinated gradient space, and shows that while other approaches sometimes succeed for particular batch sizes or architectures, BADGE consistently performs as well or better, making it a versatile option for practical active learning problems.
Abstract: We design a new algorithm for batch active learning with deep neural network models. Our algorithm, Batch Active learning by Diverse Gradient Embeddings (BADGE), samples groups of points that are disparate and high-magnitude when represented in a hallucinated gradient space, a strategy designed to incorporate both predictive uncertainty and sample diversity into every selected batch. Crucially, BADGE trades off between diversity and uncertainty without requiring any hand-tuned hyperparameters. While other approaches sometimes succeed for particular batch sizes or architectures, BADGE consistently performs as well or better, making it a useful option for real world active learning problems.

Proceedings ArticleDOI
30 Oct 2020
TL;DR: This paper proposes the first framework, called APIGraph, to enhance state-of-the-art malware classifiers with the similarity information among evolved Android malware in terms of semantically-equivalent or similar API usages, thus naturally slowing down classifier aging.
Abstract: Machine learning (ML) classifiers have been widely deployed to detect Android malware, but at the same time the application of ML classifiers also faces an emerging problem. The performance of such classifiers degrades---or called ages---significantly over time given the malware evolution. Prior works have proposed to use retraining or active learning to reverse and improve aged models. However, the underlying classifier itself is still blind, unaware of malware evolution. Unsurprisingly, such evolution-insensitive retraining or active learning comes at a price, i.e., the labeling of tens of thousands of malware samples and the cost of significant human efforts. In this paper, we propose the first framework, called APIGraph, to enhance state-of-the-art malware classifiers with the similarity information among evolved Android malware in terms of semantically-equivalent or similar API usages, thus naturally slowing down classifier aging. Our evaluation shows that because of the slow-down of classifier aging, APIGraph saves significant amounts of human efforts required by active learning in labeling new malware samples.

Journal ArticleDOI
TL;DR: The results indicate that SALK can locally approximate the limit-state surfaces around the finalSRBDO solution and efficiently reduce the computational cost on the refinement of the region far from the final SR BDO solution.

Journal ArticleDOI
23 Jul 2020
TL;DR: An active learning scheme for automatically sampling a minimum number of uncorrelated configurations for fitting the Gaussian Approximation Potential (GAP) model is proposed and shown to be able to extract a configuration that reaches the required energy fit tolerance.
Abstract: We propose an active learning scheme for automatically sampling a minimum number of uncorrelated configurations for fitting the Gaussian Approximation Potential (GAP). Our active learning scheme consists of an unsupervised machine learning (ML) scheme coupled with a Bayesian optimization technique that evaluates the GAP model. We apply this scheme to a Hafnium dioxide (HfO2) dataset generated from a “melt-quench” ab initio molecular dynamics (AIMD) protocol. Our results show that the active learning scheme, with no prior knowledge of the dataset, is able to extract a configuration that reaches the required energy fit tolerance. Further, molecular dynamics (MD) simulations performed using this active learned GAP model on 6144 atom systems of amorphous and liquid state elucidate the structural properties of HfO2 with near ab initio precision and quench rates (i.e., 1.0 K/ps) not accessible via AIMD. The melt and amorphous X-ray structural factors generated from our simulation are in good agreement with experiment. In addition, the calculated diffusion constants are in good agreement with previous ab initio studies.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper designed useful user selection criteria based on items' attributes and users' rating history, and combine the criteria in an optimization framework for selecting users, then generate accurate rating predictions for the other unselected users.
Abstract: In recommender systems, cold-start issues are situations where no previous events, e.g., ratings, are known for certain users or items. In this paper, we focus on the item cold-start problem. Both content information (e.g., item attributes) and initial user ratings are valuable for seizing users’ preferences on a new item. However, previous methods for the item cold-start problem either (1) incorporate content information into collaborative filtering to perform hybrid recommendation, or (2) actively select users to rate the new item without considering content information and then do collaborative filtering. In this paper, we propose a novel recommendation scheme for the item cold-start problem by leveraging both active learning and items’ attribute information. Specifically, we design useful user selection criteria based on items’ attributes and users’ rating history, and combine the criteria in an optimization framework for selecting users. By exploiting the feedback ratings, users’ previous ratings and items’ attributes, we then generate accurate rating predictions for the other unselected users. Experimental results on two real-world datasets show the superiority of our proposed method over traditional methods.

Journal ArticleDOI
TL;DR: By combining human and machine intelligence, Galaxy Zoo will be able to classify surveys of any conceivable scale on a timescale of weeks, providing massive and detailed morphology catalogues to support research into galaxy evolution.
Abstract: We use Bayesian convolutional neural networks and a novel generative model of Galaxy Zoo volunteer responses to infer posteriors for the visual morphology of galaxies. Bayesian CNN can learn from galaxy images with uncertain labels and then, for previously unlabelled galaxies, predict the probability of each possible label. Our posteriors are well-calibrated (e.g. for predicting bars, we achieve coverage errors of 11.8 per cent within a vote fraction deviation of 0.2) and hence are reliable for practical use. Further, using our posteriors, we apply the active learning strategy BALD to request volunteer responses for the subset of galaxies which, if labelled, would be most informative for training our network. We show that training our Bayesian CNNs using active learning requires up to 35–60 per cent fewer labelled galaxies, depending on the morphological feature being classified. By combining human and machine intelligence, Galaxy zoo will be able to classify surveys of any conceivable scale on a time-scale of weeks, providing massive and detailed morphology catalogues to support research into galaxy evolution.

Proceedings ArticleDOI
01 Nov 2020
TL;DR: With BERT, a simple strategy based on the masked language modeling loss that minimizes labeling costs for text classification is developed and reaches higher accuracy within less sampling iterations and computation time.
Abstract: Active learning strives to reduce annotation costs by choosing the most critical examples to label. Typically, the active learning strategy is contingent on the classification model. For instance, uncertainty sampling depends on poorly calibrated model confidence scores. In the cold-start setting, active learning is impractical because of model instability and data scarcity. Fortunately, modern NLP provides an additional source of information: pre-trained language models. The pre-training loss can find examples that surprise the model and should be labeled for efficient fine-tuning. Therefore, we treat the language modeling loss as a proxy for classification uncertainty. With BERT, we develop a simple strategy based on the masked language modeling loss that minimizes labeling costs for text classification. Compared to other baselines, our approach reaches higher accuracy within less sampling iterations and computation time.

Proceedings ArticleDOI
14 Jun 2020
TL;DR: This paper proposes a state relabeling adversarial active learning model (SRAAL), that leverages both the annotation and the labeled/unlabeled state information for deriving the most informative unlabeled samples.
Abstract: Active learning is to design label-efficient algorithms by sampling the most representative samples to be labeled by an oracle. In this paper, we propose a state relabeling adversarial active learning model (SRAAL), that leverages both the annotation and the labeled/unlabeled state information for deriving the most informative unlabeled samples. The SRAAL consists of a representation generator and a state discriminator. The generator uses the complementary annotation information with traditional reconstruction information to generate the unified representation of samples, which embeds the semantic into the whole data representation. Then, we design an online uncertainty indicator in the discriminator, which endues unlabeled samples with different importance. As a result, we can select the most informative samples based on the discriminator's predicted state. We also design an algorithm to initialize the labeled pool, which makes subsequent sampling more efficient. The experiments conducted on various datasets show that our model outperforms the previous state-of-art active learning methods and our initially sampling algorithm achieves better performance.

Journal ArticleDOI
TL;DR: The importance of reasoning is emphasized in this paper because it is important for building interpretable and knowledge-driven neural NLP models to handle complex tasks.

Journal ArticleDOI
TL;DR: This work trains a model on just 72 compounds to make predictions over a 10,833-compound library, identifying and experimentally validating compounds with nanomolar affinity for diverse kinases and whole-cell growth inhibition of Mycobacterium tuberculosis.
Abstract: Machine learning that generates biological hypotheses has transformative potential, but most learning algorithms are susceptible to pathological failure when exploring regimes beyond the training data distribution. A solution to address this issue is to quantify prediction uncertainty so that algorithms can gracefully handle novel phenomena that confound standard methods. Here, we demonstrate the broad utility of robust uncertainty prediction in biological discovery. By leveraging Gaussian process-based uncertainty prediction on modern pre-trained features, we train a model on just 72 compounds to make predictions over a 10,833-compound library, identifying and experimentally validating compounds with nanomolar affinity for diverse kinases and whole-cell growth inhibition of Mycobacterium tuberculosis. Uncertainty facilitates a tight iterative loop between computation and experimentation and generalizes across biological domains as diverse as protein engineering and single-cell transcriptomics. More broadly, our work demonstrates that uncertainty should play a key role in the increasing adoption of machine learning algorithms into the experimental lifecycle.

Journal ArticleDOI
01 Jun 2020
TL;DR: It can be concluded that machine learning methods are very effective for spatial data handling and have wide application potential in the big data era.
Abstract: Most machine learning tasks can be categorized into classification or regression problems. Regression and classification models are normally used to extract useful geographic information from observed or measured spatial data, such as land cover classification, spatial interpolation, and quantitative parameter retrieval. This paper reviews the progress of four advanced machine learning methods for spatial data handling, namely, support vector machine (SVM)-based kernel learning, semi-supervised and active learning, ensemble learning, and deep learning. These four machine learning modes are representative because they improve learning performances from different views, for example, feature space transform and decision function (SVM), optimized uses of samples (semi-supervised and active learning), and enhanced learning models and capabilities (ensemble learning and deep learning). For spatial data handling via machine learning that can be improved by the four machine learning models, three key elements are learning algorithms, training samples, and input features. To apply machine learning methods to spatial data handling successfully, a four-level strategy is suggested: experimenting and evaluating the applicability, extending the algorithms by embedding spatial properties, optimizing the parameters for better performance, and enhancing the algorithm by multiple means. Firstly, the advances of SVM are reviewed to demonstrate the merits of novel machine learning methods for spatial data, running the line from direct use and comparison with traditional classifiers, and then targeted improvements to address multiple class problems, to optimize parameters of SVM, and to use spatial and spectral features. To overcome the limits of small-size training samples, semi-supervised learning and active learning methods are then utilized to deal with insufficient labeled samples, showing the potential of learning from small-size training samples. Furthermore, considering the poor generalization capacity and instability of machine learning algorithms, ensemble learning is introduced to integrate the advantages of multiple learners and to enhance the generalization capacity. The typical research lines, including the combination of multiple classifiers, advanced ensemble classifiers, and spatial interpolation, are presented. Finally, deep learning, one of the most popular branches of machine learning, is reviewed with specific examples for scene classification and urban structural type recognition from high-resolution remote sensing images. By this review, it can be concluded that machine learning methods are very effective for spatial data handling and have wide application potential in the big data era.

Journal ArticleDOI
TL;DR: The on-the-fly generation of machine-learning force fields by active-learning schemes is demonstrated by presenting recent applications and overall, simulations are accelerated by several orders of magnitude while retaining almost first-principles accuracy.
Abstract: The on-the-fly generation of machine-learning force fields by active-learning schemes attracts a great deal of attention in the community of atomistic simulations. The algorithms allow the machine to self-learn an interatomic potential and construct machine-learned models on the fly during simulations. State-of-the-art query strategies allow the machine to judge whether new structures are out of the training data set or not. Only when the machine judges the necessity of updating the data set with the new structures are first-principles calculations carried out. Otherwise, the yet available machine-learned model is used to update the atomic positions. In this manner, most of the first-principles calculations are bypassed during training, and overall, simulations are accelerated by several orders of magnitude while retaining almost first-principles accuracy. In this Perspective, after describing essential components of the active-learning algorithms, we demonstrate the power of the schemes by presenting recent applications.

Journal ArticleDOI
TL;DR: The experiment results show that the proposed cost senstive active learning bidirectional gated recurrent unit (CSALBGRU) method achieves better performance in both binary fault diagnosis and multi-class fault diagnosis.

Journal ArticleDOI
TL;DR: An open source machine learning-aided pipeline applying active learning: ASReview is developed and it is demonstrated by means of simulation studies that ASReview can yield far more efficient reviewing than manual reviewing, while providing high quality.
Abstract: To help researchers conduct a systematic review or meta-analysis as efficiently and transparently as possible, we designed a tool (ASReview) to accelerate the step of screening titles and abstracts. For many tasks - including but not limited to systematic reviews and meta-analyses - the scientific literature needs to be checked systematically. Currently, scholars and practitioners screen thousands of studies by hand to determine which studies to include in their review or meta-analysis. This is error prone and inefficient because of extremely imbalanced data: only a fraction of the screened studies is relevant. The future of systematic reviewing will be an interaction with machine learning algorithms to deal with the enormous increase of available text. We therefore developed an open source machine learning-aided pipeline applying active learning: ASReview. We demonstrate by means of simulation studies that ASReview can yield far more efficient reviewing than manual reviewing, while providing high quality. Furthermore, we describe the options of the free and open source research software and present the results from user experience tests. We invite the community to contribute to open source projects such as our own that provide measurable and reproducible improvements over current practice.

Proceedings Article
03 Jun 2020
TL;DR: A unified and principled method for both the querying and training processes in deep batch active learning is proposed, providing theoretical insights from the intuition of modeling the interactive procedure in active learning as distribution matching by adopting the Wasserstein distance.
Abstract: In this paper, we are proposing a unified and principled method for both the querying and training processes in deep batch active learning We are providing theoretical insights from the intuition of modeling the interactive procedure in active learning as distribution matching, by adopting the Wasserstein distance As a consequence, we derived a new training loss from the theoretical analysis, which is decomposed into optimizing deep neural network parameters and batch query selection through alternative optimization In addition, the loss for training a deep neural network is naturally formulated as a min-max optimization problem through leveraging the unlabeled data information Moreover, the proposed principles also indicate an explicit uncertainty-diversity trade-off in the query batch selection Finally, we evaluate our proposed method on different benchmarks, consistently showing better empirical performances and a better time-efficient query strategy compared to the baselines

Journal ArticleDOI
TL;DR: The combination of multi-objective optimisation based on machine learning methods (TSEMO algorithm) with self-optimising platforms for the optimisation of multi -step continuous reaction processes with respect to multiple objectives has the potential to make substantial savings in time and resources.

Journal ArticleDOI
TL;DR: A new adaptive approach is developed for reliability analysis by ensemble learning of multiple competitive surrogate models, including Kriging, polynomial chaos expansion and support vector regression, that is very efficient for estimating failure probability (>10−4) of complex system with less computational costs than the traditional single surrogate model.

Journal ArticleDOI
TL;DR: An integrated detection framework of solder joint defects in the context of Automatic Optical Inspection (AOI) of Printed Circuit Boards (PCBs) is proposed and an active learning method was proposed to reduce the labeling workload when a large labeled training database is not easily available.

Posted Content
TL;DR: The proposed methods exploit variational adversarial active learning (VAAL), that considered data distribution of both label and unlabeled pools, by incorporating learning loss prediction module and RankCGAN concept into VAAL by modeling loss prediction as a ranker.
Abstract: Often, labeling large amount of data is challenging due to high labeling cost limiting the application domain of deep learning techniques. Active learning (AL) tackles this by querying the most informative samples to be annotated among unlabeled pool. Two promising directions for AL that have been recently explored are task-agnostic approach to select data points that are far from the current labeled pool and task-aware approach that relies on the perspective of task model. Unfortunately, the former does not exploit structures from tasks and the latter does not seem to well-utilize overall data distribution. Here, we propose task-aware variational adversarial AL (TA-VAAL) that modifies task-agnostic VAAL, that considered data distribution of both label and unlabeled pools, by relaxing task learning loss prediction to ranking loss prediction and by using ranking conditional generative adversarial network to embed normalized ranking loss information on VAAL. Our proposed TA-VAAL outperforms state-of-the-arts on various benchmark datasets for classifications with balanced / imbalanced labels as well as semantic segmentation and its task-aware and task-agnostic AL properties were confirmed with our in-depth analyses.

Proceedings ArticleDOI
TL;DR: A novel framework called Active Semi-supervised Graph Neural Network (ASGN) is proposed by incorporating both labeled and unlabeled molecules and adopts a teacher-student framework to learn general representation that jointly exploits information from molecular structure and molecular distribution.
Abstract: Molecular property prediction (e.g., energy) is an essential problem in chemistry and biology. Unfortunately, many supervised learning methods usually suffer from the problem of scarce labeled molecules in the chemical space, where such property labels are generally obtained by Density Functional Theory (DFT) calculation which is extremely computational costly. An effective solution is to incorporate the unlabeled molecules in a semi-supervised fashion. However, learning semi-supervised representation for large amounts of molecules is challenging, including the joint representation issue of both molecular essence and structure, the conflict between representation and property leaning. Here we propose a novel framework called Active Semi-supervised Graph Neural Network (ASGN) by incorporating both labeled and unlabeled molecules. Specifically, ASGN adopts a teacher-student framework. In the teacher model, we propose a novel semi-supervised learning method to learn general representation that jointly exploits information from molecular structure and molecular distribution. Then in the student model, we target at property prediction task to deal with the learning loss conflict. At last, we proposed a novel active learning strategy in terms of molecular diversities to select informative data during the whole framework learning. We conduct extensive experiments on several public datasets. Experimental results show the remarkable performance of our ASGN framework.