Showing papers on "Active learning (machine learning) published in 2017"

PDF

Open Access

Journal Article•DOI•

[...]

Jacob Biamonte¹, Jacob Biamonte², Peter Wittek³, Nicola Pancotti⁴, Patrick Rebentrost⁵, Nathan Wiebe⁶, Seth Lloyd⁵ - Show less +3 more•Institutions (6)

Skolkovo Institute of Science and Technology¹, University of Waterloo², ICFO – The Institute of Photonic Sciences³, Max Planck Society⁴, Massachusetts Institute of Technology⁵, Microsoft⁶

13 Sep 2017-Nature

TL;DR: The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers.

...read moreread less

Abstract: Recent progress implies that a crossover between machine learning and quantum information processing benefits both fields. Traditional machine learning has dramatically improved the benchmarking an ...

...read moreread less

2,162 citations

Proceedings Article•DOI•

Deep Bayesian active learning with image data

[...]

Yarin Gal¹, Riashat Islam², Zoubin Ghahramani¹•Institutions (2)

University of Cambridge¹, McGill University²

27 Nov 2017

TL;DR: This paper develops an active learning framework for high dimensional data, a task which has been extremely challenging so far, with very sparse existing literature, and demonstrates its active learning techniques with image data, obtaining a significant improvement on existing active learning approaches.

...read moreread less

Abstract: Even though active learning forms an important pillar of machine learning, deep learning tools are not prevalent within it. Deep learning poses several difficulties when used in an active learning setting. First, active learning (AL) methods generally rely on being able to learn and update models from small amounts of data. Recent advances in deep learning, on the other hand, are notorious for their dependence on large amounts of data. Second, many AL acquisition functions rely on model uncertainty, yet deep learning methods rarely represent such model uncertainty. In this paper we combine recent advances in Bayesian deep learning into the active learning framework in a practical way. We develop an active learning framework for high dimensional data, a task which has been extremely challenging so far, with very sparse existing literature. Taking advantage of specialised models such as Bayesian convolutional neural networks, we demonstrate our active learning techniques with image data, obtaining a significant improvement on existing active learning approaches. We demonstrate this on both the MNIST dataset, as well as for skin cancer diagnosis from lesion images (ISIC2016 task).

...read moreread less

1,139 citations

Journal Article•DOI•

Machine Learning: An Applied Econometric Approach

[...]

Sendhil Mullainathan¹, Jann Spiess¹•Institutions (1)

Harvard University¹

01 May 2017-Journal of Economic Perspectives

TL;DR: This work presents a way of thinking about machine learning that gives it its own place in the econometric toolbox, and aims to make them conceptually easier to use by providing a crisper understanding of how these algorithms work, where they excel, and where they can stumble.

...read moreread less

Abstract: Machines are increasingly doing “intelligent” things. Face recognition algorithms use a large dataset of photos labeled as having a face or not to estimate a function that predicts the pre...

...read moreread less

1,055 citations

Journal Article•DOI•

Machine Learning Paradigms for Next-Generation Wireless Networks

[...]

Chunxiao Jiang¹, Haijun Zhang², Yong Ren¹, Zhu Han³, Kwang-Cheng Chen⁴, Lajos Hanzo⁵ - Show less +2 more•Institutions (5)

Tsinghua University¹, University of Science and Technology Beijing², University of Houston³, University of South Florida⁴, University of Southampton⁵

01 Apr 2017-IEEE Wireless Communications

TL;DR: The goal is to assist the readers in refining the motivation, problem formulation, and methodology of powerful machine learning algorithms in the context of future networks in order to tap into hitherto unexplored applications and services.

...read moreread less

Abstract: Next-generation wireless networks are expected to support extremely high data rates and radically new applications, which require a new wireless radio technology paradigm. The challenge is that of assisting the radio in intelligent adaptive learning and decision making, so that the diverse requirements of next-generation wireless networks can be satisfied. Machine learning is one of the most promising artificial intelligence tools, conceived to support smart radio terminals. Future smart 5G mobile terminals are expected to autonomously access the most meritorious spectral bands with the aid of sophisticated spectral efficiency learning and inference, in order to control the transmission power, while relying on energy efficiency learning/inference and simultaneously adjusting the transmission protocols with the aid of quality of service learning/inference. Hence we briefly review the rudimentary concepts of machine learning and propose their employment in the compelling applications of 5G networks, including cognitive radios, massive MIMOs, femto/small cells, heterogeneous networks, smart grid, energy harvesting, device-todevice communications, and so on. Our goal is to assist the readers in refining the motivation, problem formulation, and methodology of powerful machine learning algorithms in the context of future networks in order to tap into hitherto unexplored applications and services.

...read moreread less

958 citations

Proceedings Article•DOI•

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

[...]

Briland Hitaj¹, Giuseppe Ateniese², Fernando Perez-Cruz²•Institutions (2)

Sapienza University of Rome¹, Stevens Institute of Technology²

30 Oct 2017

TL;DR: In this article, the authors show that any privacy-preserving collaborative deep learning model is susceptible to a powerful attack that exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data).

...read moreread less

Abstract: Deep Learning has recently become hugely popular in machine learning for its ability to solve end-to-end learning systems, in which the features and the classifiers are learned simultaneously, providing significant improvements in classification accuracy in the presence of highly-structured and large databases. Its success is due to a combination of recent algorithmic breakthroughs, increasingly powerful computers, and access to significant amounts of data. Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15. Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level differential privacy applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).

...read moreread less

832 citations

Posted Content•

Few-Shot Learning with Graph Neural Networks

[...]

Victor Garcia, Joan Bruna

10 Nov 2017-arXiv: Machine Learning

TL;DR: A graph neural network architecture is defined that generalizes several of the recently proposed few-shot learning models and provides improved numerical performance, and is easily extended to variants of few- shot learning, such as semi-supervised or active learning, demonstrating the ability of graph-based models to operate well on 'relational' tasks.

...read moreread less

Abstract: We propose to study the problem of few-shot learning with the prism of inference on a partially observed graphical model, constructed from a collection of input images whose label can be either observed or not. By assimilating generic message-passing inference algorithms with their neural-network counterparts, we define a graph neural network architecture that generalizes several of the recently proposed few-shot learning models. Besides providing improved numerical performance, our framework is easily extended to variants of few-shot learning, such as semi-supervised or active learning, demonstrating the ability of graph-based models to operate well on 'relational' tasks.

...read moreread less

724 citations

Journal Article•DOI•

A Proposal on Machine Learning via Dynamical Systems

[...]

Weinan E¹, Weinan E²•Institutions (2)

Peking University¹, Princeton University²

22 Mar 2017

TL;DR: The idea of using continuous dynamical systems to model general high-dimensional nonlinear functions used in machine learning and the connection with deep learning is discussed.

...read moreread less

Abstract: We discuss the idea of using continuous dynamical systems to model general high-dimensional nonlinear functions used in machine learning. We also discuss the connection with deep learning.

...read moreread less

608 citations

Journal Article•DOI•

Machine Learning With Big Data: Challenges and Approaches

[...]

Alexandra L'Heureux¹, Katarina Grolinger¹, Hany F. ElYamany¹, Miriam A. M. Capretz¹•Institutions (1)

University of Western Ontario¹

20 Apr 2017-IEEE Access

TL;DR: This paper compiles, summarizes, and organizes machine learning challenges with Big Data, highlighting the cause–effect relationship by organizing challenges according to Big Data Vs or dimensions that instigated the issue: volume, velocity, variety, or veracity.

...read moreread less

Abstract: The Big Data revolution promises to transform how we live, work, and think by enabling process optimization, empowering insight discovery and improving decision making. The realization of this grand potential relies on the ability to extract value from such massive data through data analytics; machine learning is at its core because of its ability to learn from data and provide data driven insights, decisions, and predictions. However, traditional machine learning approaches were developed in a different era, and thus are based upon multiple assumptions, such as the data set fitting entirely into memory, what unfortunately no longer holds true in this new context. These broken assumptions, together with the Big Data characteristics, are creating obstacles for the traditional techniques. Consequently, this paper compiles, summarizes, and organizes machine learning challenges with Big Data. In contrast to other research that discusses challenges, this work highlights the cause–effect relationship by organizing challenges according to Big Data Vs or dimensions that instigated the issue: volume, velocity, variety, or veracity. Moreover, emerging machine learning approaches and techniques are discussed in terms of how they are capable of handling the various challenges with the ultimate objective of helping practitioners select appropriate solutions for their use cases. Finally, a matrix relating the challenges and approaches is presented. Through this process, this paper provides a perspective on the domain, identifies research gaps and opportunities, and provides a strong foundation and encouragement for further research in the field of machine learning with Big Data.

...read moreread less

592 citations

Journal Article•DOI•

Cost-Effective Active Learning for Deep Image Classification

[...]

Keze Wang¹, Dongyu Zhang¹, Ya Li², Ruimao Zhang¹, Liang Lin¹ - Show less +1 more•Institutions (2)

Sun Yat-sen University¹, Guangzhou University²

01 Dec 2017-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper proposes a novel active learning (AL) framework, which is capable of building a competitive classifier with optimal feature representation via a limited amount of labeled training instances in an incremental learning manner and incorporates deep convolutional neural networks into AL.

...read moreread less

Abstract: Recent successes in learning-based image classification, however, heavily rely on the large number of annotated training samples, which may require considerable human effort. In this paper, we propose a novel active learning (AL) framework, which is capable of building a competitive classifier with optimal feature representation via a limited amount of labeled training instances in an incremental learning manner. Our approach advances the existing AL methods in two aspects. First, we incorporate deep convolutional neural networks into AL. Through the properly designed framework, the feature representation and the classifier can be simultaneously updated with progressively annotated informative samples. Second, we present a cost-effective sample selection strategy to improve the classification performance with less manual annotations. Unlike traditional methods focusing on only the uncertain samples of low prediction confidence, we especially discover the large amount of high-confidence samples from the unlabeled set for feature learning. Specifically, these high-confidence samples are automatically selected and iteratively assigned pseudolabels. We thus call our framework cost-effective AL (CEAL) standing for the two advantages. Extensive experiments demonstrate that the proposed CEAL framework can achieve promising results on two challenging image classification data sets, i.e., face recognition on the cross-age celebrity face recognition data set database and object categorization on Caltech-256.

...read moreread less

581 citations

Journal Article•DOI•

Imitation Learning: A Survey of Learning Methods

[...]

Ahmed Hussein¹, Mohamed Medhat Gaber², Eyad Elyan¹, Chrisina Jayne¹•Institutions (2)

Robert Gordon University¹, Birmingham City University²

06 Apr 2017-ACM Computing Surveys

TL;DR: This article surveys imitation learning methods and presents design options in different steps of the learning process, and extensively discusses combining imitation learning approaches using different sources and methods, as well as incorporating other motion learning methods to enhance imitation.

...read moreread less

Abstract: Imitation learning techniques aim to mimic human behavior in a given task. An agent (a learning machine) is trained to perform a task from demonstrations by learning a mapping between observations and actions. The idea of teaching by imitation has been around for many years; however, the field is gaining attention recently due to advances in computing and sensing as well as rising demand for intelligent applications. The paradigm of learning by imitation is gaining popularity because it facilitates teaching complex tasks with minimal expert knowledge of the tasks. Generic imitation learning methods could potentially reduce the problem of teaching a task to that of providing demonstrations, without the need for explicit programming or designing reward functions specific to the task. Modern sensors are able to collect and transmit high volumes of data rapidly, and processors with high computational power allow fast processing that maps the sensory data to actions in a timely manner. This opens the door for many potential AI applications that require real-time perception and reaction such as humanoid robots, self-driving vehicles, human computer interaction, and computer games, to name a few. However, specialized algorithms are needed to effectively and robustly learn models as learning by imitation poses its own set of challenges. In this article, we survey imitation learning methods and present design options in different steps of the learning process. We introduce a background and motivation for the field as well as highlight challenges specific to the imitation problem. Methods for designing and evaluating imitation learning tasks are categorized and reviewed. Special attention is given to learning methods in robotics and games as these domains are the most popular in the literature and provide a wide array of problems and methodologies. We extensively discuss combining imitation learning approaches using different sources and methods, as well as incorporating other motion learning methods to enhance imitation. We also discuss the potential impact on industry, present major applications, and highlight current and future research directions.

...read moreread less

535 citations

Book Chapter•DOI•

Suggestive Annotation: A Deep Active Learning Framework for Biomedical Image Segmentation

[...]

Lin Yang¹, Yizhe Zhang¹, Jianxu Chen¹, Siyuan Zhang¹, Danny Z. Chen¹ - Show less +1 more•Institutions (1)

University of Notre Dame¹

10 Sep 2017

TL;DR: A deep active learning framework that combines fully convolutional network (FCN) and active learning to significantly reduce annotation effort by making judicious suggestions on the most effective annotation areas is presented.

...read moreread less

Abstract: Image segmentation is a fundamental problem in biomedical image analysis. Recent advances in deep learning have achieved promising results on many biomedical image segmentation benchmarks. However, due to large variations in biomedical images (different modalities, image settings, objects, noise, etc.), to utilize deep learning on a new application, it usually needs a new set of training data. This can incur a great deal of annotation effort and cost, because only biomedical experts can annotate effectively, and often there are too many instances in images (e.g., cells) to annotate. In this paper, we aim to address the following question: With limited effort (e.g., time) for annotation, what instances should be annotated in order to attain the best performance? We present a deep active learning framework that combines fully convolutional network (FCN) and active learning to significantly reduce annotation effort by making judicious suggestions on the most effective annotation areas. We utilize uncertainty and similarity information provided by FCN and formulate a generalized version of the maximum set cover problem to determine the most representative and uncertain areas for annotation. Extensive experiments using the 2015 MICCAI Gland Challenge dataset and a lymph node ultrasound image segmentation dataset show that, using annotation suggestions by our method, state-of-the-art segmentation performance can be achieved by using only 50% of training data.

...read moreread less

Journal Article•DOI•

Survey of Model-Based Reinforcement Learning: Applications on Robotics

[...]

Athanasios S. Polydoros¹, Lazaros Nalpantidis¹•Institutions (1)

Aalborg University – Copenhagen¹

01 May 2017-Journal of Intelligent and Robotic Systems

TL;DR: It is argued that, by employing model-based reinforcement learning, the—now limited—adaptability characteristics of robotic systems can be expanded, and model- based reinforcement learning exhibits advantages that makes it more applicable to real life use-cases compared to model-free methods.

...read moreread less

Abstract: Reinforcement learning is an appealing approach for allowing robots to learn new tasks. Relevant literature reveals a plethora of methods, but at the same time makes clear the lack of implementations for dealing with real life challenges. Current expectations raise the demand for adaptable robots. We argue that, by employing model-based reinforcement learning, the--now limited--adaptability characteristics of robotic systems can be expanded. Also, model-based reinforcement learning exhibits advantages that makes it more applicable to real life use-cases compared to model-free methods. Thus, in this survey, model-based methods that have been applied in robotics are covered. We categorize them based on the derivation of an optimal policy, the definition of the returns function, the type of the transition model and the learned task. Finally, we discuss the applicability of model-based reinforcement learning approaches in new applications, taking into consideration the state of the art in both algorithms and hardware.

...read moreread less

Journal Article•DOI•

Active learning of linearly parametrized interatomic potentials

[...]

Evgeny V. Podryabinkin¹, Alexander V. Shapeev¹•Institutions (1)

Skolkovo Institute of Science and Technology¹

01 Dec 2017-Computational Materials Science

TL;DR: It is shown that the proposed active learning approach to the fitting of machine learning interatomic potentials is highly efficient in training potentials on the fly, ensuring that no extrapolation is attempted and leading to a completely reliable atomistic simulation without any significant decrease in accuracy.

...read moreread less

Journal Article•DOI•

Smart Augmentation Learning an Optimal Data Augmentation Strategy

[...]

Joseph Lemley¹, Shabab Bazrafkan¹, Peter Corcoran¹•Institutions (1)

National University of Ireland, Galway¹

24 Apr 2017-IEEE Access

TL;DR: Smart augmentation works, by creating a network that learns how to generate augmented data during the training process of a target network in a way that reduces that networks loss, and allows to learn augmentations that minimize the error of that network.

...read moreread less

Abstract: A recurring problem faced when training neural networks is that there is typically not enough data to maximize the generalization capability of deep neural networks. There are many techniques to address this, including data augmentation, dropout, and transfer learning. In this paper, we introduce an additional method, which we call smart augmentation and we show how to use it to increase the accuracy and reduce over fitting on a target network. Smart augmentation works, by creating a network that learns how to generate augmented data during the training process of a target network in a way that reduces that networks loss. This allows us to learn augmentations that minimize the error of that network. Smart augmentation has shown the potential to increase accuracy by demonstrably significant measures on all data sets tested. In addition, it has shown potential to achieve similar or improved performance levels with significantly smaller network sizes in a number of tested cases.

...read moreread less

Proceedings Article•DOI•

Active preference-based learning of reward functions

[...]

Dorsa Sadigh¹, Anca D. Dragan¹, S. Shankar Sastry¹, Sanjit A. Seshia¹•Institutions (1)

University of California, Berkeley¹

12 Jul 2017

TL;DR: This work builds on work in label ranking and proposes to learn from preferences (or comparisons) instead: the person provides the system a relative preference between two trajectories, and takes an active learning approach, in which the system decides on what preference queries to make.

...read moreread less

Abstract: Author(s): Sadigh, D; Dragan, AD; Sastry, S; Seshia, SA | Editor(s): Amato, Nancy M; Srinivasa, Siddhartha S; Ayanian, Nora; Kuindersma, Scott | Abstract: Our goal is to efficiently learn reward functions encoding a human's preferences for how a dynamical system should act. There are two challenges with this. First, in many problems it is difficult for people to provide demonstrations of the desired system trajectory (like a high-DOF robot arm motion or an aggressive driving maneuver), or to even assign how much numerical reward an action or trajectory should get. We build on work in label ranking and propose to learn from preferences (or comparisons) instead: the person provides the system a relative preference between two trajectories. Second, the learned reward function strongly depends on what environments and trajectories were experienced during the training phase. We thus take an active learning approach, in which the system decides on what preference queries to make. A novel aspect of our work is the complexity and continuous nature of the queries: continuous trajectories of a dynamical system in environments with other moving agents (humans or robots). We contribute a method for actively synthesizing queries that satisfy the dynamics of the system. Further, we learn the reward function from a continuous hypothesis space by maximizing the volume removed from the hypothesis space by each query. We assign weights to the hypothesis space in the form of a log-concave distribution and provide a bound on the number of iterations required to converge. We show that our algorithm converges faster to the desired reward compared to approaches that are not active or that do not synthesize queries in an autonomous driving domain. We then run a user study to put our method to the test with real people.

...read moreread less

Proceedings Article•DOI•

DeepDefense: Identifying DDoS Attack via Deep Learning

[...]

Xiaoyong Yuan¹, Chuanhuang Li², Xiaolin Li¹•Institutions (2)

University of Florida¹, Zhejiang Gongshang University²

29 May 2017

TL;DR: A recurrent deep neural network to learn patterns from sequences of network traffic and trace network attack activities and reduces the error rate compared with conventional machine learning method in the larger data set.

...read moreread less

Abstract: Distributed Denial of Service (DDoS) attacks grow rapidly and become one of the fatal threats to the Internet. Automatically detecting DDoS attack packets is one of the main defense mechanisms. Conventional solutions monitor network traffic and identify attack activities from legitimate network traffic based on statistical divergence. Machine learning is another method to improve identifying performance based on statistical features. However, conventional machine learning techniques are limited by the shallow representation models. In this paper, we propose a deep learning based DDoS attack detection approach (DeepDefense). Deep learning approach can automatically extract high-level features from low-level ones and gain powerful representation and inference. We design a recurrent deep neural network to learn patterns from sequences of network traffic and trace network attack activities. The experimental results demonstrate a better performance of our model compared with conventional machine learning models. We reduce the error rate from 7.517% to 2.103% compared with conventional machine learning method in the larger data set.

...read moreread less

Journal Article•DOI•

A survey on heterogeneous transfer learning

[...]

Oscar Day¹, Taghi M. Khoshgoftaar¹•Institutions (1)

University College of Engineering¹

01 Dec 2017-Journal of Big Data

TL;DR: This paper contributes a comprehensive survey and analysis of current methods designed for performing heterogeneous transfer learning tasks to provide an updated, centralized outlook into current methodologies.

...read moreread less

Abstract: Transfer learning has been demonstrated to be effective for many real-world applications as it exploits knowledge present in labeled training data from a source domain to enhance a model’s performance in a target domain, which has little or no labeled target training data. Utilizing a labeled source, or auxiliary, domain for aiding a target task can greatly reduce the cost and effort of collecting sufficient training labels to create an effective model in the new target distribution. Currently, most transfer learning methods assume the source and target domains consist of the same feature spaces which greatly limits their applications. This is because it may be difficult to collect auxiliary labeled source domain data that shares the same feature space as the target domain. Recently, heterogeneous transfer learning methods have been developed to address such limitations. This, in effect, expands the application of transfer learning to many other real-world tasks such as cross-language text categorization, text-to-image classification, and many others. Heterogeneous transfer learning is characterized by the source and target domains having differing feature spaces, but may also be combined with other issues such as differing data distributions and label spaces. These can present significant challenges, as one must develop a method to bridge the feature spaces, data distributions, and other gaps which may be present in these cross-domain learning tasks. This paper contributes a comprehensive survey and analysis of current methods designed for performing heterogeneous transfer learning tasks to provide an updated, centralized outlook into current methodologies.

...read moreread less

Journal Article•DOI•

SVM or deep learning? A comparative study on remote sensing image classification

[...]

Peng Liu¹, Kim-Kwang Raymond Choo², Lizhe Wang¹, Fang Huang³•Institutions (3)

Chinese Academy of Sciences¹, University of South Australia², University of Electronic Science and Technology of China³

01 Dec 2017

TL;DR: Auto-encoder and support vector machine can also perform better than SAE in some circumstances, and active learning schemes can be used to achieve high classification accuracy in both methods.

...read moreread less

Abstract: With constant advancements in remote sensing technologies resulting in higher image resolution, there is a corresponding need to be able to mine useful data and information from remote sensing images. In this paper, we study auto-encoder (SAE) and support vector machine (SVM), and to examine their sensitivity, we include additional umber of training samples using the active learning frame. We then conduct a comparative evaluation. When classifying remote sensing images, SVM can also perform better than SAE in some circumstances, and active learning schemes can be used to achieve high classification accuracy in both methods.

...read moreread less

Journal Article•DOI•

Active Deep Learning for Classification of Hyperspectral Images

[...]

Peng Liu¹, Hui Zhang, Kie B. Eom²•Institutions (2)

Chinese Academy of Sciences¹, George Washington University²

01 Feb 2017-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: The proposed active learning algorithm based on a weighted incremental dictionary learning that trains a deep network efficiently by actively selecting training samples at each iteration is shown to be efficient and effective in classifying hyperspectral images.

...read moreread less

Abstract: Active deep learning classification of hyperspectral images is considered in this paper. Deep learning has achieved success in many applications, but good-quality labeled samples are needed to construct a deep learning network. It is expensive getting good labeled samples in hyperspectral images for remote sensing applications. An active learning algorithm based on a weighted incremental dictionary learning is proposed for such applications. The proposed algorithm selects training samples that maximize two selection criteria, namely representative and uncertainty. This algorithm trains a deep network efficiently by actively selecting training samples at each iteration. The proposed algorithm is applied for the classification of hyperspectral images, and compared with other classification algorithms employing active learning. It is shown that the proposed algorithm is efficient and effective in classifying hyperspectral images.

...read moreread less

Proceedings Article•DOI•

Deep Active Learning for Named Entity Recognition

[...]

Yanyao Shen¹, Hyokun Yun², Zachary C. Lipton³, Yakov Kronrod⁴, Animashree Anandkumar⁵ - Show less +1 more•Institutions (5)

Tsinghua University¹, Amazon.com², Carnegie Mellon University³, University of Pennsylvania⁴, California Institute of Technology⁵

01 Aug 2017

TL;DR: In this article, the authors combine deep learning with active learning and show that they can outperform classical methods even with a significantly smaller amount of training data than a large dataset or a large budget for manually labeling data.

...read moreread less

Abstract: Deep neural networks have advanced the state of the art in named entity recognition. However, under typical training procedures, advantages over classical methods emerge only with large datasets. As a result, deep learning is employed only when large public datasets or a large budget for manually labeling data is available. In this work, we show otherwise: by combining deep learning with active learning, we can outperform classical methods even with a significantly smaller amount of training data.

...read moreread less

Proceedings Article•DOI•

Application of deep learning in object detection

[...]

Xinyi Zhou¹, Wei Gong¹, Wenlong Fu¹, Fengtong Du¹•Institutions (1)

Communication University of China¹

24 May 2017

TL;DR: This paper deals with the field of computer vision, mainly for the application of deep learning in object detection task, and a new dataset is built according to those commonly used datasets.

...read moreread less

Abstract: This paper deals with the field of computer vision, mainly for the application of deep learning in object detection task. On the one hand, there is a simple summary of the datasets and deep learning algorithms commonly used in computer vision. On the other hand, a new dataset is built according to those commonly used datasets, and choose one of the network called faster r-cnn to work on this new dataset. Through the experiment to strengthen the understanding of these networks, and through the analysis of the results learn the importance of deep learning technology, and the importance of the dataset for deep learning.

...read moreread less

Journal Article•DOI•

Deep learning for biological image classification

[...]

Carlos Affonso¹, André L. Rossi¹, Fábio Henrique Antunes Vieira¹, André C. P. L. F. de Carvalho²•Institutions (2)

Sao Paulo State University¹, Spanish National Research Council²

01 Nov 2017-Expert Systems With Applications

TL;DR: This paper investigates the classification of the quality of wood boards based on their images with the use of deep learning, particularly Convolutional Neural Networks, with the combination of texture-based feature extraction techniques and traditional techniques: Decision tree induction algorithms, Neural networks, Nearest neighbors and Support vector machines.

...read moreread less

Abstract: A number of industries use human inspection to visually classify the quality of their products and the raw materials used in the production process, this process could be done automatically through digital image processing. The industries are not always interested in the most accurate technique for a given problem, but most appropriate for the expected results, there must be a balance between accuracy and computational cost. This paper investigates the classification of the quality of wood boards based on their images. For such, it compares the use of deep learning, particularly Convolutional Neural Networks, with the combination of texture-based feature extraction techniques and traditional techniques: Decision tree induction algorithms, Neural Networks, Nearest neighbors and Support vector machines. Reported studies show that Deep Learning techniques applied to image processing tasks have achieved predictive performance superior to traditional classification techniques, mainly in high complex scenarios. One of the reasons pointed out is their embedded feature extraction mechanism. Deep Learning techniques directly identify and extract features, considered by them to be relevant, in a given image dataset. However, empirical results for the image data set have shown that the texture descriptor method proposed, regardless of the strategy employed is very competitive when compared with Convolutional Neural Network for all the performed experiments. The best performance of the texture descriptor method could be caused by the nature of the image dataset. Finally are pointed out some perspectives of futures developments with the application of Active learning and Semi supervised methods.

...read moreread less

Posted Content•

Deep Bayesian Active Learning with Image Data

[...]

Yarin Gal¹, Riashat Islam², Zoubin Ghahramani¹•Institutions (2)

University of Cambridge¹, McGill University²

08 Mar 2017-arXiv: Learning

TL;DR: In this paper, the authors combine recent advances in Bayesian deep learning into the active learning framework in a practical way, and demonstrate this on both the MNIST dataset, as well as for skin cancer diagnosis from lesion images.

...read moreread less

Posted Content•

A Tutorial on Thompson Sampling

[...]

Daniel Russo¹, Benjamin Van Roy², Abbas Kazerouni², Ian Osband³, Zheng Wen⁴ - Show less +1 more•Institutions (4)

Columbia University¹, Stanford University², Google³, Adobe Systems⁴

07 Jul 2017-arXiv: Learning

TL;DR: Thompson Sampling as mentioned in this paper is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance.

...read moreread less

Abstract: Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use. This tutorial covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, product recommendation, assortment, active learning with neural networks, and reinforcement learning in Markov decision processes. Most of these problems involve complex information structures, where information revealed by taking an action informs beliefs about other actions. We will also discuss when and why Thompson sampling is or is not effective and relations to alternative algorithms.

...read moreread less

Journal Article•DOI•

The need to approximate the use-case in clinical machine learning.

[...]

Sohrab Saeb¹, Luca Lonini¹, Arun Jayaraman¹, David C. Mohr¹, Konrad P. Kording¹ - Show less +1 more•Institutions (1)

Northwestern University¹

01 May 2017-GigaScience

TL;DR: It is found that record-wise CV often massively overestimates the prediction accuracy of the algorithms, and this overly optimistic method was used by almost half of the retrieved studies that used accelerometers, wearable sensors, or smartphones to predict clinical outcomes.

...read moreread less

Abstract: The availability of smartphone and wearable sensor technology is leading to a rapid accumulation of human subject data, and machine learning is emerging as a technique to map those data into clinical predictions. As machine learning algorithms are increasingly used to support clinical decision making, it is vital to reliably quantify their prediction accuracy. Cross-validation (CV) is the standard approach where the accuracy of such algorithms is evaluated on part of the data the algorithm has not seen during training. However, for this procedure to be meaningful, the relationship between the training and the validation set should mimic the relationship between the training set and the dataset expected for the clinical use. Here we compared two popular CV methods: record-wise and subject-wise. While the subject-wise method mirrors the clinically relevant use-case scenario of diagnosis in newly recruited subjects, the record-wise strategy has no such interpretation. Using both a publicly available dataset and a simulation, we found that record-wise CV often massively overestimates the prediction accuracy of the algorithms. We also conducted a systematic review of the relevant literature, and found that this overly optimistic method was used by almost half of the retrieved studies that used accelerometers, wearable sensors, or smartphones to predict clinical outcomes. As we move towards an era of machine learning-based diagnosis and treatment, using proper methods to evaluate their accuracy is crucial, as inaccurate results can mislead both clinicians and data scientists.

...read moreread less

Journal Article•DOI•

Fast learning method for convolutional neural networks using extreme learning machine and its application to lane detection.

[...]

Jihun Kim¹, Jonghong Kim¹, Gil-Jin Jang¹, Minho Lee¹•Institutions (1)

Kyungpook National University¹

01 Mar 2017-Neural Networks

TL;DR: A stacked ELM architecture in the CNN framework is proposed using an extreme learning machine (ELM) and the backpropagation algorithm is modified to find the targets of hidden layers and effectively learn network weights while maintaining performance.

...read moreread less

Posted Content•

Learning Active Learning from Data

[...]

Ksenia Konyushkova¹, Raphael Sznitman², Pascal Fua¹•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, University of Bern²

09 Mar 2017-arXiv: Learning

TL;DR: A novel data-driven approach to active learning is suggested to train a regressor that predicts the expected error reduction for a candidate sample in a particular learning state by formulating the query selection procedure as a regression problem.

...read moreread less

Abstract: In this paper, we suggest a novel data-driven approach to active learning (AL). The key idea is to train a regressor that predicts the expected error reduction for a candidate sample in a particular learning state. By formulating the query selection procedure as a regression problem we are not restricted to working with existing AL heuristics; instead, we learn strategies based on experience from previous AL outcomes. We show that a strategy can be learnt either from simple synthetic 2D datasets or from a subset of domain-specific data. Our method yields strategies that work well on real data from a wide range of domains.

...read moreread less

Proceedings Article•DOI•

Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-Tuning

[...]

Weifeng Ge¹, Yizhou Yu¹•Institutions (1)

University of Hong Kong¹

21 Jul 2017

TL;DR: This paper introduces a deep transfer learning scheme, called selective joint fine-tuning, for improving the performance of deep learning tasks with insufficient training data, and can improve the classification accuracy by 2% - 10% using a single model.

...read moreread less

Abstract: Deep neural networks require a large amount of labeled training data during supervised learning. However, collecting and labeling so much data might be infeasible in many cases. In this paper, we introduce a deep transfer learning scheme, called selective joint fine-tuning, for improving the performance of deep learning tasks with insufficient training data. In this scheme, a target learning task with insufficient training data is carried out simultaneously with another source learning task with abundant training data. However, the source learning task does not use all existing training data. Our core idea is to identify and use a subset of training images from the original source learning task whose low-level characteristics are similar to those from the target learning task, and jointly fine-tune shared convolutional layers for both tasks. Specifically, we compute descriptors from linear or nonlinear filter bank responses on training images from both tasks, and use such descriptors to search for a desired subset of training samples for the source learning task. Experiments demonstrate that our deep transfer learning scheme achieves state-of-the-art performance on multiple visual classification tasks with insufficient training data for deep learning. Such tasks include Caltech 256, MIT Indoor 67, and fine-grained classification problems (Oxford Flowers 102 and Stanford Dogs 120). In comparison to fine-tuning without a source domain, the proposed method can improve the classification accuracy by 2% - 10% using a single model. Codes and models are available at https://github.com/ZYYSzj/Selective-Joint-Fine-tuning.

...read moreread less

Proceedings Article•

Learning Active Learning from Data

[...]

Ksenia Konyushkova¹, Raphael Sznitman², Pascal Fua¹•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, University of Bern²

01 Jan 2017

TL;DR: In this article, a data-driven approach to active learning is proposed, where a regressor is trained to predict the expected error reduction for a candidate sample in a particular learning state.

...read moreread less

Journal Article•DOI•

Evolutionary Cost-Sensitive Extreme Learning Machine

[...]

Lei Zhang¹, David Zhang²•Institutions (2)

Chongqing University¹, Hong Kong Polytechnic University²

01 Dec 2017-IEEE Transactions on Neural Networks

TL;DR: This paper proposes an evolutionary cost-sensitive ELM, which to the best of the authors' knowledge, is the first proposal of ELM in evolutionary Cost-sensitive classification scenario, and well addresses the open issue of how to define the cost matrix in cost- sensitive learning tasks.

...read moreread less

Abstract: Conventional extreme learning machines (ELMs) solve a Moore–Penrose generalized inverse of hidden layer activated matrix and analytically determine the output weights to achieve generalized performance, by assuming the same loss from different types of misclassification. The assumption may not hold in cost-sensitive recognition tasks, such as face recognition-based access control system, where misclassifying a stranger as a family member may result in more serious disaster than misclassifying a family member as a stranger. Though recent cost-sensitive learning can reduce the total loss with a given cost matrix that quantifies how severe one type of mistake against another, in many realistic cases, the cost matrix is unknown to users. Motivated by these concerns, this paper proposes an evolutionary cost-sensitive ELM, with the following merits: 1) to the best of our knowledge, it is the first proposal of ELM in evolutionary cost-sensitive classification scenario; 2) it well addresses the open issue of how to define the cost matrix in cost-sensitive learning tasks; and 3) an evolutionary backtracking search algorithm is induced for adaptive cost matrix optimization. Experiments in a variety of cost-sensitive tasks well demonstrate the effectiveness of the proposed approaches, with about 5%–10% improvements.

...read moreread less

Collapse