scispace - formally typeset
Open AccessJournal ArticleDOI

A Bayesian/Information Theoretic Model of Learning to Learn viaMultiple Task Sampling

TLDR
It is argued that for many common machine learning problems, although in general the authors do not know the true (objective) prior for the problem, they do have some idea of a set of possible priors to which the true prior belongs.
Abstract
A Bayesian model of learning to learn by sampling from multiple tasks is presented. The multiple tasks are themselves generated by sampling from a distribution over an environment of related tasks. Such an environment is shown to be naturally modelled within a Bayesian context by the concept of an objective prior distribution. It is argued that for many common machine learning problems, although in general we do not know the true (objective) prior for the problem, we do have some idea of a set of possible priors to which the true prior belongs. It is shown that under these circumstances a learner can use Bayesian inference to learn the true prior by learning sufficiently many tasks from the environment. In addition, bounds are given on the amount of information required to learn a task when it is simultaneously learnt with several other tasks. The bounds show that if the learner has little knowledge of the true prior, but the dimensionality of the true prior is small, then sampling multiple tasks is highly advantageous. The theory is applied to the problem of learning a common feature set or equivalently a low-dimensional-representation (LDR) for an environment of related tasks.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book

Learning Deep Architectures for AI

TL;DR: The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed.
Posted Content

An Overview of Multi-Task Learning in Deep Neural Networks

Sebastian Ruder
- 15 Jun 2017 - 
TL;DR: This article seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks, particularly in deep neural networks.
Proceedings ArticleDOI

Regularized multi--task learning

TL;DR: An approach to multi--task learning based on the minimization of regularization functionals similar to existing ones, such as the one for Support Vector Machines, that have been successfully used in the past for single-- task learning is presented.
Posted Content

Practical recommendations for gradient-based training of deep architectures

TL;DR: Overall, this chapter describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks and closes with open questions about the training difficulties observed with deeper architectures.
Book ChapterDOI

Practical recommendations for gradient-based training of deep architectures

TL;DR: In this article, the authors present a practical guide with recommendations for some of the most commonly used hyperparameters, in particular in the context of learning algorithms based on back-propagated gradient and gradient-based optimization.
References
More filters
Book

Elements of information theory

TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Journal ArticleDOI

Approximation capabilities of multilayer feedforward networks

TL;DR: It is shown that standard multilayer feedforward networks with as few as a single hidden layer and arbitrary bounded and nonconstant activation function are universal approximators with respect to L p (μ) performance criteria, for arbitrary finite input environment measures μ.
Book

Statistical Decision Theory and Bayesian Analysis

TL;DR: An overview of statistical decision theory, which emphasizes the use and application of the philosophical ideas and mathematical structure of decision theory.
Journal ArticleDOI

Bayesian interpolation

TL;DR: The Bayesian approach to regularization and model-comparison is demonstrated by studying the inference problem of interpolating noisy data by examining the posterior probability distribution of regularizing constants and noise levels.