scispace - formally typeset
Journal ArticleDOI

Learning a generative model of images by factoring appearance and shape

Reads0
Chats0
TLDR
This work introduces a basic model, the masked RBM, which explicitly models occlusion boundaries in image patches by factoring the appearance of any patch region from its shape, and proposes a generative model of larger images using a field of such RBMs.
Abstract
Computer vision has grown tremendously in the past two decades. Despite all efforts, existing attempts at matching parts of the human visual system's extraordinary ability to understand visual scenes lack either scope or power. By combining the advantages of general low-level generative models and powerful layer-based and hierarchical models, this work aims at being a first step toward richer, more flexible models of images. After comparing various types of restricted Boltzmann machines (RBMs) able to model continuous-valued data, we introduce our basic model, the masked RBM, which explicitly models occlusion boundaries in image patches by factoring the appearance of any patch region from its shape. We then propose a generative model of larger images using a field of such RBMs. Finally, we discuss how masked RBMs could be stacked to form a deep model able to generate more complicated structures and suitable for various tasks such as segmentation or object recognition.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

From learning models of natural image patches to whole image restoration

TL;DR: A generic framework which allows for whole image restoration using any patch based prior for which a MAP (or approximate MAP) estimate can be calculated is proposed and a generic, surprisingly simple Gaussian Mixture prior is presented, learned from a set of natural images.
Book ChapterDOI

Attribute2Image: Conditional Image Generation from Visual Attributes

TL;DR: In this paper, a variational auto-encoder is used to generate images from visual attributes, where the image is modeled as a composite of foreground and background and a layered generative model with disentangled latent variables is developed.
Book ChapterDOI

An Introduction to Restricted Boltzmann Machines

TL;DR: This tutorial introduces RBMs as undirected graphical models as building blocks of multi-layer learning systems called deep belief networks based on Markov chain Monte Carlo methods.
Journal ArticleDOI

Training restricted Boltzmann machines

TL;DR: This tutorial introduces RBMs from the viewpoint of Markov random fields, starting with the required concepts of undirected graphical models and reviewing the state-of-the-art in training restricted Boltzmann machines from the perspective of graphical models.
Journal ArticleDOI

A probabilistic model for component-based shape synthesis

TL;DR: A new generative model of component-based shape structure is presented, which represents probabilistic relationships between properties of shape components, and relates them to learned underlying causes of structural variability within the domain.
References
More filters
Journal ArticleDOI

A fast learning algorithm for deep belief nets

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Proceedings Article

Rectified Linear Units Improve Restricted Boltzmann Machines

TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.
Journal ArticleDOI

Emergence of simple-cell receptive field properties by learning a sparse code for natural images

TL;DR: It is shown that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex.
Journal ArticleDOI

Training products of experts by minimizing contrastive divergence

TL;DR: A product of experts (PoE) is an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary because it is hard even to approximate the derivatives of the renormalization term in the combination rule.
Related Papers (5)