scispace - formally typeset
Open AccessProceedings Article

How Many Samples are Needed to Learn a Convolutional Neural Network

TLDR
The study of rigorously characterizing the sample complexity of estimating CNNs is initiated, showing that for an $m$-dimensional convolutional filter with linear activation acting on a d-dimensional input, the samplecomplexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\ep silon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\Epsilon
Abstract
A widespread folklore for explaining the success of convolutional neural network (CNN) is that CNN is a more compact representation than the fully connected neural network (FNN) and thus requires fewer samples for learning. We initiate the study of rigorously characterizing the sample complexity of learning convolutional neural networks. We show that for learning an m-dimensional convolutional filter with linear activation acting on a d-dimensional input, the sample complexity of achieving population prediction error of ϵ is ˜ O (m/ϵ2) whereas its FNN counterpart needs at least Ω(d/ϵ2) samples. Since m≪d, this result demonstrates the advantage of using CNN. We further consider the sample complexity of learning a one-hidden-layer CNN with linear activation where both the m-dimensional convolutional filter and the r-dimensional output weights are unknown. For this model, we show the sample complexity is ˜ O ((m+r)/ϵ2) when the ratio between the stride size and the filter size is a constant. For both models, we also present lower bounds showing our sample complexities are tight up to logarithmic factors. Our main tools for deriving these results are localized empirical process and a new lemma characterizing the convolutional structure. We believe these tools may inspire further developments in understanding CNN.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks

TL;DR: In this paper, a simple 2-layer ReLU network with random initialization is analyzed and generalization bound independent of network size is shown to be robust to the size of the network.
Posted Content

Generalization bounds for deep convolutional neural networks

TL;DR: Borders on the generalization error of convolutional networks are proved in terms of the training loss, the number of parameters, the Lipschitz constant of the loss and the distance from the weights to the initial weights.
Journal ArticleDOI

Using convolutional neural network for predicting cyanobacteria concentrations in river water

TL;DR: This study successfully demonstrated the capability of the CNN model for cyanobacterial bloom prediction using high temporal frequency images and characterized its performance variations across the studied river reach.
Posted Content

Size-free generalization bounds for convolutional neural networks

TL;DR: In this article, the authors prove bounds on the generalization error of convolutional networks in terms of the training loss, the number of parameters, the Lipschitz constant of the loss and the distance from the weights to the initialweights.
Posted Content

Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets?

TL;DR: This work describes a natural task on which a provable sample complexity gap can be shown, for standard training algorithms, and demonstrates a single target function, learning which on all possible distributions leads to an $O(1)$ vs $Omega(d^2/\varepsilon)$ gap.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI

ImageNet classification with deep convolutional neural networks

TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
Journal ArticleDOI

Mastering the game of Go with deep neural networks and tree search

TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.
Book ChapterDOI

Probability Inequalities for sums of Bounded Random Variables

TL;DR: In this article, upper bounds for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt are derived for certain sums of dependent random variables such as U statistics.
Related Papers (5)