scispace - formally typeset
Open AccessBook ChapterDOI

NAS-Count: Counting-by-Density with Neural Architecture Search.

Reads0
Chats0
TLDR
This work automates the design of counting models with Neural Architecture Search (NAS) and introduces an end-to-end searched encoder-decoder architecture, Automatic Multi-Scale Network (AMSNet), utilizing a counting-specific two-level search space.
Abstract
Most of the recent advances in crowd counting have evolved from hand-designed density estimation networks, where multi-scale features are leveraged to address the scale variation problem, but at the expense of demanding design efforts. In this work, we automate the design of counting models with Neural Architecture Search (NAS) and introduce an end-to-end searched encoder-decoder architecture, Automatic Multi-Scale Network (AMSNet). Specifically, we utilize a counting-specific two-level search space. The encoder and decoder in AMSNet are composed of different cells discovered from micro-level search, while the multi-path architecture is explored through macro-level search. To solve the pixel-level isolation issue in MSE loss, AMSNet is optimized with an auto-searched Scale Pyramid Pooling Loss (SPPLoss) that supervises the multi-scale structural information. Extensive experiments on four datasets show AMSNet produces state-of-the-art results that outperform hand-designed models, fully demonstrating the efficacy of NAS-Count.

read more

Citations
More filters
Proceedings ArticleDOI

A Generalized Loss Function for Crowd Counting and Localization

TL;DR: In this article, a generalized loss function was proposed to learn density maps for crowd counting and localization, which outperformed other losses on four large-scale datasets for counting, and achieves the best localization performance on NWPU-Crowd and UCF-QNRF.
Proceedings ArticleDOI

Cross-View Cross-Scene Multi-View Crowd Counting

TL;DR: In this paper, a cross-view cross-scene (CVCS) multi-view counting paradigm is proposed, where the training and testing occur on different scenes with arbitrary camera layouts, to dynamically handle the challenge of optimal view fusion under scene and camera layout change and non-correspondence noise due to camera calibration errors or erroneous features.
Proceedings Article

To Choose or to Fuse? Scale Selection for Crowd Counting

TL;DR: SASNet as discussed by the authors proposes a scale-adaptive selection network, which automatically learns the internal correspondence between the scales and the feature levels to mitigate the gap between discrete feature levels and continuous scale variation.
Book ChapterDOI

An End-to-End Transformer Model for Crowd Localization

TL;DR: Zhang et al. as mentioned in this paper proposed an end-to-end transformer-decoder for crowd localization, which views the crowd localization as a direct set prediction problem, taking extracted features and trainable embeddings as input of the transformerdecoder.
Journal ArticleDOI

A Survey of Crowd Counting and Density Estimation based on Convolutional Neural Network

TL;DR: A comprehensive review of the recent research advancement on crowd counting and density estimation can be found in this article, where the authors introduce the background of counting and crowd density estimation and summarize the traditional crowd counting methods.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Book ChapterDOI

U-Net: Convolutional Networks for Biomedical Image Segmentation

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Journal ArticleDOI

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.
Journal Article

Random search for hyper-parameter optimization

TL;DR: This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid, and shows that random search is a natural baseline against which to judge progress in the development of adaptive (sequential) hyper- parameter optimization algorithms.
Proceedings ArticleDOI

Learning Transferable Architectures for Scalable Image Recognition

TL;DR: NASNet as discussed by the authors proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset, which enables transferability and achieves state-of-the-art performance.
Related Papers (5)