scispace - formally typeset
Search or ask a question

Showing papers on "Robustness (computer science) published in 2016"


Proceedings ArticleDOI
27 Jun 2016
TL;DR: DeepFool as discussed by the authors proposes the DeepFool algorithm to efficiently compute perturbations that fool deep networks, and thus reliably quantify the robustness of these classifiers by making them more robust.
Abstract: State-of-the-art deep neural networks have achieved impressive results on many image classification tasks. However, these same architectures have been shown to be unstable to small, well sought, perturbations of the images. Despite the importance of this phenomenon, no effective methods have been proposed to accurately compute the robustness of state-of-the-art deep classifiers to such perturbations on large-scale datasets. In this paper, we fill this gap and propose the DeepFool algorithm to efficiently compute perturbations that fool deep networks, and thus reliably quantify the robustness of these classifiers. Extensive experimental results show that our approach outperforms recent methods in the task of computing adversarial perturbations and making classifiers more robust.1

4,505 citations


Proceedings Article
03 Nov 2016
TL;DR: This article showed that adversarial training confers robustness to single-step attack methods, while multi-step attacks are somewhat less transferable than single step attack methods and single step attacks are the best for mounting black-box attacks.
Abstract: Adversarial examples are malicious inputs designed to fool machine learning models. They often transfer from one model to another, allowing attackers to mount black box attacks without knowledge of the target model's parameters. Adversarial training is the process of explicitly training a model on adversarial examples, in order to make it more robust to attack or to reduce its test error on clean inputs. So far, adversarial training has primarily been applied to small problems. In this research, we apply adversarial training to ImageNet. Our contributions include: (1) recommendations for how to succesfully scale adversarial training to large models and datasets, (2) the observation that adversarial training confers robustness to single-step attack methods, (3) the finding that multi-step attack methods are somewhat less transferable than single-step attack methods, so single-step attacks are the best for mounting black-box attacks, and (4) resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples, because the adversarial example construction process uses the true label and the model can learn to exploit regularities in the construction process.

1,769 citations


Proceedings ArticleDOI
27 Jun 2016
TL;DR: It is shown that a simple tracker combining complementary cues in a ridge regression framework can operate faster than 80 FPS and outperform not only all entries in the popular VOT14 competition, but also recent and far more sophisticated trackers according to multiple benchmarks.
Abstract: Correlation Filter-based trackers have recently achieved excellent performance, showing great robustness to challenging situations exhibiting motion blur and illumination changes. However, since the model that they learn depends strongly on the spatial layout of the tracked object, they are notoriously sensitive to deformation. Models based on colour statistics have complementary traits: they cope well with variation in shape, but suffer when illumination is not consistent throughout a sequence. Moreover, colour distributions alone can be insufficiently discriminative. In this paper, we show that a simple tracker combining complementary cues in a ridge regression framework can operate faster than 80 FPS and outperform not only all entries in the popular VOT14 competition, but also recent and far more sophisticated trackers according to multiple benchmarks.

1,285 citations


Book ChapterDOI
08 Oct 2016
TL;DR: This paper introduces new gating mechanism within LSTM to learn the reliability of the sequential input data and accordingly adjust its effect on updating the long-term context information stored in the memory cell, and proposes a more powerful tree-structure based traversal method.
Abstract: 3D action recognition – analysis of human actions based on 3D skeleton data – becomes popular recently due to its succinctness, robustness, and view-invariant representation. Recent attempts on this problem suggested to develop RNN-based learning methods to model the contextual dependency in the temporal domain. In this paper, we extend this idea to spatio-temporal domains to analyze the hidden sources of action-related information within the input data over both domains concurrently. Inspired by the graphical structure of the human skeleton, we further propose a more powerful tree-structure based traversal method. To handle the noise and occlusion in 3D skeleton data, we introduce new gating mechanism within LSTM to learn the reliability of the sequential input data and accordingly adjust its effect on updating the long-term context information stored in the memory cell. Our method achieves state-of-the-art performance on 4 challenging benchmark datasets for 3D human action analysis.

1,230 citations


Posted Content
TL;DR: In this paper, a factorized convolution operator was introduced to reduce the number of parameters in the discriminative correlation filter (DCF) model and a compact generative model of the training sample distribution, which significantly reduced memory and time complexity, while providing better diversity of samples.
Abstract: In recent years, Discriminative Correlation Filter (DCF) based methods have significantly advanced the state-of-the-art in tracking. However, in the pursuit of ever increasing tracking performance, their characteristic speed and real-time capability have gradually faded. Further, the increasingly complex models, with massive number of trainable parameters, have introduced the risk of severe over-fitting. In this work, we tackle the key causes behind the problems of computational complexity and over-fitting, with the aim of simultaneously improving both speed and performance. We revisit the core DCF formulation and introduce: (i) a factorized convolution operator, which drastically reduces the number of parameters in the model; (ii) a compact generative model of the training sample distribution, that significantly reduces memory and time complexity, while providing better diversity of samples; (iii) a conservative model update strategy with improved robustness and reduced complexity. We perform comprehensive experiments on four benchmarks: VOT2016, UAV123, OTB-2015, and TempleColor. When using expensive deep features, our tracker provides a 20-fold speedup and achieves a 13.0% relative gain in Expected Average Overlap compared to the top ranked method in the VOT2016 challenge. Moreover, our fast variant, using hand-crafted features, operates at 60 Hz on a single CPU, while obtaining 65.0% AUC on OTB-2015.

1,069 citations


Journal ArticleDOI
TL;DR: In this paper, a convolution of the underlying image with a sinc function was proposed to sample the ringing pattern at the zero-crossings of the oscillating sincfunction.
Abstract: Purpose To develop a fast and stable method for correcting the gibbs-ringing artifact. Methods Gibbs-ringing is a well-known artifact which manifests itself as spurious oscillations in the vicinity of sharp image gradients at tissue boundaries. The origin can be seen in the truncation of k-space during MRI data-acquisition. Correction techniques like Gegenbauer reconstruction or extrapolation methods aim at recovering these missing data. Here, we present a simple and robust method which exploits a different view on the Gibbs-phenomenon: The truncation in k-space can be interpreted as a convolution of the underlying image with a sinc-function. As the image is reconstructed on a discretized grid, the severity of the ringing artifacts depends on how this grid is located with respect to the edge and the oscillation pattern of the function. We propose to reinterpolate the image based on local, subvoxel-shifts to sample the ringing pattern at the zero-crossings of the oscillating sinc-function. Results With the proposed method, the artifact can simply, effectively, and robustly be removed with a minimal amount of image smoothing. Conclusions The robustness of the method suggests it as a suitable candidate for an implementation in the standard image processing pipeline in clinical routine. Magn Reson Med 76:1574-1581, 2016. © 2015 International Society for Magnetic Resonance in Medicine.

838 citations


Posted Content
TL;DR: It is shown that models trained with the VIB objective outperform those that are trained with other forms of regularization, in terms of generalization performance and robustness to adversarial attack.
Abstract: We present a variational approximation to the information bottleneck of Tishby et al. (1999). This variational approach allows us to parameterize the information bottleneck model using a neural network and leverage the reparameterization trick for efficient training. We call this method "Deep Variational Information Bottleneck", or Deep VIB. We show that models trained with the VIB objective outperform those that are trained with other forms of regularization, in terms of generalization performance and robustness to adversarial attack.

757 citations


Proceedings Article
01 Dec 2016
TL;DR: Deep Variational Information Bottleneck (Deep VIB) as discussed by the authors is a variational approximation to the information bottleneck of Tishby et al. This variational approach allows us to parameterize the bottleneck model using a neural network and leverage the reparameterization trick for efficient training.
Abstract: We present a variational approximation to the information bottleneck of Tishby et al. (1999). This variational approach allows us to parameterize the information bottleneck model using a neural network and leverage the reparameterization trick for efficient training. We call this method "Deep Variational Information Bottleneck", or Deep VIB. We show that models trained with the VIB objective outperform those that are trained with other forms of regularization, in terms of generalization performance and robustness to adversarial attack.

610 citations


Journal ArticleDOI
TL;DR: It is shown that a novel approach to real-time dense visual simultaneous localisation and mapping enables more realistic augmented reality rendering; a richer understanding of the scene beyond pure geometry and more accurate and robust photometric tracking.
Abstract: We present a novel approach to real-time dense visual simultaneous localisation and mapping. Our system is capable of capturing comprehensive dense globally consistent surfel-based maps of room scale environments and beyond explored using an RGB-D camera in an incremental online fashion, without pose graph optimization or any post-processing steps. This is accomplished by using dense frame-to-model camera tracking and windowed surfel-based fusion coupled with frequent model refinement through non-rigid surface deformations. Our approach applies local model-to-model surface loop closure optimizations as often as possible to stay close to the mode of the map distribution, while utilizing global loop closure to recover from arbitrary drift and maintain global consistency. In the spirit of improving map quality as well as tracking accuracy and robustness, we furthermore explore a novel approach to real-time discrete light source detection. This technique is capable of detecting numerous light sources in indoo...

600 citations


Proceedings ArticleDOI
27 Jun 2016
TL;DR: A large traffic-sign benchmark from 100000 Tencent Street View panoramas is created, going beyond previous benchmarks, and it is demonstrated how a robust end-to-end convolutional neural network (CNN) can simultaneously detect and classify trafficsigns.
Abstract: Although promising results have been achieved in the areas of traffic-sign detection and classification, few works have provided simultaneous solutions to these two tasks for realistic real world images. We make two contributions to this problem. Firstly, we have created a large traffic-sign benchmark from 100000 Tencent Street View panoramas, going beyond previous benchmarks. It provides 100000 images containing 30000 traffic-sign instances. These images cover large variations in illuminance and weather conditions. Each traffic-sign in the benchmark is annotated with a class label, its bounding box and pixel mask. We call this benchmark Tsinghua-Tencent 100K. Secondly, we demonstrate how a robust end-to-end convolutional neural network (CNN) can simultaneously detect and classify trafficsigns. Most previous CNN image processing solutions target objects that occupy a large proportion of an image, and such networks do not work well for target objects occupying only a small fraction of an image like the traffic-signs here. Experimental results show the robustness of our network and its superiority to alternatives. The benchmark, source code and the CNN model introduced in this paper is publicly available1.

587 citations


Journal ArticleDOI
TL;DR: Compared with traditional neural network, the SAE-based DNN can achieve superior performance for feature learning and classification in the field of induction motor fault diagnosis.

Proceedings ArticleDOI
15 Apr 2016
TL;DR: The authors proposed a general stability training method to stabilize deep networks against small input distortions that result from various types of common image processing, such as compression, rescaling, and cropping.
Abstract: In this paper we address the issue of output instability of deep neural networks: small perturbations in the visual input can significantly distort the feature embeddings and output of a neural network. Such instability affects many deep architectures with state-of-the-art performance on a wide range of computer vision tasks. We present a general stability training method to stabilize deep networks against small input distortions that result from various types of common image processing, such as compression, rescaling, and cropping. We validate our method by stabilizing the state of-the-art Inception architecture [11] against these types of distortions. In addition, we demonstrate that our stabilized model gives robust state-of-the-art performance on largescale near-duplicate detection, similar-image ranking, and classification on noisy datasets.

Journal ArticleDOI
TL;DR: A generalized correntropy that adopts the generalized Gaussian density (GGD) function as the kernel, and some important properties are presented, and an adaptive algorithm is derived and shown to be very stable and can achieve zero probability of divergence (POD).
Abstract: As a robust nonlinear similarity measure in kernel space, correntropy has received increasing attention in domains of machine learning and signal processing. In particular, the maximum correntropy criterion (MCC) has recently been successfully applied in robust regression and filtering. The default kernel function in correntropy is the Gaussian kernel, which is, of course, not always the best choice. In this paper, we propose a generalized correntropy that adopts the generalized Gaussian density (GGD) function as the kernel, and present some important properties. We further propose the generalized maximum correntropy criterion (GMCC) and apply it to adaptive filtering. An adaptive algorithm, called the GMCC algorithm, is derived, and the stability problem and steady-state performance are studied. We show that the proposed algorithm is very stable and can achieve zero probability of divergence (POD). Simulation results confirm the theoretical expectations and demonstrate the desirable performance of the new algorithm.

Journal ArticleDOI
TL;DR: This paper compares ten popular local feature descriptors in the contexts of 3D object recognition, 3D shape retrieval, and 3D modeling and presents the performance results of these descriptors when combined with different 3D keypoint detection methods.
Abstract: A number of 3D local feature descriptors have been proposed in the literature. It is however, unclear which descriptors are more appropriate for a particular application. A good descriptor should be descriptive, compact, and robust to a set of nuisances. This paper compares ten popular local feature descriptors in the contexts of 3D object recognition, 3D shape retrieval, and 3D modeling. We first evaluate the descriptiveness of these descriptors on eight popular datasets which were acquired using different techniques. We then analyze their compactness using the recall of feature matching per each float value in the descriptor. We also test the robustness of the selected descriptors with respect to support radius variations, Gaussian noise, shot noise, varying mesh resolution, distance to the mesh boundary, keypoint localization error, occlusion, clutter, and dataset size. Moreover, we present the performance results of these descriptors when combined with different 3D keypoint detection methods. We finally analyze the computational efficiency for generating each descriptor.

Journal ArticleDOI
TL;DR: This study gives the mathematic model of a quadrotor unmanned aerial vehicle (UAV) and then proposes a robust nonlinear controller which combines the sliding-mode control technique and the backstepping control technique, which is designed to achieve Cartesian position trajectory tracking capability.
Abstract: This study gives the mathematic model of a quadrotor unmanned aerial vehicle (UAV) and then proposes a robust nonlinear controller which combines the sliding-mode control technique and the backstepping control technique. To achieve Cartesian position trajectory tracking capability, the construction of the controller can be divided into two stages: a regular SMC controller for attitude subsystem (inner loop) is first developed to guarantee fast convergence rapidity of Euler angles and the backstepping technique is applied to the position loop until desired attitudes are obtained and then the ultimate control laws. The stability of the closed-loop system is guaranteed by stabilizing each of the subsystems step by step and the robustness of the controller against model uncertainty and external disturbances is investigated. In addition, an adaptive observer-based fault estimation scheme is also considered for taking off mode. Simulations are conducted to demonstrate the effectiveness of the designed robust nonlinear controller and the fault estimation scheme.

Proceedings Article
30 Apr 2016
TL;DR: CatGAN as discussed by the authors is based on an objective function that trades-off mutual information between observed examples and their predicted categorical class distribution, against robustness of the classifier to an adversarial generative model.
Abstract: In this paper we present a method for learning a discriminative classifier from unlabeled or partially labeled data. Our approach is based on an objective function that trades-off mutual information between observed examples and their predicted categorical class distribution, against robustness of the classifier to an adversarial generative model. The resulting algorithm can either be interpreted as a natural generalization of the generative adversarial networks (GAN) framework or as an extension of the regularized information maximization (RIM) framework to robust classification against an optimal adversary. We empirically evaluate our method - which we dub categorical generative adversarial networks (or CatGAN) - on synthetic data as well as on challenging image classification tasks, demonstrating the robustness of the learned classifiers. We further qualitatively assess the fidelity of samples generated by the adversarial generator that is learned alongside the discriminative classifier, and identify links between the CatGAN objective and discriminative clustering algorithms (such as RIM).

Journal ArticleDOI
TL;DR: A novel tasks-constrained deep model is formulated, which not only learns the inter-task correlation but also employs dynamic task coefficients to facilitate the optimization convergence when learning multiple complex tasks.
Abstract: In this study, we show that landmark detection or face alignment task is not a single and independent problem. Instead, its robustness can be greatly improved with auxiliary information. Specifically, we jointly optimize landmark detection together with the recognition of heterogeneous but subtly correlated facial attributes, such as gender, expression, and appearance attributes. This is non-trivial since different attribute inference tasks have different learning difficulties and convergence rates. To address this problem, we formulate a novel tasks-constrained deep model, which not only learns the inter-task correlation but also employs dynamic task coefficients to facilitate the optimization convergence when learning multiple complex tasks. Extensive evaluations show that the proposed task-constrained learning (i) outperforms existing face alignment methods, especially in dealing with faces with severe occlusion and pose variation, and (ii) reduces model complexity drastically compared to the state-of-the-art methods based on cascaded deep model.

Journal ArticleDOI
TL;DR: The fault detection filtering problem is solved for nonlinear switched stochastic system in the T-S fuzzy framework and the fuzzy-parameter-dependent fault detection filters are designed that guarantee the resulted error system to be mean-square exponential stable with a weighted H∞ error performance.
Abstract: In this note, the fault detection filtering problem is solved for nonlinear switched stochastic system in the T-S fuzzy framework. Our attention is concentrated on the construction of a robust fault detection technique to the nonlinear switched system with Brownian motion. Based on observer-based fault detection fuzzy filter as a residual generator, the proposed fault detection is formulated as a fuzzy filtering problem. By the utilization of the average dwell time technique and the piecewise Lyapunov function technique, the fuzzy-parameter-dependent fault detection filters are designed that guarantee the resulted error system to be mean-square exponential stable with a weighted ${\mathcal H}_{\infty}$ error performance. Then, the corresponding solvability condition for the fault detection fuzzy filter is also established by the linearization procedure technique. Finally, simulation has been presented to show the effectiveness of the proposed fault detection technique.

Posted Content
TL;DR: In this article, a verification framework for feed-forward multi-layer neural networks based on Satisfiability Modulo Theory (SMT) is proposed to guarantee the safety of image classification decisions with respect to image manipulations, such as scratches or changes to camera angle or lighting conditions.
Abstract: Deep neural networks have achieved impressive experimental results in image classification, but can surprisingly be unstable with respect to adversarial perturbations, that is, minimal changes to the input image that cause the network to misclassify it. With potential applications including perception modules and end-to-end controllers for self-driving cars, this raises concerns about their safety. We develop a novel automated verification framework for feed-forward multi-layer neural networks based on Satisfiability Modulo Theory (SMT). We focus on safety of image classification decisions with respect to image manipulations, such as scratches or changes to camera angle or lighting conditions that would result in the same class being assigned by a human, and define safety for an individual decision in terms of invariance of the classification within a small neighbourhood of the original image. We enable exhaustive search of the region by employing discretisation, and propagate the analysis layer by layer. Our method works directly with the network code and, in contrast to existing methods, can guarantee that adversarial examples, if they exist, are found for the given region and family of manipulations. If found, adversarial examples can be shown to human testers and/or used to fine-tune the network. We implement the techniques using Z3 and evaluate them on state-of-the-art networks, including regularised and deep learning networks. We also compare against existing techniques to search for adversarial examples and estimate network robustness.

Journal ArticleDOI
TL;DR: Two Mixed Integer Linear Programming (MILP) approaches to generate configurable square-based fiducial marker dictionaries maximizing their inter-marker distance are proposed.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a stochastic quasi-Newton method that is efficient, robust, and scalable based on the observation that it is beneficial to collect curvature information pointwise, and at spaced intervals.
Abstract: The question of how to incorporate curvature information into stochastic approximation methods is challenging. The direct application of classical quasi-Newton updating techniques for deterministic optimization leads to noisy curvature estimates that have harmful effects on the robustness of the iteration. In this paper, we propose a stochastic quasi-Newton method that is efficient, robust, and scalable. It employs the classical BFGS update formula in its limited memory form, and is based on the observation that it is beneficial to collect curvature information pointwise, and at spaced intervals. One way to do this is through (subsampled) Hessian-vector products. This technique differs from the classical approach that would compute differences of gradients at every iteration, and where controlling the quality of the curvature estimates can be difficult. We present numerical results on problems arising in machine learning that suggest that the proposed method shows much promise.

Proceedings Article
05 Dec 2016
TL;DR: This work presents a general approach for using flexible parametric models (neural networks) for Bayesian optimization, staying as close to a truly Bayesian treatment as possible and obtaining scalability through stochastic gradient Hamiltonian Monte Carlo, whose robustness is improved via a scale adaptation.
Abstract: Bayesian optimization is a prominent method for optimizing expensive-to-evaluate black-box functions that is widely applied to tuning the hyperparameters of machine learning algorithms. Despite its successes, the prototypical Bayesian optimization approach - using Gaussian process models - does not scale well to either many hyperparameters or many function evaluations. Attacking this lack of scalability and flexibility is thus one of the key challenges of the field. We present a general approach for using flexible parametric models (neural networks) for Bayesian optimization, staying as close to a truly Bayesian treatment as possible. We obtain scalability through stochastic gradient Hamiltonian Monte Carlo, whose robustness we improve via a scale adaptation. Experiments including multi-task Bayesian optimization with 21 tasks, parallel optimization of deep neural networks and deep reinforcement learning show the power and flexibility of this approach.

Journal ArticleDOI
TL;DR: It is demonstrated that new millimeter-wave (mm-wave) technology, under investigation for 5G communications systems, will be able to provide centimeter (cm)-accuracy indoor localization in a robust manner, ideally suited for AL.
Abstract: Asisted living (AL) technologies, enabled by technical advances such as the advent of the Internet of Things, are increasingly gaining importance in our aging society. This article discusses the potential of future high-accuracy localization systems as a key component of AL applications. Accurate location information can be tremendously useful to realize, e.g., behavioral monitoring, fall detection, and real-time assistance. Such services are expected to provide older adults and people with disabilities with more independence and thus to reduce the cost of caretaking. Total cost of ownership and ease of installation are paramount to make sensor systems for AL viable. In case of a radio-based indoor localization system, this implies that a conventional solution is unlikely to gain widespread adoption because of its requirement to install multiple fixed nodes (anchors) in each room. This article therefore places its focus on 1) discussing radiolocalization methods that reduce the required infrastructure by exploiting information from reflected multipath components (MPCs) and 2) showing that knowledge about the propagation environment enables localization with high accuracy and robustness. It is demonstrated that new millimeter-wave (mm-wave) technology, under investigation for 5G communications systems, will be able to provide centimeter (cm)-accuracy indoor localization in a robust manner, ideally suited for AL.

Journal ArticleDOI
TL;DR: It is presented that, even without offline training with a large amount of auxiliary data, simple two-layer convolutional networks can be powerful enough to learn robust representations for visual tracking.
Abstract: Deep networks have been successfully applied to visual tracking by learning a generic representation offline from numerous training images. However, the offline training is time-consuming and the learned generic representation may be less discriminative for tracking specific objects. In this paper, we present that, even without offline training with a large amount of auxiliary data, simple two-layer convolutional networks can be powerful enough to learn robust representations for visual tracking. In the first frame, we extract a set of normalized patches from the target region as fixed filters, which integrate a series of adaptive contextual filters surrounding the target to define a set of feature maps in the subsequent frames. These maps measure similarities between each filter and useful local intensity patterns across the target, thereby encoding its local structural information. Furthermore, all the maps together form a global representation, via which the inner geometric layout of the target is also preserved. A simple soft shrinkage method that suppresses noisy values below an adaptive threshold is employed to de-noise the global representation. Our convolutional networks have a lightweight structure and perform favorably against several state-of-the-art methods on the recent tracking benchmark data set with 50 challenging videos.

Book ChapterDOI
08 Oct 2016
TL;DR: A new method for blind motion deblurring that uses a neural network trained to compute estimates of sharp image patches from observations that are blurred by an unknown motion kernel to predict the complex Fourier coefficients of a deconvolution filter to be applied to the input patch for restoration.
Abstract: We present a new method for blind motion deblurring that uses a neural network trained to compute estimates of sharp image patches from observations that are blurred by an unknown motion kernel. Instead of regressing directly to patch intensities, this network learns to predict the complex Fourier coefficients of a deconvolution filter to be applied to the input patch for restoration. For inference, we apply the network independently to all overlapping patches in the observed image, and average its outputs to form an initial estimate of the sharp image. We then explicitly estimate a single global blur kernel by relating this estimate to the observed image, and finally perform non-blind deconvolution with this kernel. Our method exhibits accuracy and robustness close to state-of-the-art iterative methods, while being much faster when parallelized on GPU hardware.

Journal ArticleDOI
TL;DR: Experimental results indicate that DCP outperforms the state-of-the-art local descriptors for both face identification and face verification tasks and the best performance is achieved on the challenging LFW and FRGC 2.0 databases by deploying MDML-DCPs in a simple recognition scheme.
Abstract: To perform unconstrained face recognition robust to variations in illumination, pose and expression, this paper presents a new scheme to extract “Multi-Directional Multi-Level Dual-Cross Patterns” (MDML-DCPs) from face images. Specifically, the MDML-DCPs scheme exploits the first derivative of Gaussian operator to reduce the impact of differences in illumination and then computes the DCP feature at both the holistic and component levels. DCP is a novel face image descriptor inspired by the unique textural structure of human faces. It is computationally efficient and only doubles the cost of computing local binary patterns, yet is extremely robust to pose and expression variations. MDML-DCPs comprehensively yet efficiently encodes the invariant characteristics of a face image from multiple levels into patterns that are highly discriminative of inter-personal differences but robust to intra-personal variations. Experimental results on the FERET, CAS-PERL-R1, FRGC 2.0, and LFW databases indicate that DCP outperforms the state-of-the-art local descriptors (e.g., LBP, LTP, LPQ, POEM, tLBP, and LGXP) for both face identification and face verification tasks. More impressively, the best performance is achieved on the challenging LFW and FRGC 2.0 databases by deploying MDML-DCPs in a simple recognition scheme.

Journal ArticleDOI
TL;DR: An R package, robustlmm, is introduced, designed to robustly fit linear mixed-effects models, to provide estimates where contamination has only little influence and to detect and flag contamination.
Abstract: As any real-life data, data modeled by linear mixed-effects models often contain outliers or other contamination. Even little contamination can drive the classic estimates far away from what they would be without the contamination. At the same time, datasets that require mixed-effects modeling are often complex and large. This makes it difficult to spot contamination. Robust estimation methods aim to solve both problems: to provide estimates where contamination has only little influence and to detect and flag contamination. We introduce an R package, robustlmm, to robustly fit linear mixed-effects models. The package's functions and methods are designed to closely equal those offered by lme4, the R package that implements classic linear mixed-effects model estimation in R. The robust estimation method in robustlmm is based on the random effects contamination model and the central contamination model. Contamination can be detected at all levels of the data. The estimation method does not make any assumption on the data's grouping structure except that the model parameters are estimable. robustlmm supports hierarchical and non-hierarchical (e.g., crossed) grouping structures. The robustness of the estimates and their asymptotic efficiency is fully controlled through the function interface. Individual parts (e.g., fixed effects and variance components) can be tuned independently. In this tutorial, we show how to fit robust linear mixed-effects models using robustlmm, how to assess the model fit, how to detect outliers, and how to compare different fits.

Book ChapterDOI
TL;DR: The intent is to provide image-processing methods that can be deployed in algorithms that analyze biomedical images with improved rotation invariance and high directional sensitivity, and address the problem of matching directional patterns by proposing steerable filters.
Abstract: We give a methodology-oriented perspective on directional image analysis and rotation-invariant processing. We review the state of the art in the field and make connections with recent mathematical developments in functional analysis and wavelet theory. We unify our perspective within a common framework using operators. The intent is to provide image-processing methods that can be deployed in algorithms that analyze biomedical images with improved rotation invariance and high directional sensitivity. We start our survey with classical methods such as directional-gradient and the structure tensor. Then, we discuss how these methods can be improved with respect to robustness, invariance to geometric transformations (with a particular interest in scaling), and computation cost. To address robustness against noise, we move forward to higher degrees of directional selectivity and discuss Hessian-based detection schemes. To present multiscale approaches, we explain the differences between Fourier filters, directional wavelets, curvelets, and shearlets. To reduce the computational cost, we address the problem of matching directional patterns by proposing steerable filters, where one might perform arbitrary rotations and optimizations without discretizing the orientation. We define the property of steerability and give an introduction to the design of steerable filters. We cover the spectrum from simple steerable filters through pyramid schemes up to steerable wavelets. We also present illustrations on the design of steerable wavelets and their application to pattern recognition.

Book ChapterDOI
08 Oct 2016
TL;DR: This paper employs CNNs and incorporates two significant improvements to the state of the art methods: layered boosting and selective sampling to increase the counting accuracy and to reduce processing time.
Abstract: In this paper, we address the task of object counting in images. We follow modern learning approaches in which a density map is estimated directly from the input image. We employ CNNs and incorporate two significant improvements to the state of the art methods: layered boosting and selective sampling. As a result, we manage both to increase the counting accuracy and to reduce processing time. Moreover, we show that the proposed method is effective, even in the presence of labeling errors. Extensive experiments on five different datasets demonstrate the efficacy and robustness of our approach. Mean Absolute error was reduced by 20 % to 35 %. At the same time, the training time of each CNN has been reduced by 50 %.

Journal ArticleDOI
TL;DR: A learning-based framework for robust and automatic nucleus segmentation with shape preservation is proposed that is applicable to different staining histopathology images and general enough to perform well across multiple scenarios.
Abstract: Computer-aided image analysis of histopathology specimens could potentially provide support for early detection and improved characterization of diseases such as brain tumor, pancreatic neuroendocrine tumor (NET), and breast cancer. Automated nucleus segmentation is a prerequisite for various quantitative analyses including automatic morphological feature computation. However, it remains to be a challenging problem due to the complex nature of histopathology images. In this paper, we propose a learning-based framework for robust and automatic nucleus segmentation with shape preservation. Given a nucleus image, it begins with a deep convolutional neural network (CNN) model to generate a probability map, on which an iterative region merging approach is performed for shape initializations. Next, a novel segmentation algorithm is exploited to separate individual nuclei combining a robust selection-based sparse shape model and a local repulsive deformable model. One of the significant benefits of the proposed framework is that it is applicable to different staining histopathology images. Due to the feature learning characteristic of the deep CNN and the high level shape prior modeling, the proposed method is general enough to perform well across multiple scenarios. We have tested the proposed algorithm on three large-scale pathology image datasets using a range of different tissue and stain preparations, and the comparative experiments with recent state of the arts demonstrate the superior performance of the proposed approach.