scispace - formally typeset
Search or ask a question

Showing papers on "Robustness (computer science) published in 2015"


Journal ArticleDOI
TL;DR: An extensive evaluation of the state-of-the-art online object-tracking algorithms with various evaluation criteria is carried out to identify effective approaches for robust tracking and provide potential future research directions in this field.
Abstract: Object tracking has been one of the most important and active research areas in the field of computer vision. A large number of tracking algorithms have been proposed in recent years with demonstrated success. However, the set of sequences used for evaluation is often not sufficient or is sometimes biased for certain types of algorithms. Many datasets do not have common ground-truth object positions or extents, and this makes comparisons among the reported quantitative results difficult. In addition, the initial conditions or parameters of the evaluated tracking algorithms are not the same, and thus, the quantitative results reported in literature are incomparable or sometimes contradictory. To address these issues, we carry out an extensive evaluation of the state-of-the-art online object-tracking algorithms with various evaluation criteria to understand how these methods perform within the same framework. In this work, we first construct a large dataset with ground-truth object positions and extents for tracking and introduce the sequence attributes for the performance analysis. Second, we integrate most of the publicly available trackers into one code library with uniform input and output formats to facilitate large-scale performance evaluation. Third, we extensively evaluate the performance of 31 algorithms on 100 sequences with different initialization settings. By analyzing the quantitative results, we identify effective approaches for robust tracking and provide potential future research directions in this field.

2,974 citations


Proceedings ArticleDOI
01 Sep 2015
TL;DR: A monocular visual-inertial odometry algorithm which achieves accurate tracking performance while exhibiting a very high level of robustness by directly using pixel intensity errors of image patches, leading to a truly power-up-and-go state estimation system.
Abstract: In this paper, we present a monocular visual-inertial odometry algorithm which, by directly using pixel intensity errors of image patches, achieves accurate tracking performance while exhibiting a very high level of robustness. After detection, the tracking of the multilevel patch features is closely coupled to the underlying extended Kalman filter (EKF) by directly using the intensity errors as innovation term during the update step. We follow a purely robocentric approach where the location of 3D landmarks are always estimated with respect to the current camera pose. Furthermore, we decompose landmark positions into a bearing vector and a distance parametrization whereby we employ a minimal representation of differences on a corresponding σ-Algebra in order to achieve better consistency and to improve the computational performance. Due to the robocentric, inverse-distance landmark parametrization, the framework does not require any initialization procedure, leading to a truly power-up-and-go state estimation system. The presented approach is successfully evaluated in a set of highly dynamic hand-held experiments as well as directly employed in the control loop of a multirotor unmanned aerial vehicle (UAV).

665 citations


Proceedings ArticleDOI
17 Dec 2015
TL;DR: This paper leverages recent progress on Convolutional Neural Networks (CNNs) and proposes a novel RGB-D architecture for object recognition that is composed of two separate CNN processing streams - one for each modality - which are consecutively combined with a late fusion network.
Abstract: Robust object recognition is a crucial ingredient of many, if not all, real-world robotics applications. This paper leverages recent progress on Convolutional Neural Networks (CNNs) and proposes a novel RGB-D architecture for object recognition. Our architecture is composed of two separate CNN processing streams - one for each modality - which are consecutively combined with a late fusion network. We focus on learning with imperfect sensor data, a typical problem in real-world robotics tasks. For accurate learning, we introduce a multi-stage training methodology and two crucial ingredients for handling depth data with CNNs. The first, an effective encoding of depth information for CNNs that enables learning without the need for large depth datasets. The second, a data augmentation scheme for robust learning with depth images by corrupting them with realistic noise patterns. We present state-of-the-art results on the RGB-D object dataset [15] and show recognition in challenging RGB-D real-world noisy settings.

629 citations


Journal ArticleDOI
TL;DR: By incorporating the fundamental and harmonic SSVEP components in target identification, the proposed FBCCA method significantly improves the performance of theSSVEP-based BCI, and thereby facilitates its practical applications such as high-speed spelling.
Abstract: Objective. Recently, canonical correlation analysis (CCA) has been widely used in steady-state visual evoked potential (SSVEP)-based brain–computer interfaces (BCIs) due to its high efficiency, robustness, and simple implementation. However, a method with which to make use of harmonic SSVEP components to enhance the CCA-based frequency detection has not been well established. Approach. This study proposed a filter bank canonical correlation analysis (FBCCA) method to incorporate fundamental and harmonic frequency components to improve the detection of SSVEPs. A 40-target BCI speller based on frequency coding (frequency range: 8–15.8 Hz, frequency interval: 0.2 Hz) was used for performance evaluation. To optimize the filter bank design, three methods (M1: sub-bands with equally spaced bandwidths; M2: sub-bands corresponding to individual harmonic frequency bands; M3: sub-bands covering multiple harmonic frequency bands) were proposed for comparison. Classification accuracy and information transfer rate (ITR) of the three FBCCA methods and the standard CCA method were estimated using an offline dataset from 12 subjects. Furthermore, an online BCI speller adopting the optimal FBCCA method was tested with a group of 10 subjects. Main results. The FBCCA methods significantly outperformed the standard CCA method. The method M3 achieved the highest classification performance. At a spelling rate of ~33.3 characters/min, the online BCI speller obtained an average ITR of 151.18 ± 20.34 bits min−1. Significance. By incorporating the fundamental and harmonic SSVEP components in target identification, the proposed FBCCA method significantly improves the performance of the SSVEP-based BCI, and thereby facilitates its practical applications such as high-speed spelling.

471 citations


Journal ArticleDOI
TL;DR: In this paper, a hybrid model for fault detection and classification of motor bearing is presented, where the permutation entropy (PE) of the vibration signal is calculated to detect the malfunctions of the bearing.

453 citations


Journal ArticleDOI
TL;DR: This paper introduces a new generalized hierarchical FCM (GHFCM), which is more robust to image noise with the spatial constraints: the generalized mean, and introduces a more flexibility function which considers the distance function itself as a sub-FCM.
Abstract: Fuzzy c-means (FCM) has been considered as an effective algorithm for image segmentation. However, it still suffers from two problems: one is insufficient robustness to image noise, and the other is the Euclidean distance in FCM, which is sensitive to outliers. In this paper, we propose two new algorithms, generalized FCM (GFCM) and hierarchical FCM (HFCM), to solve these two problems. Traditional FCM can be considered as a linear combination of membership and distance from the expression of its mathematical formula. GFCM is generated by applying generalized mean on these two items. We impose generalized mean on membership to incorporate local spatial information and cluster information, and on distance function to incorporate local spatial information and image intensity value. Thus, our GFCM is more robust to image noise with the spatial constraints: the generalized mean. To solve the second problem caused by Euclidean distance (l2 norm), we introduce a more flexibility function which considers the distance function itself as a sub-FCM. Furthermore, the sub-FCM distance function in HFCM is general and flexible enough to deal with non-Euclidean data. Finally, we combine these two algorithms to introduce a new generalized hierarchical FCM (GHFCM). Experimental results demonstrate the improved robustness and effectiveness of the proposed algorithm.

434 citations


Journal ArticleDOI
TL;DR: This paper describes the latest progress of ELM in recent years, including the model and specific applications of ELm, and finally points out the research and development prospects ofELM in the future.
Abstract: Extreme learning machine (ELM) is a new learning algorithm for the single hidden layer feedforward neural networks Compared with the conventional neural network learning algorithm it overcomes the slow training speed and over-fitting problems ELM is based on empirical risk minimization theory and its learning process needs only a single iteration The algorithm avoids multiple iterations and local minimization It has been used in various fields and applications because of better generalization ability, robustness, and controllability and fast learning rate In this paper, we make a review of ELM latest research progress about the algorithms, theory and applications It first analyzes the theory and the algorithm ideas of ELM, then tracking describes the latest progress of ELM in recent years, including the model and specific applications of ELM, finally points out the research and development prospects of ELM in the future

429 citations


Posted Content
TL;DR: This article presents a didactic example of Structural Equation Modeling using the software SmartPLS 2.0 M3.0 using the method of Partial Least Squares to address the following situations frequently observed in marketing research.
Abstract: The objective of this article is to present a didactic example of Structural Equation Modeling using the software SmartPLS 2.0 M3. The program mentioned uses the method of Partial Least Squares and seeks to address the following situations frequently observed in marketing research: Absence of symmetric distributions of variables measured by a theory still in its beginning phase or with little “consolidation”, formative models, and/or a limited amount of data. The growing use of SmartPLS has demonstrated its robustness and the applicability of the model in the areas that are being studied.

419 citations


Posted Content
TL;DR: CatGAN as mentioned in this paper is based on an objective function that trades-off mutual information between observed examples and their predicted categorical class distribution, against robustness of the classifier to an adversarial generative model.
Abstract: In this paper we present a method for learning a discriminative classifier from unlabeled or partially labeled data. Our approach is based on an objective function that trades-off mutual information between observed examples and their predicted categorical class distribution, against robustness of the classifier to an adversarial generative model. The resulting algorithm can either be interpreted as a natural generalization of the generative adversarial networks (GAN) framework or as an extension of the regularized information maximization (RIM) framework to robust classification against an optimal adversary. We empirically evaluate our method - which we dub categorical generative adversarial networks (or CatGAN) - on synthetic data as well as on challenging image classification tasks, demonstrating the robustness of the learned classifiers. We further qualitatively assess the fidelity of samples generated by the adversarial generator that is learned alongside the discriminative classifier, and identify links between the CatGAN objective and discriminative clustering algorithms (such as RIM).

407 citations


Proceedings Article
01 Jan 2015
TL;DR: This article proposed a generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency, where the notion of similarity is between deep network features computed from the input data.
Abstract: Current state-of-the-art deep learning systems for visual object recognition and detection use purely supervised training with regularization such as dropout to avoid overfitting. The performance depends critically on the amount of labeled examples, and in current practice the labels are assumed to be unambiguous and accurate. However, this assumption often does not hold; e.g. in recognition, class labels may be missing; in detection, objects in the image may not be localized; and in general, the labeling may be subjective. In this work we propose a generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency. We consider a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data. In experiments we demonstrate that our approach yields substantial robustness to label noise on several datasets. On MNIST handwritten digits, we show that our model is robust to label corruption. On the Toronto Face Database, we show that our model handles well the case of subjective labels in emotion recognition, achieving state-of-theart results, and can also benefit from unlabeled face images with no modification to our method. On the ILSVRC2014 detection challenge data, we show that our approach extends to very deep networks, high resolution images and structured outputs, and results in improved scalable detection.

377 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed the Multi-view Intact Space Learning (MISL) algorithm, which integrates the encoded complementary information in multiple views to discover a latent intact representation of the data.
Abstract: It is practical to assume that an individual view is unlikely to be sufficient for effective multi-view learning Therefore, integration of multi-view information is both valuable and necessary In this paper, we propose the Multi-view Intact Space Learning (MISL) algorithm, which integrates the encoded complementary information in multiple views to discover a latent intact representation of the data Even though each view on its own is insufficient, we show theoretically that by combing multiple views we can obtain abundant information for latent intact space learning Employing the Cauchy loss (a technique used in statistical learning) as the error measurement strengthens robustness to outliers We propose a new definition of multi-view stability and then derive the generalization error bound based on multi-view stability and Rademacher complexity, and show that the complementarity between multiple views is beneficial for the stability and generalization MISL is efficiently optimized using a novel Iteratively Reweight Residuals (IRR) technique, whose convergence is theoretically analyzed Experiments on synthetic data and real-world datasets demonstrate that MISL is an effective and promising algorithm for practical applications

Proceedings ArticleDOI
07 Jun 2015
TL;DR: This paper proposes a novel approach to correlation filter estimation that takes advantage of inherent computational redundancies in the frequency domain, dramatically reduces boundary effects, and is able to implicitly exploit all possible patches densely extracted from training examples during learning process.
Abstract: Correlation filters take advantage of specific properties in the Fourier domain allowing them to be estimated efficiently: O(N D log D) in the frequency domain, versus O(D3 + N D2) spatially where D is signal length, and N is the number of signals. Recent extensions to correlation filters, such as MOSSE, have reignited interest of their use in the vision community due to their robustness and attractive computational properties. In this paper we demonstrate, however, that this computational efficiency comes at a cost. Specifically, we demonstrate that only 1/D proportion of shifted examples are unaffected by boundary effects which has a dramatic effect on detection/tracking performance. In this paper, we propose a novel approach to correlation filter estimation that: (i) takes advantage of inherent computational redundancies in the frequency domain, (ii) dramatically reduces boundary effects, and (iii) is able to implicitly exploit all possible patches densely extracted from training examples during learning process. Impressive object tracking and detection results are presented in terms of both accuracy and computational efficiency.

Posted Content
TL;DR: A novel Aggregated Local Flow Descriptor (ALFD) that encodes the relative motion pattern between a pair of temporally distant detections using long term interest point trajectories (IPTs) and ablative analysis verifies the superiority of the ALFD metric over the other conventional affinity metrics.
Abstract: In this paper, we focus on the two key aspects of multiple target tracking problem: 1) designing an accurate affinity measure to associate detections and 2) implementing an efficient and accurate (near) online multiple target tracking algorithm. As the first contribution, we introduce a novel Aggregated Local Flow Descriptor (ALFD) that encodes the relative motion pattern between a pair of temporally distant detections using long term interest point trajectories (IPTs). Leveraging on the IPTs, the ALFD provides a robust affinity measure for estimating the likelihood of matching detections regardless of the application scenarios. As another contribution, we present a Near-Online Multi-target Tracking (NOMT) algorithm. The tracking problem is formulated as a data-association between targets and detections in a temporal window, that is performed repeatedly at every frame. While being efficient, NOMT achieves robustness via integrating multiple cues including ALFD metric, target dynamics, appearance similarity, and long term trajectory regularization into the model. Our ablative analysis verifies the superiority of the ALFD metric over the other conventional affinity metrics. We run a comprehensive experimental evaluation on two challenging tracking datasets, KITTI and MOT datasets. The NOMT method combined with ALFD metric achieves the best accuracy in both datasets with significant margins (about 10% higher MOTA) over the state-of-the-arts.

Proceedings ArticleDOI
07 Dec 2015
TL;DR: In this paper, a novel Aggregated Local Flow Descriptor (ALFD) is proposed to encode the relative motion pattern between a pair of temporally distant detections using long term interest point trajectories (IPTs).
Abstract: In this paper, we tackle two key aspects of multiple target tracking problem: 1) designing an accurate affinity measure to associate detections and 2) implementing an efficient and accurate (near) online multiple target tracking algorithm. As for the first contribution, we introduce a novel Aggregated Local Flow Descriptor (ALFD) that encodes the relative motion pattern between a pair of temporally distant detections using long term interest point trajectories (IPTs). Leveraging on the IPTs, the ALFD provides a robust affinity measure for estimating the likelihood of matching detections regardless of the application scenarios. As for another contribution, we present a Near-Online Multi-target Tracking (NOMT) algorithm. The tracking problem is formulated as a data-association between targets and detections in a temporal window, that is performed repeatedly at every frame. While being efficient, NOMT achieves robustness via integrating multiple cues including ALFD metric, target dynamics, appearance similarity, and long term trajectory regularization into the model. Our ablative analysis verifies the superiority of the ALFD metric over the other conventional affinity metrics. We run a comprehensive experimental evaluation on two challenging tracking datasets, KITTI [16] and MOT [2] datasets. The NOMT method combined with ALFD metric achieves the best accuracy in both datasets with significant margins (about 10% higher MOTA) over the state-of-the-art.

Journal ArticleDOI
TL;DR: A new algorithm for the accurate detection and localization of copy-move forgeries, based on rotation-invariant features computed densely on the image, is proposed, using a fast approximate nearest-neighbor search algorithm, PatchMatch, especially suited for the computation of dense fields over images.
Abstract: We propose a new algorithm for the accurate detection and localization of copy–move forgeries, based on rotation-invariant features computed densely on the image. Dense-field techniques proposed in the literature guarantee a superior performance with respect to their keypoint-based counterparts, at the price of a much higher processing time, mostly due to the feature matching phase. To overcome this limitation, we resort here to a fast approximate nearest-neighbor search algorithm, PatchMatch, especially suited for the computation of dense fields over images. We adapt the matching algorithm to deal efficiently with invariant features, so as to achieve higher robustness with respect to rotations and scale changes. Moreover, leveraging on the smoothness of the output field, we implement a simplified and reliable postprocessing procedure. The experimental analysis, conducted on databases available online, proves the proposed technique to be at least as accurate, generally more robust, and typically much faster than the state-of-the-art dense-field references.

Journal ArticleDOI
Jun Zhang1, Xiao Chen1, Yang Xiang1, Wanlei Zhou1, Jie Wu2 
TL;DR: The proposed RTC scheme has the capability of identifying the traffic of zero-day applications as well as accurately discriminating predefined application classes and is significantly better than four state-of-the-art methods.
Abstract: As a fundamental tool for network management and security, traffic classification has attracted increasing attention in recent years. A significant challenge to the robustness of classification performance comes from zero-day applications previously unknown in traffic classification systems. In this paper, we propose a new scheme of Robust statistical Traffic Classification (RTC) by combining supervised and unsupervised machine learning techniques to meet this challenge. The proposed RTC scheme has the capability of identifying the traffic of zero-day applications as well as accurately discriminating predefined application classes. In addition, we develop a new method for automating the RTC scheme parameters optimization process. The empirical study on real-world traffic data confirms the effectiveness of the proposed scheme. When zero-day applications are present, the classification performance of the new scheme is significantly better than four state-of-the-art methods: random forest, correlation-based classification, semi-supervised clustering, and one-class SVM.

Proceedings ArticleDOI
07 Sep 2015
TL;DR: This paper presents DeepEar -- the first mobile audio sensing framework built from coupled Deep Neural Networks (DNNs) that simultaneously perform common audio sensing tasks and shows DeepEar is feasible for smartphones by building a cloud-free DSP-based prototype that runs continuously, using only 6% of the smartphone's battery daily.
Abstract: Microphones are remarkably powerful sensors of human behavior and context. However, audio sensing is highly susceptible to wild fluctuations in accuracy when used in diverse acoustic environments (such as, bedrooms, vehicles, or cafes), that users encounter on a daily basis. Towards addressing this challenge, we turn to the field of deep learning; an area of machine learning that has radically changed related audio modeling domains like speech recognition. In this paper, we present DeepEar -- the first mobile audio sensing framework built from coupled Deep Neural Networks (DNNs) that simultaneously perform common audio sensing tasks. We train DeepEar with a large-scale dataset including unlabeled data from 168 place visits. The resulting learned model, involving 2.3M parameters, enables DeepEar to significantly increase inference robustness to background noise beyond conventional approaches present in mobile devices. Finally, we show DeepEar is feasible for smartphones by building a cloud-free DSP-based prototype that runs continuously, using only 6% of the smartphone's battery daily.

Journal ArticleDOI
TL;DR: In this article, an adaptive robust optimization model for multi-period economic dispatch, and methods to construct such sets to model temporal and spatial correlations of uncertainty, are presented to deal with uncertainty caused by the highly intermittent and uncertain wind power becomes a significant issue.
Abstract: The exceptional benefits of wind power as an environmentally responsible renewable energy resource have led to an increasing penetration of wind energy in today's power systems. This trend has started to reshape the paradigms of power system operations, as dealing with uncertainty caused by the highly intermittent and uncertain wind power becomes a significant issue. Motivated by this, we present a new framework using adaptive robust optimization for the economic dispatch of power systems with high level of wind penetration. In particular, we propose an adaptive robust optimization model for multi-period economic dispatch, and introduce the concept of dynamic uncertainty sets and methods to construct such sets to model temporal and spatial correlations of uncertainty. We also develop a simulation platform which combines the proposed robust economic dispatch model with statistical prediction tools in a rolling horizon framework. We have conducted extensive computational experiments on this platform using real wind data. The results are promising and demonstrate the benefits of our approach in terms of cost and reliability over existing robust optimization models as well as recent look-ahead dispatch models.

Posted Content
TL;DR: A new and simple way of finding adversarial examples is presented and experimentally shown to be efficient and greatly improves the robustness of the classification models produced.
Abstract: The robustness of neural networks to intended perturbations has recently attracted significant attention. In this paper, we propose a new method, \emph{learning with a strong adversary}, that learns robust classifiers from supervised data. The proposed method takes finding adversarial examples as an intermediate step. A new and simple way of finding adversarial examples is presented and experimentally shown to be efficient. Experimental results demonstrate that resulting learning method greatly improves the robustness of the classification models produced.

Journal ArticleDOI
TL;DR: It is shown that when used in a non-conventional regime, STT-MTJs can additionally act as a stochastic memristive device, appropriate to implement a “synaptic” function in robust, low power, cognitive-type systems.
Abstract: Spin-transfer torque magnetic memory (STT-MRAM) is currently under intense academic and industrial development, since it features non-volatility, high write and read speed and high endurance. In this work, we show that when used in a non-conventional regime, it can additionally act as a stochastic memristive device, appropriate to implement a “synaptic” function. We introduce basic concepts relating to spin-transfer torque magnetic tunnel junction (STT-MTJ, the STT-MRAM cell) behavior and its possible use to implement learning-capable synapses. Three programming regimes (low, intermediate and high current) are identified and compared. System-level simulations on a task of vehicle counting highlight the potential of the technology for learning systems. Monte Carlo simulations show its robustness to device variations. The simulations also allow comparing system operation when the different programming regimes of STT-MTJs are used. In comparison to the high and low current regimes, the intermediate current regime allows minimization of energy consumption, while retaining a high robustness to device variations. These results open the way for unexplored applications of STT-MTJs in robust, low power, cognitive-type systems.

Journal ArticleDOI
TL;DR: A novel Robust Structured Subspace Learning (RSSL) algorithm by integrating image understanding and feature learning into a joint learning framework is proposed, and the learned subspace is adopted as an intermediate space to reduce the semantic gap between the low-level visual features and the high-level semantics.
Abstract: To uncover an appropriate latent subspace for data representation, in this paper we propose a novel Robust Structured Subspace Learning (RSSL) algorithm by integrating image understanding and feature learning into a joint learning framework. The learned subspace is adopted as an intermediate space to reduce the semantic gap between the low-level visual features and the high-level semantics. To guarantee the subspace to be compact and discriminative, the intrinsic geometric structure of data, and the local and global structural consistencies over labels are exploited simultaneously in the proposed algorithm. Besides, we adopt the $\ell _{2,1}$ -norm for the formulations of loss function and regularization respectively to make our algorithm robust to the outliers and noise. An efficient algorithm is designed to solve the proposed optimization problem. It is noted that the proposed framework is a general one which can leverage several well-known algorithms as special cases and elucidate their intrinsic relationships. To validate the effectiveness of the proposed method, extensive experiments are conducted on diversity datasets for different image understanding tasks, i.e., image tagging, clustering, and classification, and the more encouraging results are achieved compared with some state-of-the-art approaches.

Journal ArticleDOI
TL;DR: Based on eCMP, VRP, and DCM, methods for real-time planning and tracking control of DCM trajectories in 3-D are presented and the robustness of the proposed control framework is examined.
Abstract: In this paper, the concept of divergent component of motion (DCM, also called “Capture Point”) is extended to 3-D. We introduce the “Enhanced Centroidal Moment Pivot point” (eCMP) and the “Virtual Repellent Point” (VRP), which allow for the encoding of both direction and magnitude of the external forces and the total force (i.e., external plus gravitational forces) acting on the robot. Based on eCMP, VRP, and DCM, we present methods for real-time planning and tracking control of DCM trajectories in 3-D. The basic DCM trajectory generator is extended to produce continuous leg force profiles and to facilitate the use of toe-off motion during double support. The robustness of the proposed control framework is thoroughly examined, and its capabilities are verified both in simulations and experiments.

Journal Article
TL;DR: The empirical results show that the CRM objective implemented in POEM provides improved robustness and generalization performance compared to the state-of-the-art, and a decomposition of the POEM objective that enables efficient stochastic gradient optimization is presented.
Abstract: We develop a learning principle and an efficient algorithm for batch learning from logged bandit feedback. This learning setting is ubiquitous in online systems (e.g., ad placement, web search, recommendation), where an algorithm makes a prediction (e.g., ad ranking) for a given input (e.g., query) and observes bandit feedback (e.g., user clicks on presented ads). We first address the counterfactual nature of the learning problem (Bottou et al., 2013) through propensity scoring. Next, we prove generalization error bounds that account for the variance of the propensity-weighted empirical risk estimator. In analogy to the Structural Risk Minimization principle of Wapnik and Tscherwonenkis (1979), these constructive bounds give rise to the Counterfactual Risk Minimization (CRM) principle. We show how CRM can be used to derive a new learning method--called Policy Optimizer for Exponential Models (POEM)--for learning stochastic linear rules for structured output prediction. We present a decomposition of the POEM objective that enables efficient stochastic gradient optimization. The effectiveness and efficiency of POEM is evaluated on several simulated multi-label classification problems, as well as on a real-world information retrieval problem. The empirical results show that the CRM objective implemented in POEM provides improved robustness and generalization performance compared to the state-of-the-art.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: A learning-based approach that can learn from a large amount of simulated noisy and reverberant microphone array inputs for robust DOA estimation and uses a multilayer perceptron neural network to learn the nonlinear mapping from such features to the DOA.
Abstract: This paper presents a learning-based approach to the task of direction of arrival estimation (DOA) from microphone array input. Traditional signal processing methods such as the classic least square (LS) method rely on strong assumptions on signal models and accurate estimations of time delay of arrival (TDOA) . They only work well in relatively clean conditions, but suffer from noise and reverberation distortions. In this paper, we propose a learning-based approach that can learn from a large amount of simulated noisy and reverberant microphone array inputs for robust DOA estimation. Specifically, we extract features from the generalised cross correlation (GCC) vectors and use a multilayer perceptron neural network to learn the nonlinear mapping from such features to the DOA. One advantage of the learning based method is that as more and more training data becomes available, the DOA estimation will become more and more accurate. Experimental results on simulated data show that the proposed learning based method produces much better results than the state-of-the-art LS method. The testing results on real data recorded in meeting rooms show improved root-mean-square error (RMSE) compared to the LS method.

Journal ArticleDOI
TL;DR: Tackling software data issues, including redundancy, correlation, feature irrelevance and missing samples, with the proposed combined learning model resulted in remarkable classification performance paving the way for successful quality control.
Abstract: Context Several issues hinder software defect data including redundancy, correlation, feature irrelevance and missing samples. It is also hard to ensure balanced distribution between data pertaining to defective and non-defective software. In most experimental cases, data related to the latter software class is dominantly present in the dataset. Objective The objectives of this paper are to demonstrate the positive effects of combining feature selection and ensemble learning on the performance of defect classification. Along with efficient feature selection, a new two-variant (with and without feature selection) ensemble learning algorithm is proposed to provide robustness to both data imbalance and feature redundancy. Method We carefully combine selected ensemble learning models with efficient feature selection to address these issues and mitigate their effects on the defect classification performance. Results Forward selection showed that only few features contribute to high area under the receiver-operating curve (AUC). On the tested datasets, greedy forward selection (GFS) method outperformed other feature selection techniques such as Pearson’s correlation. This suggests that features are highly unstable. However, ensemble learners like random forests and the proposed algorithm, average probability ensemble (APE), are not as affected by poor features as in the case of weighted support vector machines (W-SVMs). Moreover, the APE model combined with greedy forward selection (enhanced APE) achieved AUC values of approximately 1.0 for the NASA datasets: PC2, PC4, and MC1. Conclusion This paper shows that features of a software dataset must be carefully selected for accurate classification of defective components. Furthermore, tackling the software data issues, mentioned above, with the proposed combined learning model resulted in remarkable classification performance paving the way for successful quality control.

Journal ArticleDOI
TL;DR: The rPPG method developed in this study has a performance that is very close to that of the contact-based sensor under realistic situations, while its computational efficiency allows real-time processing on an off-the-shelf computer.
Abstract: Remote photoplethysmography (rPPG) techniques can measure cardiac activity by detecting pulse-induced color variations on human skin using an RGB camera. State-of-the-art rPPG methods are sensitive to subject body motions (e.g., motion-induced color distortions). This study proposes a novel framework to improve the motion robustness of rPPG. The basic idea of this paper originates from the observation that a camera can simultaneously sample multiple skin regions in parallel, and each of them can be treated as an independent sensor for pulse measurement. The spatial redundancy of an image sensor can thus be exploited to distinguish the pulse signal from motion-induced noise. To this end, the pixel-based rPPG sensors are constructed to estimate a robust pulse signal using motion-compensated pixel-to-pixel pulse extraction, spatial pruning, and temporal filtering. The evaluation of this strategy is not based on a full clinical trial, but on 36 challenging benchmark videos consisting of subjects that differ in gender, skin types, and performed motion categories. Experimental results show that the proposed method improves the SNR of the state-of-the-art rPPG technique from 3.34 to 6.76 dB, and the agreement ( $\pm 1.96\sigma$ ) with instantaneous reference pulse rate from 55% to 80% correct. ANOVA with post hoc comparison shows that the improvement on motion robustness is significant. The rPPG method developed in this study has a performance that is very close to that of the contact-based sensor under realistic situations, while its computational efficiency allows real-time processing on an off-the-shelf computer.

Journal ArticleDOI
TL;DR: This paper proposed a taxonomy of robustness frameworks to compare and contrast these approaches based on their methods of alternative generation, sampling of states of the world, quantifying robustness measures, and sensitivity analysis to identify important uncertainties.
Abstract: Water systems planners have long recognized the need for robust solutions capable of withstanding deviations from the conditions for which they were designed. Robustness analyses have shifted from expected utility to exploratory bottom-up approaches which identify vulnerable scenarios prior to assigning likelihoods. Examples include Robust Decision Making (RDM), Decision Scaling, Info-Gap, and Many-Objective Robust Decision Making (MORDM). We propose a taxonomy of robustness frameworks to compare and contrast these approaches based on their methods of (1) alternative generation, (2) sampling of states of the world, (3) quantification of robustness measures, and (4) sensitivity analysis to identify important uncertainties. Building from the proposed taxonomy, we use a regional urban water supply case study in the Research Triangle region of North Carolina to illustrate the decision-relevant consequences that emerge from each of these choices. Results indicate that the methodological choices in the ...

Journal ArticleDOI
TL;DR: This paper provides an efficient EMR-SLRA optimization procedure to obtain the output feature embedding and experiments on the pattern recognition applications confirm the effectiveness of the EMR -SLRA algorithm compare with some other multiview feature dimensionality reduction approaches.

Journal ArticleDOI
TL;DR: In this article, the authors compare and classify multiple Fourier ptychography inverse algorithms in terms of experimental robustness and find that the main sources of error are noise, aberrations and mis-calibration (i.e. model mis-match).
Abstract: Fourier ptychography is a new computational microscopy technique that provides gigapixel-scale intensity and phase images with both wide field-of-view and high resolution. By capturing a stack of low-resolution images under different illumination angles, an inverse algorithm can be used to computationally reconstruct the high-resolution complex field. Here, we compare and classify multiple proposed inverse algorithms in terms of experimental robustness. We find that the main sources of error are noise, aberrations and mis-calibration (i.e. model mis-match). Using simulations and experiments, we demonstrate that the choice of cost function plays a critical role, with amplitude-based cost functions performing better than intensity-based ones. The reason for this is that Fourier ptychography datasets consist of images from both brightfield and darkfield illumination, representing a large range of measured intensities. Both noise (e.g. Poisson noise) and model mis-match errors are shown to scale with intensity. Hence, algorithms that use an appropriate cost function will be more tolerant to both noise and model mis-match. Given these insights, we propose a global Newton’s method algorithm which is robust and accurate. Finally, we discuss the impact of procedures for algorithmic correction of aberrations and mis-calibration.

Journal ArticleDOI
TL;DR: A novel RL-based robust adaptive control algorithm is developed for a class of continuous-time uncertain nonlinear systems subject to input constraints that is converted to the constrained optimal control problem with appropriately selecting value functions for the nominal system.
Abstract: The design of stabilizing controller for uncertain nonlinear systems with control constraints is a challenging problem The constrained-input coupled with the inability to identify accurately the uncertainties motivates the design of stabilizing controller based on reinforcement-learning (RL) methods In this paper, a novel RL-based robust adaptive control algorithm is developed for a class of continuous-time uncertain nonlinear systems subject to input constraints The robust control problem is converted to the constrained optimal control problem with appropriately selecting value functions for the nominal system Distinct from typical action-critic dual networks employed in RL, only one critic neural network (NN) is constructed to derive the approximate optimal control Meanwhile, unlike initial stabilizing control often indispensable in RL, there is no special requirement imposed on the initial control By utilizing Lyapunov’s direct method, the closed-loop optimal control system and the estimated weights of the critic NN are proved to be uniformly ultimately bounded In addition, the derived approximate optimal control is verified to guarantee the uncertain nonlinear system to be stable in the sense of uniform ultimate boundedness Two simulation examples are provided to illustrate the effectiveness and applicability of the present approach