Showing papers on "Statistical learning theory published in 2019"

PDF

Open Access

Journal Article•DOI•

Machine Learning of Coarse-Grained Molecular Dynamics Force Fields.

[...]

Jiang Wang¹, Simon Olsson², Christoph Wehmeyer², Adrià Pérez³, Nicholas E. Charron¹, Gianni De Fabritiis³, Gianni De Fabritiis⁴, Frank Noé¹, Frank Noé², Cecilia Clementi - Show less +6 more•Institutions (4)

Rice University¹, Free University of Berlin², Pompeu Fabra University³, Catalan Institution for Research and Advanced Studies⁴

15 Apr 2019-ACS central science

TL;DR: In this article, the authors reformulate coarse-graining as a supervised machine learning problem and use statistical learning theory to decompose the coarsegraining error and cross-validation to select and compare the performance of different models.

...read moreread less

Abstract: Atomistic or ab initio molecular dynamics simulations are widely used to predict thermodynamics and kinetics and relate them to molecular structure. A common approach to go beyond the time- and length-scales accessible with such computationally expensive simulations is the definition of coarse-grained molecular models. Existing coarse-graining approaches define an effective interaction potential to match defined properties of high-resolution models or experimental data. In this paper, we reformulate coarse-graining as a supervised machine learning problem. We use statistical learning theory to decompose the coarse-graining error and cross-validation to select and compare the performance of different models. We introduce CGnets, a deep learning approach, that learns coarse-grained free energy functions and can be trained by a force-matching scheme. CGnets maintain all physically relevant invariances and allow one to incorporate prior physics knowledge to avoid sampling of unphysical structures. We show tha...

...read moreread less

298 citations

Journal Article•DOI•

The Big Data Newsvendor: Practical Insights from Machine Learning

[...]

Gah‐Yi Ban¹, Cynthia Rudin²•Institutions (2)

London Business School¹, Duke University²

01 Jan 2019-Operations Research

TL;DR: Ban and Rudin this article take an innovative machine learning approach to a classic problem solved by almost every news vendor, and apply it to the Big Data Newsvendor Problem (BNN).

...read moreread less

Abstract: In Ban and Rudin’s (2018) “The Big Data Newsvendor: Practical Insights from Machine Learning,” the authors take an innovative machine-learning approach to a classic problem solved by almost every c...

...read moreread less

222 citations

Journal Article•DOI•

Advances and opportunities in machine learning for process data analytics

[...]

S. Joe Qin¹, S. Joe Qin², Leo H. Chiang³•Institutions (3)

The Chinese University of Hong Kong¹, University of Southern California², Dow Chemical Company³

12 Jul 2019-Computers & Chemical Engineering

TL;DR: The current thrust of development in machine learning and artificial intelligence, fueled by advances in statistical learning theory over the last 20 years and commercial successes by leading big data companies, is introduced.

...read moreread less

184 citations

Proceedings Article•DOI•

Learning to Reconstruct: Statistical Learning Theory and Encrypted Database Attacks

[...]

Paul Grubbs¹, Marie-Sarah Lacharité², Brice Minaud³, Kenneth G. Paterson²•Institutions (3)

Cornell University¹, Royal Holloway, University of London², École Normale Supérieure³

20 May 2019

TL;DR: This work addresses the problem of ε-approximate database reconstruction (ε-ADR) from range query leakage, giving attacks whose query cost scales only with the relative error ε, and is independent of the size of the database, or the number N of possible values of data items.

...read moreread less

Abstract: We show that the problem of reconstructing encrypted databases from access pattern leakage is closely related to statistical learning theory. This new viewpoint enables us to develop broader attacks that are supported by streamlined performance analyses. First, we address the problem of e-approximate database reconstruction (e-ADR) from range query leakage, giving attacks whose query cost scales only with the relative error e, and is independent of the size of the database, or the number N of possible values of data items. This already goes significantly beyond the state-of-the-art for such attacks, as represented by Kellaris et al. (ACM CCS 2016) and Lacharite et al. (IEEE SP using real data, we show that devastatingly small numbers of queries are needed to attain very accurate database reconstruction. Finally, we generalize from ranges to consider what learning theory tells us about the impact of access pattern leakage for other classes of queries, focusing on prefix and suffix queries. We illustrate this with both concrete attacks for prefix queries and with a general lower bound for all query classes. We also show a very general reduction from reconstruction with known or chosen queries to PAC learning.

...read moreread less

97 citations

Book•

Machine Learning: A Practical Approach on the Statistical Learning Theory

[...]

Rodrigo Fernandes de Mello, Moacir Antonelli Ponti

01 Feb 2019

66 citations

Journal Article•DOI•

Efficient Training for Positive Unlabeled Learning

[...]

Emanuele Sansone¹, Francesco G. B. De Natale¹, Zhi-Hua Zhou²•Institutions (2)

University of Trento¹, Nanjing University²

01 Nov 2019-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel scalable PU learning algorithm that is theoretically proven to provide the optimal solution, while showing superior computational and memory performance, is proposed and successfully applied to a large variety of real-world problems involving PU learning.

...read moreread less

Abstract: Positive unlabeled (PU) learning is useful in various practical situations, where there is a need to learn a classifier for a class of interest from an unlabeled data set, which may contain anomalies as well as samples from unknown classes. The learning task can be formulated as an optimization problem under the framework of statistical learning theory. Recent studies have theoretically analyzed its properties and generalization performance, nevertheless, little effort has been made to consider the problem of scalability, especially when large sets of unlabeled data are available. In this work we propose a novel scalable PU learning algorithm that is theoretically proven to provide the optimal solution, while showing superior computational and memory performance. Experimental evaluation confirms the theoretical evidence and shows that the proposed method can be successfully applied to a large variety of real-world problems involving PU learning.

...read moreread less

58 citations

Journal Article•DOI•

On learning guarantees to unsupervised concept drift detection on data streams

[...]

Rodrigo Fernandes de Mello¹, Yule Vaz¹, Carlos H. Grossi¹, Albert Bifet²•Institutions (2)

University of São Paulo¹, Télécom ParisTech²

01 Mar 2019-Expert Systems With Applications

TL;DR: The Algorithmic Stability framework is relied on to prove learning bounds for the unsupervised concept drift detection on data streams, and the Plover algorithm is designed to detect drifts using different measure functions, such as Statistical Moments and the Power Spectrum.

...read moreread less

Abstract: Motivated by the Statistical Learning Theory (SLT), which provides a theoretical framework to ensure when supervised learning algorithms generalize input data, this manuscript relies on the Algorithmic Stability framework to prove learning bounds for the unsupervised concept drift detection on data streams. Based on such proof, we also designed the Plover algorithm to detect drifts using different measure functions, such as Statistical Moments and the Power Spectrum. In this way, the criterion for issuing data changes can also be adapted to better address the target task. From synthetic and real-world scenarios, we observed that each data stream may require a different measure function to identify concept drifts, according to the underlying characteristics of the corresponding application domain. In addition, we discussed about the differences of our approach against others from literature, and showed illustrative results confirming the usefulness of our proposal.

...read moreread less

53 citations

Journal Article•DOI•

Preference disaggregation within the regularization framework for sorting problems with multiple potentially non-monotonic criteria

[...]

Jiapeng Liu¹, Xiuwu Liao¹, Miłosz Kadziński², Roman Słowiński³, Roman Słowiński² - Show less +1 more•Institutions (3)

Xi'an Jiaotong University¹, Poznań University of Technology², Polish Academy of Sciences³

01 Aug 2019-European Journal of Operational Research

TL;DR: By accounting for the trade-off between model’s complexity and fitting ability, the proposed approach avoids the problem of over-fitting and enhances the generalization ability to non-reference alternatives and belongs to the family of preference disaggregation approaches.

...read moreread less

51 citations

Journal Article•DOI•

Rethinking statistical learning theory: learning using statistical invariants

[...]

Vladimir Vapnik¹, Vladimir Vapnik², Rauf Izmailov•Institutions (2)

Columbia University¹, Royal Holloway, University of London²

01 Mar 2019-Machine Learning

TL;DR: The LUSI paradigm, in order to construct the desired classification function, a learning machine computes statistical invariants that are specific for the problem, and then minimizes the expected error in a way that preserves these invariants; it is thus both data- and invariant-driven learning.

...read moreread less

Abstract: This paper introduces a new learning paradigm, called Learning Using Statistical Invariants (LUSI), which is different from the classical one. In a classical paradigm, the learning machine constructs a classification rule that minimizes the probability of expected error; it is data-driven model of learning. In the LUSI paradigm, in order to construct the desired classification function, a learning machine computes statistical invariants that are specific for the problem, and then minimizes the expected error in a way that preserves these invariants; it is thus both data- and invariant-driven learning. From a mathematical point of view, methods of the classical paradigm employ mechanisms of strong convergence of approximations to the desired function, whereas methods of the new paradigm employ both strong and weak convergence mechanisms. This can significantly increase the rate of convergence.

...read moreread less

48 citations

Posted Content•

On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex Learning

[...]

Jian Li¹, Xuanyuan Luo, Mingda Qiao²•Institutions (2)

Tsinghua University¹, Stanford University²

02 Feb 2019-arXiv: Learning

TL;DR: A new framework, termed Bayes-Stability, is developed for proving algorithm-dependent generalization error bounds for learning general non-convex objectives and it is demonstrated that the data-dependent bounds can distinguish randomly labelled data from normal data.

...read moreread less

Abstract: Generalization error (also known as the out-of-sample error) measures how well the hypothesis learned from training data generalizes to previously unseen data. Proving tight generalization error bounds is a central question in statistical learning theory. In this paper, we obtain generalization error bounds for learning general non-convex objectives, which has attracted significant attention in recent years. We develop a new framework, termed Bayes-Stability, for proving algorithm-dependent generalization error bounds. The new framework combines ideas from both the PAC-Bayesian theory and the notion of algorithmic stability. Applying the Bayes-Stability method, we obtain new data-dependent generalization bounds for stochastic gradient Langevin dynamics (SGLD) and several other noisy gradient methods (e.g., with momentum, mini-batch and acceleration, Entropy-SGD). Our result recovers (and is typically tighter than) a recent result in Mou et al. (2018) and improves upon the results in Pensia et al. (2018). Our experiments demonstrate that our data-dependent bounds can distinguish randomly labelled data from normal data, which provides an explanation to the intriguing phenomena observed in Zhang et al. (2017a). We also study the setting where the total loss is the sum of a bounded loss and an additional \ell_2 regularization term. We obtain new generalization bounds for the continuous Langevin dynamic in this setting by developing a new Log-Sobolev inequality for the parameter distribution at any time. Our new bounds are more desirable when the noisy level of the process is not small, and do not become vacuous even when T tends to infinity.

...read moreread less

45 citations

Journal Article•DOI•

A framework for semi-supervised metric transfer learning on manifolds

[...]

Rakesh Kumar Sanodiya¹, Jimson Mathew¹•Institutions (1)

Indian Institute of Technology Patna¹

15 Jul 2019-Knowledge Based Systems

TL;DR: A Semi-Supervised Metric Transfer Learning framework called SSMT is proposed that reduces the distribution between domains both statistically and geometrically by learning the instance weights, while a regularized distance metric is learned to minimize the within- class co-variance and maximize the between-class co-Variance simultaneously for the target domain.

...read moreread less

Abstract: A common assumption of statistical learning theory is that the training and testing data are drawn from the same distribution However, in many real-world applications, this assumption does not hold true Hence, a realistic strategy, Cross Domain Adaptation (DA) or Transfer Learning (TA), can be used to employ previously labelled source domain data to boost the task in the new target domain Previously, Cross Domain Adaptation methods have been focused on re-weighting the instances or aligning the cross-domain distributions However, these methods have two significant challenges: (1) There is no proper consideration of the unlabelled data of target task as in the real-world, an abundant amount of unlabelled data is available, (2) The use of normal Euclidean distance function fails to capture the appropriate similarity or dissimilarity between samples To deal with this issue, we have proposed a Semi-Supervised Metric Transfer Learning framework called SSMT that reduces the distribution between domains both statistically and geometrically by learning the instance weights, while a regularized distance metric is learned to minimize the within-class co-variance and maximize the between-class co-variance simultaneously for the target domain Compared with the previous works where Mahalanobis distance metric and instance weights are learned by using the labelled data or in a pipelined framework that leads to a decrease in the performance, our proposed SSMT attempts to learn a regularized distance metric and instance weights by considering unlabelled data in a parallel framework Experimental evaluation on three cross-domain visual data sets, eg, PIE Face, Handwriting Digit Recognition on MNIST–USPS and Object Recognition, demonstrates the effectiveness of our designed approach on facilitating the unlabelled target task learning, compared to current state-of-the-art domain adaptation approaches

...read moreread less

Journal Article•DOI•

Comparative assessments of binned and support vector regression-based blade pitch curve of a wind turbine for the purpose of condition monitoring

[...]

Ravi Kumar Pandit¹, David Infield¹•Institutions (1)

University of Strathclyde¹

01 Jun 2019-international journal of energy and environmental engineering

TL;DR: A support vector regression (a nonparametric machine learning approach)-based pitch curve is presented and its application to anomaly detection explored for wind turbine condition monitoring.

...read moreread less

Abstract: The unexpected failure of wind turbine components leads to significant downtime and loss of revenue. To prevent this, supervisory control and data acquisition (SCADA) based condition monitoring is considered as a cost-effective approach. In several studies, the wind turbine power curve has been used as a critical indicator for power performance assessment. In contrast, the application of the blade pitch angle curve has hardly been explored for wind turbine condition monitoring purposes. The blade pitch angle curve describes the nonlinear relationship between pitch angle and hub height wind speed and can be used for the detection of faults. A support vector machine (SVM) is an improved version of an artificial neural networks (ANN) and is widely used for classification- and regression-related problems. Support vector regression is a data-driven approach based on statistical learning theory and a structural risk minimization principle which provides useful nonlinear system modeling. In this paper, a support vector regression (a nonparametric machine learning approach)-based pitch curve is presented and its application to anomaly detection explored for wind turbine condition monitoring. A radial basis function (RBF) was used as the kernel function for effective SVR blade pitch curve modeling. This approach is then compared with a binned pitch curve in the identification of operational anomalies. The paper will outline the advantages and limitations of these techniques.

...read moreread less

Posted Content•

Robust Learning from Untrusted Sources

[...]

Nikola Konstantinov¹, Christoph H. Lampert¹•Institutions (1)

Institute of Science and Technology Austria¹

29 Jan 2019-arXiv: Learning

TL;DR: This work derives a procedure that allows for learning from all available sources, yet automatically suppresses irrelevant or corrupted data, and shows that this method provides significant improvements over alternative approaches from robust statistics and distributed optimization.

...read moreread less

Abstract: Modern machine learning methods often require more data for training than a single expert can provide. Therefore, it has become a standard procedure to collect data from external sources, e.g. via crowdsourcing. Unfortunately, the quality of these sources is not always guaranteed. As additional complications, the data might be stored in a distributed way, or might even have to remain private. In this work, we address the question of how to learn robustly in such scenarios. Studying the problem through the lens of statistical learning theory, we derive a procedure that allows for learning from all available sources, yet automatically suppresses irrelevant or corrupted data. We show by extensive experiments that our method provides significant improvements over alternative approaches from robust statistics and distributed optimization.

...read moreread less

Journal Article•DOI•

Use of the cointegration strategies to remove environmental effects from data acquired on historical buildings

[...]

Giorgia Coletta¹, Gaetano Miraglia¹, Marica Leonarda Pecorelli¹, Rosario Ceravolo¹, Elizabeth J. Cross², Cecilia Surace¹, Keith Worden² - Show less +3 more•Institutions (2)

Polytechnic University of Turin¹, University of Sheffield²

15 Mar 2019-Engineering Structures

TL;DR: A regression obtained through a particular class of machine learners, based on statistical learning theory and its Bayesian variants is proposed, which is applied to data from the Sanctuary of Vicoforte, which was dynamically monitored over a period of four months and modelled with finite elements to simulate structural damage.

...read moreread less

Proceedings Article•DOI•

Settling the sample complexity of single-parameter revenue maximization

[...]

Chenghao Guo¹, Zhiyi Huang², Xinzhi Zhang¹•Institutions (2)

Tsinghua University¹, University of Hong Kong²

23 Jun 2019

TL;DR: This paper settles the sample complexity of single-parameter revenue maximization by showing matching upper and lower bounds, up to a poly-logarithmic factor, for all families of value distributions that have been considered in the literature.

...read moreread less

Abstract: This paper settles the sample complexity of single-parameter revenue maximization by showing matching upper and lower bounds, up to a poly-logarithmic factor, for all families of value distributions that have been considered in the literature. The upper bounds are unified under a novel framework, which builds on the strong revenue monotonicity by Devanur, Huang, and Psomas (STOC 2016), and an information theoretic argument. This is fundamentally different from the previous approaches that rely on either constructing an є-net of the mechanism space, explicitly or implicitly via statistical learning theory, or learning an approximately accurate version of the virtual values. To our knowledge, it is the first time information theoretical arguments are used to show sample complexity upper bounds, instead of lower bounds. Our lower bounds are also unified under a meta construction of hard instances.

...read moreread less

Proceedings Article•

Robust Learning from Untrusted Sources

[...]

Nikola Konstantinov¹, Christoph H. Lampert¹•Institutions (1)

Institute of Science and Technology Austria¹

24 May 2019

TL;DR: In this paper, the authors address the question of how to learn robustly in such scenarios, and derive a procedure that allows for learning from all available sources, yet automatically suppresses irrelevant or corrupted data.

...read moreread less

Journal Article•DOI•

Granular support vector machine: a review

[...]

Husheng Guo¹, Wenjian Wang¹•Institutions (1)

Shanxi University¹

01 Jan 2019-Artificial Intelligence Review

TL;DR: Granular support vector machine (GSVM) is a novel machine learning model based on granular computing and statistical learning theory, and it can solve the low efficiency learning problem that exists in the traditional SVM and obtain satisfactory generalization performance, as well.

...read moreread less

Abstract: The time complexity of traditional support vector machine (SVM) is $$O(l^{3})$$ and l is the the training sample size, and it can not solve the large scale problems. Granular support vector machine (GSVM) is a novel machine learning model based on granular computing and statistical learning theory, and it can solve the low efficiency learning problem that exists in the traditional SVM and obtain satisfactory generalization performance, as well. This paper primarily reviews the past (rudiment), present (basic model) and future (development direction) of GSVM. Firstly, we briefly introduce the basic theory of SVM and GSVM. Secondly, we describe the past related research works conducted before the GSVM was proposed. Next, the latest thoughts, models, algorithms and applications of GSVM are described. Finally, we note the research and development prospects of GSVM.

...read moreread less

Journal Article•DOI•

Generalization in fully-connected neural networks for time series forecasting

[...]

Anastasia Borovykh, Cornelis W. Oosterlee, Sander M. Bohte

01 Sep 2019-Journal of Computational Science

TL;DR: The input and weight Hessians are used to quantify a network's ability to generalize to unseen data and how one can control the generalization capability of the network by means of the training process using the learning rate, batch size and the number of training iterations as controls.

...read moreread less

Proceedings Article•DOI•

Improved Sparse Pinball Twin SVM

[...]

Muhammad Tanveer¹, T. Rajani¹, M. A. Ganaie¹•Institutions (1)

Indian Institute of Technology Indore¹

01 Oct 2019

TL;DR: Results computed on multiple UCI benchmark datasets clearly indicate the effectiveness and applicability of the proposed ISPTSVM compared to pinball support vector machine (Pin-SVM), twin bounded support vectors machine (TBSVM) and SPTSVM.

...read moreread less

Abstract: In this paper, we propose an improved version of sparse pinball twin support vector machine (SPTSVM) [1], called improved sparse pinball twin support vector machine (ISPTSVM). SPTSVM implements empirical risk minimization principle and the matrices appearing in the formulation of SPTSVM are positive semi-definite. Here, we reformulate the primal problems of SPTSVM by introducing extra regularization term to the objective function of SPTSVM. Unlike SPTSVM, structural risk minimization (SRM) principle is implemented in the proposed ISPTSVM which embodies the marrow of statistical learning theory. Also, the matrices that appear in the dual formulation of the proposed ISPTSVM are positive definite. Results computed on multiple UCI benchmark datasets clearly indicate the effectiveness and applicability of the proposed ISPTSVM compared to pinball support vector machine (Pin-SVM), twin bounded support vector machine (TBSVM) and SPTSVM.

...read moreread less

Posted Content•

Risk bounds for reservoir computing

[...]

Lukas Gonon¹, Lyudmila Grigoryeva², Juan-Pablo Ortega¹•Institutions (2)

University of St. Gallen¹, University of Konstanz²

30 Oct 2019-arXiv: Learning

TL;DR: Finite sample upper bounds are derived for the generalization error committed by specific families of reservoir computing systems when processing discrete-time inputs under various hypotheses on their dependence structure in the framework of statistical learning theory.

...read moreread less

Abstract: We analyze the practices of reservoir computing in the framework of statistical learning theory. In particular, we derive finite sample upper bounds for the generalization error committed by specific families of reservoir computing systems when processing discrete-time inputs under various hypotheses on their dependence structure. Non-asymptotic bounds are explicitly written down in terms of the multivariate Rademacher complexities of the reservoir systems and the weak dependence structure of the signals that are being handled. This allows, in particular, to determine the minimal number of observations needed in order to guarantee a prescribed estimation accuracy with high probability for a given reservoir family. At the same time, the asymptotic behavior of the devised bounds guarantees the consistency of the empirical risk minimization procedure for various hypothesis classes of reservoir functionals.

...read moreread less

Proceedings Article•DOI•

SPuManTE: Significant Pattern Mining with Unconditional Testing

[...]

Leonardo Pellegrina¹, Matteo Riondato², Fabio Vandin¹•Institutions (2)

University of Padua¹, Amherst College²

25 Jul 2019

TL;DR: The results of the experimental evaluation show that SPuManTE allows the discovery of statistically significant patterns while properly accounting for uncertainties in patterns' frequencies due to the data generation process.

...read moreread less

Abstract: We present SPuManTE, an efficient algorithm for mining significant patterns from a transactional dataset. SPuManTE controls the Family-wise Error Rate: it ensures that the probability of reporting one or more false discoveries is less than an user-specified threshold. A key ingredient of SPuManTE is UT, our novel unconditional statistical test for evaluating the significance of a pattern, that requires fewer assumptions on the data generation process and is more appropriate for a knowledge discovery setting than classical conditional tests, such as the widely used Fisher's exact test. Computational requirements have limited the use of unconditional tests in significant pattern discovery, but UT overcomes this issue by obtaining the required probabilities in a novel efficient way. SPuManTE combines UT with recent results on the supremum of the deviations of pattern frequencies from their expectations, founded in statistical learning theory. This combination allows SPuManTE to be very efficient, while also enjoying high statistical power. The results of our experimental evaluation show that SPuManTE allows the discovery of statistically significant patterns while properly accounting for uncertainties in patterns' frequencies due to the data generation process.

...read moreread less

Posted Content•

Robust Model Predictive Shielding for Safe Reinforcement Learning with Stochastic Dynamics

[...]

Shuo Li¹, Osbert Bastani¹•Institutions (1)

University of Pennsylvania¹

24 Oct 2019

TL;DR: This work proposes a framework for safe reinforcement learning that can handle stochastic nonlinear dynamical systems, and proposes to use a tube-based robust nonlinear model predictive controller (NMPC) as the backup controller.

...read moreread less

Abstract: This paper proposes a framework for safe reinforcement learning that can handle stochastic nonlinear dynamical systems. We focus on the setting where the nominal dynamics are known, and are subject to additive stochastic disturbances with known distribution. Our goal is to ensure the safety of a control policy trained using reinforcement learning, e.g., in a simulated environment. We build on the idea of model predictive shielding (MPS), where a backup controller is used to override the learned policy as needed to ensure safety. The key challenge is how to compute a backup policy in the context of stochastic dynamics. We propose to use a tube-based robust NMPC controller as the backup controller. We estimate the tubes using sampled trajectories, leveraging ideas from statistical learning theory to obtain high-probability guarantees. We empirically demonstrate that our approach can ensure safety in stochastic systems, including cart-pole and a non-holonomic particle with random obstacles.

...read moreread less

Proceedings Article•DOI•

Safe Learning-Based Control of Stochastic Jump Linear Systems: a Distributionally Robust Approach

[...]

Mathijs Schuurmans¹, Pantelis Sopasakis², Panagiotis Patrinos¹•Institutions (2)

Katholieke Universiteit Leuven¹, Queen's University Belfast²

01 Dec 2019

TL;DR: This work considers the problem of designing control laws for stochastic jump linear systems where the disturbances are drawn randomly from a finite sample space according to an unknown distribution, and adopts a distributionally robust approach to compute a mean-square stabilizing feedback gain with a given probability.

...read moreread less

Abstract: We consider the problem of designing control laws for stochastic jump linear systems where the disturbances are drawn randomly from a finite sample space according to an unknown distribution, which is estimated from a finite sample of i.i.d. observations. We adopt a distributionally robust approach to compute a mean-square stabilizing feedback gain with a given probability. The larger the sample size, the less conservative the controller, yet our methodology gives stability guarantees with high probability, for any number of samples. Using tools from statistical learning theory, we estimate confidence regions for the unknown probability distributions (ambiguity sets) which have the shape of total variation balls centered around the empirical distribution. We use these confidence regions in the design of appropriate distributionally robust controllers and show that the associated stability conditions can be cast as a tractable linear matrix inequality (LMI) by using conjugate duality. The resulting design procedure scales gracefully with the size of the probability space and the system dimensions. Through a numerical example, we illustrate the superior sample complexity of the proposed methodology over the stochastic approach.

...read moreread less

Journal Article•DOI•

Privacy-Preserving Analysis of Distributed Biomedical Data: Designing Efficient and Secure Multiparty Computations Using Distributed Statistical Learning Theory.

[...]

Fida Kamal Dankar¹, Nisha Madathil¹, Samar K. Dankar, Sabri Boughorbel²•Institutions (2)

United Arab Emirates University¹, Qatar Airways²

29 Apr 2019-JMIR medical informatics

TL;DR: This study introduced distributed statistical computing (DSC) into the design of secure multiparty protocols, which allows DSC to securely calculate a linear regression model over multiple datasets, thus limiting communication to the final step and reducing complexity.

...read moreread less

Abstract: Background: Biomedical research often requires large cohorts and necessitates the sharing of biomedical data with researchers around the world, which raises many privacy, ethical, and legal concerns. In the face of these concerns, privacy experts are trying to explore approaches to analyzing the distributed data while protecting its privacy. Many of these approaches are based on secure multiparty computations (SMCs). SMC is an attractive approach allowing multiple parties to collectively carry out calculations on their datasets without having to reveal their own raw data; however, it incurs heavy computation time and requires extensive communication between the involved parties. Objective: This study aimed to develop usable and efficient SMC applications that meet the needs of the potential end-users and to raise general awareness about SMC as a tool that supports data sharing. Methods: We have introduced distributed statistical computing (DSC) into the design of secure multiparty protocols, which allows us to conduct computations on each of the parties’ sites independently and then combine these computations to form 1 estimator for the collective dataset, thus limiting communication to the final step and reducing complexity. The effectiveness of our privacy-preserving model is demonstrated through a linear regression application. Results: Our secure linear regression algorithm was tested for accuracy and performance using real and synthetic datasets. The results showed no loss of accuracy (over nonsecure regression) and very good performance (20 min for 100 million records). Conclusions: We used DSC to securely calculate a linear regression model over multiple datasets. Our experiments showed very good performance (in terms of the number of records it can handle). We plan to extend our method to other estimators such as logistic regression.

...read moreread less

Posted Content•

Lecture Notes: Selected topics on robust statistical learning theory

[...]

Matthieu Lerasle

28 Aug 2019-arXiv: Machine Learning

TL;DR: These notes gather recent results on robust statistical learning theory and stress the main principles underlying the construction and theoretical analysis of these estimators rather than provide an exhaustive account on this rapidly growing field.

...read moreread less

Abstract: These notes gather recent results on robust statistical learning theory. The goal is to stress the main principles underlying the construction and theoretical analysis of these estimators rather than provide an exhaustive account on this rapidly growing field. The notes are the basis of lectures given at the conference StatMathAppli 2019.

...read moreread less

Proceedings Article•

Localized Structured Prediction

[...]

Carlo Ciliberto¹, Francis Bach², Alessandro Rudi³, Alessandro Rudi²•Institutions (3)

University College London¹, University of Paris², École Normale Supérieure³

01 Jan 2019

TL;DR: In this article, the authors propose a theoretical framework to deal with part-based data from a general perspective and study a novel method within the setting of statistical learning theory, which explicitly quantifies the benefits of leveraging the partbased structure of a problem on the learning rates of the proposed estimator.

...read moreread less

Abstract: Key to structured prediction is exploiting the problem's structure to simplify the learning process. A major challenge arises when data exhibit a local structure (i.e., are made ``by parts'') that can be leveraged to better approximate the relation between (parts of) the input and (parts of) the output. Recent literature on signal processing, and in particular computer vision, shows that capturing these aspects is indeed essential to achieve state-of-the-art performance. However, in this context algorithms are typically derived on a case-by-case basis. In this work we propose the first theoretical framework to deal with part-based data from a general perspective and study a novel method within the setting of statistical learning theory. Our analysis is novel in that it explicitly quantifies the benefits of leveraging the part-based structure of a problem on the learning rates of the proposed estimator.

...read moreread less

Proceedings Article•

A Robust Zero-Sum Game Framework for Pool-based Active Learning

[...]

Dixian Zhu¹, Zhe Li¹, Xiaoyu Wang, Boqing Gong², Tianbao Yang¹ - Show less +1 more•Institutions (2)

University of Iowa¹, Tencent²

11 Apr 2019

TL;DR: A novel robust zero- sum game framework for pool-based active learning grounded on advanced statistical learning theory that avoids the issues of many previous algorithms such as inefficiency, sampling bias and sensitivity to imbalanced data distribution is presented.

...read moreread less

Abstract: In this paper, we present a novel robust zero- sum game framework for pool-based active learning grounded on advanced statistical learning theory. Pool-based active learning usually consists of two components, namely, learning of a classifier given labeled data and querying of unlabeled data for labeling. Most previous studies on active learning consider these as two separate tasks and propose various heuristics for selecting important unlabeled data for labeling, which may render the selection of unlabeled examples sub-optimal for minimizing the classification error. In contrast, the present work formulates active learning as a unified optimization framework for learning the classifier, i.e., the querying of labels and the learning of models are unified to minimize a common objective for statistical learning. In addition, the proposed method avoids the issues of many previous algorithms such as inefficiency, sampling bias and sensitivity to imbalanced data distribution. Besides theoretical analysis, we conduct extensive experiments on benchmark datasets and demonstrate the superior performance of the proposed active learning method compared with the state-of-the-art methods.

...read moreread less

Posted Content•

Learning from weakly dependent data under Dobrushin's condition

[...]

Yuval Dagan¹, Constantinos Daskalakis¹, Nishanth Dikkala¹, Siddhartha Jayanti¹•Institutions (1)

Massachusetts Institute of Technology¹

21 Jun 2019-arXiv: Learning

TL;DR: The standard complexity measures of Gaussian and Rademacher complexities and VC dimension are sufficient measures of complexity for the purposes of bounding the generalization error and learning rates of hypothesis classes in this setting.

...read moreread less

Abstract: Statistical learning theory has largely focused on learning and generalization given independent and identically distributed (i.i.d.) samples. Motivated by applications involving time-series data, there has been a growing literature on learning and generalization in settings where data is sampled from an ergodic process. This work has also developed complexity measures, which appropriately extend the notion of Rademacher complexity to bound the generalization error and learning rates of hypothesis classes in this setting. Rather than time-series data, our work is motivated by settings where data is sampled on a network or a spatial domain, and thus do not fit well within the framework of prior work. We provide learning and generalization bounds for data that are complexly dependent, yet their distribution satisfies the standard Dobrushin's condition. Indeed, we show that the standard complexity measures of Gaussian and Rademacher complexities and VC dimension are sufficient measures of complexity for the purposes of bounding the generalization error and learning rates of hypothesis classes in our setting. Moreover, our generalization bounds only degrade by constant factors compared to their i.i.d. analogs, and our learnability bounds degrade by log factors in the size of the training set.

...read moreread less

Proceedings Article•

Learning from weakly dependent data under Dobrushin's condition

[...]

Yuval Dagan¹, Constantinos Daskalakis¹, Nishanth Dikkala¹, Siddhartha Jayanti¹•Institutions (1)

Massachusetts Institute of Technology¹

21 Jun 2019

TL;DR: In this article, the authors show that the standard complexity measures of Gaussian and Rademacher complexities and VC dimension are sufficient measures of complexity for the purposes of bounding the generalization error and learning rates of hypothesis classes in this setting.

...read moreread less

Proceedings Article•

McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds

[...]

Rui Ray Zhang¹, Xingwu Liu², Yuyi Wang³, Liwei Wang⁴•Institutions (4)

Monash University¹, Chinese Academy of Sciences², ETH Zurich³, Peking University⁴

01 Jan 2019

TL;DR: This work proves novel McDiarmid-type concentration inequalities for Lipschitz functions of graph-dependent random variables and demonstrates that for many types of dependent data, the forest complexity is small and thus implies good concentration.

...read moreread less

Abstract: A crucial assumption in most statistical learning theory is that samples are independently and identically distributed (i.i.d.). However, for many real applications, the i.i.d. assumption does not hold. We consider learning problems in which examples are dependent and their dependency relation is characterized by a graph. To establish algorithm-dependent generalization theory for learning with non-i.i.d. data, we first prove novel McDiarmid-type concentration inequalities for Lipschitz functions of graph-dependent random variables. We show that concentration relies on the forest complexity of the graph, which characterizes the strength of the dependency. We demonstrate that for many types of dependent data, the forest complexity is small and thus implies good concentration. Based on our new inequalities we are able to build stability bounds for learning from graph-dependent data.

...read moreread less