Showing papers on "Upper and lower bounds published in 2018"

PDF

Open Access

Journal Article•DOI•

GW170817: Joint Constraint on the Neutron Star Equation of State from Multimessenger Observations

[...]

David Radice¹, Albino Perego², Albino Perego³, Albino Perego⁴, Francesco Zappa², Sebastiano Bernuzzi², Sebastiano Bernuzzi³ - Show less +3 more•Institutions (4)

Princeton University¹, University of Parma², Istituto Nazionale di Fisica Nucleare³, University of Milan⁴

09 Jan 2018-The Astrophysical Journal

TL;DR: In this paper, the interpretation of the UV/optical/infrared counterpart of GW170817 with kilonova models, combined with new numerical relativity results, imply a complementary lower bound on the tidal deformability parameter.

...read moreread less

Abstract: Gravitational waves detected from the binary neutron star (NS) merger GW170817 constrained the NS equation of state by placing an upper bound on certain parameters describing the binary's tidal interactions. We show that the interpretation of the UV/optical/infrared counterpart of GW170817 with kilonova models, combined with new numerical relativity results, imply a complementary lower bound on the tidal deformability parameter. The joint constraints tentatively rule out both extremely stiff and soft NS equations of state.

...read moreread less

503 citations

Posted Content•

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models

[...]

Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy A. Mann, Pushmeet Kohli - Show less +5 more

30 Oct 2018-arXiv: Learning

TL;DR: This work shows how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy and allows the largest model to be verified beyond vacuous bounds on a downscaled version of ImageNet.

...read moreread less

Abstract: Recent work has shown that it is possible to train deep neural networks that are provably robust to norm-bounded adversarial perturbations. Most of these methods are based on minimizing an upper bound on the worst-case loss over all possible adversarial perturbations. While these techniques show promise, they often result in difficult optimization procedures that remain hard to scale to larger networks. Through a comprehensive analysis, we show how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy. While the upper bound computed by IBP can be quite weak for general networks, we demonstrate that an appropriate loss and clever hyper-parameter schedule allow the network to adapt such that the IBP bound is tight. This results in a fast and stable learning algorithm that outperforms more sophisticated methods and achieves state-of-the-art results on MNIST, CIFAR-10 and SVHN. It also allows us to train the largest model to be verified beyond vacuous bounds on a downscaled version of ImageNet.

...read moreread less

423 citations

Journal Article•DOI•

Security Control for Discrete-Time Stochastic Nonlinear Systems Subject to Deception Attacks

[...]

Derui Ding¹, Zidong Wang², Qing-Long Han³, Guoliang Wei¹•Institutions (3)

University of Shanghai for Science and Technology¹, Brunel University London², Swinburne University of Technology³

01 May 2018-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: This paper is concerned with the security control problem with quadratic cost criterion for a class of discrete-time stochastic nonlinear systems subject to deception attacks, and proposes an easy-solution version on above inequalities to obtain both the controller gain and the upper bound.

...read moreread less

Abstract: This paper is concerned with the security control problem with quadratic cost criterion for a class of discrete-time stochastic nonlinear systems subject to deception attacks A definition of security in probability is adopted to account for the transient dynamics of controlled systems The purpose of the problem under consideration is to design a dynamic output feedback controller such that the prescribed security in probability is guaranteed while obtaining an upper bound of the quadratic cost criterion First of all, some sufficient conditions with the form of matrix inequalities are established in the framework of the input-to-state stability in probability Then, an easy-solution version on above inequalities is proposed by carrying out the well-known matrix inverse lemma to obtain both the controller gain and the upper bound Furthermore, the main results are shown to be extendable to the case of discrete-time stochastic linear systems Finally, two simulation examples are utilized to illustrate the usefulness of the proposed controller design scheme

...read moreread less

364 citations

Book•

Pricing American Options: A Duality Approach

[...]

Martin B. Haugh¹, Leonid Kogan²•Institutions (2)

Columbia University¹, Massachusetts Institute of Technology²

27 Feb 2018

TL;DR: In this paper, a general algorithm for constructing upper and lower bounds on the true price of the option using any approximation to the option price is presented, which is made feasible by the representation of the American option price as a solution of a properly defined dual minimization problem.

...read moreread less

Abstract: We develop a new method for pricing American options. The main practical contribution of this paper is a general algorithm for constructing upper and lower bounds on the true price of the option using any approximation to the option price. We show that our bounds are tight, so that if the initial approximation is close to the true price of the option, the bounds are also guaranteed to be close. We also explicitly characterize the worst-case performance of the pricing bounds. The computation of the lower bound is straightforward and relies on simulating the suboptimal exercise strategy implied by the approximate option price. The upper bound is also computed using Monte Carlo simulation. This is made feasible by the representation of the American option price as a solution of a properly defined dual minimization problem, which is the main theoretical result of this paper. Our algorithm proves to be accurate on a set of sample problems where we price call options on the maximum and the geometric mean of a collection of stocks. These numerical results suggest that our pricing method can be successfully applied to problems of practical interest.

...read moreread less

335 citations

Proceedings Article•

Fixing a Broken ELBO

[...]

Alexander A. Alemi¹, Ben Poole², Ian Fischer², Joshua V. Dillon², Rif A. Saurous², Kevin Murphy³ - Show less +2 more•Institutions (3)

Cornell University¹, Google², Cardiff University³

03 Jul 2018

TL;DR: This framework derives variational lower and upper bounds on the mutual information between the input and the latent variable, and uses these bounds to derive a rate-distortion curve that characterizes the tradeoff between compression and reconstruction accuracy.

...read moreread less

Abstract: Recent work in unsupervised representation learning has focused on learning deep directed latent-variable models. Fitting these models by maximizing the marginal likelihood or evidence is typically intractable, thus a common approximation is to maximize the evidence lower bound (ELBO) instead. However, maximum likelihood training (whether exact or approximate) does not necessarily result in a good latent representation, as we demonstrate both theoretically and empirically. In particular, we derive variational lower and upper bounds on the mutual information between the input and the latent variable, and use these bounds to derive a rate-distortion curve that characterizes the tradeoff between compression and reconstruction accuracy. Using this framework, we demonstrate that there is a family of models with identical ELBO, but different quantitative and qualitative characteristics. Our framework also suggests a simple new method to ensure that latent variable models with powerful stochastic decoders do not ignore their latent code.

...read moreread less

323 citations

Posted Content•

Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks

[...]

Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann LeCun, Nathan Srebro - Show less +1 more

30 May 2018-arXiv: Learning

TL;DR: A novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks and a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks are presented.

...read moreread less

Abstract: Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes, and could potentially explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.

...read moreread less

310 citations

Journal Article•DOI•

Optimal approximation of piecewise smooth functions using deep ReLU neural networks.

[...]

Philipp Petersen¹, Felix Voigtlaender¹•Institutions (1)

Technical University of Berlin¹

01 Dec 2018-Neural Networks

TL;DR: It is proved that one cannot approximate a general function f∈Eβ(Rd) using neural networks that are less complex than those produced by the construction, which partly explains the benefits of depth for ReLU networks by showing that deep networks are necessary to achieve efficient approximation of (piecewise) smooth functions.

...read moreread less

307 citations

Proceedings Article•

Not All Samples Are Created Equal: Deep Learning with Importance Sampling

[...]

Angelos Katharopoulos¹, François Fleuret²•Institutions (2)

Aristotle University of Thessaloniki¹, Idiap Research Institute²

02 Mar 2018

TL;DR: The authors proposed a principled importance sampling scheme that focuses computation on "informative" examples, and reduces the variance of the stochastic gradients during training, which can be used for image classification, CNN fine-tuning, and RNN training.

...read moreread less

Abstract: Deep neural network training spends most of the computation on examples that are properly handled, and could be ignored. We propose to mitigate this phenomenon with a principled importance sampling scheme that focuses computation on "informative" examples, and reduces the variance of the stochastic gradients during training. Our contribution is twofold: first, we derive a tractable upper bound to the per-sample gradient norm, and second we derive an estimator of the variance reduction achieved with importance sampling, which enables us to switch it on when it will result in an actual speedup. The resulting scheme can be used by changing a few lines of code in a standard SGD procedure, and we demonstrate experimentally, on image classification, CNN fine-tuning, and RNN training, that for a fixed wall-clock time budget, it provides a reduction of the train losses of up to an order of magnitude and a relative improvement of test errors between 5% and 17%.

...read moreread less

289 citations

Journal Article•DOI•

Analysis of classifiers’ robustness to adversarial perturbations

[...]

Alhussein Fawzi¹, Omar Fawzi², Pascal Frossard¹•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, École normale supérieure de Lyon²

01 Mar 2018-Machine Learning

TL;DR: In this article, the authors provide a theoretical framework for analyzing the robustness of classifiers to adversarial perturbations, and show fundamental upper bounds on the adversarial robustness.

...read moreread less

Abstract: The goal of this paper is to analyze the intriguing instability of classifiers to adversarial perturbations (Szegedy et al., in: International conference on learning representations (ICLR), 2014). We provide a theoretical framework for analyzing the robustness of classifiers to adversarial perturbations, and show fundamental upper bounds on the robustness of classifiers. Specifically, we establish a general upper bound on the robustness of classifiers to adversarial perturbations, and then illustrate the obtained upper bound on two practical classes of classifiers, namely the linear and quadratic classifiers. In both cases, our upper bound depends on a distinguishability measure that captures the notion of difficulty of the classification task. Our results for both classes imply that in tasks involving small distinguishability, no classifier in the considered set will be robust to adversarial perturbations, even if a good accuracy is achieved. Our theoretical framework moreover suggests that the phenomenon of adversarial instability is due to the low flexibility of classifiers, compared to the difficulty of the classification task (captured mathematically by the distinguishability measure). We further show the existence of a clear distinction between the robustness of a classifier to random noise and its robustness to adversarial perturbations. Specifically, the former is shown to be larger than the latter by a factor that is proportional to $$\sqrt{d}$$ (with d being the signal dimension) for linear classifiers. This result gives a theoretical explanation for the discrepancy between the two robustness properties in high dimensional problems, which was empirically observed by Szegedy et al. in the context of neural networks. We finally show experimental results on controlled and real-world data that confirm the theoretical analysis and extend its spirit to more complex classification schemes.

...read moreread less

272 citations

Proceedings Article•

Towards Fast Computation of Certified Robustness for ReLU Networks

[...]

Tsui-Wei Weng¹, Huan Zhang², Hongge Chen¹, Zhao Song³, Cho-Jui Hsieh², Duane S. Boning¹, Inderjit S. Dhillon³, Luca Daniel¹ - Show less +4 more•Institutions (3)

Massachusetts Institute of Technology¹, University of California, Davis², University of Texas at Austin³

25 Apr 2018

TL;DR: In this paper, the authors exploit the special structure of ReLU networks and provide two computationally efficient algorithms FastLin and Fast-Lip that are able to certify non-trivial lower bounds of minimum distortions, by bounding the ReLU units with appropriate linear functions Fast-Linear or FastLip.

...read moreread less

Abstract: Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17]. Although finding the exact minimum adversarial distortion is hard, giving a certified lower bound of the minimum distortion is possible. Current available methods of computing such a bound are either time-consuming or delivering low quality bounds that are too loose to be useful. In this paper, we exploit the special structure of ReLU networks and provide two computationally efficient algorithms Fast-Lin and Fast-Lip that are able to certify non-trivial lower bounds of minimum distortions, by bounding the ReLU units with appropriate linear functions Fast-Lin, or by bounding the local Lipschitz constant Fast-Lip. Experiments show that (1) our proposed methods deliver bounds close to (the gap is 2-3X) exact minimum distortion found by Reluplex in small MNIST networks while our algorithms are more than 10,000 times faster; (2) our methods deliver similar quality of bounds (the gap is within 35% and usually around 10%; sometimes our bounds are even better) for larger networks compared to the methods based on solving linear programming problems but our algorithms are 33-14,000 times faster; (3) our method is capable of solving large MNIST and CIFAR networks up to 7 layers with more than 10,000 neurons within tens of seconds on a single CPU core. In addition, we show that, in fact, there is no polynomial time algorithm that can approximately find the minimum $\ell_1$ adversarial distortion of a ReLU network with a $0.99\ln n$ approximation ratio unless $\mathsf{NP}$=$\mathsf{P}$, where $n$ is the number of neurons in the network.

...read moreread less

267 citations

Journal Article•DOI•

Closing the window on single leptoquark solutions to the $B$-physics anomalies

[...]

Andrei Angelescu¹, Damir Becirevic², Darius A. Faroughy³, Olcyr Sumensari⁴, Olcyr Sumensari⁵ - Show less +1 more•Institutions (5)

University of Nebraska–Lincoln¹, Université Paris-Saclay², Jožef Stefan Institute³, University of Padua⁴, Istituto Nazionale di Fisica Nucleare⁵

24 Aug 2018-arXiv: High Energy Physics - Phenomenology

TL;DR: In this paper, the authors examined various scenarios in which the Standard Model is extended by a light leptoquark state to solve for one or both $B$-physics anomalies, viz.

...read moreread less

Abstract: We examine various scenarios in which the Standard Model is extended by a light leptoquark state to solve for one or both $B$-physics anomalies, viz. $R_{D^{(\ast)}}^\mathrm{exp}> R_{D^{(\ast)}}^\mathrm{SM}$ or/and $R_{K^{(\ast)}}^\mathrm{exp}< R_{K^{(\ast)}}^\mathrm{SM}$. To do so we combine the constraints arising both from the low-energy observables and from direct searches at the LHC. We find that none of the scalar leptoquarks of mass $m_\mathrm{LQ} \simeq 1$ TeV can alone accommodate the above mentioned anomalies. The only single leptoquark scenario which can provide a viable solution for $m_\mathrm{LQ} \simeq 1÷2$ TeV is a vector leptoquark, known as $U_1$, which we re-examine in its minimal form (letting only left-handed couplings to have non-zero values). We find that the limits deduced from direct searches are complementary to the low-energy physics constraints. In particular, we find a rather stable lower bound on the lepton flavor violating $b\to s\ell_1^\pm\ell_2^\mp$ modes, such as $\mathcal{B}(B\to K\mu\tau)$. Improving the experimental upper bound on $\mathcal{B}(B\to K\mu\tau)$ by two orders of magnitude could compromise the viability of the minimal $U_1$ model as well.

...read moreread less

Journal Article•DOI•

State Estimation for Static Neural Networks With Time-Varying Delays Based on an Improved Reciprocally Convex Inequality

[...]

Xian-Ming Zhang¹, Qing-Long Han¹•Institutions (1)

Swinburne University of Technology¹

01 Apr 2018-IEEE Transactions on Neural Networks

TL;DR: A novel bounded real lemma (BRL) for the resultant error system is derived and is applied to present a method for designing suitable Luenberger estimators in terms of solutions of linear matrix inequalities with two tuning parameters.

...read moreread less

Abstract: This brief is concerned with the problem of neural state estimation for static neural networks with time-varying delays. Notice that a Luenberger estimator can produce an estimation error irrespective of the neuron state trajectory. This brief provides a method for designing such an estimator for static neural networks with time-varying delays. First, in-depth analysis on a well-used reciprocally convex approach is made, leading to an improved reciprocally convex inequality. Second, the improved reciprocally convex inequality and some integral inequalities are employed to provide a tight upper bound on the time-derivative of some Lyapunov–Krasovskii functional. As a result, a novel bounded real lemma (BRL) for the resultant error system is derived. Third, the BRL is applied to present a method for designing suitable Luenberger estimators in terms of solutions of linear matrix inequalities with two tuning parameters. Finally, it is shown through a numerical example that the proposed method can derive less conservative results than some existing ones.

...read moreread less

Proceedings Article•

Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator

[...]

Sarah Dean¹, Horia Mania¹, Nikolai Matni², Benjamin Recht¹, Stephen Tu¹ - Show less +1 more•Institutions (2)

University of California, Berkeley¹, California Institute of Technology²

01 Jan 2018

TL;DR: In this paper, a provably polynomial time algorithm that achieves sub-linear regret was proposed for adaptive control of the Linear Quadratic Regulator (LQR).

...read moreread less

Abstract: We consider adaptive control of the Linear Quadratic Regulator (LQR), where an unknown linear system is controlled subject to quadratic costs. Leveraging recent developments in the estimation of linear systems and in robust controller synthesis, we present the first provably polynomial time algorithm that achieves sub-linear regret on this problem. We further study the interplay between regret minimization and parameter estimation by proving a lower bound on the expected regret in terms of the exploration schedule used by any algorithm. Finally, we conduct a numerical study comparing our robust adaptive algorithm to other methods from the adaptive LQR literature, and demonstrate the flexibility of our proposed method by extending it to a demand forecasting problem subject to state constraints.

...read moreread less

Posted Content•

Reachability Analysis of Deep Neural Networks with Provable Guarantees

[...]

Wenjie Ruan¹, Xiaowei Huang², Marta Kwiatkowska¹•Institutions (2)

University of Oxford¹, University of Liverpool²

06 May 2018-arXiv: Learning

TL;DR: A novel algorithm based on adaptive nested optimisation to solve the reachability problem for feed-forward DNNs is presented, demonstrating its efficiency, scalability and ability to handle a broader class of networks than state-of-the-art verification approaches.

...read moreread less

Abstract: Verifying correctness of deep neural networks (DNNs) is challenging. We study a generic reachability problem for feed-forward DNNs which, for a given set of inputs to the network and a Lipschitz-continuous function over its outputs, computes the lower and upper bound on the function values. Because the network and the function are Lipschitz continuous, all values in the interval between the lower and upper bound are reachable. We show how to obtain the safety verification problem, the output range analysis problem and a robustness measure by instantiating the reachability problem. We present a novel algorithm based on adaptive nested optimisation to solve the reachability problem. The technique has been implemented and evaluated on a range of DNNs, demonstrating its efficiency, scalability and ability to handle a broader class of networks than state-of-the-art verification approaches.

...read moreread less

Journal Article•DOI•

Joint state and fault estimation for time-varying nonlinear systems with randomly occurring faults and sensor saturations

[...]

Jun Hu¹, Jun Hu², Zidong Wang³, Huijun Gao⁴•Institutions (4)

Harbin University of Science and Technology¹, University of New South Wales², Brunel University London³, Harbin Institute of Technology⁴

01 Nov 2018-Automatica

TL;DR: The aim of this paper is to design a locally optimal time-varying estimator to simultaneously estimate both the system states and the fault signals such that, at each sampling instant, the covariance of the estimation error has an upper bound that is minimized by properly designing the estimator gain.

...read moreread less

Journal Article•DOI•

Adaptive nonsingular fast terminal sliding-mode control for the tracking problem of uncertain dynamical systems.

[...]

Mohamed Boukattaya¹, Neila Mezghani¹, Tarak Damak¹•Institutions (1)

University of Sfax¹

01 Jun 2018-Isa Transactions

TL;DR: In this paper, robust and adaptive nonsingular fast terminal sliding-mode (NFTSM) control schemes with known or unknown upper bound of the system uncertainty and external disturbances are proposed.

...read moreread less

Abstract: In this paper, robust and adaptive nonsingular fast terminal sliding-mode (NFTSM) control schemes for the trajectory tracking problem are proposed with known or unknown upper bound of the system uncertainty and external disturbances. The developed controllers take the advantage of the NFTSM theory to ensure fast convergence rate, singularity avoidance, and robustness against uncertainties and external disturbances. First, a robust NFTSM controller is proposed which guarantees that sliding surface and equilibrium point can be reached in a short finite-time from any initial state. Then, in order to cope with the unknown upper bound of the system uncertainty which may be occurring in practical applications, a new adaptive NFTSM algorithm is developed. One feature of the proposed control law is their adaptation techniques where the prior knowledge of parameters uncertainty and disturbances is not needed. However, the adaptive tuning law can estimate the upper bound of these uncertainties using only position and velocity measurements. Moreover, the proposed controller eliminates the chattering effect without losing the robustness property and the precision. Stability analysis is performed using the Lyapunov stability theory, and simulation studies are conducted to verify the effectiveness of the developed control schemes.

...read moreread less

Journal Article•DOI•

Barrier function-based adaptive sliding mode control

[...]

Hussein Obeid¹, Leonid Fridman², Salah Laghrouche¹, Mohamed Harmouche•Institutions (2)

Centre national de la recherche scientifique¹, National Autonomous University of Mexico²

01 Jul 2018-Automatica

TL;DR: A new barrier function-based adaptive strategy is proposed for first order sliding mode controller that can ensure the convergence of the output variable and maintain it in a predefined neighborhood of zero independent of the upper bound of the disturbance, without overestimating the control gain.

...read moreread less

Proceedings Article•

Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification

[...]

Max Simchowitz¹, Horia Mania¹, Stephen Tu¹, Michael I. Jordan¹, Benjamin Recht¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

03 Jul 2018

TL;DR: In this paper, the authors show that the OLS estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory, using a generalization of Mendelson's small-ball method to dependent data, eschewing the use of standard mixing-time arguments.

...read moreread less

Abstract: We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory. Our upper bound relies on a generalization of Mendelson's small-ball method to dependent data, eschewing the use of standard mixing-time arguments. Our lower bounds reveal that these upper bounds match up to logarithmic factors. In particular, we capture the correct signal-to-noise behavior of the problem, showing that more unstable linear systems are easier to estimate. This behavior is qualitatively different from arguments which rely on mixing-time calculations that suggest that unstable systems are more difficult to estimate. We generalize our technique to provide bounds for a more general class of linear response time-series.

...read moreread less

Proceedings Article•DOI•

Finding Fair and Efficient Allocations

[...]

Siddharth Barman¹, Sanath Kumar Krishnamurthy², Rohit Vaish¹•Institutions (2)

Indian Institute of Science¹, Chennai Mathematical Institute²

11 Jun 2018

TL;DR: In this article, a pseudopolynomial time algorithm for finding allocations that are EF1 and Pareto efficient is presented. But this algorithm does not provide an efficient algorithm for maximizing Nash social welfare, which is NP-hard.

...read moreread less

Abstract: We study the problem of allocating a set of indivisible goods among a set of agents in a fair and efficient manner. An allocation is said to be fair if it is envy-free up to one good (EF1), which means that each agent prefers its own bundle over the bundle of any other agent up to the removal of one good. In addition, an allocation is deemed efficient if it satisfies Pareto efficiency. While each of these well-studied properties is easy to achieve separately, achieving them together is far from obvious. Recently, Caragiannis et al. (2016) established the surprising result that when agents have additive valuations for the goods, there always exists an allocation that simultaneously satisfies these two seemingly incompatible properties. Specifically, they showed that an allocation that maximizes the Nash social welfare objective is both EF1 and Pareto efficient. However, the problem of maximizing Nash social welfare is NP-hard. As a result, this approach does not provide an efficient algorithm for finding a fair and efficient allocation. In this paper, we bypass this barrier, and develop a pseudopolynomial time algorithm for finding allocations that are EF1 and Pareto efficient; in particular, when the valuations are bounded, our algorithm finds such an allocation in polynomial time. Furthermore, we establish a stronger existence result compared to Caragiannis et al. (2016): For additive valuations, there always exists an allocation that is EF1 and fractionally Pareto efficient. Another key contribution of our work is to show that our algorithm provides a polynomial-time 1.45-approximation to the Nash social welfare objective. This improves upon the best known approximation ratio for this problem (namely, the 2-approximation algorithm of Cole et al., 2017), and also matches the lower bound on the integrality gap of the convex program of Cole et al. (2017). Unlike many of the existing approaches, our algorithm is completely combinatorial, and relies on constructing integral Fisher markets wherein specific equilibria are not only efficient, but also fair.

...read moreread less

Posted Content•

The total variation distance between high-dimensional Gaussians

[...]

Luc Devroye, Abbas Mehrabian, Tommy Reddad

19 Oct 2018-arXiv: Statistics Theory

TL;DR: A lower bound and an upper bound are proved for the total variation distance between two high-dimensional Gaussians, which are within a constant factor of one another.

...read moreread less

Abstract: We prove a lower bound and an upper bound for the total variation distance between two high-dimensional Gaussians, which are within a constant factor of one another.

...read moreread less

Posted Content•

Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees

[...]

Yuping Luo¹, Huazhe Xu², Yuanzhi Li², Yuandong Tian³, Trevor Darrell⁴, Tengyu Ma⁴ - Show less +2 more•Institutions (4)

Princeton University¹, University of California, Berkeley², Facebook³, Stanford University⁴

10 Jul 2018-arXiv: Learning

TL;DR: A novel algorithmic framework for designing and analyzing model-based RL algorithms with theoretical guarantees is introduced and a meta-algorithm with a theoretical guarantee of monotone improvement to a local maximum of the expected reward is designed.

...read moreread less

Abstract: Model-based reinforcement learning (RL) is considered to be a promising approach to reduce the sample complexity that hinders model-free RL. However, the theoretical understanding of such methods has been rather limited. This paper introduces a novel algorithmic framework for designing and analyzing model-based RL algorithms with theoretical guarantees. We design a meta-algorithm with a theoretical guarantee of monotone improvement to a local maximum of the expected reward. The meta-algorithm iteratively builds a lower bound of the expected reward based on the estimated dynamical model and sample trajectories, and then maximizes the lower bound jointly over the policy and the model. The framework extends the optimism-in-face-of-uncertainty principle to non-linear dynamical models in a way that requires \textit{no explicit} uncertainty quantification. Instantiating our framework with simplification gives a variant of model-based RL algorithms Stochastic Lower Bounds Optimization (SLBO). Experiments demonstrate that SLBO achieves state-of-the-art performance when only one million or fewer samples are permitted on a range of continuous control benchmark tasks.

...read moreread less

Journal Article•DOI•

Rate-optimal perturbation bounds for singular subspaces with applications to high-dimensional statistics

[...]

T. Tony Cai, Anru Zhang

01 Feb 2018-Annals of Statistics

TL;DR: This is the first result that gives different optimal rates for the left and right singular spaces under the same perturbation, and applications to low-rank matrix denoising and singular space estimation, high-dimensional clustering, and canonical correlation analysis are discussed.

...read moreread less

Abstract: Perturbation bounds for singular spaces, in particular Wedin’s $\mathop{\mathrm{sin}} olimits \Theta$ theorem, are a fundamental tool in many fields including high-dimensional statistics, machine learning and applied mathematics. In this paper, we establish separate perturbation bounds, measured in both spectral and Frobenius $\mathop{\mathrm{sin}} olimits \Theta$ distances, for the left and right singular subspaces. Lower bounds, which show that the individual perturbation bounds are rate-optimal, are also given. The new perturbation bounds are applicable to a wide range of problems. In this paper, we consider in detail applications to low-rank matrix denoising and singular space estimation, high-dimensional clustering and canonical correlation analysis (CCA). In particular, separate matching upper and lower bounds are obtained for estimating the left and right singular spaces. To the best of our knowledge, this is the first result that gives different optimal rates for the left and right singular spaces under the same perturbation.

...read moreread less

Posted Content•

Formal Limitations on the Measurement of Mutual Information

[...]

David McAllester¹, Karl Stratos²•Institutions (2)

Tencent¹, Toyota Technological Institute at Chicago²

10 Nov 2018-arXiv: Information Theory

TL;DR: It is proved that any distribution-free high-confidence lower bound on mutual information estimated from N samples cannot be larger than O(ln N ).

...read moreread less

Abstract: Measuring mutual information from finite data is difficult. Recent work has considered variational methods maximizing a lower bound. In this paper, we prove that serious statistical limitations are inherent to any method of measuring mutual information. More specifically, we show that any distribution-free high-confidence lower bound on mutual information estimated from N samples cannot be larger than O(ln N ).

...read moreread less

Journal Article•DOI•

Distributionally Robust Chance Constrained Optimal Power Flow with Renewables: A Conic Reformulation

[...]

Weijun Xie¹, Shabbir Ahmed¹•Institutions (1)

Georgia Institute of Technology¹

01 Mar 2018-IEEE Transactions on Power Systems

TL;DR: In this article, the authors proposed a data driven distributionally robust chance constrained optimal power flow model (DRCC-OPF), which ensures that the worst-case probability of violating both the upper and lower limit of a line/bus capacity under a wide family of distributions is small.

...read moreread less

Abstract: The uncertainty associated with renewable energy sources introduces significant challenges in optimal power flow (OPF) analysis. A variety of new approaches have been proposed that use chance constraints to limit line or bus overload risk in OPF models. Most existing formulations assume that the probability distributions associated with the uncertainty are known a priori or can be estimated accurately from empirical data, and/or use separate chance constraints for upper and lower line/bus limits. In this paper, we propose a data driven distributionally robust chance constrained optimal power flow model (DRCC-OPF), which ensures that the worst-case probability of violating both the upper and lower limit of a line/bus capacity under a wide family of distributions is small. Assuming that we can estimate the first and second moments of the underlying distributions based on empirical data, we propose an exact reformulation of DRCC-OPF as a tractable convex program. The key theoretical result behind this reformulation is a second-order cone programming (SOCP) reformulation of a general two-sided distributionally robust chance constrained set by lifting the set to a higher dimensional space. Our numerical study shows that the proposed SOCP formulation can be solved efficiently and that the results of our model are quite robust.

...read moreread less

Proceedings Article•

Bounding and Counting Linear Regions of Deep Neural Networks

[...]

Thiago Serra¹, Christian Tjandraatmadja², Srikumar Ramalingam³•Institutions (3)

Bucknell University¹, Google², Mitsubishi Electric Research Laboratories³

15 Feb 2018

TL;DR: The results indicate that a deep rectifier network can only have more linear regions than every shallow counterpart with same number of neurons if that number exceeds the dimension of the input.

...read moreread less

Abstract: We investigate the complexity of deep neural networks (DNN) that represent piecewise linear (PWL) functions. In particular, we study the number of linear regions, i.e. pieces, that a PWL function represented by a DNN can attain, both theoretically and empirically. We present (i) tighter upper and lower bounds for the maximum number of linear regions on rectifier networks, which are exact for inputs of dimension one; (ii) a first upper bound for multi-layer maxout networks; and (iii) a first method to perform exact enumeration or counting of the number of regions by modeling the DNN with a mixed-integer linear formulation. These bounds come from leveraging the dimension of the space defining each linear region. The results also indicate that a deep rectifier network can only have more linear regions than every shallow counterpart with same number of neurons if that number exceeds the dimension of the input.

...read moreread less

Proceedings Article•DOI•

Reachability analysis of deep neural networks with provable guarantees

[...]

Wenjie Ruan¹, Xiaowei Huang², Marta Kwiatkowska¹•Institutions (2)

University of Oxford¹, University of Liverpool²

13 Jul 2018

TL;DR: In this article, the authors study a generic reachability problem for feed-forward DNNs which, for a given set of inputs to the network and a Lipschitz-continuous function over its outputs, computes the lower and upper bound on the function values.

...read moreread less

Journal Article•DOI•

Dissipative fault-tolerant control for nonlinear singular perturbed systems with Markov jumping parameters based on slow state feedback

[...]

Jing Wang¹, Kun Liang¹, Xia Huang², Zhen Wang², Hao Shen¹ - Show less +1 more•Institutions (2)

Anhui University of Technology¹, Shandong University of Science and Technology²

01 Jul 2018-Applied Mathematics and Computation

TL;DR: The aim is to design an optimized slow state feedback controller such that the stability of MJSPSs is guaranteed even in faulty case, and the upper bound of singular perturbation parameter (SPP) ϵ is improved simultaneously.

...read moreread less

Book•

The Spatial Complexity of Oblivious K-Probe Hash Functions

[...]

Jeanette P. Schmidt¹, Alan Siegel•Institutions (1)

New York University¹

09 Feb 2018

TL;DR: Nearly tight bounds on the spatial complexity of oblivious $O(1)$-probe hash functions, which are defined to depend solely on their search key argument are provided, establishing a significant gap between oblivious and nonoblivious search.

...read moreread less

Abstract: The problem of constructing a dense static hash-based lookup table T for a set of n elements belonging to a universe $U = \{ 0, 1, 2,\cdots , m -1 \}$ is considered. Nearly tight bounds on the spatial complexity of oblivious $O(1)$-probe hash functions, which are defined to depend solely on their search key argument, are provided. This establishes a significant gap between oblivious and nonoblivious search. In particular, the results include the following: • A lower bound showing that oblivious k-probe hash functions require a program size of $\Omega(({n / k}^{2})e^{-k}+\log \log m)$ bits, on average. • A probabilistic construction of a family of oblivious k-probe hash functions that can be specified in $O(n e^{-k} +\log \log m)$ bits, which nearly matches the above lower bound. • A variation of an explicit $O(1)$ time 1-probe (perfect) hash function family that can be specified in $O(n+\log \log m)$ bits, which is tight to within a constant factor of the lower bound.

...read moreread less

Proceedings Article•DOI•

Robust moment estimation and improved clustering via sum of squares

[...]

Pravesh K. Kothari¹, Jacob Steinhardt², David Steurer³•Institutions (3)

Princeton University¹, Stanford University², ETH Zurich³

20 Jun 2018

TL;DR: Improved algorithms for independent component analysis and learning mixtures of Gaussians in the presence of outliers are developed and a sharp upper bound on the sum-of-squares norms for moment tensors of any distribution that satisfies the Poincare inequality is shown.

...read moreread less

Abstract: We develop efficient algorithms for estimating low-degree moments of unknown distributions in the presence of adversarial outliers and design a new family of convex relaxations for k-means clustering based on sum-of-squares method. As an immediate corollary, for any γ > 0, we obtain an efficient algorithm for learning the means of a mixture of k arbitrary distributions in d in time dO(1/γ) so long as the means have separation Ω(kγ). This in particular yields an algorithm for learning Gaussian mixtures with separation Ω(kγ), thus partially resolving an open problem of Regev and Vijayaraghavan regev2017learning. The guarantees of our robust estimation algorithms improve in many cases significantly over the best previous ones, obtained in the recent works. We also show that the guarantees of our algorithms match information-theoretic lower-bounds for the class of distributions we consider. These improved guarantees allow us to give improved algorithms for independent component analysis and learning mixtures of Gaussians in the presence of outliers. We also show a sharp upper bound on the sum-of-squares norms for moment tensors of any distribution that satisfies the Poincare inequality. The Poincare inequality is a central inequality in probability theory, and a large class of distributions satisfy it including Gaussians, product distributions, strongly log-concave distributions, and any sum or uniformly continuous transformation of such distributions. As a consequence, this yields that all of the above algorithmic improvements hold for distributions satisfying the Poincare inequality.

...read moreread less

Journal Article•DOI•

Admissible Delay Upper Bounds for Global Asymptotic Stability of Neural Networks With Time-Varying Delays

[...]

Xian-Ming Zhang¹, Qing-Long Han¹, Jun Wang²•Institutions (2)

Swinburne University of Technology¹, City University of Hong Kong²

16 Feb 2018-IEEE Transactions on Neural Networks

TL;DR: By constructing proper Lyapunov–Krasovskii functional, global asymptotic stability of the neural network is analyzed for two types of the time-varying delays depending on whether or not the lower bound of the delay derivative is known.

...read moreread less

Abstract: This paper is concerned with global asymptotic stability of a neural network with a time-varying delay, where the delay function is differentiable uniformly bounded with delay-derivative bounded from above. First, a general reciprocally convex inequality is presented by introducing some slack vectors with flexible dimensions. This inequality provides a tighter bound in the form of a convex combination than some existing ones. Second, by constructing proper Lyapunov–Krasovskii functional, global asymptotic stability of the neural network is analyzed for two types of the time-varying delays depending on whether or not the lower bound of the delay derivative is known. Third, noticing that sufficient conditions on stability from estimation on the derivative of some Lyapunov–Krasovskii functional are affine both on the delay function and its derivative, allowable delay sets can be refined to produce less conservative stability criteria for the neural network under study. Finally, two numerical examples are given to substantiate the effectiveness of the proposed method.

...read moreread less

Collapse