Showing papers on "Function (mathematics) published in 2013"

PDF

Open Access

Posted Content•

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

[...]

Yoshua Bengio, Nicholas Léonard, Aaron Courville

15 Aug 2013-arXiv: Learning

TL;DR: This work considers a small-scale version of {\em conditional computation}, where sparse stochastic units form a distributed representation of gaters that can turn off in combinatorially many ways large chunks of the computation performed in the rest of the neural network.

...read moreread less

Abstract: Stochastic neurons and hard non-linearities can be useful for a number of reasons in deep learning models, but in many cases they pose a challenging problem: how to estimate the gradient of a loss function with respect to the input of such stochastic or non-smooth neurons? I.e., can we "back-propagate" through these stochastic neurons? We examine this question, existing approaches, and compare four families of solutions, applicable in different settings. One of them is the minimum variance unbiased gradient estimator for stochatic binary neurons (a special case of the REINFORCE algorithm). A second approach, introduced here, decomposes the operation of a binary stochastic neuron into a stochastic binary part and a smooth differentiable part, which approximates the expected effect of the pure stochatic binary neuron to first order. A third approach involves the injection of additive or multiplicative noise in a computational graph that is otherwise differentiable. A fourth approach heuristically copies the gradient with respect to the stochastic output directly as an estimator of the gradient with respect to the sigmoid argument (we call this the straight-through estimator). To explore a context where these estimators are useful, we consider a small-scale version of {\em conditional computation}, where sparse stochastic units form a distributed representation of gaters that can turn off in combinatorially many ways large chunks of the computation performed in the rest of the neural network. In this case, it is important that the gating units produce an actual 0 most of the time. The resulting sparsity can be potentially be exploited to greatly reduce the computational cost of large deep networks for which conditional computation would be useful.

...read moreread less

2,178 citations

Journal Article•DOI•

New Version: Grasp2K Relativistic Atomic Structure Package

[...]

Per Jönsson¹, Gediminas Gaigalas², Jacek Bieron³, C. Froese Fischer⁴, C. Froese Fischer⁵, Ian P. Grant - Show less +2 more•Institutions (5)

Malmö University¹, Vilnius University², Jagiellonian University³, Vanderbilt University⁴, National Institute of Standards and Technology⁵

01 Sep 2013-Computer Physics Communications

TL;DR: A revised version of Grasp 2 K supports earlier non-block and block versions of codes as well as a new block version in which the njgraf library module has been replaced by the librang angular package developed by Gaigalas based on the theory of Dirac–Hartree–Fock.

...read moreread less

494 citations

Journal Article•DOI•

Robust Adaptive Fuzzy Tracking Control for Pure-Feedback Stochastic Nonlinear Systems With Input Constraints

[...]

Huanqing Wang¹, Bing Chen¹, Xiaoping Liu², Kefu Liu², Chong Lin¹ - Show less +1 more•Institutions (2)

Qingdao University¹, Lakehead University²

11 Feb 2013-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: The proposed adaptive fuzzy tracking controller guarantees that all signals in the closed-loop system are bounded in probability and the system output eventually converges to a small neighborhood of the desired reference signal in the sense of mean quartic value.

...read moreread less

Abstract: This paper is concerned with the problem of adaptive fuzzy tracking control for a class of pure-feedback stochastic nonlinear systems with input saturation. To overcome the design difficulty from nondifferential saturation nonlinearity, a smooth nonlinear function of the control input signal is first introduced to approximate the saturation function; then, an adaptive fuzzy tracking controller based on the mean-value theorem is constructed by using backstepping technique. The proposed adaptive fuzzy controller guarantees that all signals in the closed-loop system are bounded in probability and the system output eventually converges to a small neighborhood of the desired reference signal in the sense of mean quartic value. Simulation results further illustrate the effectiveness of the proposed control scheme.

...read moreread less

386 citations

Journal Article•DOI•

Modeling the effects of light and temperature on algae growth: State of the art and critical assessment for productivity prediction during outdoor cultivation

[...]

Quentin Béchet¹, Andy Shilton¹, Benoit Guieysse¹•Institutions (1)

Massey University¹

01 Dec 2013-Biotechnology Advances

TL;DR: It is proposed that Type II models currently offer the best compromise between accuracy and practicability for full scale engineering application and the recommended approach is uncoupled models.

...read moreread less

337 citations

Proceedings Article•

What Regularized Auto-Encoders Learn from the Data Generating Distribution

[...]

Guillaume Alain¹, Yoshua Bengio¹•Institutions (1)

Université de Montréal¹

17 Jan 2013

TL;DR: It is shown that the auto-encoder captures the score (derivative of the log-density with respect to the input) and contradicts previous interpretations of reconstruction error as an energy function.

...read moreread less

Abstract: What do auto-encoders learn about the underlying data generating distribution? Recent work suggests that some auto-encoder variants do a good job of capturing the local manifold structure of data. This paper clarifies some of these previous observations by showing that minimizing a particular form of regularized reconstruction error yields a reconstruction function that locally characterizes the shape of the data generating density. We show that the auto-encoder captures the score (derivative of the log-density with respect to the input). It contradicts previous interpretations of reconstruction error as an energy function. Unlike previous results, the theorems provided here are completely generic and do not depend on the parametrization of the auto-encoder: they show what the auto-encoder would tend to if given enough capacity and examples. These results are for a contractive training criterion we show to be similar to the denoising auto-encoder training criterion with small corruption noise, but with contraction applied on the whole reconstruction function rather than just encoder. Similarly to score matching, one can consider the proposed training criterion as a convenient alternative to maximum likelihood because it does not involve a partition function. Finally, we show how an approximate Metropolis-Hastings MCMC can be setup to recover samples from the estimated distribution, and this is confirmed in sampling experiments.

...read moreread less

318 citations

Journal Article•DOI•

Nonsmooth optimization via quasi-Newton methods

[...]

Adrian S. Lewis¹, Michael L. Overton²•Institutions (2)

Cornell University¹, New York University²

01 Oct 2013-Mathematical Programming

TL;DR: It is found that when f is locally Lipschitz and semi-algebraic with bounded sublevel sets, the BFGS method with the inexact line search almost always generates sequences whose cluster points are Clarke stationary and with function values converging R-linearly to a Clarke stationary value.

...read moreread less

Abstract: We investigate the behavior of quasi-Newton algorithms applied to minimize a nonsmooth function f , not necessarily convex. We introduce an inex- act line search that generates a sequence of nested intervals containing a set of points of nonzero measure that satisfy the Armijo and Wolfe conditions if f is absolutely continuous along the line. Furthermore, the line search is guaranteed to terminate if f is semi-algebraic. It seems quite difficult to establish a convergence theorem for quasi-Newton methods applied to such general classes of functions, so we give a care- ful analysis of a special but illuminating case, the Euclidean norm, in one variable using the inexact line search and in two variables assuming that the line search is exact. In practice, we find that when f is locally Lipschitz and semi-algebraic with bounded sublevel sets, the BFGS (Broyden-Fletcher-Goldfarb-Shanno) method with the inexact line search almost always generates sequences whose cluster points are Clarke stationary and with function values converging R-linearly to a Clarke station- ary value. We give references documenting the successful use of BFGS in a variety of nonsmooth applications, particularly the design of low-order controllers for linear dynamical systems. We conclude with a challenging open question.

...read moreread less

311 citations

Journal Article•DOI•

HMFcalc: An online tool for calculating dark matter halo mass functions

[...]

Steven G. Murray¹, Chris Power¹, Aaron S. G. Robotham¹•Institutions (1)

University of Western Australia¹

28 Jun 2013-Astronomy and Computing

TL;DR: A new web application for calculating the dark matter halo mass function (HMF) is presented—the frontend HMFcalc and the engine hmf, designed to be flexible, efficient and easy to use.

...read moreread less

272 citations

Posted Content•

Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

[...]

Francis Bach¹, Eric Moulines²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, Télécom ParisTech²

10 Jun 2013-arXiv: Learning

TL;DR: This work considers the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk.

...read moreread less

Abstract: We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk. We focus on problems without strong convexity, for which all previously known algorithms achieve a convergence rate for function values of O(1/n^{1/2}). We consider and analyze two algorithms that achieve a rate of O(1/n) for classical supervised learning problems. For least-squares regression, we show that averaged stochastic gradient descent with constant step-size achieves the desired rate. For logistic regression, this is achieved by a simple novel stochastic gradient algorithm that (a) constructs successive local quadratic approximations of the loss functions, while (b) preserving the same running time complexity as stochastic gradient descent. For these algorithms, we provide a non-asymptotic analysis of the generalization error (in expectation, and also in high probability for least-squares), and run extensive experiments on standard machine learning benchmarks showing that they often outperform existing approaches.

...read moreread less

266 citations

Proceedings Article•

Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

[...]

Francis Bach¹, Eric Moulines²•Institutions (2)

École Normale Supérieure¹, Télécom ParisTech²

05 Dec 2013

TL;DR: In this article, the authors consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk.

...read moreread less

Abstract: We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk. We focus on problems without strong convexity, for which all previously known algorithms achieve a convergence rate for function values of O(1/√n) after n iterations. We consider and analyze two algorithms that achieve a rate of O(1/n) for classical supervised learning problems. For least-squares regression, we show that averaged stochastic gradient descent with constant step-size achieves the desired rate. For logistic regression, this is achieved by a simple novel stochastic gradient algorithm that (a) constructs successive local quadratic approximations of the loss functions, while (b) preserving the same running-time complexity as stochastic gradient descent. For these algorithms, we provide a non-asymptotic analysis of the generalization error (in expectation, and also in high probability for least-squares), and run extensive experiments showing that they often outperform existing approaches.

...read moreread less

241 citations

Posted Content•

Local Privacy and Statistical Minimax Rates

[...]

John C. Duchi¹, Michael I. Jordan¹, Martin J. Wainwright¹•Institutions (1)

University of California, Berkeley¹

13 Feb 2013

TL;DR: In this paper, the tradeoff between privacy guarantees and the utility of the resulting statistical estimators was studied under local privacy constraints, and lower and upper bounds on mutual information and Kullback-Leibler divergence were established.

...read moreread less

Abstract: Working under a model of privacy in which data remains private even from the statistician, we study the tradeoff between privacy guarantees and the utility of the resulting statistical estimators. We prove bounds on information-theoretic quantities, including mutual information and Kullback-Leibler divergence, that influence estimation rates as a function of the amount of privacy preserved. When combined with standard minimax techniques such as Le Cam's and Fano's methods, these inequalities allow for a precise characterization of statistical rates under local privacy constraints. In this paper, we provide a complete treatment of three canonical problem families: mean estimation in location family models, parameter estimation in fixed-design regression, and convex risk minimization. For all of these families, we provide lower and upper bounds that match up to constant factors, giving privacy-preserving mechanisms and computationally efficient estimators that achieve the bounds.

...read moreread less

223 citations

Journal Article•DOI•

A Bilevel Optimization Approach for Parameter Learning in Variational Models

[...]

Karl Kunisch¹, Thomas Pock²•Institutions (2)

University of Graz¹, Graz University of Technology²

16 May 2013-Siam Journal on Imaging Sciences

TL;DR: This work considers a class of image denoising models incorporating $\ell_p$-norm--based analysis priors using a fixed set of linear operators and devise semismooth Newton methods for solving the resulting nonsmooth bilevel optimization problems.

...read moreread less

Abstract: In this work we consider the problem of parameter learning for variational image denoising models. The learning problem is formulated as a bilevel optimization problem, where the lower-level problem is given by the variational model and the higher-level problem is expressed by means of a loss function that penalizes errors between the solution of the lower-level problem and the ground truth data. We consider a class of image denoising models incorporating $\ell_p$-norm--based analysis priors using a fixed set of linear operators. We devise semismooth Newton methods for solving the resulting nonsmooth bilevel optimization problems and show that the optimized image denoising models can achieve state-of-the-art performance.

...read moreread less

Journal Article•DOI•

On the Stability and Accuracy of Least Squares Approximations

[...]

Albert Cohen¹, Mark A. Davenport², Dany Leviatan³•Institutions (3)

Pierre-and-Marie-Curie University¹, Georgia Institute of Technology², Tel Aviv University³

01 Oct 2013-Foundations of Computational Mathematics

TL;DR: This work provides a criterion on m that describes the needed amount of regularization to ensure that the least squares method is stable and that its accuracy, measured in L2(X,ρX), is comparable to the best approximation error of f by elements from Vm.

...read moreread less

Abstract: We consider the problem of reconstructing an unknown function f on a domain X from samples of f at n randomly chosen points with respect to a given measure źX. Given a sequence of linear spaces (Vm)m>0 with dim(Vm)=m≤n, we study the least squares approximations from the spaces Vm. It is well known that such approximations can be inaccurate when m is too close to n, even when the samples are noiseless. Our main result provides a criterion on m that describes the needed amount of regularization to ensure that the least squares method is stable and that its accuracy, measured in L2(X,źX), is comparable to the best approximation error of f by elements from Vm. We illustrate this criterion for various approximation schemes, such as trigonometric polynomials, with źX being the uniform measure, and algebraic polynomials, with źX being either the uniform or Chebyshev measure. For such examples we also prove similar stability results using deterministic samples that are equispaced with respect to these measures.

...read moreread less

Journal Article•DOI•

Hexagon functions and the three-loop remainder function

[...]

Lance J. Dixon¹, James M. Drummond², James M. Drummond³, Matt von Hippel⁴, Matt von Hippel¹, Jeffrey Pennington¹ - Show less +2 more•Institutions (4)

Stanford University¹, CERN², University of Savoy³, Stony Brook University⁴

10 Aug 2013-Journal of High Energy Physics

TL;DR: The three-loop remainder function as discussed by the authors describes the scattering of six gluons in the maximally-helicity-violating configuration in planar N = 4 super- Yang-Mills theory, as a function of the three dual conformal cross ratios.

...read moreread less

Abstract: We present the three-loop remainder function, which describes the scattering of six gluons in the maximally-helicity-violating configuration in planar N = 4 super- Yang-Mills theory, as a function of the three dual conformal cross ratios. The result can be expressed in terms of multiple Goncharov polylogarithms. We also employ a more restricted class of hexagon functions which have the correct branch cuts and certain other restrictions on their symbols. We classify all the hexagon functions through transcendental weight five, using the coproduct for their Hopf algebra iteratively, which amounts to a set of first-order differential equations. The three-loop remainder function is a particular weight-six hexagon function, whose symbol was determined previously. The differential equations can be integrated numerically for generic values of the cross ratios, or analytically in certain kinematic limits, including the near-collinear and multi-Regge limits. These limits allow us to impose constraints from the operator product expansion and multi- Regge factorization directly at the function level, and thereby to fix uniquely a set of Riemann ζ valued constants that could not be fixed at the level of the symbol. The near- collinear limits agree precisely with recent predictions by Basso, Sever and Vieira based on integrability. The multi-Regge limits agree with the factorization formula of Fadin and Lipatov, and determine three constants entering the impact factor at this order. We plot the three-loop remainder function for various slices of the Euclidean region of positive cross ratios, and compare it to the two-loop one. For large ranges of the cross ratios, the ratio of the three-loop to the two-loop remainder function is relatively constant, and close to −7.

...read moreread less

Journal Article•DOI•

Accelerated and Inexact Forward-Backward Algorithms

[...]

Silvia Villa, Saverio Salzo, Luca Baldassarre, Alessandro Verri

06 Aug 2013-Siam Journal on Optimization

TL;DR: A convergence analysis of accelerated forward-backward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision is proposed.

...read moreread less

Abstract: We propose a convergence analysis of accelerated forward-backward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision. We prove that the $1/k^2$ convergence rate for the function values can be achieved if the admissible errors are of a certain type and satisfy a sufficiently fast decay condition. Our analysis is based on the machinery of estimate sequences first introduced by Nesterov for the study of accelerated gradient descent algorithms. Furthermore, we give a global complexity analysis, taking into account the cost of computing admissible approximations of the proximal point. An experimental analysis is also presented.

...read moreread less

Journal Article•DOI•

LaSalle-Yoshizawa Corollaries for Nonsmooth Systems

[...]

N. Fischer¹, Rushikesh Kamalapurkar¹, Warren E. Dixon¹•Institutions (1)

University of Florida¹

29 Apr 2013-IEEE Transactions on Automatic Control

TL;DR: Two generalized corollaries to the LaSalle-Yoshizawa Theorem are presented for nonautonomous systems described by nonlinear differential equations with discontinuous right-hand sides.

...read moreread less

Abstract: In this technical note, two generalized corollaries to the LaSalle-Yoshizawa Theorem are presented for nonautonomous systems described by nonlinear differential equations with discontinuous right-hand sides. Lyapunov-based analysis methods that achieve asymptotic convergence when the candidate Lyapunov derivative is upper bounded by a negative semi-definite function in the presence of differential inclusions are presented. A design example illustrates the utility of the corollaries.

...read moreread less

Book Chapter•DOI•

Signatures of correct computation

[...]

Charalampos Papamanthou¹, Elaine Shi², Roberto Tamassia³•Institutions (3)

University of California, Berkeley¹, University of Maryland, College Park², Brown University³

03 Mar 2013

TL;DR: Signatures of Correct Computation is introduced, a new model for verifying dynamic computations in cloud settings and it is shown that signatures of correct computation imply Publicly Verifiable Computation (PVC), a model recently introduced in several concurrent and independent works.

...read moreread less

Abstract: We introduce Signatures of Correct Computation (SCC), a new model for verifying dynamic computations in cloud settings. In the SCC model, a trusted source outsources a function f to an untrusted server, along with a public key for that function (to be used during verification). The server can then produce a succinct signature σ vouching for the correctness of the computation of f, i.e., that some result v is indeed the correct outcome of the function f evaluated on some point a. There are two crucial performance properties that we want to guarantee in an SCC construction: (1) verifying the signature should take asymptotically less time than evaluating the function f; and (2) the public key should be efficiently updated whenever the function changes. We construct SCC schemes (satisfying the above two properties) supporting expressive manipulations over multivariate polynomials, such as polynomial evaluation and differentiation. Our constructions are adaptively secure in the random oracle model and achieve optimal updates, i.e., the function's public key can be updated in time proportional to the number of updated coefficients, without performing a linear-time computation (in the size of the polynomial). We also show that signatures of correct computation imply Publicly Verifiable Computation (PVC), a model recently introduced in several concurrent and independent works. Roughly speaking, in the SCC model, any client can verify the signature σ and be convinced of some computation result, whereas in the PVC model only the client that issued a query (or anyone who trusts this client) can verify that the server returned a valid signature (proof) for the answer to the query. Our techniques can be readily adapted to construct PVC schemes with adaptive security, efficient updates and without the random oracle model.

...read moreread less

Journal Article•DOI•

Input-to-state stability of infinite-dimensional control systems

[...]

Sergey Dashkovskiy¹, Andrii Mironchenko²•Institutions (2)

Frankfurt University of Applied Sciences¹, University of Bremen²

01 Mar 2013-Mathematics of Control, Signals, and Systems

TL;DR: It is shown that for certain classes of admissible inputs, the existence of an ISS-Lyapunov function implies the ISS of a system, and it is proved a linearization principle that allows a construction of a local ISS- Lyap unov function for a system.

...read moreread less

Abstract: We develop tools for investigation of input-to-state stability (ISS) of infinite-dimensional control systems. We show that for certain classes of admissible inputs, the existence of an ISS-Lyapunov function implies the ISS of a system. Then for the case of systems described by abstract equations in Banach spaces, we develop two methods of construction of local and global ISS-Lyapunov functions. We prove a linearization principle that allows a construction of a local ISS-Lyapunov function for a system, the linear approximation of which is ISS. In order to study the interconnections of nonlinear infinite-dimensional systems, we generalize the small-gain theorem to the case of infinite-dimensional systems and provide a way to construct an ISS-Lyapunov function for an entire interconnection, if ISS-Lyapunov functions for subsystems are known and the small-gain condition is satisfied. We illustrate the theory on examples of linear and semilinear reaction-diffusion equations.

...read moreread less

Journal Article•DOI•

Triangular fuzzy decision-theoretic rough sets

[...]

Decui Liang¹, Decui Liang², Dun Liu¹, Witold Pedrycz², Witold Pedrycz³, Pei Hu¹ - Show less +2 more•Institutions (3)

Southwest Jiaotong University¹, University of Alberta², Polish Academy of Sciences³

01 Oct 2013-International Journal of Approximate Reasoning

TL;DR: This study provides a solution in the aspect of determining the value of loss function of DTRS and extends its range of applications with the use of particle swarm optimization.

...read moreread less

Journal Article•DOI•

SO-MI: A surrogate model algorithm for computationally expensive nonlinear mixed-integer black-box global optimization problems

[...]

Juliane Müller¹, Christine A. Shoemaker², Robert Piche¹•Institutions (2)

Tampere University of Technology¹, Cornell University²

01 May 2013-Computers & Operations Research

TL;DR: The numerical experiments show that SO-MI reaches significantly better results than the other algorithms when the number of function evaluations is very restricted (200-300 evaluations), and the algorithm converges to the global optimum almost surely.

...read moreread less

Journal Article•DOI•

MAP estimators and their consistency in Bayesian nonparametric inverse problems

[...]

Masoumeh Dashti¹, Kody J. H. Law², Andrew M. Stuart³, Jochen Voss⁴•Institutions (4)

University of Sussex¹, King Abdullah University of Science and Technology², University of Warwick³, University of Leeds⁴

02 Sep 2013-Inverse Problems

TL;DR: In this paper, a Bayesian approach is adopted to the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map applied to u. The prior measure is specified as a Gaussian random field μ 0.

...read moreread less

Abstract: We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map $\mathcal {G}$ applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ0. We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μy. Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager–Machlup functional defined on the Cameron–Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of $\mathcal {G}(u)$ can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier–Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics.

...read moreread less

Journal Article•DOI•

Estimation of the mean of functional time series and a two-sample problem

[...]

Lajos Horváth¹, Piotr Kokoszka², Ron W Reeder¹•Institutions (2)

University of Utah¹, Colorado State University²

01 Jan 2013-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: In this article, a normal approximation for the functional sample mean is developed and asymptotically justify testing procedures for the equality of means in two functional samples exhibiting temporal dependence.

...read moreread less

Abstract: Summary. The paper is concerned with inference based on the mean function of a functional time series. We develop a normal approximation for the functional sample mean and then focus on the estimation of the asymptotic variance kernel. Using these results, we develop and asymptotically justify testing procedures for the equality of means in two functional samples exhibiting temporal dependence. Evaluated by means of a simulation study and application to a real data set, these two-sample procedures enjoy good size and power in finite samples.

...read moreread less

Book Chapter•DOI•

Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration

[...]

Emile Contal¹, David Buffoni¹, Alexandre Robicquet¹, Nicolas Vayatis¹•Institutions (1)

École normale supérieure de Cachan¹

19 Apr 2013-arXiv: Learning

TL;DR: The Gaussian Process Upper Confidence Bound and Pure exploration algorithm (GP-UCB-PE) is introduced which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations and proves theoretical upper bounds on the regret with batches of size K for this procedure.

...read moreread less

Abstract: In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to actively select the next estimation of f which is predicted to be the most useful. We focus on the case where the function can be evaluated in parallel with batches of fixed size and analyze the benefit compared to the purely sequential procedure in terms of cumulative regret. We introduce the Gaussian Process Upper Confidence Bound and Pure Exploration algorithm (GP-UCB-PE) which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations. We prove theoretical upper bounds on the regret with batches of size K for this procedure which show the improvement of the order of sqrt{K} for fixed iteration cost over purely sequential versions. Moreover, the multiplicative constants involved have the property of being dimension-free. We also confirm empirically the efficiency of GP-UCB-PE on real and synthetic problems compared to state-of-the-art competitors.

...read moreread less

Journal Article•DOI•

Distribution-Free Prediction Sets

[...]

Jing Lei¹, James M. Robins², Larry Wasserman¹•Institutions (2)

Carnegie Mellon University¹, Harvard University²

15 Mar 2013-Journal of the American Statistical Association

TL;DR: This article considers the problem of constructing nonparametric tolerance/prediction sets by starting from the general conformal prediction approach, and uses a kernel density estimator as a measure of agreement between a sample point and the underlying distribution.

...read moreread less

Abstract: This article introduces a new approach to prediction by bringing together two different nonparametric ideas: distribution-free inference and nonparametric smoothing. Specifically, we consider the problem of constructing nonparametric tolerance/prediction sets. We start from the general conformal prediction approach, and we use a kernel density estimator as a measure of agreement between a sample point and the underlying distribution. The resulting prediction set is shown to be closely related to plug-in density level sets with carefully chosen cutoff values. Under standard smoothness conditions, we get an asymptotic efficiency result that is near optimal for a wide range of function classes. But the coverage is guaranteed whether or not the smoothness conditions hold and regardless of the sample size. The performance of our method is investigated through simulation studies and illustrated in a real data example.

...read moreread less

Proceedings Article•DOI•

Learning objective functions for manipulation

[...]

Mrinal Kalakrishnan¹, Peter Pastor¹, Ludovic Righetti¹, Stefan Schaal¹•Institutions (1)

University of Southern California¹

06 May 2013

TL;DR: An approach to learning objective functions for robotic manipulation based on inverse reinforcement learning that can deal with high-dimensional continuous state-action spaces, and only requires local optimality of demonstrated trajectories is presented.

...read moreread less

Abstract: We present an approach to learning objective functions for robotic manipulation based on inverse reinforcement learning. Our path integral inverse reinforcement learning algorithm can deal with high-dimensional continuous state-action spaces, and only requires local optimality of demonstrated trajectories. We use L1 regularization in order to achieve feature selection, and propose an efficient algorithm to minimize the resulting convex objective function. We demonstrate our approach by applying it to two core problems in robotic manipulation. First, we learn a cost function for redundancy resolution in inverse kinematics. Second, we use our method to learn a cost function over trajectories, which is then used in optimization-based motion planning for grasping and manipulation tasks. Experimental results show that our method outperforms previous algorithms in high-dimensional settings.

...read moreread less

Book Chapter•DOI•

Interest Rate Modeling

[...]

You-lan Zhu¹, Xiaonan Wu², I-Liang Chern³, Zhi-zhong Sun⁴•Institutions (4)

University of North Carolina at Chapel Hill¹, Hong Kong Baptist University², National Taiwan University³, Southeast University⁴

01 Jan 2013

TL;DR: In this paper, when the short-term interest rate is considered as a random variable, there is an unknown function λ(r, t), called the market price of risk, in the governing equation.

...read moreread less

Abstract: As pointed out in Sect. 2.3, when the short-term interest rate is considered as a random variable, there is an unknown function λ(r, t), called the market price of risk, in the governing equation.

...read moreread less

Proceedings Article•DOI•

Active learning for level set estimation

[...]

Alkis Gotovos¹, Nathalie Casati², Gregory Hitz¹, Andreas Krause¹•Institutions (2)

ETH Zurich¹, IBM²

03 Aug 2013

TL;DR: This work proposes LSE, an algorithm that guides both sampling and classification based on GP-derived confidence bounds, and extends LSE and its theory to two more natural settings: where the threshold level is implicitly defined as a percentage of the (unknown) maximum of the target function and (2) where samples are selected in batches.

...read moreread less

Abstract: Many information gathering problems require determining the set of points, for which an unknown function takes value above or below some given threshold level. We formalize this task as a classification problem with sequential measurements, where the unknown function is modeled as a sample from a Gaussian process (GP). We propose LSE, an algorithm that guides both sampling and classification based on GP-derived confidence bounds, and provide theoretical guarantees about its sample complexity. Furthermore, we extend LSE and its theory to two more natural settings: (1) where the threshold level is implicitly defined as a percentage of the (unknown) maximum of the target function and (2) where samples are selected in batches. We evaluate the effectiveness of our proposed methods on two problems of practical interest, namely autonomous monitoring of algal populations in a lake environment and geolocating network latency.

...read moreread less

Proceedings Article•DOI•

StaticGreedy: solving the scalability-accuracy dilemma in influence maximization

[...]

Suqi Cheng¹, Huawei Shen¹, Junming Huang¹, Guoqing Zhang¹, Xueqi Cheng¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

27 Oct 2013

TL;DR: In this article, a static greedy algorithm, named StaticGreedy, is proposed to strictly guarantee the submodularity of influence spread function during the seed selection process, which makes the computational expense dramatically reduced by two orders of magnitude without loss of accuracy.

...read moreread less

Abstract: Influence maximization, defined as a problem of finding a set of seed nodes to trigger a maximized spread of influence, is crucial to viral marketing on social networks. For practical viral marketing on large scale social networks, it is required that influence maximization algorithms should have both guaranteed accuracy and high scalability. However, existing algorithms suffer a scalability-accuracy dilemma: conventional greedy algorithms guarantee the accuracy with expensive computation, while the scalable heuristic algorithms suffer from unstable accuracyIn this paper, we focus on solving this scalability-accuracy dilemma. We point out that the essential reason of the dilemma is the surprising fact that the submodularity, a key requirement of the objective function for a greedy algorithm to approximate the optimum, is not guaranteed in all conventional greedy algorithms in the literature of influence maximization. Therefore a greedy algorithm has to afford a huge number of Monte Carlo simulations to reduce the pain caused by unguaranteed submodularity. Motivated by this critical finding, we propose a static greedy algorithm, named StaticGreedy, to strictly guarantee the submodularity of influence spread function during the seed selection process. The proposed algorithm makes the computational expense dramatically reduced by two orders of magnitude without loss of accuracy. Moreover, we propose a dynamical update strategy which can speed up the StaticGreedy algorithm by 2-7 times on large scale social networks.

...read moreread less

Journal Article•DOI•

Certain subclasses of bi-univalent functions satisfying subordinate conditions

[...]

Erhan Deniz

01 Jan 2013-Journal of Classical Analysis

TL;DR: In this paper, the Taylor-Maclaurin coefficients of the function f when f is in the following subclasses: SΣ(λ,γ;ϕ), HS Σ(α), RΣ (η,γ,ϕ) and BΣ((μ,φ,γ),φ) are investigated.

...read moreread less

Abstract: In this paper, we introduce and investigate each of the following subclasses: SΣ(λ,γ;ϕ), HS Σ(α), RΣ(η,γ;ϕ) and BΣ(μ;ϕ) (0 λ 1; γ ∈ C� {0}; α ∈ C ;0 η 0, and ϕ(D) is symmetric with respect to the real axis. We obtain coefficient bounds involving the Taylor-Maclaurin coefficients |a2| and |a3| of the function f when f is in these classes. The various results, which are presented in this paper, would generalize and improve those in related works of several earlier authors.

...read moreread less

Journal Article•DOI•

Going the Distance for Protein Function Prediction: A New Distance Metric for Protein Interaction Networks

[...]

Mengfei Cao¹, Hao Zhang¹, Jisoo Park¹, Noah M. Daniels¹, Mark Crovella², Lenore J. Cowen¹, Benjamin Hescott¹ - Show less +3 more•Institutions (2)

Tufts University¹, Boston University²

23 Oct 2013-PLOS ONE

TL;DR: This work introduces diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks.

...read moreread less

Abstract: In protein-protein interaction (PPI) networks, functional similarity is often inferred based on the function of directly interacting proteins, or more generally, some notion of interaction network proximity among proteins in a local neighborhood. Prior methods typically measure proximity as the shortest-path distance in the network, but this has only a limited ability to capture fine-grained neighborhood distinctions, because most proteins are close to each other, and there are many ties in proximity. We introduce diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks. We present a tool that, when input a PPI network, will output the DSD distances between every pair of proteins. We show that replacing the shortest-path metric by DSD improves the performance of classical function prediction methods across the board.

...read moreread less

Book Chapter•DOI•

Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration

[...]

Emile Contal¹, David Buffoni¹, Alexandre Robicquet¹, Nicolas Vayatis¹•Institutions (1)

École normale supérieure de Cachan¹

23 Sep 2013

TL;DR: The Gaussian Process Upper Confidence Bound and Pure Exploration (GP-UCB-PE) algorithm as discussed by the authors combines the UCB strategy and pure exploration in the same batch of evaluations along the parallel iterations.

...read moreread less

Abstract: In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to actively select the next estimation of f which is predicted to be the most useful. We focus on the case where the function can be evaluated in parallel with batches of fixed size and analyze the benefit compared to the purely sequential procedure in terms of cumulative regret. We introduce the Gaussian Process Upper Confidence Bound and Pure Exploration algorithm (GP-UCB-PE) which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations. We prove theoretical upper bounds on the regret with batches of size K for this procedure which show the improvement of the order of $\sqrt{K}$ for fixed iteration cost over purely sequential versions. Moreover, the multiplicative constants involved have the property of being dimension-free. We also confirm empirically the efficiency of GP-UCB-PE on real and synthetic problems compared to state-of-the-art competitors.

...read moreread less

Collapse