scispace - formally typeset
Search or ask a question

Showing papers on "Function (mathematics) published in 2013"


Posted Content
TL;DR: This work considers a small-scale version of {\em conditional computation}, where sparse stochastic units form a distributed representation of gaters that can turn off in combinatorially many ways large chunks of the computation performed in the rest of the neural network.
Abstract: Stochastic neurons and hard non-linearities can be useful for a number of reasons in deep learning models, but in many cases they pose a challenging problem: how to estimate the gradient of a loss function with respect to the input of such stochastic or non-smooth neurons? I.e., can we "back-propagate" through these stochastic neurons? We examine this question, existing approaches, and compare four families of solutions, applicable in different settings. One of them is the minimum variance unbiased gradient estimator for stochatic binary neurons (a special case of the REINFORCE algorithm). A second approach, introduced here, decomposes the operation of a binary stochastic neuron into a stochastic binary part and a smooth differentiable part, which approximates the expected effect of the pure stochatic binary neuron to first order. A third approach involves the injection of additive or multiplicative noise in a computational graph that is otherwise differentiable. A fourth approach heuristically copies the gradient with respect to the stochastic output directly as an estimator of the gradient with respect to the sigmoid argument (we call this the straight-through estimator). To explore a context where these estimators are useful, we consider a small-scale version of {\em conditional computation}, where sparse stochastic units form a distributed representation of gaters that can turn off in combinatorially many ways large chunks of the computation performed in the rest of the neural network. In this case, it is important that the gating units produce an actual 0 most of the time. The resulting sparsity can be potentially be exploited to greatly reduce the computational cost of large deep networks for which conditional computation would be useful.

2,178 citations


Journal ArticleDOI
TL;DR: A revised version of Grasp 2 K supports earlier non-block and block versions of codes as well as a new block version in which the njgraf library module has been replaced by the librang angular package developed by Gaigalas based on the theory of Dirac–Hartree–Fock.

494 citations


Journal ArticleDOI
TL;DR: The proposed adaptive fuzzy tracking controller guarantees that all signals in the closed-loop system are bounded in probability and the system output eventually converges to a small neighborhood of the desired reference signal in the sense of mean quartic value.
Abstract: This paper is concerned with the problem of adaptive fuzzy tracking control for a class of pure-feedback stochastic nonlinear systems with input saturation. To overcome the design difficulty from nondifferential saturation nonlinearity, a smooth nonlinear function of the control input signal is first introduced to approximate the saturation function; then, an adaptive fuzzy tracking controller based on the mean-value theorem is constructed by using backstepping technique. The proposed adaptive fuzzy controller guarantees that all signals in the closed-loop system are bounded in probability and the system output eventually converges to a small neighborhood of the desired reference signal in the sense of mean quartic value. Simulation results further illustrate the effectiveness of the proposed control scheme.

386 citations


Journal ArticleDOI
TL;DR: It is proposed that Type II models currently offer the best compromise between accuracy and practicability for full scale engineering application and the recommended approach is uncoupled models.

337 citations


Proceedings Article
17 Jan 2013
TL;DR: It is shown that the auto-encoder captures the score (derivative of the log-density with respect to the input) and contradicts previous interpretations of reconstruction error as an energy function.
Abstract: What do auto-encoders learn about the underlying data generating distribution? Recent work suggests that some auto-encoder variants do a good job of capturing the local manifold structure of data. This paper clarifies some of these previous observations by showing that minimizing a particular form of regularized reconstruction error yields a reconstruction function that locally characterizes the shape of the data generating density. We show that the auto-encoder captures the score (derivative of the log-density with respect to the input). It contradicts previous interpretations of reconstruction error as an energy function. Unlike previous results, the theorems provided here are completely generic and do not depend on the parametrization of the auto-encoder: they show what the auto-encoder would tend to if given enough capacity and examples. These results are for a contractive training criterion we show to be similar to the denoising auto-encoder training criterion with small corruption noise, but with contraction applied on the whole reconstruction function rather than just encoder. Similarly to score matching, one can consider the proposed training criterion as a convenient alternative to maximum likelihood because it does not involve a partition function. Finally, we show how an approximate Metropolis-Hastings MCMC can be setup to recover samples from the estimated distribution, and this is confirmed in sampling experiments.

318 citations


Journal ArticleDOI
TL;DR: It is found that when f is locally Lipschitz and semi-algebraic with bounded sublevel sets, the BFGS method with the inexact line search almost always generates sequences whose cluster points are Clarke stationary and with function values converging R-linearly to a Clarke stationary value.
Abstract: We investigate the behavior of quasi-Newton algorithms applied to minimize a nonsmooth function f , not necessarily convex. We introduce an inex- act line search that generates a sequence of nested intervals containing a set of points of nonzero measure that satisfy the Armijo and Wolfe conditions if f is absolutely continuous along the line. Furthermore, the line search is guaranteed to terminate if f is semi-algebraic. It seems quite difficult to establish a convergence theorem for quasi-Newton methods applied to such general classes of functions, so we give a care- ful analysis of a special but illuminating case, the Euclidean norm, in one variable using the inexact line search and in two variables assuming that the line search is exact. In practice, we find that when f is locally Lipschitz and semi-algebraic with bounded sublevel sets, the BFGS (Broyden-Fletcher-Goldfarb-Shanno) method with the inexact line search almost always generates sequences whose cluster points are Clarke stationary and with function values converging R-linearly to a Clarke station- ary value. We give references documenting the successful use of BFGS in a variety of nonsmooth applications, particularly the design of low-order controllers for linear dynamical systems. We conclude with a challenging open question.

311 citations


Journal ArticleDOI
TL;DR: A new web application for calculating the dark matter halo mass function (HMF) is presented—the frontend HMFcalc and the engine hmf, designed to be flexible, efficient and easy to use.

272 citations


Posted Content
TL;DR: This work considers the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk.
Abstract: We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk. We focus on problems without strong convexity, for which all previously known algorithms achieve a convergence rate for function values of O(1/n^{1/2}). We consider and analyze two algorithms that achieve a rate of O(1/n) for classical supervised learning problems. For least-squares regression, we show that averaged stochastic gradient descent with constant step-size achieves the desired rate. For logistic regression, this is achieved by a simple novel stochastic gradient algorithm that (a) constructs successive local quadratic approximations of the loss functions, while (b) preserving the same running time complexity as stochastic gradient descent. For these algorithms, we provide a non-asymptotic analysis of the generalization error (in expectation, and also in high probability for least-squares), and run extensive experiments on standard machine learning benchmarks showing that they often outperform existing approaches.

266 citations


Proceedings Article
05 Dec 2013
TL;DR: In this article, the authors consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk.
Abstract: We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk. We focus on problems without strong convexity, for which all previously known algorithms achieve a convergence rate for function values of O(1/√n) after n iterations. We consider and analyze two algorithms that achieve a rate of O(1/n) for classical supervised learning problems. For least-squares regression, we show that averaged stochastic gradient descent with constant step-size achieves the desired rate. For logistic regression, this is achieved by a simple novel stochastic gradient algorithm that (a) constructs successive local quadratic approximations of the loss functions, while (b) preserving the same running-time complexity as stochastic gradient descent. For these algorithms, we provide a non-asymptotic analysis of the generalization error (in expectation, and also in high probability for least-squares), and run extensive experiments showing that they often outperform existing approaches.

241 citations


Posted Content
13 Feb 2013
TL;DR: In this paper, the tradeoff between privacy guarantees and the utility of the resulting statistical estimators was studied under local privacy constraints, and lower and upper bounds on mutual information and Kullback-Leibler divergence were established.
Abstract: Working under a model of privacy in which data remains private even from the statistician, we study the tradeoff between privacy guarantees and the utility of the resulting statistical estimators. We prove bounds on information-theoretic quantities, including mutual information and Kullback-Leibler divergence, that influence estimation rates as a function of the amount of privacy preserved. When combined with standard minimax techniques such as Le Cam's and Fano's methods, these inequalities allow for a precise characterization of statistical rates under local privacy constraints. In this paper, we provide a complete treatment of three canonical problem families: mean estimation in location family models, parameter estimation in fixed-design regression, and convex risk minimization. For all of these families, we provide lower and upper bounds that match up to constant factors, giving privacy-preserving mechanisms and computationally efficient estimators that achieve the bounds.

223 citations


Journal ArticleDOI
TL;DR: This work considers a class of image denoising models incorporating $\ell_p$-norm--based analysis priors using a fixed set of linear operators and devise semismooth Newton methods for solving the resulting nonsmooth bilevel optimization problems.
Abstract: In this work we consider the problem of parameter learning for variational image denoising models. The learning problem is formulated as a bilevel optimization problem, where the lower-level problem is given by the variational model and the higher-level problem is expressed by means of a loss function that penalizes errors between the solution of the lower-level problem and the ground truth data. We consider a class of image denoising models incorporating $\ell_p$-norm--based analysis priors using a fixed set of linear operators. We devise semismooth Newton methods for solving the resulting nonsmooth bilevel optimization problems and show that the optimized image denoising models can achieve state-of-the-art performance.

Journal ArticleDOI
TL;DR: This work provides a criterion on m that describes the needed amount of regularization to ensure that the least squares method is stable and that its accuracy, measured in L2(X,ρX), is comparable to the best approximation error of f by elements from Vm.
Abstract: We consider the problem of reconstructing an unknown function f on a domain X from samples of f at n randomly chosen points with respect to a given measure źX. Given a sequence of linear spaces (Vm)m>0 with dim(Vm)=m≤n, we study the least squares approximations from the spaces Vm. It is well known that such approximations can be inaccurate when m is too close to n, even when the samples are noiseless. Our main result provides a criterion on m that describes the needed amount of regularization to ensure that the least squares method is stable and that its accuracy, measured in L2(X,źX), is comparable to the best approximation error of f by elements from Vm. We illustrate this criterion for various approximation schemes, such as trigonometric polynomials, with źX being the uniform measure, and algebraic polynomials, with źX being either the uniform or Chebyshev measure. For such examples we also prove similar stability results using deterministic samples that are equispaced with respect to these measures.

Journal ArticleDOI
TL;DR: The three-loop remainder function as discussed by the authors describes the scattering of six gluons in the maximally-helicity-violating configuration in planar N = 4 super- Yang-Mills theory, as a function of the three dual conformal cross ratios.
Abstract: We present the three-loop remainder function, which describes the scattering of six gluons in the maximally-helicity-violating configuration in planar N = 4 super- Yang-Mills theory, as a function of the three dual conformal cross ratios. The result can be expressed in terms of multiple Goncharov polylogarithms. We also employ a more restricted class of hexagon functions which have the correct branch cuts and certain other restrictions on their symbols. We classify all the hexagon functions through transcendental weight five, using the coproduct for their Hopf algebra iteratively, which amounts to a set of first-order differential equations. The three-loop remainder function is a particular weight-six hexagon function, whose symbol was determined previously. The differential equations can be integrated numerically for generic values of the cross ratios, or analytically in certain kinematic limits, including the near-collinear and multi-Regge limits. These limits allow us to impose constraints from the operator product expansion and multi- Regge factorization directly at the function level, and thereby to fix uniquely a set of Riemann ζ valued constants that could not be fixed at the level of the symbol. The near- collinear limits agree precisely with recent predictions by Basso, Sever and Vieira based on integrability. The multi-Regge limits agree with the factorization formula of Fadin and Lipatov, and determine three constants entering the impact factor at this order. We plot the three-loop remainder function for various slices of the Euclidean region of positive cross ratios, and compare it to the two-loop one. For large ranges of the cross ratios, the ratio of the three-loop to the two-loop remainder function is relatively constant, and close to −7.

Journal ArticleDOI
TL;DR: A convergence analysis of accelerated forward-backward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision is proposed.
Abstract: We propose a convergence analysis of accelerated forward-backward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision. We prove that the $1/k^2$ convergence rate for the function values can be achieved if the admissible errors are of a certain type and satisfy a sufficiently fast decay condition. Our analysis is based on the machinery of estimate sequences first introduced by Nesterov for the study of accelerated gradient descent algorithms. Furthermore, we give a global complexity analysis, taking into account the cost of computing admissible approximations of the proximal point. An experimental analysis is also presented.

Journal ArticleDOI
TL;DR: Two generalized corollaries to the LaSalle-Yoshizawa Theorem are presented for nonautonomous systems described by nonlinear differential equations with discontinuous right-hand sides.
Abstract: In this technical note, two generalized corollaries to the LaSalle-Yoshizawa Theorem are presented for nonautonomous systems described by nonlinear differential equations with discontinuous right-hand sides. Lyapunov-based analysis methods that achieve asymptotic convergence when the candidate Lyapunov derivative is upper bounded by a negative semi-definite function in the presence of differential inclusions are presented. A design example illustrates the utility of the corollaries.

Book ChapterDOI
03 Mar 2013
TL;DR: Signatures of Correct Computation is introduced, a new model for verifying dynamic computations in cloud settings and it is shown that signatures of correct computation imply Publicly Verifiable Computation (PVC), a model recently introduced in several concurrent and independent works.
Abstract: We introduce Signatures of Correct Computation (SCC), a new model for verifying dynamic computations in cloud settings. In the SCC model, a trusted source outsources a function f to an untrusted server, along with a public key for that function (to be used during verification). The server can then produce a succinct signature σ vouching for the correctness of the computation of f, i.e., that some result v is indeed the correct outcome of the function f evaluated on some point a. There are two crucial performance properties that we want to guarantee in an SCC construction: (1) verifying the signature should take asymptotically less time than evaluating the function f; and (2) the public key should be efficiently updated whenever the function changes. We construct SCC schemes (satisfying the above two properties) supporting expressive manipulations over multivariate polynomials, such as polynomial evaluation and differentiation. Our constructions are adaptively secure in the random oracle model and achieve optimal updates, i.e., the function's public key can be updated in time proportional to the number of updated coefficients, without performing a linear-time computation (in the size of the polynomial). We also show that signatures of correct computation imply Publicly Verifiable Computation (PVC), a model recently introduced in several concurrent and independent works. Roughly speaking, in the SCC model, any client can verify the signature σ and be convinced of some computation result, whereas in the PVC model only the client that issued a query (or anyone who trusts this client) can verify that the server returned a valid signature (proof) for the answer to the query. Our techniques can be readily adapted to construct PVC schemes with adaptive security, efficient updates and without the random oracle model.

Journal ArticleDOI
TL;DR: It is shown that for certain classes of admissible inputs, the existence of an ISS-Lyapunov function implies the ISS of a system, and it is proved a linearization principle that allows a construction of a local ISS- Lyap unov function for a system.
Abstract: We develop tools for investigation of input-to-state stability (ISS) of infinite-dimensional control systems. We show that for certain classes of admissible inputs, the existence of an ISS-Lyapunov function implies the ISS of a system. Then for the case of systems described by abstract equations in Banach spaces, we develop two methods of construction of local and global ISS-Lyapunov functions. We prove a linearization principle that allows a construction of a local ISS-Lyapunov function for a system, the linear approximation of which is ISS. In order to study the interconnections of nonlinear infinite-dimensional systems, we generalize the small-gain theorem to the case of infinite-dimensional systems and provide a way to construct an ISS-Lyapunov function for an entire interconnection, if ISS-Lyapunov functions for subsystems are known and the small-gain condition is satisfied. We illustrate the theory on examples of linear and semilinear reaction-diffusion equations.

Journal ArticleDOI
TL;DR: This study provides a solution in the aspect of determining the value of loss function of DTRS and extends its range of applications with the use of particle swarm optimization.

Journal ArticleDOI
TL;DR: The numerical experiments show that SO-MI reaches significantly better results than the other algorithms when the number of function evaluations is very restricted (200-300 evaluations), and the algorithm converges to the global optimum almost surely.

Journal ArticleDOI
TL;DR: In this paper, a Bayesian approach is adopted to the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map applied to u. The prior measure is specified as a Gaussian random field μ 0.
Abstract: We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map $\mathcal {G}$ applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ0. We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μy. Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager–Machlup functional defined on the Cameron–Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of $\mathcal {G}(u)$ can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier–Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics.

Journal ArticleDOI
TL;DR: In this article, a normal approximation for the functional sample mean is developed and asymptotically justify testing procedures for the equality of means in two functional samples exhibiting temporal dependence.
Abstract: Summary. The paper is concerned with inference based on the mean function of a functional time series. We develop a normal approximation for the functional sample mean and then focus on the estimation of the asymptotic variance kernel. Using these results, we develop and asymptotically justify testing procedures for the equality of means in two functional samples exhibiting temporal dependence. Evaluated by means of a simulation study and application to a real data set, these two-sample procedures enjoy good size and power in finite samples.

Book ChapterDOI
TL;DR: The Gaussian Process Upper Confidence Bound and Pure exploration algorithm (GP-UCB-PE) is introduced which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations and proves theoretical upper bounds on the regret with batches of size K for this procedure.
Abstract: In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to actively select the next estimation of f which is predicted to be the most useful. We focus on the case where the function can be evaluated in parallel with batches of fixed size and analyze the benefit compared to the purely sequential procedure in terms of cumulative regret. We introduce the Gaussian Process Upper Confidence Bound and Pure Exploration algorithm (GP-UCB-PE) which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations. We prove theoretical upper bounds on the regret with batches of size K for this procedure which show the improvement of the order of sqrt{K} for fixed iteration cost over purely sequential versions. Moreover, the multiplicative constants involved have the property of being dimension-free. We also confirm empirically the efficiency of GP-UCB-PE on real and synthetic problems compared to state-of-the-art competitors.

Journal ArticleDOI
TL;DR: This article considers the problem of constructing nonparametric tolerance/prediction sets by starting from the general conformal prediction approach, and uses a kernel density estimator as a measure of agreement between a sample point and the underlying distribution.
Abstract: This article introduces a new approach to prediction by bringing together two different nonparametric ideas: distribution-free inference and nonparametric smoothing. Specifically, we consider the problem of constructing nonparametric tolerance/prediction sets. We start from the general conformal prediction approach, and we use a kernel density estimator as a measure of agreement between a sample point and the underlying distribution. The resulting prediction set is shown to be closely related to plug-in density level sets with carefully chosen cutoff values. Under standard smoothness conditions, we get an asymptotic efficiency result that is near optimal for a wide range of function classes. But the coverage is guaranteed whether or not the smoothness conditions hold and regardless of the sample size. The performance of our method is investigated through simulation studies and illustrated in a real data example.

Proceedings ArticleDOI
06 May 2013
TL;DR: An approach to learning objective functions for robotic manipulation based on inverse reinforcement learning that can deal with high-dimensional continuous state-action spaces, and only requires local optimality of demonstrated trajectories is presented.
Abstract: We present an approach to learning objective functions for robotic manipulation based on inverse reinforcement learning. Our path integral inverse reinforcement learning algorithm can deal with high-dimensional continuous state-action spaces, and only requires local optimality of demonstrated trajectories. We use L1 regularization in order to achieve feature selection, and propose an efficient algorithm to minimize the resulting convex objective function. We demonstrate our approach by applying it to two core problems in robotic manipulation. First, we learn a cost function for redundancy resolution in inverse kinematics. Second, we use our method to learn a cost function over trajectories, which is then used in optimization-based motion planning for grasping and manipulation tasks. Experimental results show that our method outperforms previous algorithms in high-dimensional settings.

Book ChapterDOI
01 Jan 2013
TL;DR: In this paper, when the short-term interest rate is considered as a random variable, there is an unknown function λ(r, t), called the market price of risk, in the governing equation.
Abstract: As pointed out in Sect. 2.3, when the short-term interest rate is considered as a random variable, there is an unknown function λ(r, t), called the market price of risk, in the governing equation.

Proceedings ArticleDOI
03 Aug 2013
TL;DR: This work proposes LSE, an algorithm that guides both sampling and classification based on GP-derived confidence bounds, and extends LSE and its theory to two more natural settings: where the threshold level is implicitly defined as a percentage of the (unknown) maximum of the target function and (2) where samples are selected in batches.
Abstract: Many information gathering problems require determining the set of points, for which an unknown function takes value above or below some given threshold level. We formalize this task as a classification problem with sequential measurements, where the unknown function is modeled as a sample from a Gaussian process (GP). We propose LSE, an algorithm that guides both sampling and classification based on GP-derived confidence bounds, and provide theoretical guarantees about its sample complexity. Furthermore, we extend LSE and its theory to two more natural settings: (1) where the threshold level is implicitly defined as a percentage of the (unknown) maximum of the target function and (2) where samples are selected in batches. We evaluate the effectiveness of our proposed methods on two problems of practical interest, namely autonomous monitoring of algal populations in a lake environment and geolocating network latency.

Proceedings ArticleDOI
27 Oct 2013
TL;DR: In this article, a static greedy algorithm, named StaticGreedy, is proposed to strictly guarantee the submodularity of influence spread function during the seed selection process, which makes the computational expense dramatically reduced by two orders of magnitude without loss of accuracy.
Abstract: Influence maximization, defined as a problem of finding a set of seed nodes to trigger a maximized spread of influence, is crucial to viral marketing on social networks. For practical viral marketing on large scale social networks, it is required that influence maximization algorithms should have both guaranteed accuracy and high scalability. However, existing algorithms suffer a scalability-accuracy dilemma: conventional greedy algorithms guarantee the accuracy with expensive computation, while the scalable heuristic algorithms suffer from unstable accuracyIn this paper, we focus on solving this scalability-accuracy dilemma. We point out that the essential reason of the dilemma is the surprising fact that the submodularity, a key requirement of the objective function for a greedy algorithm to approximate the optimum, is not guaranteed in all conventional greedy algorithms in the literature of influence maximization. Therefore a greedy algorithm has to afford a huge number of Monte Carlo simulations to reduce the pain caused by unguaranteed submodularity. Motivated by this critical finding, we propose a static greedy algorithm, named StaticGreedy, to strictly guarantee the submodularity of influence spread function during the seed selection process. The proposed algorithm makes the computational expense dramatically reduced by two orders of magnitude without loss of accuracy. Moreover, we propose a dynamical update strategy which can speed up the StaticGreedy algorithm by 2-7 times on large scale social networks.

Journal ArticleDOI
TL;DR: In this paper, the Taylor-Maclaurin coefficients of the function f when f is in the following subclasses: SΣ(λ,γ;ϕ), HS Σ(α), RΣ (η,γ,ϕ) and BΣ((μ,φ,γ),φ) are investigated.
Abstract: In this paper, we introduce and investigate each of the following subclasses: SΣ(λ,γ;ϕ), HS Σ(α), RΣ(η,γ;ϕ) and BΣ(μ;ϕ) (0 λ 1; γ ∈ C� {0}; α ∈ C ;0 η 0, and ϕ(D) is symmetric with respect to the real axis. We obtain coefficient bounds involving the Taylor-Maclaurin coefficients |a2| and |a3| of the function f when f is in these classes. The various results, which are presented in this paper, would generalize and improve those in related works of several earlier authors.

Journal ArticleDOI
23 Oct 2013-PLOS ONE
TL;DR: This work introduces diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks.
Abstract: In protein-protein interaction (PPI) networks, functional similarity is often inferred based on the function of directly interacting proteins, or more generally, some notion of interaction network proximity among proteins in a local neighborhood. Prior methods typically measure proximity as the shortest-path distance in the network, but this has only a limited ability to capture fine-grained neighborhood distinctions, because most proteins are close to each other, and there are many ties in proximity. We introduce diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks. We present a tool that, when input a PPI network, will output the DSD distances between every pair of proteins. We show that replacing the shortest-path metric by DSD improves the performance of classical function prediction methods across the board.

Book ChapterDOI
23 Sep 2013
TL;DR: The Gaussian Process Upper Confidence Bound and Pure Exploration (GP-UCB-PE) algorithm as discussed by the authors combines the UCB strategy and pure exploration in the same batch of evaluations along the parallel iterations.
Abstract: In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to actively select the next estimation of f which is predicted to be the most useful. We focus on the case where the function can be evaluated in parallel with batches of fixed size and analyze the benefit compared to the purely sequential procedure in terms of cumulative regret. We introduce the Gaussian Process Upper Confidence Bound and Pure Exploration algorithm (GP-UCB-PE) which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations. We prove theoretical upper bounds on the regret with batches of size K for this procedure which show the improvement of the order of $\sqrt{K}$ for fixed iteration cost over purely sequential versions. Moreover, the multiplicative constants involved have the property of being dimension-free. We also confirm empirically the efficiency of GP-UCB-PE on real and synthetic problems compared to state-of-the-art competitors.