scispace - formally typeset
Search or ask a question

Showing papers on "Convergence (routing) published in 2011"


Book ChapterDOI
01 Jan 2011
TL;DR: Weakconvergence methods in metric spaces were studied in this article, with applications sufficient to show their power and utility, and the results of the first three chapters are used in Chapter 4 to derive a variety of limit theorems for dependent sequences of random variables.
Abstract: The author's preface gives an outline: "This book is about weakconvergence methods in metric spaces, with applications sufficient to show their power and utility. The Introduction motivates the definitions and indicates how the theory will yield solutions to problems arising outside it. Chapter 1 sets out the basic general theorems, which are then specialized in Chapter 2 to the space C[0, l ] of continuous functions on the unit interval and in Chapter 3 to the space D [0, 1 ] of functions with discontinuities of the first kind. The results of the first three chapters are used in Chapter 4 to derive a variety of limit theorems for dependent sequences of random variables. " The book develops and expands on Donsker's 1951 and 1952 papers on the invariance principle and empirical distributions. The basic random variables remain real-valued although, of course, measures on C[0, l ] and D[0, l ] are vitally used. Within this framework, there are various possibilities for a different and apparently better treatment of the material. More of the general theory of weak convergence of probabilities on separable metric spaces would be useful. Metrizability of the convergence is not brought up until late in the Appendix. The close relation of the Prokhorov metric and a metric for convergence in probability is (hence) not mentioned (see V. Strassen, Ann. Math. Statist. 36 (1965), 423-439; the reviewer, ibid. 39 (1968), 1563-1572). This relation would illuminate and organize such results as Theorems 4.1, 4.2 and 4.4 which give isolated, ad hoc connections between weak convergence of measures and nearness in probability. In the middle of p. 16, it should be noted that C*(S) consists of signed measures which need only be finitely additive if 5 is not compact. On p. 239, where the author twice speaks of separable subsets having nonmeasurable cardinal, he means "discrete" rather than "separable." Theorem 1.4 is Ulam's theorem that a Borel probability on a complete separable metric space is tight. Theorem 1 of Appendix 3 weakens completeness to topological completeness. After mentioning that probabilities on the rationals are tight, the author says it is an

3,554 citations


Proceedings Article
12 Dec 2011
TL;DR: This work provides a non-asymptotic analysis of the convergence of two well-known algorithms, stochastic gradient descent as well as a simple modification where iterates are averaged, suggesting that a learning rate proportional to the inverse of the number of iterations, while leading to the optimal convergence rate, is not robust to the lack of strong convexity or the setting of the proportionality constant.
Abstract: We consider the minimization of a convex objective function defined on a Hilbert space, which is only available through unbiased estimates of its gradients. This problem includes standard machine learning algorithms such as kernel logistic regression and least-squares regression, and is commonly referred to as a stochastic approximation problem in the operations research community. We provide a non-asymptotic analysis of the convergence of two well-known algorithms, stochastic gradient descent (a.k.a. Robbins-Monro algorithm) as well as a simple modification where iterates are averaged (a.k.a. Polyak-Ruppert averaging). Our analysis suggests that a learning rate proportional to the inverse of the number of iterations, while leading to the optimal convergence rate in the strongly convex case, is not robust to the lack of strong convexity or the setting of the proportionality constant. This situation is remedied when using slower decays together with averaging, robustly leading to the optimal rate of convergence. We illustrate our theoretical results with simulations on synthetic and standard datasets.

726 citations



Journal ArticleDOI
TL;DR: In this article, a Gaussian process prior is used to determine the associated space of functions, its reproducing-kernel Hilbert space (RKHS), and the expected improvement is known to converge on the minimum of any function in its RKHS.
Abstract: In the efficient global optimization problem, we minimize an unknown function f, using as few observations f(x) as possible. It can be considered a continuum-armed-bandit problem, with noiseless data, and simple regret. Expected-improvement algorithms are perhaps the most popular methods for solving the problem; in this paper, we provide theoretical results on their asymptotic behaviour. Implementing these algorithms requires a choice of Gaussian-process prior, which determines an associated space of functions, its reproducing-kernel Hilbert space (RKHS). When the prior is fixed, expected improvement is known to converge on the minimum of any function in its RKHS. We provide convergence rates for this procedure, optimal for functions of low smoothness, and describe a modified algorithm attaining optimal rates for smoother functions. In practice, however, priors are typically estimated sequentially from the data. For standard estimators, we show this procedure may never find the minimum of f. We then propose alternative estimators, chosen to minimize the constants in the rate of convergence, and show these estimators retain the convergence rates of a fixed prior.

413 citations


Journal ArticleDOI
TL;DR: A Newton recursive and a Newton iterative identification algorithms are derived by using the Newton method (Newton-Raphson method) to reduce the sensitivity of the projection algorithm to noise, and to improve convergence rates of the SG algorithm.

312 citations


Journal ArticleDOI
TL;DR: Rate-of-convergence analysis shows that by controlling the sample size in an incremental gradient algorithm, it is possible to maintain the steady convergence rates of full-gradient methods.
Abstract: Many structured data-fitting applications require the solution of an optimization problem involving a sum over a potentially large number of measurements. Incremental gradient algorithms offer inexpensive iterations by sampling a subset of the terms in the sum. These methods can make great progress initially, but often slow as they approach a solution. In contrast, full-gradient methods achieve steady convergence at the expense of evaluating the full objective and gradient on each iteration. We explore hybrid methods that exhibit the benefits of both approaches. Rate-of-convergence analysis shows that by controlling the sample size in an incremental gradient algorithm, it is possible to maintain the steady convergence rates of full-gradient methods. We detail a practical quasi-Newton implementation based on this approach. Numerical experiments illustrate its potential benefits.

289 citations


Journal ArticleDOI
TL;DR: The usefulness of the links highlighted in this paper to obtain proofs of asymptotic synchronization in networks of identical nonlinear oscillators are illustrated via numerical simulations on some representative examples.
Abstract: In this paper, a relationship is discussed between three common assumptions made in the literature to prove local or global asymptotic stability of the synchronization manifold in networks of coupled nonlinear dynamical systems. In such networks, each node, when uncoupled, is described by a nonlinear ordinary differential equation of the form x = f (x,t) . In this paper, we establish links between the QUAD condition on f (x, t), i.e.,(x-y)T[f(x, t)-f(y, t)] - (x-y)T Δ(x-y) ≤-ω(x-y)T(x-y) for some arbitrary Δ and ω, and contraction theory. We then investigate the relationship between the assumption of f being Lipschitz and the QUAD condition. We show the usefulness of the links highlighted in this paper to obtain proofs of asymptotic synchronization in networks of identical nonlinear oscillators and illustrate the results via numerical simulations on some representative examples.

224 citations


Journal ArticleDOI
TL;DR: A new routing/scheduling back-pressure algorithm that not only guarantees network stability (throughput optimality), but also adaptively selects a set of optimal routes based on shortest-path information in order to minimize average path lengths between each source and destination pair is proposed.
Abstract: Back-pressure-type algorithms based on the algorithm by Tassiulas and Ephremides have recently received much attention for jointly routing and scheduling over multihop wireless networks. However, this approach has a significant weakness in routing because the traditional back-pressure algorithm explores and exploits all feasible paths between each source and destination. While this extensive exploration is essential in order to maintain stability when the network is heavily loaded, under light or moderate loads, packets may be sent over unnecessarily long routes, and the algorithm could be very inefficient in terms of end-to-end delay and routing convergence times. This paper proposes a new routing/scheduling back-pressure algorithm that not only guarantees network stability (throughput optimality), but also adaptively selects a set of optimal routes based on shortest-path information in order to minimize average path lengths between each source and destination pair. Our results indicate that under the traditional back-pressure algorithm, the end-to-end packet delay first decreases and then increases as a function of the network load (arrival rate). This surprising low-load behavior is explained due to the fact that the traditional back-pressure algorithm exploits all paths (including very long ones) even when the traffic load is light. On the other-hand, the proposed algorithm adaptively selects a set of routes according to the traffic load so that long paths are used only when necessary, thus resulting in much smaller end-to-end packet delays as compared to the traditional back-pressure algorithm .

218 citations


Book ChapterDOI
10 May 2011
TL;DR: The recommended strategy (following Section 11.10 of Gelman et al., 2003) and the reasons for the recommendations are explained, illustrating with a relatively simple example from recent research: a hierarchical model fit to public-opinion survey data.
Abstract: Constructing efficient iterative simulation algorithms can be difficult, but inference and monitoring convergence are relatively easy. We first give our recommended strategy (following Section 11.10 of Gelman et al., 2003) and then explain the reasons for our recommendations, illustrating with a relatively simple example from our recent research: a hierarchical model fit to public-opinion survey data.

217 citations


Posted Content
TL;DR: In this paper, the authors consider the problem of optimizing the sum of a smooth convex function and a non-smooth function using proximal-gradient methods, where an error is present in the calculation of the gradient of the smooth term or in the proximity operator with respect to the nonsmooth term, and show that both the basic proximal gradient method and the accelerated proximalgradient method achieve the same convergence rate as in the error-free case, provided that the errors decrease at appropriate rates.
Abstract: We consider the problem of optimizing the sum of a smooth convex function and a non-smooth convex function using proximal-gradient methods, where an error is present in the calculation of the gradient of the smooth term or in the proximity operator with respect to the non-smooth term. We show that both the basic proximal-gradient method and the accelerated proximal-gradient method achieve the same convergence rate as in the error-free case, provided that the errors decrease at appropriate rates.Using these rates, we perform as well as or better than a carefully chosen fixed error level on a set of structured sparsity problems.

194 citations


Book
05 Sep 2011
TL;DR: In this article, the authors consider the problem of minimizing a smooth function of n variables subject to m smooth equality constraints, and give a new short proof of the Boggs-Tolle-Wang necessary and sufficient condition for Q-superlinear convergence of a class of quasi-Newton methods.
Abstract: We consider the problem of minimizing a smooth function of n variables subject to m smooth equality constraints. We begin by describing various approaches to Newton’s method for this problem, with emphasis on the recent work of Goodman. This leads to the proposal of a Broyden-type method which updates an $n \times (n - m)$ matrix approximating a “one-sided projected Hessian” of a Lagrangian function. This method is shown to converge Q-superlinearly. We also give a new short proof of the Boggs-Tolle-Wang necessary and sufficient condition for Q-superlinear convergence of a class of quasi-Newton methods for solving this problem. Finally, we describe an algorithm which updates an approximation to a “two-sided projected Hessian,” a symmetric matrix of order $n - m$ which is generally positive definite near a solution. We present several new variants of this algorithm and show that under certain conditions they all have a local two-step Q-superlinear convergence property, even though only one set of gradients ...

Journal ArticleDOI
TL;DR: In this paper, a Heaviside projection based topology optimization method with a scalar function that is filtered by a Helmholtz type partial differential equation is proposed, where the optimality can be strictly discussed in terms of the KKT condition.
Abstract: This paper deals with topology optimization based on the Heaviside projection method using a scalar function as design variables. The scalar function is then regularized by a PDE based filter. Several image-processing based filtering techniques have so far been proposed for regularization or restricting the minimum length scale. They are conventionally applied to the design sensitivities rather than the design variables themselves. However, it causes discrepancies between the filtered sensitivities and the actual sensitivities that may confuse the optimization process and disturb the convergence. In this paper, we propose a Heaviside projection based topology optimization method with a scalar function that is filtered by a Helmholtz type partial differential equation. Therefore, the optimality can be strictly discussed in terms of the KKT condition. In order to demonstrate the effectiveness of the proposed method, a minimum compliance problem is solved.

Journal ArticleDOI
TL;DR: A review of existing explicit approximation of the implicit Colebrook equation with estimated accuracy is shown in this article, where most of the available approximations are very accurate with deviations of no more than few percentages.

Journal ArticleDOI
TL;DR: A numerical comparison with a decentralized primal algorithm shows that the dual algorithm converges faster, and with less communication, than the new decentralized optimization algorithm.

Journal ArticleDOI
TL;DR: It is shown that a simple adaptation of a consensus algorithm leads to an averaging algorithm, and lower bounds on the worst-case convergence time for various classes of linear, time-invariant, distributed consensus methods are proved.
Abstract: We study the convergence speed of distributed iterative algorithms for the consensus and averaging problems, with emphasis on the latter. We first consider the case of a fixed communication topology. We show that a simple adaptation of a consensus algorithm leads to an averaging algorithm. We prove lower bounds on the worst-case convergence time for various classes of linear, time-invariant, distributed consensus methods, and provide an algorithm that essentially matches those lower bounds. We then consider the case of a time-varying topology, and provide a polynomial-time averaging algorithm.

Proceedings ArticleDOI
26 Oct 2011
TL;DR: This paper develops a distributed computing framework, PrIter, which supports the prioritized execution of iterative computations, and shows that PrIter achieves up to 50 × speedup over Hadoop for a series ofIterative algorithms.
Abstract: Iterative computations are pervasive among data analysis applications in the cloud, including Web search, online social network analysis, recommendation systems, and so on. These cloud applications typically involve data sets of massive scale. Fast convergence of the iterative computation on the massive data set is essential for these applications. In this paper, we explore the opportunity for accelerating iterative computations and propose a distributed computing framework, PrIter, which enables fast iterative computation by providing the support of prioritized iteration. Instead of performing computations on all data records without discrimination, PrIter prioritizes the computations that help convergence the most, so that the convergence speed of iterative process is significantly improved. We evaluate PrIter on a local cluster of machines as well as on Amazon EC2 Cloud. The results show that PrIter achieves up to 50x speedup over Hadoop for a series of iterative algorithms.

Journal ArticleDOI
TL;DR: In this paper, the concept of λ-statistical convergence of order α was introduced, and some relations between the λ statistical convergence and strong (V, λ)-summability were given.

Posted Content
TL;DR: In this article, the authors give some explicit calculations for stable distributions and convergence to them, mainly based on less explicit results in Feller (1971) and Feller this article.
Abstract: We give some explicit calculations for stable distributions and convergence to them, mainly based on less explicit results in Feller (1971) The main purpose is to provide ourselves with easy reference to explicit formulas and examples (There are probably no new results)

Journal ArticleDOI
TL;DR: This work considers the solution of generalized Nash equilibrium problems by concatenating the KKT optimality conditions of each player’s optimization problem into a single KKT-like system and shows that it is possible to establish global convergence under sensible conditions.
Abstract: We consider the solution of generalized Nash equilibrium problems by concatenating the KKT optimality conditions of each player’s optimization problem into a single KKT-like system. We then propose two approaches for solving this KKT system. The first approach is rather simple and uses a merit-function/equation-based technique for the solution of the KKT system. The second approach, partially motivated by the shortcomings of the first one, is an interior-point-based method. We show that this second approach has strong theoretical properties and, in particular, that it is possible to establish global convergence under sensible conditions, this probably being the first result of its kind in the literature. We discuss the results of an extensive numerical testing on four KKT-based solution algorithms, showing that the new interior-point method is efficient and very robust.

Journal ArticleDOI
TL;DR: In this paper, the authors reexamine the band gap with the full-potential linearized augmented-plane-wave method and find that even with 3000 bands, the gap is not completely converged.
Abstract: Recently, Shih et al. [Phys. Rev. Lett. 105, 146401 (2010)] published a theoretical band gap for wurtzite ZnO, calculated with the non-self-consistent $\mathit{GW}$ approximation, that agreed surprisingly well with experiment while deviating strongly from previous studies. They showed that a very large number of empty bands is necessary to converge the gap. We reexamine the $\mathit{GW}$ calculation with the full-potential linearized augmented-plane-wave method and find that even with 3000 bands the band gap is not completely converged. A hyperbolical fit is used to extrapolate to infinite bands. Furthermore, we eliminate the linearization error for high-lying states with local orbitals. In fact, our calculated band gap is considerably larger than in previous studies, but somewhat smaller than that of Shih et al.

Journal ArticleDOI
TL;DR: A KM–CQ-like algorithm is presented, which combines the KM algorithm with the CQ algorithm by introducing two parameter sequences for solving the split feasibility problem and under some parametric controlling conditions, the strong convergence of the algorithm is shown.
Abstract: It is well known that the Krasnosel'skii–Mann algorithm and the CQ algorithm for a split feasibility problem are not strongly convergent. In this paper, we present a KM–CQ-like algorithm with strong convergence, which combines the KM algorithm with the CQ algorithm by introducing two parameter sequences for solving the split feasibility problem. Under some parametric controlling conditions, the strong convergence of the algorithm is shown. Finally, we propose a modified KM–CQ-like algorithm and establish its strong convergence theorem.

Journal ArticleDOI
TL;DR: In this article, it was shown that in the large t-limit, the first-, second-, third-, and fourth largest extremal particles descend with overwhelming probability from ancestors having split either within a distance of order 1 from time 0, or within an order of order 2 from time t. The approach relies on characterizing, up to a certain level of precision, the paths of the extremal particle paths.
Abstract: Branching Brownian motion describes a system of particles that diffuse in space and split into offspring according to a certain random mechanism. By virtue of the groundbreaking work by M. Bramson on the convergence of solutions of the Fisher-KPP equation to traveling waves, the law of the rightmost particle in the limit of large times is rather well understood. In this work, we address the full statistics of the extremal particles (first-, second-, third-largest, etc.). In particular, we prove that in the large t-limit, such particles descend with overwhelming probability from ancestors having split either within a distance of order 1 from time 0, or within a distance of order 1 from time t. The approach relies on characterizing, up to a certain level of precision, the paths of the extremal particles. As a byproduct, a heuristic picture of branching Brownian motion “at the edge” emerges, which sheds light on the still unknown limiting extremal process. © 2011 Wiley Periodicals, Inc.

Journal ArticleDOI
TL;DR: The global convergence of the neural network can be guaranteed even though the objective function is pseudoconvex, and the finite-time state convergence to the feasible region defined by the equality constraints is proved.
Abstract: In this paper, a one-layer recurrent neural network is presented for solving pseudoconvex optimization problems subject to linear equality constraints. The global convergence of the neural network can be guaranteed even though the objective function is pseudoconvex. The finite-time state convergence to the feasible region defined by the equality constraints is also proved. In addition, global exponential convergence is proved when the objective function is strongly pseudoconvex on the feasible region. Simulation results on illustrative examples and application on chemical process data reconciliation are provided to demonstrate the effectiveness and characteristics of the neural network.

Journal ArticleDOI
TL;DR: It has been observed that by appropriately choosing the data assignment criterion, the proposed on-line method can be extended to deal also with the identification of piecewise affine models and is tested through some computer simulations and the modeling of an open channel system.

Journal ArticleDOI
TL;DR: In this paper, the authors provide an explicit rigorous derivation of a diffusion limit from a deterministic skew-product flow, which is assumed to exhibit time-scale separation and has the form of a slowly evolving system driven by a fast chaotic flow.
Abstract: We provide an explicit rigorous derivation of a diffusion limit—a stochastic differential equation (SDE) with additive noise—from a deterministic skew-product flow. This flow is assumed to exhibit time-scale separation and has the form of a slowly evolving system driven by a fast chaotic flow. Under mild assumptions on the fast flow, we prove convergence to a SDE as the time-scale separation grows. In contrast to existing work, we do not require the flow to have good mixing properties. As a consequence, our results incorporate a large class of fast flows, including the classical Lorenz equations.

Journal ArticleDOI
TL;DR: In this article, the transport current and magnetization problems for superconducting tape coils and Roebel cables are solved using an efficient numerical scheme based on a variational formulation of the Kim critical-state model.
Abstract: Superconducting tape coils and Roebel cables are often modeled as stacks of parallel superconducting tapes carrying the same transport current. We solved, in the infinitely thin approximation, the transport current and magnetization problems for such stacks using an efficient numerical scheme based on a variational formulation of the Kim critical-state model. We also refined the anisotropic bulk approximation, introduced by Clem et al in order to simplify AC loss estimates for densely packed stacks of many tapes; this was achieved by removing the simplifying a priori assumptions on the current sheet density in the subcritical zone and the shape of this zone boundary. Finally, we studied the convergence of stack problem solutions to the solution of the modified bulk problem. It was shown that, due to the fast convergence to the anisotropic bulk limit, accurate AC loss estimates for stacks of hundreds of tapes can usually be obtained also using a properly rescaled model of a stack containing only ten to twenty tapes.

Journal ArticleDOI
TL;DR: A continuous time fixed-point algorithm is introduced and its convergence for effective path delay operators that allow a limited type of nonmonotone path delay is proved and it is shown that the DUE algorithm is compatible with network loading based on the LDM and the cell transmission model due to Daganzo (1995).
Abstract: In this paper we present a dual-time-scale formulation of dynamic user equilibrium (DUE) with demand evolution. Our formulation belongs to the problem class that Pang and Stewart (2008) refer to as differential variational inequalities. It combines the within-day time scale for which route and departure time choices fluctuate in continuous time with the day-to-day time scale for which demand evolves in discrete time steps. Our formulation is consistent with the often told story that drivers adjust their travel demands at the end of every day based on their congestion experience during one or more previous days. We show that analysis of the within-day assignment model is tremendously simplified by expressing dynamic user equilibrium as a differential variational inequality. We also show there is a class of day-to-day demand growth models that allow the dual-time-scale formulation to be decomposed by time-stepping to yield a sequence of continuous time, single-day, dynamic user equilibrium problems. To solve the single-day DUE problems arising during time-stepping, it is necessary to repeatedly solve a dynamic network loading problem. We observe that the network loading phase of DUE computation generally constitutes a differential algebraic equation (DAE) system, and we show that the DAE system for network loading based on the link delay model (LDM) of Friesz et al. (1993) may be approximated by a system of ordinary differential equations (ODEs). That system of ODEs, as we demonstrate, may be efficiently solved using traditional numerical methods for such problems. To compute an actual dynamic user equilibrium, we introduce a continuous time fixed-point algorithm and prove its convergence for effective path delay operators that allow a limited type of nonmonotone path delay. We show that our DUE algorithm is compatible with network loading based on the LDM and the cell transmission model (CTM) due to Daganzo (1995) . We provide a numerical example based on the much studied Sioux Falls network.

Patent
30 Nov 2011
TL;DR: In this paper, the authors describe techniques for using routing information obtained by operation of network routing protocols to dynamically generate network and cost maps for an application-layer traffic optimization (ALTO) service.
Abstract: In general, techniques are described for using routing information obtained by operation of network routing protocols to dynamically generate network and cost maps for an application-layer traffic optimization (ALTO) service. For example, an ALTO server of an autonomous system (AS) receives routing information from routers of the AS by listening for routing protocol updates outputted by the routers and uses the received topology information to dynamically generate a network map of PIDs that reflects a current topology of the AS and/or of the broader network that includes the AS. Additionally, the ALTO server dynamically calculates inter-PID costs using received routing information that reflects current link metrics. The ALTO server then assembles the inter-PID costs into a cost map that the ALTO server may provide, along with the network map, to clients of the ALTO service.

Journal ArticleDOI
TL;DR: A distributed game based channel allocation (GBCA) Algorithm is proposed by taking into account both network topology and routing information and it is proved that there exists at least one Nash Equilibrium for the problem.
Abstract: In this paper, multi-channel allocation in wireless sensor and actuator networks is formulated as an optimization problem which is NP-hard. In order to efficiently solve this problem, a distributed game based channel allocation (GBCA) Algorithm is proposed by taking into account both network topology and routing information. For both tree/forest routing and non-tree/forest routing scenarios, it is proved that there exists at least one Nash Equilibrium for the problem. Furthermore, the sub- optimality of Nash Equilibrium and the convergence of the Best Response dynamics are also analyzed. Simulation results demonstrate that GBCA significantly reduces the interference and dramatically improves the network performance in terms of delivery ratio, throughput, channel access delay, and energy consumption.

Journal ArticleDOI
TL;DR: It is shown that using control input information from neighbors improves performance in two aspects and the convergence rate using time-delayed control inputs can still be selected with considerable freedom, and remains superior to the performance of the standard local voting protocol which depends on the graph topology.