Topic

Rate of convergence

About: Rate of convergence is a research topic. Over the lifetime, 31257 publications have been published within this topic receiving 795334 citations. The topic is also known as: convergence rate.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Broadcast Gossip Algorithms for Consensus

[...]

T.C. Aysal¹, Mehmet E. Yildiz¹, Anand D. Sarwate², Anna Scaglione³•Institutions (3)

Cornell University¹, University of California, San Diego², University of California, Davis³

01 Jul 2009-IEEE Transactions on Signal Processing

TL;DR: It is proved that the random consensus value is, in expectation, the average of initial node measurements and that it can be made arbitrarily close to this value in mean squared error sense, under a balanced connectivity model and by trading off convergence speed with accuracy of the computation.

...read moreread less

Abstract: Motivated by applications to wireless sensor, peer-to-peer, and ad hoc networks, we study distributed broadcasting algorithms for exchanging information and computing in an arbitrarily connected network of nodes. Specifically, we study a broadcasting-based gossiping algorithm to compute the (possibly weighted) average of the initial measurements of the nodes at every node in the network. We show that the broadcast gossip algorithm converges almost surely to a consensus. We prove that the random consensus value is, in expectation, the average of initial node measurements and that it can be made arbitrarily close to this value in mean squared error sense, under a balanced connectivity model and by trading off convergence speed with accuracy of the computation. We provide theoretical and numerical results on the mean square error performance, on the convergence rate and study the effect of the ldquomixing parameterrdquo on the convergence rate of the broadcast gossip algorithm. The results indicate that the mean squared error strictly decreases through iterations until the consensus is achieved. Finally, we assess and compare the communication cost of the broadcast gossip algorithm to achieve a given distance to consensus through theoretical and numerical results.

...read moreread less

516 citations

Journal Article•DOI•

Data Oscillation and Convergence of Adaptive FEM

[...]

Pedro Morin, Ricardo H. Nochetto, Kunibert G. Siebert

01 Jul 2000-SIAM Journal on Numerical Analysis

TL;DR: A simple and efficient adaptive FEM for elliptic partial differential equations (PDEs) with linear rate of convergence without any preliminary mesh adaptation nor explicit knowledge of constants is constructed.

...read moreread less

Abstract: Data oscillation is intrinsic information missed by the averaging process associated with finite element methods (FEM) regardless of quadrature. Ensuring a reduction rate of data oscillation, together with an error reduction based on a posteriori error estimators, we construct a simple and efficient adaptive FEM for elliptic partial differential equations (PDEs) with linear rate of convergence without any preliminary mesh adaptation nor explicit knowledge of constants. Any prescribed error tolerance is thus achieved in a finite number of steps. A number of numerical experiments in two and three dimensions yield quasi-optimal meshes along with a competitive performance.

...read moreread less

515 citations

Journal Article•DOI•

Distributed Sparse Linear Regression

[...]

Gonzalo Mateos¹, Juan Andres Bazerque¹, Georgios B. Giannakis¹•Institutions (1)

University of Minnesota¹

01 Oct 2010-IEEE Transactions on Signal Processing

TL;DR: Three novel algorithms to estimate the regression coefficients via Lasso when the training data are distributed across different agents, and their communication to a central processing unit is prohibited for e.g., communication cost or privacy reasons are developed.

...read moreread less

Abstract: The Lasso is a popular technique for joint estimation and continuous variable selection, especially well-suited for sparse and possibly under-determined linear regression problems. This paper develops algorithms to estimate the regression coefficients via Lasso when the training data are distributed across different agents, and their communication to a central processing unit is prohibited for e.g., communication cost or privacy reasons. A motivating application is explored in the context of wireless communications, whereby sensing cognitive radios collaborate to estimate the radio-frequency power spectrum density. Attaining different tradeoffs between complexity and convergence speed, three novel algorithms are obtained after reformulating the Lasso into a separable form, which is iteratively minimized using the alternating-direction method of multipliers so as to gain the desired degree of parallelization. Interestingly, the per agent estimate updates are given by simple soft-thresholding operations, and inter-agent communication overhead remains at affordable level. Without exchanging elements from the different training sets, the local estimates consent to the global Lasso solution, i.e., the fit that would be obtained if the entire data set were centrally available. Numerical experiments with both simulated and real data demonstrate the merits of the proposed distributed schemes, corroborating their convergence and global optimality. The ideas in this paper can be easily extended for the purpose of fitting related models in a distributed fashion, including the adaptive Lasso, elastic net, fused Lasso and nonnegative garrote.

...read moreread less

514 citations

Journal Article•DOI•

Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation

[...]

Clifford Lam¹, Jianqing Fan•Institutions (1)

London School of Economics and Political Science¹

01 Dec 2009-Annals of Statistics

TL;DR: In this paper, the authors studied the sparsistency and rates of convergence for estimating sparse covariance and precision matrices based on penalized likelihood with nonconvex penalty functions.

...read moreread less

Abstract: This paper studies the sparsistency and rates of convergence for estimating sparse covariance and precision matrices based on penalized likelihood with nonconvex penalty functions. Here, sparsistency refers to the property that all parameters that are zero are actually estimated as zero with probability tending to one. Depending on the case of applications, sparsity priori may occur on the covariance matrix, its inverse or its Cholesky decomposition. We study these three sparsity exploration problems under a unified framework with a general penalty function. We show that the rates of convergence for these problems under the Frobenius norm are of order (s(n) log p(n)/n)(1/2), where s(n) is the number of nonzero elements, p(n) is the size of the covariance matrix and n is the sample size. This explicitly spells out the contribution of high-dimensionality is merely of a logarithmic factor. The conditions on the rate with which the tuning parameter λ(n) goes to 0 have been made explicit and compared under different penalties. As a result, for the L(1)-penalty, to guarantee the sparsistency and optimal rate of convergence, the number of nonzero elements should be small: sn'=O(pn) at most, among O(pn2) parameters, for estimating sparse covariance or correlation matrix, sparse precision or inverse correlation matrix or sparse Cholesky factor, where sn' is the number of the nonzero elements on the off-diagonal entries. On the other hand, using the SCAD or hard-thresholding penalty functions, there is no such a restriction.

...read moreread less

509 citations

Posted Content•

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

[...]

Alexander Rakhlin¹, Ohad Shamir², Karthik Sridharan¹•Institutions (2)

University of Pennsylvania¹, Microsoft²

26 Sep 2011-arXiv: Learning

TL;DR: This paper investigates the optimality of SGD in a stochastic setting, and shows that for smooth problems, the algorithm attains the optimal O(1/T) rate, however, for non-smooth problems the convergence rate with averaging might really be Ω(log(T)/T), and this is not just an artifact of the analysis.

...read moreread less

Abstract: Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization problems which arise in machine learning. For strongly convex problems, its convergence rate was known to be O(\log(T)/T), by running SGD for T iterations and returning the average point. However, recent results showed that using a different algorithm, one can get an optimal O(1/T) rate. This might lead one to believe that standard SGD is suboptimal, and maybe should even be replaced as a method of choice. In this paper, we investigate the optimality of SGD in a stochastic setting. We show that for smooth problems, the algorithm attains the optimal O(1/T) rate. However, for non-smooth problems, the convergence rate with averaging might really be \Omega(\log(T)/T), and this is not just an artifact of the analysis. On the flip side, we show that a simple modification of the averaging step suffices to recover the O(1/T) rate, and no other change of the algorithm is necessary. We also present experimental results which support our findings, and point out open problems.

...read moreread less

509 citations

Collapse

Network Information

Performance

Metrics

33,496

Papers

930,998

Citations

No. of papers in the topic in previous years
Year	Papers
2024	1
2023	693
2022	1,530
2021	2,129
2020	2,036
2019	1,995

Rate of convergence

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics