Showing papers by "Ali H. Sayed published in 2016"

PDF

Open Access

Journal Article•DOI•

Combinations of Adaptive Filters: Performance and convergence properties

[...]

Jeronimo Arenas-Garcia¹, Luis A. Azpicueta-Ruiz¹, Magno T. M. Silva², Vitor H. Nascimento², Ali H. Sayed³ - Show less +1 more•Institutions (3)

Charles III University of Madrid¹, University of São Paulo², University of California, Los Angeles³

01 Jan 2016-IEEE Signal Processing Magazine

TL;DR: Adaptive filters are at the core of many signal processing applications, ranging from acoustic noise supression to echo cancelation to array beamforming.

...read moreread less

Abstract: Adaptive filters are at the core of many signal processing applications, ranging from acoustic noise supression to echo cancelation [1], array beamforming [2], channel equalization [3], to more recent sensor network applications in surveillance, target localization, and tracking. A trending approach in this direction is to recur to in-network distributed processing in which individual nodes implement adaptation rules and diffuse their estimation to the network [4], [5].

...read moreread less

115 citations

Journal Article•DOI•

Proximal Multitask Learning Over Networks With Sparsity-Inducing Coregularization

[...]

Roula Nassif¹, Cedric Richard¹, A. Ferrari¹, Ali H. Sayed²•Institutions (2)

Centre national de la recherche scientifique¹, University of California, Los Angeles²

01 Dec 2016-IEEE Transactions on Signal Processing

TL;DR: This work considers multitask learning problems where clusters of nodes are interested in estimating their own parameter vector and proposes a fully distributed algorithm that relies on minimizing a global mean-square error criterion regularized by nondifferentiable terms to promote cooperation among neighboring clusters.

...read moreread less

Abstract: In this work, we consider multitask learning problems where clusters of nodes are interested in estimating their own parameter vector. Cooperation among clusters is beneficial when the optimal models of adjacent clusters have a good number of similar entries. We propose a fully distributed algorithm for solving this problem. The approach relies on minimizing a global mean-square error criterion regularized by nondifferentiable terms to promote cooperation among neighboring clusters. A general diffusion forward–backward splitting strategy is introduced. Then, it is specialized to the case of sparsity promoting regularizers. A closed-form expression for the proximal operator of a weighted sum of $\ell _1$ -norms is derived to achieve higher efficiency. We also provide conditions on the step-sizes that ensure convergence of the algorithm in the mean and mean-square error sense. Simulations are conducted to illustrate the effectiveness of the strategy.

...read moreread less

74 citations

Posted Content•

Decentralized Consensus Optimization with Asynchrony and Delays

[...]

Tianyu Wu¹, Kun Yuan¹, Qing Ling², Wotao Yin¹, Ali H. Sayed¹ - Show less +1 more•Institutions (2)

University of California, Los Angeles¹, University of Science and Technology of China²

01 Dec 2016-arXiv: Optimization and Control

TL;DR: In this paper, the authors proposed an asynchronous, decentralized algorithm for consensus optimization, where each agent can compute and communicate independently at different times, for different durations, with the information it has even if the latest information from its neighbors is not yet available.

...read moreread less

Abstract: We propose an asynchronous, decentralized algorithm for consensus optimization. The algorithm runs over a network in which the agents communicate with their neighbors and perform local computation. In the proposed algorithm, each agent can compute and communicate independently at different times, for different durations, with the information it has even if the latest information from its neighbors is not yet available. Such an asynchronous algorithm reduces the time that agents would otherwise waste idle because of communication delays or because their neighbors are slower. It also eliminates the need for a global clock for synchronization. Mathematically, the algorithm involves both primal and dual variables, uses fixed step-size parameters, and provably converges to the exact solution under a bounded delay assumption and a random agent assumption. When running synchronously, the algorithm performs just as well as existing competitive synchronous algorithms such as PG-EXTRA, which diverges without synchronization. Numerical experiments confirm the theoretical findings and illustrate the performance of the proposed algorithm.

...read moreread less

59 citations

Journal Article•DOI•

Multitask Diffusion Adaptation Over Asynchronous Networks

[...]

Roula Nassif¹, Cedric Richard¹, André Ferrari¹, Ali H. Sayed²•Institutions (2)

University of Nice Sophia Antipolis¹, University of California, Los Angeles²

01 Jun 2016-IEEE Transactions on Signal Processing

TL;DR: In this article, a model for the solution of multitask problems over asynchronous networks is described and a detailed mean and mean-square error analysis is carried out, which shows that sufficiently small step-sizes can still ensure both stability and performance.

...read moreread less

Abstract: The multitask diffusion LMS is an efficient strategy to simultaneously infer, in a collaborative manner, multiple parameter vectors. Existing works on multitask problems assume that all agents respond to data synchronously. In several applications, agents may not be able to act synchronously because networks can be subject to several sources of uncertainties such as changing topology, random link failures, or agents turning on and off for energy conservation. In this paper, we describe a model for the solution of multitask problems over asynchronous networks and carry out a detailed mean and mean-square error analysis. Results show that sufficiently small step-sizes can still ensure both stability and performance. Simulations and illustrative examples are provided to verify the theoretical findings.

...read moreread less

52 citations

Journal Article•DOI•

Robust Adaptation in Impulsive Noise

[...]

Sara Al-Sayed, Abdelhak M. Zoubir, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

01 Jun 2016-IEEE Transactions on Signal Processing

TL;DR: In this paper, a robust adaptive filtering algorithm is developed that effectively learns and tracks the output error distribution to improve estimation performance.

...read moreread less

Abstract: The popular least-mean-squares (LMS) algorithm for adaptive filtering is nonrobust against impulsive noise in the measurements. The presence of this type of noise degrades the transient and steady-state performance of the algorithm. Since the distribution of the impulsive noise is generally unknown, a robust semi-parametric approach to adaptive filtering is warranted, where the output error nonlinearity is adapted jointly with the parameter of interest. In this paper, a robust adaptive filtering algorithm is developed that effectively learns and tracks the output error distribution to improve estimation performance. The performance of the algorithm is analyzed mathematically and validated experimentally.

...read moreread less

43 citations

Proceedings Article•DOI•

Diffusion LMS over multitask networks with noisy links

[...]

Roula Nassif¹, Cedric Richard¹, Jie Chen², André Ferrari¹, Ali H. Sayed³ - Show less +1 more•Institutions (3)

University of Nice Sophia Antipolis¹, Northwestern Polytechnical University², University of California, Los Angeles³

20 Mar 2016

TL;DR: This work analyzes the theoretical performance of the single-task diffusion LMS when it is run, intentionally or unintentionally, in a multitask environment in the presence of noisy links and introduces an improved strategy that allows the agents to promote or reduce exchanges of information with their neighbors.

...read moreread less

Abstract: Diffusion LMS is an efficient strategy for solving distributed optimization problems with cooperating agents. In some applications, the optimum parameter vectors may not be the same for all agents. Moreover, agents usually exchange information through noisy communication links. In this work, we analyze the theoretical performance of the single-task diffusion LMS when it is run, intentionally or unintentionally, in a multitask environment in the presence of noisy links. To reduce the impact of these nuisance factors, we introduce an improved strategy that allows the agents to promote or reduce exchanges of information with their neighbors.

...read moreread less

40 citations

Journal Article•DOI•

Diffusion-Based Adaptive Distributed Detection: Steady-State Performance in the Slow Adaptation Regime

[...]

Vincenzo Matta¹, Paolo Braca², Stefano Marano¹, Ali H. Sayed³•Institutions (3)

University of Salerno¹, NATO², University of California, Los Angeles³

14 Jun 2016-IEEE Transactions on Information Theory

TL;DR: In this article, a scaling law for the steady-state probabilities of miss detection and false alarm in the slow adaptation regime was established for distributed detection schemes over fully decentralized networks, where the agents interact with each other according to distributed strategies that employ small constant step-sizes.

...read moreread less

Abstract: This paper examines the close interplay between cooperation and adaptation for distributed detection schemes over fully decentralized networks. The combined attributes of cooperation and adaptation are necessary to enable networks of detectors to continually learn from streaming data and to continually track drifts in the state of nature when deciding in favor of one hypothesis or another. The results in this paper establish a fundamental scaling law for the steady-state probabilities of miss detection and false alarm in the slow adaptation regime, when the agents interact with each other according to distributed strategies that employ small constant step-sizes. The latter are critical to enable continuous adaptation and learning. This paper establishes three key results. First, it is shown that the output of the collaborative process at each agent has a steady-state distribution. Second, it is shown that this distribution is asymptotically Gaussian in the slow adaptation regime of small step-sizes. Third, by carrying out a detailed large deviations analysis, closed-form expressions are derived for the decaying rates of the false-alarm and miss-detection probabilities. Interesting insights are gained from these expressions. In particular, it is verified that as the step-size $\mu $ decreases, the error probabilities are driven to zero exponentially fast as functions of $1/\mu $ , and that the exponents governing the decay increase linearly in the number of agents. It is also verified that the scaling laws governing the errors of detection and the errors of estimation over the network behave very differently, with the former having exponential decay proportional to $1/\mu $ , while the latter scales linearly with decay proportional to $\mu $ . Moreover, and interestingly, it is shown that the cooperative strategy allows each agent to reach the same detection performance, in terms of detection error exponents, of a centralized stochastic-gradient solution. The results of this paper are illustrated by applying them to canonical distributed detection problems.

...read moreread less

36 citations

Journal Article•DOI•

Distributed Detection Over Adaptive Networks: Refined Asymptotics and the Role of Connectivity

[...]

Vincenzo Matta¹, Paolo Braca², Stefano Marano¹, Ali H. Sayed³•Institutions (3)

University of Salerno¹, NATO², University of California, Los Angeles³

26 Sep 2016

TL;DR: In this paper, the authors consider distributed detection problems over adaptive networks, where dispersed agents learn continually from streaming data by means of local interactions, and propose diffusion algorithms with constant step-size approximate asymptotics.

...read moreread less

Abstract: We consider distributed detection problems over adaptive networks, where dispersed agents learn continually from streaming data by means of local interactions. The requirement of adaptation allows the network of detectors to track drifts in the underlying hypothesis. The requirement of cooperation allows each agent to deliver a performance superior to what would be obtained if it were acting individually. The simultaneous requirements of adaptation and cooperation are achieved by employing diffusion algorithms with constant step-size $\mu$ . By conducting a refined asymptotic analysis based on the mathematical framework of exact asymptotics, we arrive at a revealing understanding of the universal behavior of distributed detection over adaptive networks : as functions of $1/\mu$ , the error (log-)probability curves corresponding to different agents stay nearly-parallel to each other (as already discovered in [1] and [2] ), however, these curves are ordered following a criterion reflecting the degree of connectivity of each agent. Depending on the combination weights, the more connected an agent is, the lower its error probability curve will be. The analysis provides explicit analytical formulas for the detection error probabilities and these expressions are also verified by means of extensive simulations. We further enlarge the reference setting from the case of doubly-stochastic combination matrices considered in [1] and [2] to the more general and demanding setting of right-stochastic combination matrices; this extension poses new and interesting questions in terms of the interplay between the network topology, the combination weights, and the inference performance. The potential of the proposed methods is illustrated by application of the results to canonical detection problems, to typical network topologies, for both doubly-stochastic and right-stochastic combination matrices. Interesting and somehow unexpected behaviors emerge, and the lesson learned is that connectivity matters .

...read moreread less

35 citations

Journal Article•DOI•

Excess-Risk of Distributed Stochastic Learners

[...]

Zaid J. Towfic¹, Jianshu Chen¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

01 Oct 2016-IEEE Transactions on Information Theory

TL;DR: In this paper, the authors studied the learning ability of consensus and diffusion learners from continuous streams of data arising from different but related statistical distributions and derived closed-form expressions for the evolution of their excess-risk under a diminishing step-size rule.

...read moreread less

Abstract: This paper studies the learning ability of consensus and diffusion distributed learners from continuous streams of data arising from different but related statistical distributions. Four distinctive features for diffusion learners are revealed in relation to other decentralized schemes even under left-stochastic combination policies. First, closed-form expressions for the evolution of their excess-risk are derived for strongly convex risk functions under a diminishing step-size rule. Second, using these results, it is shown that the diffusion strategy improves the asymptotic convergence rate of the excess-risk relative to non-cooperative schemes. Third, it is shown that when the in-network cooperation rules are designed optimally, the performance of the diffusion implementation can outperform that of naive centralized processing. Finally, the arguments further show that diffusion outperforms consensus strategies asymptotically and that the asymptotic excess-risk expression is invariant to the particular network topology. The framework adopted in this paper studies convergence in the stronger mean-square-error sense, rather than in distribution, and develops tools that enable a close examination of the differences between distributed strategies in terms of asymptotic behavior, as well as in terms of convergence rates.

...read moreread less

31 citations

Journal Article•DOI•

Information Exchange and Learning Dynamics Over Weakly Connected Adaptive Networks

[...]

Bicheng Ying¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

01 Mar 2016-IEEE Transactions on Information Theory

TL;DR: In this paper, the authors examine the learning mechanism of adaptive agents over weakly connected graphs and reveal an interesting behavior on how information flows through such topologies, and explain why strong-connectivity of the network topology, adaptation of the combination weights, and clustering of agents are important ingredients to equalize the learning abilities of all agents against such disturbances.

...read moreread less

Abstract: This paper examines the learning mechanism of adaptive agents over weakly connected graphs and reveals an interesting behavior on how information flows through such topologies. The results clarify how asymmetries in the exchange of data can mask local information at certain agents and make them totally dependent on other agents. A leader-follower relationship develops with the performance of some agents being fully determined by the performance of other agents that are outside their domain of influence. This scenario can arise, for example, due to intruder attacks by malicious agents or as the result of failures by some critical links. The findings in this paper help explain why strong-connectivity of the network topology, adaptation of the combination weights, and clustering of agents are important ingredients to equalize the learning abilities of all agents against such disturbances. The results also clarify how weak-connectivity can be helpful in reducing the effect of outlier data on learning performance.

...read moreread less

29 citations

Proceedings Article•DOI•

Stochastic gradient descent with finite samples sizes

[...]

Kun Yuan¹, Bicheng Ying¹, Stefan Vlaski¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

01 Sep 2016

TL;DR: This work draws from recent results in the field of online adaptation to derive new tight performance expressions for empirical implementations of stochastic gradient descent, mini-batchgradient descent, and importance sampling, and proposes an optimal importance sampling algorithm to optimize performance.

...read moreread less

Abstract: The minimization of empirical risks over finite sample sizes is an important problem in large-scale machine learning. A variety of algorithms has been proposed in the literature to alleviate the computational burden per iteration at the expense of convergence speed and accuracy. Many of these approaches can be interpreted as stochastic gradient descent algorithms, where data is sampled from particular empirical distributions. In this work, we leverage this interpretation and draw from recent results in the field of online adaptation to derive new tight performance expressions for empirical implementations of stochastic gradient descent, mini-batch gradient descent, and importance sampling. The expressions are exact to first order in the step-size parameter and are tighter than existing bounds. We further quantify the performance gained from employing mini-batch solutions, and propose an optimal importance sampling algorithm to optimize performance.

...read moreread less

Proceedings Article•DOI•

On the influence of momentum acceleration on online learning

[...]

Kun Yuan¹, Bicheng Ying¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

20 Mar 2016

TL;DR: The results establish that momentum methods are equivalent to the standard stochastic gradient method with a re-scaled (larger) step-size value, and suggests a method to enhance performance in the Stochastic setting by tuning the momentum parameter over time.

...read moreread less

Abstract: This paper examines the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime. The results establish that momentum methods are equivalent to the standard stochastic gradient method with a re-scaled (larger) step-size value. The equivalence result is established for all time instants and not only in steady-state. The analysis is carried out for general risk functions, and is not limited to quadratic risks. One notable conclusion is that the well-known benefits of momentum constructions for deterministic optimization problems do not necessarily carry over to the stochastic setting when gradient noise is present and continuous adaptation is necessary. The analysis suggests a method to enhance performance in the stochastic setting by tuning the momentum parameter over time.

...read moreread less

Proceedings Article•DOI•

Diffusion stochastic optimization with non-smooth regularizers

[...]

Stefan Vlaski¹, Lieven Vandenberghe¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

20 Mar 2016

TL;DR: It is shown how the regularizers can be smoothed and how the Pareto solution can be sought by appealing to a multi-agent diffusion strategy under conditions that are weaker than assumed earlier in the literature.

...read moreread less

Abstract: We develop an effective distributed strategy for seeking the Pareto solution of an aggregate cost consisting of regularized risks. The focus is on stochastic optimization problems where each risk function is expressed as the expectation of some loss function and the probability distribution of the data is unknown. We assume each risk function is regularized and allow the regularizer to be non-smooth. Under conditions that are weaker than assumed earlier in the literature and, hence, applicable to a broader class of adaptation and learning problems, we show how the regularizers can be smoothed and how the Pareto solution can be sought by appealing to a multi-agent diffusion strategy. The formulation is general enough and includes, for example, a multi-agent proximal strategy as a special case.

...read moreread less

Journal Article•DOI•

Diffusion Adaptation over Multi-Agent Networks with Wireless Link Impairments

[...]

Reza Abdolee¹, Benoit Champagne¹, Ali H. Sayed•Institutions (1)

McGill University¹

01 Jun 2016-IEEE Transactions on Mobile Computing

TL;DR: In this article, the authors study the performance of diffusion least-mean squares algorithms for distributed parameter estimation in multi-agent networks when nodes exchange information over wireless communication links and show that by properly monitoring the CSI over the network and choosing sufficiently small adaptation step-sizes, diffusion strategies are able to deliver satisfactory performance in the presence of fading and path loss.

...read moreread less

Abstract: We study the performance of diffusion least-mean squares algorithms for distributed parameter estimation in multi-agent networks when nodes exchange information over wireless communication links. Wireless channel impairments, such as fading and path-loss, adversely affect the exchanged data and cause instability and performance degradation if left unattended. To mitigate these effects, we incorporate equalization coefficients into the diffusion combination step and update the combination weights dynamically in the face of randomly changing neighborhoods due to fading conditions. When channel state information (CSI) is unavailable, we determine the equalization factors from pilot-aided channel coefficient estimates. The analysis reveals that by properly monitoring the CSI over the network and choosing sufficiently small adaptation step-sizes, the diffusion strategies are able to deliver satisfactory performance in the presence of fading and path loss.

...read moreread less

Journal Article•

On the influence of momentum acceleration on online learning

[...]

Kun Yuan¹, Bicheng Ying¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

01 Jan 2016-Journal of Machine Learning Research

TL;DR: In this article, the convergence rate and mean square error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime was examined in the adaptive online setting.

...read moreread less

Abstract: The article examines in some detail the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime The results establish that momentum methods are equivalent to the standard stochastic gradient method with a re-scaled (larger) step-size value The size of the re-scaling is determined by the value of the momentum parameter The equivalence result is established for all time instants and not only in steady-state The analysis is carried out for general strongly convex and smooth risk functions, and is not limited to quadratic risks One notable conclusion is that the well-known benefits of momentum constructions for deterministic optimization problems do not necessarily carry over to the adaptive online setting when small constant step-sizes are used to enable continuous adaptation and learning in the presence of persistent gradient noise From simulations, the equivalence between momentum and standard stochastic gradient methods is also observed for nondifferentiable and non-convex problems

...read moreread less

Proceedings Article•DOI•

Group diffusion LMS

[...]

Jie Chen, Shang Kee Ting, Cedric Richard¹, Ali H. Sayed²•Institutions (2)

University of Nice Sophia Antipolis¹, University of California, Los Angeles²

20 Mar 2016

TL;DR: It is shown that the diffusion LMS algorithm for distributed inference over networks can be extended to deal with structured criteria built upon groups of variables, leading to a flexible framework that can encode various structures in the parameters to estimate.

...read moreread less

Abstract: Considering groups of variables, rather than variables individually, can be beneficial for estimation accuracy if structural relationships between variables exist (e.g., spatial, hierarchical or related to the physics of the problem). Group-sparsity inducing estimators are typical examples that benefit from such type of prior knowledge. Building on this principle, we show that the diffusion LMS algorithm for distributed inference over networks can be extended to deal with structured criteria built upon groups of variables, leading to a flexible framework that can encode various structures in the parameters to estimate. We also propose an unsupervised online strategy to differentially promote or inhibit collaborations between nodes depending on the group of variables at hand.

...read moreread less

Posted Content•

On the Influence of Momentum Acceleration on Online Learning

[...]

Kun Yuan¹, Bicheng Ying¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

14 Mar 2016-arXiv: Optimization and Control

TL;DR: In this article, the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime were analyzed for strongly convex and smooth risk functions, and not limited to quadratic risks.

...read moreread less

Abstract: The article examines in some detail the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime. The results establish that momentum methods are equivalent to the standard stochastic gradient method with a re-scaled (larger) step-size value. The size of the re-scaling is determined by the value of the momentum parameter. The equivalence result is established for all time instants and not only in steady-state. The analysis is carried out for general strongly convex and smooth risk functions, and is not limited to quadratic risks. One notable conclusion is that the well-known bene ts of momentum constructions for deterministic optimization problems do not necessarily carry over to the adaptive online setting when small constant step-sizes are used to enable continuous adaptation and learn- ing in the presence of persistent gradient noise. From simulations, the equivalence between momentum and standard stochastic gradient methods is also observed for non-differentiable and non-convex problems.

...read moreread less

Journal Article•DOI•

Diffusion Estimation Over Cooperative Multi-Agent Networks With Missing Data

[...]

Mohammad Reza Gholami, Magnus Jansson¹, Erik G. Ström², Ali H. Sayed³•Institutions (3)

Royal Institute of Technology¹, Chalmers University of Technology², University of California, Los Angeles³

19 May 2016

TL;DR: In this article, the authors examine how a connected network of agents, with each one of them subjected to a stream of data with incomplete regression information, can cooperate with each other through local interactions to estimate the underlying model parameters in the presence of missing data.

...read moreread less

Abstract: In many fields, and especially in the medical and social sciences and in recommender systems, data are gathered through clinical studies or targeted surveys. Participants are generally reluctant to respond to all questions in a survey or they may lack information to respond adequately to some questions. The data collected from these studies tend to lead to linear regression models where the regression vectors are only known partially: some of their entries are either missing completely or replaced randomly by noisy values. In this work, assuming missing positions are replaced by noisy values, we examine how a connected network of agents, with each one of them subjected to a stream of data with incomplete regression information, can cooperate with each other through local interactions to estimate the underlying model parameters in the presence of missing data. We explain how to adjust the distributed diffusion strategy through (de)regularization in order to eliminate the bias introduced by the incomplete model. We also propose a technique to recursively estimate the (de)regularization parameter and examine the performance of the resulting strategy. We illustrate the results by considering two applications: one dealing with a mental health survey and the other dealing with a household consumption survey.

...read moreread less

Posted Content•

Distributed Detection over Adaptive Networks: Refined Asymptotics and the Role of Connectivity

[...]

Vincenzo Matta¹, Paolo Braca², Stefano Marano¹, Ali H. Sayed³•Institutions (3)

University of Salerno¹, NATO², University of California, Los Angeles³

26 Jan 2016-arXiv: Multiagent Systems

TL;DR: In this paper, the authors consider distributed detection problems over adaptive networks, where dispersed agents learn continually from streaming data by means of local interactions, and the simultaneous requirements of adaptation and cooperation are achieved by employing diffusion algorithms with constant step-size.

...read moreread less

Abstract: We consider distributed detection problems over adaptive networks, where dispersed agents learn continually from streaming data by means of local interactions. The simultaneous requirements of adaptation and cooperation are achieved by employing diffusion algorithms with constant step-size {\mu}. In [1], [2] some main features of adaptive distributed detection were revealed. By resorting to large deviations analysis, it was established that the Type-I and Type-II error probabilities of all agents vanish exponentially as functions of 1/{\mu}, and that all agents share the same Type-I and Type-II error exponents. However, numerical evidences presented in [1], [2] showed that the theory of large deviations does not capture the fundamental impact of network connectivity on performance, and that additional tools and efforts are required to obtain accurate predictions for the error probabilities. This work addresses these open issues and extends the results of [1], [2] in several directions. By conducting a refined asymptotic analysis based on the mathematical framework of exact asymptotics, we arrive at a revealing and powerful understanding of the universal behavior of distributed detection over adaptive networks: as functions of 1/{\mu}, the error (log-)probability curves corresponding to different agents stay nearly-parallel to each other (as already discovered in [1], [2]), however, these curves are ordered following a criterion reflecting the degree of connectivity of each agent. Depending on the combination weights, the more connected an agent is, the lower its error probability curve will be. Interesting and somehow unexpected behaviors emerge, in terms of the interplay between the network topology, the combination weights, and the inference performance. The lesson learned is that connectivity matters.

...read moreread less

Proceedings Article•DOI•

Decentralized consensus optimization with asynchrony and delays

[...]

Tianyu Wu¹, Kun Yuan¹, Qing Ling², Wotao Yin¹, Ali H. Sayed¹ - Show less +1 more•Institutions (2)

University of California, Los Angeles¹, University of Science and Technology of China²

01 Nov 2016

TL;DR: An asynchronous, decentralized algorithm for consensus optimization that involves both primal and dual variables, uses fixed step-size parameters, and provably converges to the exact solution under a random agent assumption and both bounded and unbounded delay assumptions.

...read moreread less

Abstract: We propose an asynchronous, decentralized algorithm for consensus optimization. The algorithm runs over a network of agents, where the agents perform local computation and communicate with neighbors. We design the algorithm so that the agents can compute and communicate independently at different times and for different durations. This reduces the waiting time for the slowest agent or longest communication delay and also eliminates the need for a global clock. Mathematically, the algorithm involves both primal and dual variables, uses fixed step-size parameters, and provably converges to the exact solution under a bounded delay assumption and a random agent assumption. When running synchronously, the algorithm performs just as well as existing competitive synchronous algorithms such as PG-EXTRA, which diverges without synchronization. Numerical experiments confirm the theoretical findings and illustrate the performance of the proposed algorithm.

...read moreread less

Proceedings Article•DOI•

Diffusion social learning over weakly-connected graphs

[...]

Hawraa Salami¹, Bicheng Ying¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

20 Mar 2016

TL;DR: It is shown that the asymmetric flow of information hinders the learning abilities of certain agents regardless of their local observations, and useful closed-form expressions are derived which can be used to motivate design problems to control it.

...read moreread less

Abstract: In this paper, we study diffusion social learning over weakly-connected graphs. We show that the asymmetric flow of information hinders the learning abilities of certain agents regardless of their local observations. Under some circumstances that we clarify in this work, a scenario of total influence (or "mind-control") arises where a set of influential agents ends up shaping the beliefs of non-influential agents. We derive useful closed-form expressions that characterize this influence, and which can be used to motivate design problems to control it. We provide simulation examples to illustrate the results.

...read moreread less

Posted Content•

Social Learning over Weakly-Connected Graphs

[...]

Hawraa Salami¹, Bicheng Ying¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

13 Sep 2016-arXiv: Social and Information Networks

TL;DR: In this paper, the authors study diffusion social learning over weakly-connected graphs and show that the asymmetric flow of information hinders the learning abilities of certain agents regardless of their local observations.

...read moreread less

Proceedings Article•DOI•

Performance limits of single-agent and multi-agent sub-gradient stochastic learning

[...]

Bicheng Ying¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

20 Mar 2016

TL;DR: The analysis establishes that sub-gradient strategies can attain exponential convergence rates, as opposed to sub-linear rates, and that they can approach the optimal solution within O(p), for sufficiently small step-sizes, p.

...read moreread less

Abstract: This work examines the performance of stochastic sub-gradient learning strategies, for both cases of stand-alone and networked agents, under weaker conditions than usually considered in the literature. It is shown that these conditions are automatically satisfied by several important cases of interest, including support-vector machines and sparsity-inducing learning solutions. The analysis establishes that sub-gradient strategies can attain exponential convergence rates, as opposed to sub-linear rates, and that they can approach the optimal solution within O(p), for sufficiently small step-sizes, p. A realizable exponential-weighting procedure is proposed to smooth the intermediate iterates and to guarantee these desirable performance properties.

...read moreread less

Proceedings Article•DOI•

Detection over diffusion networks: Asymptotic tools for performance prediction and simulation

[...]

Vincenzo Matta¹, Paolo Braca², Stefano Marano¹, Ali H. Sayed³•Institutions (3)

University of Salerno¹, NATO², University of California, Los Angeles³

01 Aug 2016

TL;DR: The analysis provides insight into the interplay between the network topology, the combination weights, and the inference performance, revealing the universal behavior of diffusion-based detectors over adaptive networks.

...read moreread less

Abstract: Exploiting recent progress [1]-[4] in the characterization of the detection performance of diffusion strategies over adaptive multi-agent networks: i) we present two theoretical approximations, one based on asymptotic normality and the other based on the theory of exact asymptotics; and ii) we develop an efficient simulation method by tailoring the importance sampling technique to diffusion adaptation. We show that these theoretical and experimental tools complement each other well, with their combination offering a substantial advance for a reliable quantitative detection-performance assessment. The analysis provides insight into the interplay between the network topology, the combination weights, and the inference performance, revealing the universal behavior of diffusion-based detectors over adaptive networks.

...read moreread less

Proceedings Article•DOI•

Distributed learning over multitask networks with linearly related tasks

[...]

Roula Nassif¹, Cedric Richard¹, A. Ferrari¹, Ali H. Sayed²•Institutions (2)

Centre national de la recherche scientifique¹, University of California, Los Angeles²

01 Nov 2016

TL;DR: A projection based diffusion LMS approach is derived and studied for distributed adaptive learning over multitask mean-square-error networks where each agent is interested in estimating its own parameter vector and the set of constraints involving its vector.

...read moreread less

Abstract: In this work, we consider distributed adaptive learning over multitask mean-square-error (MSE) networks where each agent is interested in estimating its own parameter vector, also called task, and where the tasks at neighboring agents are related according to a set of linear equality constraints. We assume that each agent knows its own cost function of its vector and the set of constraints involving its vector. In order to solve the multitask problem and to optimize the individual costs subject to all constraints, a projection based diffusion LMS approach is derived and studied. Simulation results illustrate the efficiency of the strategy.

...read moreread less

Journal Article•DOI•

Guest Editorial Inference and Learning over Networks

[...]

Vincenzo Matta¹, Cedric Richard², Venkatesh Saligrama³, Ali H. Sayed⁴•Institutions (4)

University of Salerno¹, Centre national de la recherche scientifique², Boston University³, University of California, Los Angeles⁴

01 Dec 2016

Proceedings Article•DOI•

Adaptive learning for stochastic generalized Nash equilibrium problems

[...]

Chung-Kai Yu¹, Mikaela van der Schaar¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

20 Mar 2016

TL;DR: Three stochastic gradient strategies are developed by relying on a penalty-based approach where the constrained GNEP formulation is replaced by a penalized unconstrained formulation that is able to approach the Nash equilibrium in a stable manner within O(p), for small step-size values p.

...read moreread less

Abstract: This work examines a stochastic formulation of the generalized Nash equilibrium problem (GNEP) where agents are subject to randomness in the environment of unknown statistical distribution. Three stochastic gradient strategies are developed by relying on a penalty-based approach where the constrained GNEP formulation is replaced by a penalized unconstrained formulation. It is shown that this penalty solution is able to approach the Nash equilibrium in a stable manner within O(p), for small step-size values p. The operation of the algorithms is illustrated by considering the Cournot competition problem.

...read moreread less

Proceedings Article•DOI•

Online dual coordinate ascent learning

[...]

Bicheng Ying¹, Kun Yuan¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

24 Feb 2016

TL;DR: In this paper, the authors develop an online dual coordinate-ascent (O-DCA) algorithm that is able to respond to streaming data and does not need to revisit the past data.

...read moreread less

Abstract: The stochastic dual coordinate-ascent (S-DCA) technique is a useful alternative to the traditional stochastic gradient-descent algorithm for solving large-scale optimization problems due to its scalability to large data sets and strong theoretical guarantees. However, the available S-DCA formulation is limited to finite sample sizes and relies on performing multiple passes over the same data. This formulation is not well-suited for online implementations where data keep streaming in. In this work, we develop an online dual coordinate-ascent (O-DCA) algorithm that is able to respond to streaming data and does not need to revisit the past data. This feature infuses the resulting construction with continuous adaptation, learning, and tracking abilities, which are particularly useful for online learning scenarios.

...read moreread less

Proceedings Article•DOI•

The brain strategy for online learning

[...]

Stefan Vlaski¹, Bicheng Ying¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

01 Dec 2016

TL;DR: This work proposes a BRAIN strategy for learning, which enhances the performance of traditional algorithms, such as logistic regression and SVM learners, by incorporating a graphical layer that tracks and learns in real-time the underlying correlation structure among feature subspaces.

...read moreread less

Abstract: Complexity is a double-edged sword for learning algorithms when the number of available samples for training in relation to the dimension of the feature space is small. This is because simple models do not sufficiently capture the nuances of the data set, while complex models overfit. While remedies such as regularization and dimensionality reduction exist, they themselves can suffer from overfitting or introduce bias. To address the issue of overfitting, the incorporation of prior structural knowledge is generally of paramount importance. In this work, we propose a BRAIN strategy for learning, which enhances the performance of traditional algorithms, such as logistic regression and SVM learners, by incorporating a graphical layer that tracks and learns in real-time the underlying correlation structure among feature subspaces. In this way, the algorithm is able to identify salient subspaces and their correlations, while simultaneously dampening the effect of irrelevant features. This effect is particularly useful for high-dimensional feature spaces.

...read moreread less

Posted Content•

Online Dual Coordinate Ascent Learning

[...]

Bicheng Ying¹, Kun Yuan¹, Ali H. Sayed¹•Institutions (1)

University of California, Los Angeles¹

24 Feb 2016-arXiv: Optimization and Control

TL;DR: In this article, the authors develop an online dual coordinate-ascent (O-DCA) algorithm that is able to respond to streaming data and does not need to revisit the past data.

...read moreread less

Abstract: The stochastic dual coordinate-ascent (S-DCA) technique is a useful alternative to the traditional stochastic gradient-descent algorithm for solving large-scale optimization problems due to its scalability to large data sets and strong theoretical guarantees. However, the available S-DCA formulation is limited to finite sample sizes and relies on performing multiple passes over the same data. This formulation is not well-suited for online implementations where data keep streaming in. In this work, we develop an {\em online} dual coordinate-ascent (O-DCA) algorithm that is able to respond to streaming data and does not need to revisit the past data. This feature embeds the resulting construction with continuous adaptation, learning, and tracking abilities, which are particularly attractive for online learning scenarios.

...read moreread less