scispace - formally typeset
Search or ask a question

Showing papers by "Ali H. Sayed published in 2015"


Journal ArticleDOI
TL;DR: This paper conducts a theoretical analysis on the stochastic behavior of diffusion LMS in the case where the single-task hypothesis is violated and proposes an unsupervised clustering strategy that allows each node to select, via adaptive adjustments of combination weights, the neighboring nodes with which it can collaborate to estimate a common parameter vector.
Abstract: The diffusion LMS algorithm has been extensively studied in recent years. This efficient strategy allows to address distributed optimization problems over networks in the case where nodes have to collaboratively estimate a single parameter vector. Nevertheless, there are several problems in practice that are multitask-oriented in the sense that the optimum parameter vector may not be the same for every node. This brings up the issue of studying the performance of the diffusion LMS algorithm when it is run, either intentionally or unintentionally, in a multitask environment. In this paper, we conduct a theoretical analysis on the stochastic behavior of diffusion LMS in the case where the single-task hypothesis is violated. We analyze the competing factors that influence the performance of diffusion LMS in the multitask environment, and which allow the algorithm to continue to deliver performance superior to non-cooperative strategies in some useful circumstances. We also propose an unsupervised clustering strategy that allows each node to select, via adaptive adjustments of combination weights, the neighboring nodes with which it can collaborate to estimate a common parameter vector. Simulations are presented to illustrate the theoretical results, and to demonstrate the efficiency of the proposed clustering strategy.

214 citations


Journal ArticleDOI
TL;DR: A detailed transient analysis of the learning behavior of multiagent networks reveals how combination policies influence the learning process of networked agents, and how these policies can steer the convergence point toward any of many possible Pareto optimal solutions.
Abstract: This paper carries out a detailed transient analysis of the learning behavior of multiagent networks, and reveals interesting results about the learning abilities of distributed strategies. Among other results, the analysis reveals how combination policies influence the learning process of networked agents, and how these policies can steer the convergence point toward any of many possible Pareto optimal solutions. The results also establish that the learning process of an adaptive network undergoes three (rather than two) well-defined stages of evolution with distinctive convergence rates during the first two stages, while attaining a finite mean-square-error level in the last stage. The analysis reveals what aspects of the network topology influence performance directly and suggests design procedures that can optimize performance by adjusting the relevant topology parameters. Interestingly, it is further shown that, in the adaptation regime, each agent in a sparsely connected network is able to achieve the same performance level as that of a centralized stochastic-gradient strategy even for left-stochastic combination strategies. These results lead to a deeper understanding and useful insights on the convergence behavior of coupled distributed learners. The results also lead to effective design mechanisms to help diffuse information more thoroughly over networks.

159 citations


Journal ArticleDOI
TL;DR: An adaptive clustering and learning scheme that allows agents to learn which neighbor they should cooperate with and which other neighbors they should ignore, and enables the agents to identify their clusters and to attain improved learning and estimation accuracy over networks is proposed.
Abstract: Distributed processing over networks relies on in-network processing and cooperation among neighboring agents. Cooperation is beneficial when agents share a common objective. However, in many applications, agents may belong to different clusters that pursue different objectives. Then, indiscriminate cooperation will lead to undesired results. In this paper, we propose an adaptive clustering and learning scheme that allows agents to learn which neighbors they should cooperate with and which other neighbors they should ignore. In doing so, the resulting algorithm enables the agents to identify their clusters and to attain improved learning and estimation accuracy over networks. We carry out a detailed mean-square analysis and assess the error probabilities of Types I and II, i.e., false alarm and misdetection, for the clustering mechanism. Among other results, we establish that these probabilities decay exponentially with the step-sizes so that the probability of correct clustering can be made arbitrarily close to one.

109 citations


Journal ArticleDOI
TL;DR: This paper considers learning dictionary models over a network of agents, where each agent is only in charge of a portion of the dictionary elements and generates dual variables that are used by the agents to update their dictionaries without the need to share these dictionaries or even the coefficient models for the training data.
Abstract: In this paper, we consider learning dictionary models over a network of agents, where each agent is only in charge of a portion of the dictionary elements. This formulation is relevant in Big Data scenarios where large dictionary models may be spread over different spatial locations and it is not feasible to aggregate all dictionaries in one location due to communication and privacy considerations. We first show that the dual function of the inference problem is an aggregation of individual cost functions associated with different agents, which can then be minimized efficiently by means of diffusion strategies. The collaborative inference step generates dual variables that are used by the agents to update their dictionaries without the need to share these dictionaries or even the coefficient models for the training data. This is a powerful property that leads to an effective distributed procedure for learning dictionaries over large networks (e.g., hundreds of agents in our experiments). Furthermore, the proposed learning strategy operates in an online manner and is able to respond to streaming data, where each data sample is presented to the network once.

108 citations


Journal ArticleDOI
TL;DR: In this paper, the authors examined the mean-square stability and convergence of the learning process of distributed strategies over graphs and identified conditions on the network topology, utilities, and data in order to ensure stability; the results also identified three distinct stages in the learning behavior of multiagent networks related to transient phases I and II and the steady state phase.
Abstract: Part I of this paper examined the mean-square stability and convergence of the learning process of distributed strategies over graphs. The results identified conditions on the network topology, utilities, and data in order to ensure stability; the results also identified three distinct stages in the learning behavior of multiagent networks related to transient phases I and II and the steady-state phase. This Part II examines the steady-state phase of distributed learning by networked agents. Apart from characterizing the performance of the individual agents, it is shown that the network induces a useful equalization effect across all agents. In this way, the performance of noisier agents is enhanced to the same level as the performance of agents with less noisy data. It is further shown that in the small step-size regime, each agent in the network is able to achieve the same performance level as that of a centralized strategy corresponding to a fully connected network. The results in this part reveal explicitly which aspects of the network topology and operation influence performance and provide important insights into the design of effective mechanisms for the processing and diffusion of information over networks.

106 citations


Journal ArticleDOI
TL;DR: In this paper, the authors apply diffusion strategies to develop a fully-distributed cooperative reinforcement learning algorithm in which agents in a network communicate only with their immediate neighbors to improve predictions about their environment.
Abstract: We apply diffusion strategies to develop a fully-distributed cooperative reinforcement learning algorithm in which agents in a network communicate only with their immediate neighbors to improve predictions about their environment. The algorithm can also be applied to off-policy learning, meaning that the agents can predict the response to a behavior different from the actual policies they are following. The proposed distributed strategy is efficient, with linear complexity in both computation time and memory footprint. We provide a mean-square-error performance analysis and establish convergence under constant step-size updates, which endow the network with continuous learning capabilities. The results show a clear gain from cooperation: when the individual agents can estimate the solution, cooperation increases stability and reduces bias and variance of the prediction error; but, more importantly, the network is able to approach the optimal solution even when none of the individual agents can (e.g., when the individual behavior policies restrict each agent to sample a small portion of the state space).

100 citations


Journal ArticleDOI
TL;DR: In this paper, the stability and performance of asynchronous strategies for distributed optimization and adaptation problems over networks were analyzed, and the results provided a solid justification for the remarkable resilience of cooperative networks in the face of random failures at multiple levels: agents, links, data arrivals and topology.
Abstract: In this work and the supporting Parts II and III of this paper, also in the current issue, we provide a rather detailed analysis of the stability and performance of asynchronous strategies for solving distributed optimization and adaptation problems over networks. We examine asynchronous networks that are subject to fairly general sources of uncertainties, such as changing topologies, random link failures, random data arrival times, and agents turning on and off randomly. Under this model, agents in the network may stop updating their solutions or may stop sending or receiving information in a random manner and without coordination with other agents. We establish in Part I conditions on the first and second-order moments of the relevant parameter distributions to ensure mean-square stable behavior. We derive in Part II expressions that reveal how the various parameters of the asynchronous behavior influence network performance. We compare in Part III the performance of asynchronous networks to the performance of both centralized solutions and synchronous networks. One notable conclusion is that the mean-square-error performance of asynchronous networks shows a degradation only in the order of O(ν), where ν is a small step-size parameter, while the convergence rate remains largely unaltered. The results provide a solid justification for the remarkable resilience of cooperative networks in the face of random failures at multiple levels: agents, links, data arrivals, and topology.

84 citations


Journal ArticleDOI
TL;DR: It is found that distributed primal-dual strategies for adaptation and learning over networks from streaming data have narrower stability ranges and worse steady-state mean-square-error performance than primal methods of the consensus and diffusion type.
Abstract: This paper studies distributed primal-dual strategies for adaptation and learning over networks from streaming data. Two first-order methods are considered based on the Arrow-Hurwicz (AH) and augmented Lagrangian (AL) techniques. Several revealing results are discovered in relation to the performance and stability of these strategies when employed over adaptive networks. The conclusions establish that the advantages that these methods exhibit for deterministic optimization problems do not necessarily carry over to stochastic optimization problems. It is found that they have narrower stability ranges and worse steady-state mean-square-error performance than primal methods of the consensus and diffusion type. It is also found that the AH technique can become unstable under a partial observation model, while the other techniques are able to recover the unknown under this scenario. A method to enhance the performance of AL strategies is proposed by tying the selection of the step-size to their regularization parameter. It is shown that this method allows the AL algorithm to approach the performance of consensus and diffusion strategies but that it remains less stable than these other strategies.

58 citations


Journal ArticleDOI
TL;DR: In this paper, the mean-square-error performance of asynchronous strategies for solving distributed optimization and adaptation problems over networks was analyzed and the analytical expressions for the mean square convergence rate and the steady-state mean square deviation were derived.
Abstract: In Part I of this paper, also in this issue, we introduced a fairly general model for asynchronous events over adaptive networks including random topologies, random link failures, random data arrival times, and agents turning on and off randomly. We performed a stability analysis and established the notable fact that the network is still able to converge in the mean-square-error sense to the desired solution. Once stable behavior is guaranteed, it becomes important to evaluate how fast the iterates converge and how close they get to the optimal solution. This is a demanding task due to the various asynchronous events and due to the fact that agents influence each other. In this Part II, we carry out a detailed analysis of the mean-square-error performance of asynchronous strategies for solving distributed optimization and adaptation problems over networks. We derive analytical expressions for the mean-square convergence rate and the steady-state mean-square-deviation. The expressions reveal how the various parameters of the asynchronous behavior influence network performance. In the process, we establish the interesting conclusion that even under the influence of asynchronous events, all agents in the adaptive network can still reach an O(ν 1 + γo' ) near-agreement with some γo ' > 0 while approaching the desired solution within O(ν) accuracy, where ν is proportional to the small step-size parameter for adaptation.

34 citations


Proceedings ArticleDOI
28 Dec 2015
TL;DR: This work proposes an adaptive and distributed clustering technique that allows agents to learn and form clusters from streaming data in a robust manner and shows how the clustering process enhances the mean-square-error performance of the agents across the net work.
Abstract: Cooperation among agents across the network leads to bet ter estimation accuracy. However, in many network applications the agents infer and track different models of interest in an environment where agents do not know beforehand which models are being observed by their neighbors. In this work, we propose an adaptive and distributed clustering technique that allows agents to learn and form clusters from streaming data in a robust manner. Once clusters are formed, cooperation among agents with similar objectives then enhances the performance of the inference task. The performance of the proposed clustering algorithm is discussed by commenting on the behavior of probabilities of erroneous decision. We validate the performance of the algorithm by numerical sim ulations, that show how the clustering process enhances the mean-square-error performance of the agents across the net work.

27 citations


Journal ArticleDOI
TL;DR: In this paper, a detailed mean-square-error analysis of the performance of asynchronous adaptation and learning over networks under a fairly general model for asynchronous events including random topologies, random link failures, random data arrival times and agents turning on and off randomly.
Abstract: In Part II of this paper, also in this issue, we carried out a detailed mean-square-error analysis of the performance of asynchronous adaptation and learning over networks under a fairly general model for asynchronous events including random topologies, random link failures, random data arrival times, and agents turning on and off randomly. In this Part III, we compare the performance of synchronous and asynchronous networks. We also compare the performance of decentralized adaptation against centralized stochastic-gradient (batch) solutions. Two interesting conclusions stand out. First, the results establish that the performance of adaptive networks is largely immune to the effect of asynchronous events: the mean and mean-square convergence rates and the asymptotic bias values are not degraded relative to synchronous or centralized implementations. Only the steady-state mean-square-deviation suffers a degradation in the order of ν, which represents the small step-size parameters used for adaptation. Second, the results show that the adaptive distributed network matches the performance of the centralized solution. These conclusions highlight another critical benefit of cooperation by networked agents: cooperation does not only enhance performance in comparison to stand-alone single-agent processing, but it also endows the network with remarkable resilience to various forms of random failure events and is able to deliver performance that is as powerful as batch solutions.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: A diffusion-type algorithm is proposed to solve multitask estimation problems where each cluster of nodes is interested in estimating its own optimum parameter vector in a distributed manner by minimizing a global mean-square error criterion regularized by a term that promotes piecewise constant transitions in the parameter vector entries estimated by neighboring clusters.
Abstract: In this work, a diffusion-type algorithm is proposed to solve multitask estimation problems where each cluster of nodes is interested in estimating its own optimum parameter vector in a distributed manner. The approach relies on minimizing a global mean-square error criterion regularized by a term that promotes piecewise constant transitions in the parameter vector entries estimated by neighboring clusters. We provide some results on the mean and mean-square-error convergence. Simulations are conducted to illustrate the effectiveness of the strategy.

Proceedings ArticleDOI
01 Aug 2015
TL;DR: An unsupervised strategy that allows each node to continuously select the neighboring nodes with which it should exchange information to improve its estimation accuracy is derived.
Abstract: Diffusion LMS was originally conceived for online distributed parameter estimation in single-task environments where agents pursue a common objective. However, estimating distinct but correlated objects (multitask problems) is useful in many applications. To address multitask problems with combine-then-adapt diffusion LMS strategies, we derive an unsupervised strategy that allows each node to continuously select the neighboring nodes with which it should exchange information to improve its estimation accuracy. Simulation experiments illustrate the efficiency of this clustering strategy. In particular, nodes do not know which other nodes share similar objectives.

Journal ArticleDOI
19 Jun 2015
TL;DR: A reputation protocol is developed to summarize the opponent's past actions into a reputation score, which can be used to form a belief about the opponents' subsequent actions and entices agents to cooperate and turns their optimal strategy into an action-choosing strategy that enhances the overall social benefit of the network.
Abstract: We examine the behavior of multiagent networks where information-sharing is subject to a positive communications cost over the edges linking the agents. We consider a general mean-square-error formulation, where all agents are interested in estimating the same target vector. We first show that in the absence of any incentives to cooperate, the optimal strategy for the agents is to behave in a selfish manner with each agent seeking the optimal solution independently of the other agents. Pareto inefficiency arises as a result of the fact that agents are not using historical data to predict the behavior of their neighbors and to know whether they will reciprocate and participate in sharing information. Motivated by this observation, we develop a reputation protocol to summarize the opponent’s past actions into a reputation score, which can then be used to form a belief about the opponent’s subsequent actions. The reputation protocol entices agents to cooperate and turns their optimal strategy into an action-choosing strategy that enhances the overall social benefit of the network. In particular, we show that when the communications cost becomes large, the expected social benefit of the proposed protocol outperforms the social benefit that is obtained by cooperative agents that always share data. We perform a detailed mean-square-error analysis of the evolution of the network over three domains: 1) far field; 2) near-field; and 3) middle-field, and show that the network behavior is stable for sufficiently small step-sizes. The various theoretical results are illustrated by numerical simulations.

Book ChapterDOI
30 Nov 2015
TL;DR: In this paper, the authors extended the results to asynchronous networks where agents are subject to various sources of uncertainties that influence their behavior, including randomly changing topologies, random link failures, random data arrival times, and agents turning on and off randomly.
Abstract: The overview article [ 1 ] surveyed advances related to adaptation, learning, and optimization over synchronous networks. Various distributed strategies were discussed that enable a collection of networked agents to interact locally in response to online streaming data and to continually learn and adapt to drifts in the data and models. Under reasonable technical conditions, the adaptive networks were shown to be mean-square stable in the slow adaptation regime, and their mean-square-error performance and convergence rate were characterized in terms of the network topology and data statistical properties [ 2 ]. Classical results for single-agent adaptation and learning were recovered as special cases. Following the works [ [3] , [4] , [5] ], this chapter complements the exposition from [ 1 ] and extends the results to asynchronous networks where agents are subject to various sources of uncertainties that influence their behavior, including randomly changing topologies, random link failures, random data arrival times, and agents turning on and off randomly. In an asynchronous environment, agents may stop updating their solutions or may stop sending or receiving information in a random manner and without coordination with other agents. The presentation will reveal that the mean-square-error performance of asynchronous networks remains largely unaltered compared to synchronous networks. The results justify the remarkable resilience of cooperative networks.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: This work addresses the open issue of how to capture the exact impact of network connectivity on the detection performance of each individual agent by exploiting the framework of exact asymptotics.
Abstract: In [1], an important step toward the characterization of distributed detection over adaptive networks has been made by establishing the fundamental scaling law of the error probabilities. However, empirical evidence reported in [1] revealed that a refined asymptotic analysis is necessary in order to capture the exact impact of network connectivity on the detection performance of each individual agent. Here we address this open issue by exploiting the framework of exact asymptotics.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: This work represents the implementation as the cascade of three operators and invoke Banach's fixed-point theorem to establish that, despite gradient noise, the stochastic implementation is able to converge in the mean-square-error sense within O(μ) from the optimal solution, for a sufficiently small step-size parameter, μ.
Abstract: We consider networks of agents cooperating to minimize a global objective, modeled as the aggregate sum of regularized costs that are not required to be differentiable. Since the subgradients of the individual costs cannot generally be assumed to be uniformly bounded, general distributed subgradient techniques are not applicable to these problems. We isolate the requirement of bounded subgradients into the regularizer and use splitting techniques to develop a stochastic proximal diffusion strategy for solving the optimization problem by continuously learning from streaming data. We represent the implementation as the cascade of three operators and invoke Banach's fixed-point theorem to establish that, despite gradient noise, the stochastic implementation is able to converge in the mean-square-error sense within O(μ) from the optimal solution, for a sufficiently small step-size parameter, μ.

Proceedings ArticleDOI
01 Jan 2015
TL;DR: An adaptive regularized diffusion strategy using Gaussian kernel regularization is proposed to enable the agents to learn about the objectives of their neighbors and to ignore misleading information to meet their objectives more accurately and improve the performance of the network.
Abstract: The focus of this paper is on multitask learning over adaptive networks where different clusters of nodes have different objectives. We propose an adaptive regularized diffusion strategy using Gaussian kernel regularization to enable the agents to learn about the objectives of their neighbors and to ignore misleading information. In this way, the nodes will be able to meet their objectives more accurately and improve the performance of the network. Simulation results are provided to illustrate the performance of the proposed adaptive regularization procedure in comparison with other implementations.

Posted Content
TL;DR: The results in this article establish that stochastic sub-gradient strategies can attain linear convergence rates, as opposed to sub-linear rates, to the steady-state regime.
Abstract: The analysis in Part I revealed interesting properties for subgradient learning algorithms in the context of stochastic optimization when gradient noise is present. These algorithms are used when the risk functions are non-smooth and involve non-differentiable components. They have been long recognized as being slow converging methods. However, it was revealed in Part I that the rate of convergence becomes linear for stochastic optimization problems, with the error iterate converging at an exponential rate $\alpha^i$ to within an $O(\mu)-$neighborhood of the optimizer, for some $\alpha \in (0,1)$ and small step-size $\mu$. The conclusion was established under weaker assumptions than the prior literature and, moreover, several important problems (such as LASSO, SVM, and Total Variation) were shown to satisfy these weaker assumptions automatically (but not the previously used conditions from the literature). These results revealed that sub-gradient learning methods have more favorable behavior than originally thought when used to enable continuous adaptation and learning. The results of Part I were exclusive to single-agent adaptation. The purpose of the current Part II is to examine the implications of these discoveries when a collection of networked agents employs subgradient learning as their cooperative mechanism. The analysis will show that, despite the coupled dynamics that arises in a networked scenario, the agents are still able to attain linear convergence in the stochastic case; they are also able to reach agreement within $O(\mu)$ of the optimizer.

Posted Content
TL;DR: This chapter complements the exposition from [1] and extends the results to asynchronous networks, revealing that the mean-square-error performance of asynchronous networks remains largely unaltered compared to synchronous networks.
Abstract: In a recent article [1] we surveyed advances related to adaptation, learning, and optimization over synchronous networks. Various distributed strategies were discussed that enable a collection of networked agents to interact locally in response to streaming data and to continually learn and adapt to track drifts in the data and models. Under reasonable technical conditions on the data, the adaptive networks were shown to be mean-square stable in the slow adaptation regime, and their mean-square-error performance and convergence rate were characterized in terms of the network topology and data statistical moments [2]. Classical results for single-agent adaptation and learning were recovered as special cases. Following the works [3]-[5], this chapter complements the exposition from [1] and extends the results to asynchronous networks. The operation of this class of networks can be subject to various sources of uncertainties that influence their dynamic behavior, including randomly changing topologies, random link failures, random data arrival times, and agents turning on and off randomly. In an asynchronous environment, agents may stop updating their solutions or may stop sending or receiving information in a random manner and without coordination with other agents. The presentation will reveal that the mean-square-error performance of asynchronous networks remains largely unaltered compared to synchronous networks. The results justify the remarkable resilience of cooperative networks in the face of random events.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: The findings in this work help explain why strong-connectivity of the network topology, adaptation of the combination weights, and clustering of agents are important ingredients to equalize the learning abilities of all agents against such disturbances.
Abstract: In this paper, we examine the learning mechanism of adaptive agents over weakly-connected graphs and reveal an interesting behavior on how information flows through such topologies The results clarify how asymmetries in the exchange of data can mask local information at certain agents and make them totally dependent on other agents A leader-follower relationship develops with the performance of some agents being fully determined by other agents that can even be outside their immediate domain of influence This scenario can arise, for example, from intruder attacks by malicious agents or from failures by some critical links The findings in this work help explain why strong-connectivity of the network topology, adaptation of the combination weights, and clustering of agents are important ingredients to equalize the learning abilities of all agents against such disturbances The results also clarify how weak-connectivity can be helpful in reducing the effect of outlier data on learning performance

Journal ArticleDOI
TL;DR: The analysis reveals that by properly monitoring the CSI over the network and choosing sufficiently small adaptation step-sizes, the diffusion strategies are able to deliver satisfactory performance in the presence of fading and path loss.
Abstract: We study the performance of diffusion least-mean-square algorithms for distributed parameter estimation in multi-agent networks when nodes exchange information over wireless communication links. Wireless channel impairments, such as fading and path-loss, adversely affect the exchanged data and cause instability and performance degradation if left unattended. To mitigate these effects, we incorporate equalization coefficients into the diffusion combination step and update the combination weights dynamically in the face of randomly changing neighborhoods due to fading conditions. When channel state information (CSI) is unavailable, we determine the equalization factors from pilot-aided channel coefficient estimates. The analysis reveals that by properly monitoring the CSI over the network and choosing sufficiently small adaptation step-sizes, the diffusion strategies are able to deliver satisfactory performance in the presence of fading and path loss.

Posted Content
TL;DR: The analysis establishes that stochastic sub-gradient strategies can attain exponential convergence rates, as opposed to sub-linear rates, to the steady-state and proposes a realizable exponential-weighting procedure to guarantee the established performance bounds.
Abstract: This work examines the performance of stochastic sub-gradient learning strategies under weaker conditions than usually considered in the literature. The conditions are shown to be automatically satisfied by several important cases of interest including the construction of Linear-SVM, LASSO, and Total-Variation denoising formulations. In comparison, these problems do not satisfy the traditional assumptions automatically and, therefore, conclusions derived based on these earlier assumptions are not directly applicable to these problems. The analysis establishes that stochastic sub-gradient strategies can attain exponential convergence rates, as opposed to sub-linear rates, to the steady-state. A realizable exponential-weighting procedure is proposed to smooth the intermediate iterates by the sub-gradient procedure and to guarantee the established performance bounds in terms of convergence rate and excessive risk performance. Both single-agent and multi-agent scenarios are studied, where the latter case assumes that a collection of agents are interconnected by a topology and can only interact locally with their neighbors. The theoretical conclusions are illustrated by several examples and simulations, including comparisons with the FISTA procedure.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: This work studies distributed primal-dual strategies for adaptation and learning over networks from streaming data and finds that first-order methods based on the Arrow-Hurwicz and augmented Lagrangian techniques have worse steady-state mean-square-error performance than primal methods of the consensus and diffusion type.
Abstract: This work studies distributed primal-dual strategies for adaptation and learning over networks from streaming data. Two first-order methods are considered based on the Arrow-Hurwicz (AH) and augmented Lagrangian (AL) techniques. Several results are revealed in relation to the performance and stability of these strategies when employed over adaptive networks. It is found that these methods have worse steady-state mean-square-error performance than primal methods of the consensus and diffusion type. It is also found that the AH technique can become unstable under a partial observation model, while the other techniques are able to recover the unknown under this scenario. It is further shown that AL techniques are stable over a narrower range of step-sizes than primal strategies.