scispace - formally typeset
Search or ask a question

Showing papers by "Ali H. Sayed published in 2017"


Posted Content
TL;DR: The exact diffusion method is applicable to locally balanced left-stochastic combination matrices which, compared to the conventional doubly stochastic matrix, are more general and able to endow the algorithm with faster convergence rates, more flexible step-size choices, and improved privacy-preserving properties.
Abstract: This work develops a distributed optimization strategy with guaranteed exact convergence for a broad class of left-stochastic combination policies. The resulting exact diffusion strategy is shown in Part II to have a wider stability range and superior convergence performance than the EXTRA strategy. The exact diffusion solution is applicable to non-symmetric left-stochastic combination matrices, while many earlier developments on exact consensus implementations are limited to doubly-stochastic matrices; these latter matrices impose stringent constraints on the network topology. The derivation of the exact diffusion strategy in this work relies on reformulating the aggregate optimization problem as a penalized problem and resorting to a diagonally-weighted incremental construction. Detailed stability and convergence analyses are pursued in Part II and are facilitated by examining the evolution of the error dynamics in a transformed domain. Numerical simulations illustrate the theoretical conclusions.

118 citations


Journal ArticleDOI
TL;DR: In this article, a stochastic formulation of the generalized Nash equilibrium problem where agents are subject to randomness in the environment of unknown statistical distribution is examined, and penalized individual cost functions are employed to deal with coupled constraints.
Abstract: This paper examines a stochastic formulation of the generalized Nash equilibrium problem where agents are subject to randomness in the environment of unknown statistical distribution. We focus on fully distributed online learning by agents and employ penalized individual cost functions to deal with coupled constraints. Three stochastic gradient strategies are developed with constant step-sizes. We allow the agents to use heterogeneous step-sizes and show that the penalty solution is able to approach the Nash equilibrium in a stable manner within $O(\mu _\text{max})$ , for small step-size value $\mu _\text{max}$ and sufficiently large penalty parameters. The operation of the algorithm is illustrated by considering the network Cournot competition problem.

88 citations


Journal ArticleDOI
10 Feb 2017
TL;DR: It is shown that the asymmetric flow of information hinders the learning abilities of certain agents regardless of their local observations, and useful closed-form expressions are derived which can be used to motivate design problems to control it.
Abstract: In this paper, we study diffusion social learning over weakly connected graphs. We show that the asymmetric flow of information hinders the learning abilities of certain agents regardless of their local observations. Under some circumstances that we clarify in this paper, a scenario of total influence (or “mind-control”) arises where a set of influential agents ends up shaping the beliefs of noninfluential agents. We derive useful closed-form expressions that characterize this influence, and which can be used to motivate design problems to control it. We provide simulation examples to illustrate the results.

59 citations


Journal ArticleDOI
TL;DR: This paper examines an alternative way to model relations among tasks by assuming that they all share a common latent feature representation and presents a new multitask learning formulation and algorithms developed for its solution in a distributed online manner.
Abstract: Online learning with streaming data in a distributed and collaborative manner can be useful in a wide range of applications. This topic has been receiving considerable attention in recent years with emphasis on both single-task and multitask scenarios. In single-task adaptation, agents cooperate to track an objective of common interest, while in multitask adaptation agents track multiple objectives simultaneously. Regularization is one useful technique to promote and exploit similarity among tasks in the latter scenario. This paper examines an alternative way to model relations among tasks by assuming that they all share a common latent feature representation. As a result, a new multitask learning formulation is presented and algorithms are developed for its solution in a distributed online manner. We present a unified framework to analyze the mean-square-error performance of the adaptive strategies, and conduct simulations to illustrate the theoretical findings and potential applications.

57 citations


Posted Content
TL;DR: The exact diffusion algorithm developed to remove the bias that is characteristic of distributed solutions for deterministic optimization problems has a wider stability range than the EXTRA consensus solution, meaning that it is stable for a wider range of step-sizes and can attain faster convergence rates.
Abstract: Part I of this work [2] developed the exact diffusion algorithm to remove the bias that is characteristic of distributed solutions for deterministic optimization problems. The algorithm was shown to be applicable to a larger set of combination policies than earlier approaches in the literature. In particular, the combination matrices are not required to be doubly stochastic, which impose stringent conditions on the graph topology and communications protocol. In this Part II, we examine the convergence and stability properties of exact diffusion in some detail and establish its linear convergence rate. We also show that it has a wider stability range than the EXTRA consensus solution, meaning that it is stable for a wider range of step-sizes and can, therefore, attain faster convergence rates. Analytical examples and numerical simulations illustrate the theoretical findings.

56 citations


Journal ArticleDOI
TL;DR: In this article, an adaptive stochastic algorithm based on the projection gradient method and diffusion strategies is proposed to optimize the individual costs subject to all constraints, including linear equality constraints.
Abstract: We consider distributed multitask learning problems over a network of agents where each agent is interested in estimating its own parameter vector, also called task, and where the tasks at neighboring agents are related according to a set of linear equality constraints. Each agent possesses its own convex cost function of its parameter vector and a set of linear equality constraints involving its own parameter vector and the parameter vectors of its neighboring agents. We propose an adaptive stochastic algorithm based on the projection gradient method and diffusion strategies in order to allow the network to optimize the individual costs subject to all constraints. Although the derivation is carried out for linear equality constraints, the technique can be applied to other forms of convex constraints. We conduct a detailed mean-square-error analysis of the proposed algorithm and derive closed-form expressions to predict its learning behavior. We provide simulations to illustrate the theoretical findings. Finally, the algorithm is employed for solving two problems in a distributed manner: A minimum-cost flow problem over a network and a space–time varying field reconstruction problem.

54 citations


Journal ArticleDOI
TL;DR: The robust adaptive algorithm for stand-alone agents was developed, one that semi-parametrically estimates the optimal error nonlinearity jointly with the parameter of interest is extended to solve the problem of robust distributed estimation by a network of agents.
Abstract: Diffusion adaptive networks tasked with solving estimation problems have attracted attention in recent years due to their reliability, scalability, resource efficiency, and resilience to node and link failure. Diffusion adaptation strategies that are based on the least-mean-squares algorithm can be nonrobust against impulsive noise corrupting the measurements. Impulsive noise can degrade stability and steady-state performance, leading to unreliable estimates. In previous work [“Robust adaptation in impulsive noise,” IEEE Trans. Signal Process. , vol. 64, no. 11, pp. 2851–2865, Jun. 2016], a robust adaptive algorithm for stand-alone agents was developed, one that semi-parametrically estimates the optimal error nonlinearity jointly with the parameter of interest. Prior knowledge of the impulsive noise distribution was not assumed. In this paper, we extend the framework to solve the problem of robust distributed estimation by a network of agents. Challenges arise due to the coupling among the agents and the distributed nature of the problem. The resulting diffusion strategy is analyzed and its performance illustrated by numerical simulations.

49 citations


Posted Content
TL;DR: When a batch implementation is employed, it is observed in simulations that diffusion-AVRG is more computationally efficient than exact diffusion or EXTRA, while maintaining almost the same communication efficiency.
Abstract: A new amortized variance-reduced gradient (AVRG) algorithm was developed in \cite{ying2017convergence}, which has constant storage requirement in comparison to SAGA and balanced gradient computations in comparison to SVRG. One key advantage of the AVRG strategy is its amenability to decentralized implementations. In this work, we show how AVRG can be extended to the network case where multiple learning agents are assumed to be connected by a graph topology. In this scenario, each agent observes data that is spatially distributed and all agents are only allowed to communicate with direct neighbors. Moreover, the amount of data observed by the individual agents may differ drastically. For such situations, the balanced gradient computation property of AVRG becomes a real advantage in reducing idle time caused by unbalanced local data storage requirements, which is characteristic of other reduced-variance gradient algorithms. The resulting diffusion-AVRG algorithm is shown to have linear convergence to the exact solution, and is much more memory efficient than other alternative algorithms. In addition, we propose a mini-batch strategy to balance the communication and computation efficiency for diffusion-AVRG. When a proper batch size is employed, it is observed in simulations that diffusion-AVRG is more computationally efficient than exact diffusion or EXTRA while maintaining almost the same communication efficiency.

37 citations


Posted Content
TL;DR: In this article, the authors provided the first theoretical guarantee of linear convergence under random reshuffling for SAGA and proposed a new amortized variance-reduced gradient (AVRG) algorithm with constant storage requirements and balanced gradient computations compared to SVRG.
Abstract: Several useful variance-reduced stochastic gradient algorithms, such as SVRG, SAGA, Finito, and SAG, have been proposed to minimize empirical risks with linear convergence properties to the exact minimizer. The existing convergence results assume uniform data sampling with replacement. However, it has been observed in related works that random reshuffling can deliver superior performance over uniform sampling and, yet, no formal proofs or guarantees of exact convergence exist for variance-reduced algorithms under random reshuffling. This paper makes two contributions. First, it resolves this open issue and provides the first theoretical guarantee of linear convergence under random reshuffling for SAGA; the argument is also adaptable to other variance-reduced algorithms. Second, under random reshuffling, the paper proposes a new amortized variance-reduced gradient (AVRG) algorithm with constant storage requirements compared to SAGA and with balanced gradient computations compared to SVRG. AVRG is also shown analytically to converge linearly.

21 citations


Posted Content
TL;DR: This work develops effective distributed strategies for the solution of multi-agent stochastic optimization problems with coupled parameters and constraints across the agents, and derives an effective distributed learning strategy that is also able to track drifts in the underlying parameter models.
Abstract: This work develops effective distributed strategies for the solution of constrained multi-agent stochastic optimization problems with coupled parameters across the agents. In this formulation, each agent is influenced by only a subset of the entries of a global parameter vector or model, and is subject to convex constraints that are only known locally. Problems of this type arise in several applications, most notably in disease propagation models, minimum-cost flow problems, distributed control formulations, and distributed power system monitoring. This work focuses on stochastic settings, where a stochastic risk function is associated with each agent and the objective is to seek the minimizer of the aggregate sum of all risks subject to a set of constraints. Agents are not aware of the statistical distribution of the data and, therefore, can only rely on stochastic approximations in their learning strategies. We derive an effective distributed learning strategy that is able to track drifts in the underlying parameter model. A detailed performance and stability analysis is carried out showing that the resulting coupled diffusion strategy converges at a linear rate to an $O(\mu)-$neighborhood of the true penalized optimizer.

21 citations


Journal ArticleDOI
TL;DR: This work proposes a decentralized clustering algorithm aimed at identifying and forming clusters of agents of similar objectives, and at guiding cooperation to enhance the inference performance, and illustrates the performance of the proposed method in comparison to other useful techniques.
Abstract: We consider the problem of decentralized clustering and estimation over multitask networks, where agents infer and track different models of interest The agents do not know beforehand which model is generating their own data They also do not know which agents in their neighborhood belong to the same cluster We propose a decentralized clustering algorithm aimed at identifying and forming clusters of agents of similar objectives, and at guiding cooperation to enhance the inference performance One key feature of the proposed technique is the integration of the learning and clustering tasks into a single strategy We analyze the performance of the procedure and show that the error probabilities of types I and II decay exponentially to zero with the step-size parameter While links between agents following different objectives are ignored in the clustering process, we nevertheless show how to exploit these links to relay critical information across the network for enhanced performance Simulation results illustrate the performance of the proposed method in comparison to other useful techniques

Proceedings ArticleDOI
25 May 2017
TL;DR: An energy-efficient, implantable, real-time, blind Adaptive Stimulation Artifact Rejection (ASAR) engine is proposed, which enables concurrent neural stimulation and recording for state-of-the-art closed-loop neuromodulation systems.
Abstract: In this work we propose an energy-efficient, implantable, real-time, blind Adaptive Stimulation Artifact Rejection (ASAR) engine. This enables concurrent neural stimulation and recording for state-of-the-art closed-loop neuromodulation systems. Two engines, implemented in 40nm CMOS, achieve convergence of p-p by 49.2dB, without any prior knowledge of the stimulation pulse. The LFP and Spike ASAR designs occupy an area of 0.197mm2 and 0.209mm2, and consume 1.73µW and 3.02µW, respectively at 0.644V.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: The objective of this paper is to blend concepts from adaptive networks and graph signal processing to propose new useful tools for adaptive graph signalprocessing.
Abstract: Graph signal processing allows the generalization of DSP concepts to the graph domain. However, most works assume graph signals that are static with respect to time, which is a limitation even in comparison to classical DSP formulations where signals are generally sequences that evolve over time. Several earlier works on adaptive networks have addressed problems involving streaming data over graphs by developing effective learning strategies that are well-suited to dynamic data scenarios, in a manner that generalizes adaptive signal processing concepts to the graph domain. The objective of this paper is to blend concepts from adaptive networks and graph signal processing to propose new useful tools for adaptive graph signal processing.

Posted Content
04 Aug 2017
TL;DR: This paper provides the first theoretical guarantee of linear convergence under random reshuffling for SAGA and proposes a new amortized variance-reduced gradient (AVRG) algorithm with constant storage requirements and balanced gradient computations compared to SVRG.
Abstract: Several useful variance-reduced stochastic gradient algorithms, such as SVRG, SAGA, Finito, and SAG, have been proposed to minimize empirical risks with linear convergence properties to the exact minimizers. The existing convergence results assume uniform data sampling with replacement. However, it has been observed that random reshuffling can deliver superior performance. No formal proofs or guarantees of exact convergence exist for variance-reduced algorithms under random reshuffling. This paper resolves this open convergence issue and provides the first theoretical guarantee of linear convergence under random reshuffling for SAGA; the argument is also adaptable to other variance-reduced algorithms. Under random reshuffling, the paper further proposes a new amortized variance-reduced gradient (AVRG) algorithm with constant storage requirements compared to SAGA and with balanced gradient computations compared to SVRG. The balancing in computations are attained by amortizing the full gradient calculation across all iterations. AVRG is also shown analytically to converge linearly.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: This work develops an exact converging algorithm for the solution of a distributed optimization problem with partially-coupled parameters across agents in a multi-agent scenario that is shown to converge to the true optimizer at a linear rate for strongly-convex cost functions.
Abstract: This work develops an exact converging algorithm for the solution of a distributed optimization problem with partially-coupled parameters across agents in a multi-agent scenario. In this formulation, while the network performance is dependent on a collection of parameters, each individual agent may be influenced by only a subset of the parameters. Problems of this type arise in several applications, most notably in distributed control formulations and in power system monitoring. The resulting coupled exact diffusion strategy is shown to converge to the true optimizer at a linear rate for strongly-convex cost functions.

Proceedings ArticleDOI
01 Feb 2017
TL;DR: The analysis establishes analytically that random reshuffling outperforms independent sampling by showing that the iterate at the end of each run approaches a smaller neighborhood of size O( μ2) around the minimizer rather than O(μ).
Abstract: In empirical risk optimization, it has been observed that gradient descent implementations that rely on random reshuffling of the data achieve better performance than implementations that rely on sampling the data randomly and independently of each other. Recent works have pursued justifications for this behavior by examining the convergence rate of the learning process under diminishing step-sizes. Some of these justifications rely on loose bounds, or their conclusions are dependent on the sample size which is problematic for large datasets. This work focuses on constant step-size adaptation, where the agent is continuously learning. In this case, convergence is only guaranteed to a small neighborhood of the optimizer albeit at a linear rate. The analysis establishes analytically that random reshuffling outperforms independent sampling by showing that the iterate at the end of each run approaches a smaller neighborhood of size O(μ2) around the minimizer rather than O(μ). Simulation results illustrate the theoretical findings.

Proceedings ArticleDOI
01 Aug 2017
TL;DR: A distributed optimization algorithm with guaranteed exact convergence for a broad class of left-stochastic combination policies is developed, and the resulting exact diffusion strategy is shown to have a wider stability range and superior convergence performance than the EXTRA consensus strategy.
Abstract: This work develops a distributed optimization algorithm with guaranteed exact convergence for a broad class of left-stochastic combination policies. The resulting exact diffusion strategy is shown to have a wider stability range and superior convergence performance than the EXTRA consensus strategy. The exact diffusion solution is also applicable to non-symmetric left-stochastic combination matrices, while most earlier developments on exact consensus implementations are limited to doubly-stochastic matrices or right-stochastic matrices; these latter policies impose stringent constraints on the network topology. Stability and convergence results are noted, along with numerical simulations to illustrate the conclusions.

Proceedings ArticleDOI
01 Mar 2017
TL;DR: Using duality arguments from optimization theory, this work develops an effective distributed gradient boosting strategy for inference and classification by networked clusters of learners by sharing local dual variables with their immediate neighbors through a diffusion learning protocol.
Abstract: Using duality arguments from optimization theory, this work develops an effective distributed gradient boosting strategy for inference and classification by networked clusters of learners. By sharing local dual variables with their immediate neighbors through a diffusion learning protocol, the clusters are able to match the performance of centralized boosting solutions even when the individual clusters only have access to partial information about the feature space.

Patent
31 May 2017
TL;DR: In this article, system and methods that cancel artifacts of stimulation signals from neural signals are disclosed. But they do not specify a threshold value for the neural signal in the absence of artifacts, which can then be used to detect an artifact in received neural signals.
Abstract: System and methods that cancel artifacts of stimulation signals from neural signals are disclosed. In several embodiments, the systems and methods determine a threshold value for the neural signal in the absence of artifacts. The threshold value can then be used to detect an artifact in received neural signals. In a number of embodiments, a template can be used to cancel an artifact from a neural signal in response to the neural signal being greater than the threshold value.

Proceedings ArticleDOI
01 Mar 2017
TL;DR: In this paper, the authors characterize the set of beliefs that can be imposed on non-influential agents and how the graph topology of these latter agents helps resist manipulation but only to a certain degree.
Abstract: In diffusion social learning over weakly-connected graphs, it has been shown that influential agents end up shaping the beliefs of non-influential agents. In this paper, we analyse this control mechanism more closely and reveal some critical properties. In particular, we characterize the set of beliefs that can be imposed on non-influential agents (i.e., the set of attainable beliefs) and how the graph topology of these latter agents helps resist manipulation but only to a certain degree. We also derive a design procedure that allows influential agents to drive the beliefs of non-influential agents to desirable attainable states. We illustrate the results with two examples.

Proceedings ArticleDOI
01 Mar 2017
TL;DR: This work develops cooperative distributed techniques that enable agents to cooperate even when their interactions are limited to exchanging estimates of select few entries, which results in a significant reduction in communication overhead.
Abstract: In many scenarios of interest, agents may only have access to partial information about an unknown model or target vector. Each agent may be sensing only a subset of the entries of a global target vector, and the number of these entries can be different across the agents. If each of the agents were to solve an inference task independently of the other agents, then they would not benefit from neighboring agents that may be sensing similar entries. This work develops cooperative distributed techniques that enable agents to cooperate even when their interactions are limited to exchanging estimates of select few entries. In the proposed strategies, agents are only required to share estimates of their common entries, which results in a significant reduction in communication overhead. Simulations show that the proposed approach improves both the performance of individual agents and the entire network through cooperation.

Posted Content
TL;DR: In this paper, the problem of inferring whether an agent is directly influenced by another agent over an adaptive diffusion network is studied, where only the output of the diffusion learning algorithm is available to the external observer that must perform the inference based on these indirect measurements.
Abstract: This work studies the problem of inferring whether an agent is directly influenced by another agent over an adaptive diffusion network. Agent i influences agent j if they are connected (according to the network topology), and if agent j uses the data from agent i to update its online statistic. The solution of this inference task is challenging for two main reasons. First, only the output of the diffusion learning algorithm is available to the external observer that must perform the inference based on these indirect measurements. Second, only output measurements from a fraction of the network agents is available, with the total number of agents itself being also unknown. The main focus of this article is ascertaining under these demanding conditions whether consistent tomography is possible, namely, whether it is possible to reconstruct the interaction profile of the observable portion of the network, with negligible error as the network size increases. We establish a critical achievability result, namely, that for symmetric combination policies and for any given fraction of observable agents, the interacting and non-interacting agent pairs split into two separate clusters as the network size increases. This remarkable property then enables the application of clustering algorithms to identify the interacting agents influencing the observations. We provide a set of numerical experiments that verify the results for finite network sizes and time horizons. The numerical experiments show that the results hold for asymmetric combination policies as well, which is particularly relevant in the context of causation.

Proceedings ArticleDOI
01 Aug 2017
TL;DR: This work examines the mean-square error performance of diffusion stochastic algorithms under a generalized coordinate-descent scheme and shows that the steady-state performance of the learning strategy is not affected, while the convergence rate suffers some degradation.
Abstract: This work examines the mean-square error performance of diffusion stochastic algorithms under a generalized coordinate-descent scheme. In this setting, the adaptation step by each agent is limited to a random subset of the coordinates of its stochastic gradient vector. The selection of which coordinates to use varies randomly from iteration to iteration and from agent to agent across the network. Such schemes are useful in reducing computational complexity in power-intensive large data applications. The results show that the steady-state performance of the learning strategy is not affected, while the convergence rate suffers some degradation. The results provide yet another indication of the resilience and robustness of adaptive distributed strategies.