scispace - formally typeset
Search or ask a question
Author

Sergio Spanò

Bio: Sergio Spanò is an academic researcher from University of Rome Tor Vergata. The author has contributed to research in topics: Field-programmable gate array & Computer science. The author has an hindex of 8, co-authored 27 publications receiving 150 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models is presented, including nonstationarity, scalability, and observability.
Abstract: In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.

96 citations

Journal ArticleDOI
TL;DR: An efficient hardware architecture that implements the Q-Learning algorithm, suitable for real-time applications, with low-power, high throughput and limited hardware resources, and a technique based on approximated multipliers to reduce the hardware complexity of the algorithm.
Abstract: In this paper we propose an efficient hardware architecture that implements the Q-Learning algorithm, suitable for real-time applications. Its main features are low-power, high throughput and limited hardware resources. We also propose a technique based on approximated multipliers to reduce the hardware complexity of the algorithm. We implemented the design on a Xilinx Zynq Ultrascale+ MPSoC ZCU106 Evaluation Kit. The implementation results are evaluated in terms of hardware resources, throughput and power consumption. The architecture is compared to the state of the art of Q-Learning hardware accelerators presented in the literature obtaining better results in speed, power and hardware resources. Experiments using different sizes for the Q-Matrix and different wordlengths for the fixed point arithmetic are presented. With a Q-Matrix of size $8\times4$ (8 bit data) we achieved a throughput of 222 MSPS (Mega Samples Per Second) and a dynamic power consumption of 37 mW, while with a Q-Matrix of size $256\times16$ (32 bit data) we achieved a throughput of 93 MSPS and a power consumption 611 mW. Due to the small amount of hardware resources required by the accelerator, our system is suitable for multi-agent IoT applications. Moreover, the architecture can be used to implement the SARSA (State-Action-Reward-State-Action) Reinforcement Learning algorithm with minor modifications.

71 citations

Journal ArticleDOI
TL;DR: An extension of the inline-formula to approximate the Euclidean distance to a multi-dimensional space using the Min method that results in a much smaller approximation error with respect to the Manhattan approximation at the expense of a reasonable increase in hardware cost.
Abstract: Several applications in different engineering areas require the computation of the Euclidean distance, a quite complex operation based on squaring and square root. In some applications, the Euclidean distance can be replaced by the Manhattan distance. However, the approximation error introduced by the Manhattan distance may be rather large, especially in a multi-dimensional space, and may compromise the overall performance. In this brief, we propose an extension of the $\alpha $ Max $+ \beta $ Min method to approximate the Euclidean distance to a multi-dimensional space. Such a method results in a much smaller approximation error with respect to the Manhattan approximation at the expense of a reasonable increase in hardware cost. Moreover, with respect to the Euclidean distance, the $\alpha $ Max $+ \beta $ Min method provides a significant reduction in the hardware if the application can tolerate some errors.

24 citations

Journal ArticleDOI
TL;DR: A novel approach for swarm reinforcement learning that extends the standard Q-learning to multi-agent systems by developing a Q- learning real-time swarm algorithm (Q-RTS), which is iteration-based and suitable for real- time systems.
Abstract: The authors introduce a novel approach for swarm reinforcement learning that extends the standard Q-learning to multi-agent systems. State-of-the-art methods implement a knowledge sharing mechanism between the agents that is triggered by the episodes succession. This causes an intrinsic limit in the convergence speed of the algorithms. They overcame this issue by developing a Q-learning real-time swarm algorithm (Q-RTS), which is iteration-based and suitable for real-time systems. Q-RTS was tested in different environments and compared to other related methods in the literature. They obtained positive results in terms of learning time and scalability, i.e. achieving a speed-up factor of at least 1.49 with respect to standard Q-learning. Moreover, Q-RTS shows enhanced learning performance as the environments complexity increases.

22 citations

Journal ArticleDOI
TL;DR: The design of an RL Agent able to learn the behavior of a Timing Recovery Loop (TRL) through the Q-Learning algorithm is proposed and it is able to adapt its behavior to different modulation formats without the need of any tuning for the system parameters.
Abstract: Machine Learning (ML) based on supervised and unsupervised learning models has been recently applied in the telecommunication field. However, such techniques rely on application-specific large datasets and the performance deteriorates if the statistics of the inference data changes over time. Reinforcement Learning (RL) is a solution to these issues because it is able to adapt its behavior to the changing statistics of the input data. In this work, we propose the design of an RL Agent able to learn the behavior of a Timing Recovery Loop (TRL) through the Q-Learning algorithm. The Agent is compatible with popular PSK and QAM formats. We validated the RL synchronizer by comparing it to the Mueller and Muller TRL in terms of Modulation Error Ratio (MER) in a noisy channel scenario. The results show a good trade-off in terms of MER performance. The RL based synchronizer loses less than 1 dB of MER with respect to the conventional one but it is able to adapt its behavior to different modulation formats without the need of any tuning for the system parameters.

18 citations


Cited by
More filters
01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations

01 Jan 2016
TL;DR: The digital signal processing a computer based approach is universally compatible with any devices to read and is available in the digital library an online access to it is set as public so you can download it instantly.
Abstract: digital signal processing a computer based approach is available in our digital library an online access to it is set as public so you can download it instantly. Our books collection saves in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Merely said, the digital signal processing a computer based approach is universally compatible with any devices to read.

343 citations

TL;DR: A non exhaustive list on the state of the art about Blockchain technology in multiple application fields, both from an industry and business perspective and from a consumer one and a dedicated focus on the frictions between distributed ledgers and data protection regulations crossing all these areas are presented.
Abstract: The rapid technology evolution and the future challenges demand us to adopt increasingly cutting-edge and up-to-date technological tools. In this perspective, Blockchain technology is revolutionizing countless areas of our daily lives. In this paper, we present a non exhaustive list on the state of the art about Blockchain technology in multiple application fields, both from an industry and business perspective and from a consumer one. We will also present a dedicated focus on the frictions between distributed ledgers and data protection regulations crossing all these areas.

201 citations

Book
01 Jan 2003
TL;DR: Comprehensive in scope, and gentle in approach, this book will help you achieve a thorough grasp of the basics and move gradually to more sophisticated DSP concepts and applications.
Abstract: From the Publisher: This is undoubtedly the most accessible book on digital signal processing (DSP) available to the beginner. Using intuitive explanations and well-chosen examples, this book gives you the tools to develop a fundamental understanding of DSP theory. The author covers the essential mathematics by explaining the meaning and significance of the key DSP equations. Comprehensive in scope, and gentle in approach, the book will help you achieve a thorough grasp of the basics and move gradually to more sophisticated DSP concepts and applications.

162 citations

Journal ArticleDOI
TL;DR: A detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models is presented, including nonstationarity, scalability, and observability.
Abstract: In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.

96 citations