Home
/
Authors
/
Simone Medardoni

Author

Simone Medardoni

Bio: Simone Medardoni is an academic researcher from University of Ferrara. The author has contributed to research in topics: Network on a chip & Multipath routing. The author has an hindex of 9, co-authored 13 publications receiving 437 citations.

Topics: Network on a chip, Multipath routing, MPSoC, Scalability, Logic synthesis ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Addressing Manufacturing Challenges with Cost-Efficient Fault Tolerant Routing

[...]

Samuel Rodrigo¹, Jose Flich¹, Antoni Roca¹, Simone Medardoni², Davide Bertozzi², J. Camacho¹, Federico Silla¹, José Duato¹ - Show less +4 more•Institutions (2)

University of Valencia¹, University of Ferrara²

03 May 2010

TL;DR: Universal Logic-Based Distributed Routing (uLBDR) as mentioned in this paper is an efficient logic-based mechanism that adapts to any irregular topology derived from 2D meshes, being an alternative to the use of routing tables.

...read moreread less

Abstract: The high-performance computing domain is enriching with the inclusion of Networks-on-chip (NoCs) as a key component of many-core (CMPs or MPSoCs) architectures. NoCs face the communication scalability challenge while meeting tight power, area and latency constraints. Designers must address new challenges that were not present before. Defective components, the enhancement of application-level parallelism or power-aware techniques may break topology regularity, thus, efficient routing becomes a challenge.In this paper, uLBDR (Universal Logic-Based Distributed Routing) is proposed as an efficient logic-based mechanism that adapts to any irregular topology derived from 2D meshes, being an alternative to the use of routing tables (either at routers or at end-nodes). uLBDR requires a small set of configuration bits, thus being more practical than large routing tables implemented in memories. Several implementations of uLBDR are presented highlighting the trade-off between routing cost and coverage. The alternatives span from the previously proposed LBDR approach (with 30\% of coverage) to the uLBDR mechanism achieving full coverage. This comes with a small performance cost, thus exhibiting the trade-off between fault tolerance and performance.

...read moreread less

92 citations

Journal Article•DOI•

Efficient implementation of distributed routing algorithms for NoCs

[...]

Samuel Rodrigo¹, Simone Medardoni², Jose Flich¹, Davide Bertozzi², José Duato¹ - Show less +1 more•Institutions (2)

Polytechnic University of Valencia¹, University of Ferrara²

11 Aug 2009-Iet Computers and Digital Techniques

TL;DR: LBDR (logic-based distributed routing) is proposed as a new routing method that removes the need of using routing tables at all and enables the implementation of many routing algorithms on most of the practical topologies in a multi-core system.

...read moreread less

Abstract: Chip multiprocessors (CMPs) are gaining momentum in the high-performance computing domain. Networks-on-chip (NoCs) are key components of CMP architectures, in that they have to deal with the communication scalability challenge while meeting tight power, area and latency constraints. 2D mesh topologies are usually preferred by designers of general purpose NoCs. However, manufacturing faults may break their regularity. Moreover, resource management frameworks may require the segmentation of the network into irregular regions. Under these conditions, efficient routing becomes a challenge. Although the use of routing tables at switches is flexible, it does not scale in terms of latency and area due to its memory requirements. Logic-based distributed routing (LBDR) is proposed as a new routing method that removes the need for routing tables at all. LBDR enables the implementation of many routing algorithms on most of the practical topologies we may find in the near future in a multi-core system. From an initial topology and routing algorithm, a set of three bits per switch/output port is computed. Evaluation results show that, by using a small logic, LBDR mimics the performance of routing algorithms when implemented with routing tables, both in regular and irregular topologies. LBDR implementation in a real NoC switch is also explored, proving its smooth integration in the architecture and its negligible hardware and performance overhead.

...read moreread less

73 citations

Journal Article•DOI•

Cost-Efficient On-Chip Routing Implementations for CMP and MPSoC Systems

[...]

Samuel Rodrigo¹, Jose Flich¹, Antoni Roca¹, Simone Medardoni, Davide Bertozzi², Jesús Camacho¹, Federico Silla¹, José Duato¹ - Show less +4 more•Institutions (2)

University of Valencia¹, University of Ferrara²

01 Apr 2011-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: ULBDR is presented, an efficient logic-based mechanism that adapts to any irregular topology derived from 2-D meshes, instead of using routing tables, that requires a small set of configuration bits, thus being more practical than large routing tables implemented in memories.

...read moreread less

Abstract: The high-performance computing domain is enriching with the inclusion of networks-on-chip (NoCs) as a key component of many-core (CMPs or MPSoCs) architectures. NoCs face the communication scalability challenge while meeting tight power, area, and latency constraints. Designers must address new challenges that were not present before. Defective components, the enhancement of application-level parallelism, or power-aware techniques may break topology regularity, thus, efficient routing becomes a challenge. This paper presents universal logic-based distributed routing (uLBDR), an efficient logic-based mechanism that adapts to any irregular topology derived from 2-D meshes, instead of using routing tables. uLBDR requires a small set of configuration bits, thus being more practical than large routing tables implemented in memories. Several implementations of uLBDR are presented highlighting the tradeoff between routing cost and coverage. The alternatives span from the previously proposed LBDR approach (with 30% of coverage) to the uLBDR mechanism achieving full coverage. This comes with a small performance cost, thus exhibiting the tradeoff between fault tolerance and performance. Power consumption, area, and delay estimates are also provided highlighting the efficiency of the mechanism. To do this, different router models (one for CMPs and one for MPSoCs) have been designed as a proof concept.

...read moreread less

51 citations

Proceedings Article•DOI•

Improved Utilization of NoC Channel Bandwidth by Switch Replication for Cost-Effective Multi-processor Systems-on-Chip

[...]

F. Gilabert, Maria E. Gomez, Simone Medardoni¹, Davide Bertozzi¹•Institutions (1)

University of Ferrara¹

03 May 2010

TL;DR: This result builds on a well-known principle of logic synthesis for combinational circuits (the area-performance trade-off when inferring a logic function into a gate-level netlist), and proves that when a designer is aware of this, novel architecture design techniques can be conceived.

...read moreread less

Abstract: Virtual channels are an appealing flow control technique for on-chip interconnection networks (NoCs), in that they can potentially avoid deadlock and improve link utilization and network throughput. However, their use in the resource constrained multi-processor system-on-chip (MPSoC) domain is still controversial, due to their significant overhead in terms of area, power and cycle time degradation. This paper proposes a simple yet efficient approach to VC implementation, which results in more area- and power-saving solutions than conventional design techniques. While these latter replicate only buffering resources for each physical link, we replicate the entire switch and prove that our solution is counter intuitively more area/power efficient while potentially operating at higher speeds. This result builds on a well-known principle of logic synthesis for combinational circuits (the area-performance trade-off when inferring a logic function into a gate-level netlist), and proves that when a designer is aware of this, novel architecture design techniques can be conceived.

...read moreread less

43 citations

Proceedings Article•DOI•

Assessing fat-tree topologies for regular network-on-chip design under nanoscale technology constraints

[...]

Daniele Ludovici¹, F. Gilabert², Simone Medardoni³, Crispín Gómez², Maria E. Gomez², Pedro López², Georgi Gaydadjiev¹, Davide Bertozzi³ - Show less +4 more•Institutions (3)

Delft University of Technology¹, Polytechnic University of Valencia², University of Ferrara³

20 Apr 2009

TL;DR: This work aims at providing an in-depth assessment of physical synthesis efficiency of fat-trees and at extrapolating silicon-aware performance figures to back-annotate in the system-level performance analysis.

...read moreread less

Abstract: Most of past evaluations of fat-trees for on-chip interconnection networks rely on oversimplifying or even irrealistic architecture and traffic pattern assumptions, and very few layout analyses are available to relieve practical feasibility concerns in nanoscale technologies. This work aims at providing an in-depth assessment of physical synthesis efficiency of fat-trees and at extrapolating silicon-aware performance figures to back-annotate in the system-level performance analysis. A 2D mesh is used as a reference architecture for comparison, and a 65 nm technology is targeted by our study. Finally, in an attempt to mitigate the implementation cost of k-ary n-tree topologies, we also review an alternative unidirectional multi-stage interconnection network which is able to simplify the fat-tree architecture and to minimally impact performance.

...read moreread less

42 citations

Cited by

PDF

Open Access

More filters

IEEE International Solid-State Circuits Conference

[...]

Hurwitz Jonathan Ephraim David, Stewart Smith, A. A. Murray, Peter B. Denyer, John Thomson, Scot D. Anderson, E. Duncan, B. Paisley, A. Kinsey, E. Christison, B. Laffoley, J. Vittu, R. Bechignac, Robert Henderson, M.J. Panaghiston, P.-F. Pugibet, H. Hendry, K. M. Findlater - Show less +14 more

01 Jan 2001

401 citations

Journal Article•DOI•

Methods for fault tolerance in networks-on-chip

[...]

Martin Radetzki¹, Chaochao Feng², Xueqian Zhao³, Axel Jantsch³•Institutions (3)

University of Stuttgart¹, National University of Defense Technology², Royal Institute of Technology³

11 Jul 2013-ACM Computing Surveys

TL;DR: The article at hand reviews the failure mechanisms, fault models, diagnosis techniques, and fault-tolerance methods in on-chip networks, and surveys and summarizes the research of the last ten years.

...read moreread less

Abstract: Networks-on-Chip constitute the interconnection architecture of future, massively parallel multiprocessors that assemble hundreds to thousands of processing cores on a single chip. Their integration is enabled by ongoing miniaturization of chip manufacturing technologies following Moore's Law. It comes with the downside of the circuit elements' increased susceptibility to failure. Research on fault-tolerant Networks-on-Chip tries to mitigate partial failure and its effect on network performance and reliability by exploiting various forms of redundancy at the suitable network layers. The article at hand reviews the failure mechanisms, fault models, diagnosis techniques, and fault-tolerance methods in on-chip networks, and surveys and summarizes the research of the last ten years. It is structured along three communication layers: the data link, the network, and the transport layers. The most important results are summarized and open research problems and challenges are highlighted to guide future research on this topic.

...read moreread less

198 citations

Proceedings Article•DOI•

Elastic-buffer flow control for on-chip networks

[...]

George Michelogiannakis¹, James Balfour¹, William J. Dally¹•Institutions (1)

Stanford University¹

06 Mar 2009

TL;DR: A channel occupancy detector is developed to apply universal globally adaptive load-balancing (UGAL) routing to load balance traffic in networks using EBs and results in up to 8% improvement in peak throughput per unit power compared to a VC flow-control network.

...read moreread less

Abstract: This paper presents elastic buffers (EBs), an efficient flow-control scheme that uses the storage already present in pipelined channels in place of explicit input virtual-channel buffers (VCBs). With this approach, the channels themselves act as distributed FIFO buffers. Without VCBs, and hence virtual channels (VCs), deadlock prevention is achieved by duplicating physical channels. We develop a channel occupancy detector to apply universal globally adaptive load-balancing (UGAL) routing to load balance traffic in networks using EBs. Using EBs results in up to 8% (12% for low-swing channels) improvement in peak throughput per unit power compared to a VC flow-control network. These gains allow for a wider network datapath to be used to offset the removal of VCBs and increase throughput for a fixed power budget. EB networks have identical zero-load latency to VC networks operating under the same frequency. The microarchitecture of an EB router is considerably simpler than a VC router because allocators and credits are not required. For 5×5 mesh routers, this results in an 18% improvement in the cycle time.

...read moreread less

132 citations

Journal Article•DOI•

Elevator-First: A Deadlock-Free Distributed Routing Algorithm for Vertically Partially Connected 3D-NoCs

[...]

Florentine Dubois, Abbas Sheibanyrad, Frédéric Pétrot, Maryam Bahmani

01 Mar 2013-IEEE Transactions on Computers

TL;DR: It is formally proved that independently of the shape and dimensions of the planar topologies and of the number and placement of the TSVs, the proposed routing algorithm using two virtual channels in the plane is deadlock and livelock free.

...read moreread less

Abstract: In this paper, we propose a distributed routing algorithm for vertically partially connected regular 2D topologies of different shapes and sizes (e.g., 2D mesh, torus, ring). The topologies that are the target of this algorithm are of practical interest in the 3D integration of heterogeneous dies using Through-Silicon-Vias (TSVs). Indeed, TSV-based 3D integration allows to envision the stacking of dies with different functions and technologies, using as an interconnect backbone a 3D-NoC. Intrinsically, 3D topologies have better performances, but yield and active area (and thus the cost) are function of the number of TSVs; therefore, the designs tend to use only a subset of available TSVs between two dies. The definition of blockage free and low implementation cost distributed deterministic routing on this kind of topology is thus of theoretical and practical interests. We formally prove that independently of the shape and dimensions of the planar topologies and of the number and placement of the TSVs, the proposed routing algorithm using two virtual channels in the plane is deadlock and livelock free. We also experimentally show that the performance of this algorithm is still acceptable when the number of vertical connections decreases.

...read moreread less

121 citations

Journal Article•DOI•

Scalable Hierarchical Network-on-Chip Architecture for Spiking Neural Network Hardware Implementations

[...]

Snaider Carrillo¹, Jim Harkin¹, Liam McDaid¹, Fearghal Morgan², Sandeep Pande², Seamus Cawley², Brian McGinley² - Show less +3 more•Institutions (2)

Ulster University¹, National University of Ireland, Galway²

01 Dec 2013-IEEE Transactions on Parallel and Distributed Systems

TL;DR: A novel hierarchical network-on-chip (H-NoC) architecture for SNN hardware is presented, which aims to address the scalability issue by creating a modular array of clusters of neurons using a hierarchical structure of low and high-level routers.

...read moreread less

Abstract: Spiking neural networks (SNNs) attempt to emulate information processing in the mammalian brain based on massively parallel arrays of neurons that communicate via spike events. SNNs offer the possibility to implement embedded neuromorphic circuits, with high parallelism and low power consumption compared to the traditional von Neumann computer paradigms. Nevertheless, the lack of modularity and poor connectivity shown by traditional neuron interconnect implementations based on shared bus topologies is prohibiting scalable hardware implementations of SNNs. This paper presents a novel hierarchical network-on-chip (H-NoC) architecture for SNN hardware, which aims to address the scalability issue by creating a modular array of clusters of neurons using a hierarchical structure of low and high-level routers. The proposed H-NoC architecture incorporates a spike traffic compression technique to exploit SNN traffic patterns and locality between neurons, thus reducing traffic overhead and improving throughput on the network. In addition, adaptive routing capabilities between clusters balance local and global traffic loads to sustain throughput under bursting activity. Analytical results show the scalability of the proposed H-NoC approach under different scenarios, while simulation and synthesis analysis using 65-nm CMOS technology demonstrate high-throughput, low-cost area, and power consumption per cluster, respectively.

...read moreread less

110 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68

Collapse