scispace - formally typeset
Search or ask a question
Author

Horácio C. Neto

Bio: Horácio C. Neto is an academic researcher from Instituto Superior Técnico. The author has contributed to research in topics: Field-programmable gate array & Reconfigurable computing. The author has an hindex of 16, co-authored 101 publications receiving 981 citations. Previous affiliations of Horácio C. Neto include INESC-ID & University of Lisbon.


Papers
More filters
Proceedings ArticleDOI
21 Apr 1999
TL;DR: This paper presents a new approach to synthesize to reconfigurable hardware (HW) user-specified regions of a program, under the assumption of "virtual HW" support, which exploits the temporal partitions at the behavior level, resolves memory access conflicts, and generates the VHDL descriptions at register-transfer level that will be mapped into the reconfigured HW devices.
Abstract: This paper presents a new approach to synthesize to reconfigurable hardware (HW) user-specified regions of a program, under the assumption of "virtual HW" support. The automation of this approach is supported by a compiler front-end and by an HW compiler under development. The front-end starts from the Java bytecodes and, therefore, supports any language that can be compiled to the JVM (Java Virtual Machine) model. It extracts from the bytecodes all the dependencies inside and between basic blocks. This information is stored in representation graphs more suitable to efficiently exploit the existent parallelism in the program than those typically used in high-level synthesis. From the intermediate representations the HW compiler exploits the temporal partitions at the behavior level, resolves memory access conflicts, and generates the VHDL descriptions at register-transfer level that will be mapped into the reconfigurable HW devices.

96 citations

Proceedings ArticleDOI
20 Oct 2014
TL;DR: In this article, the authors compare the trends of these computing architectures for high-performance computing and survey these platforms in the execution of algorithms belonging to different scientific application domains, showing that FPGAs are increasing the gap to GPUs and many-core CPUs moving them away from highperformance computing with intensive floating-point calculations.
Abstract: Floating-point computing with more than one TFLOP of peak performance is already a reality in recent Field-Programmable Gate Arrays (FPGA). General-Purpose Graphics Processing Units (GPGPU) and recent many-core CPUs have also taken advantage of the recent technological innovations in integrated circuit (IC) design and had also dramatically improved their peak performances. In this paper, we compare the trends of these computing architectures for high-performance computing and survey these platforms in the execution of algorithms belonging to different scientific application domains. Trends in peak performance, power consumption and sustained performances, for particular applications, show that FPGAs are increasing the gap to GPUs and many-core CPUs moving them away from high-performance computing with intensive floating-point calculations. FPGAs become competitive for custom floating-point or fixed-point representations, for smaller input sizes of certain algorithms, for combinational logic problems and parallel map-reduce problems.

77 citations

Proceedings ArticleDOI
10 Jan 1999
TL;DR: This paper develops an optimization model and describes an efficient algorithm for reordering pattern sequences in the presence of don't cares and preliminary experimental results amply confirm that the resulting power savings due to pattern sequence reordering usingDon't cares can be significant.
Abstract: For a significant number of electronic systems used in safety-critical applications circuit testing is performed periodically. For these systems, power dissipation due to Built-in Self Test (BIST) can represent a significant percentage of the overall power dissipation. One approach to minimize power consumption in these systems consists of test pattern sequence reordering. Moreover a key observation is that test patterns are in general expected to exhibit don't cares, which can naturally be exploited during test pattern sequence reordering. In this paper we develop an optimization model and describe an efficient algorithm for reordering pattern sequences in the presence of don't cares. Preliminary experimental results amply confirm that the resulting power savings due to pattern sequence reordering using don't cares can be significant.

51 citations

Journal ArticleDOI
TL;DR: This paper provides techniques for compiling software programs into reconfigurable hardware which offer faster and more efficient performance than the complex resource-sharing approaches typical of high-level synthesis systems.
Abstract: This paper provides techniques for compiling software programs into reconfigurable hardware which offer faster and more efficient performance than the complex resource-sharing approaches typical of high-level synthesis systems. The Java-based compiler presented in this paper uses intermediate graph representations to embody parallelism at various levels.

51 citations

Book ChapterDOI
07 Sep 2008
TL;DR: It is shown that a hybrid between an insertion sorting unit and a merge FIFO sorting unit provides a speed-up between 1.6 and 25 compared to a quicksort software implementation.
Abstract: Sorting is an important operation for a number of embedded applications. As sorting large datasets may impose undesired performance degradation, acceleration units coupled to the embedded processor can be an interesting solution for speeding-up the computations. This paper presents and evaluates three hardware sorting units, bearing in mind embedded computing systems implemented with FPGAs. The proposed architectures take advantage of specific FPGA hardware resources to increase efficiency. Experimental results show the differences in resources and performances among the three proposed sorting units and also between the sorting units and pure software implementations for sorting.We show that a hybrid between an insertion sorting unit and a merge FIFO sorting unit provides a speed-up between 1.6 and 25 compared to a quicksort software implementation.

42 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling are explored, and the software that targets these machines is focused on.
Abstract: Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solution. In this survey, we explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling. We also focus on the software that targets these machines, such as compilation tools that map high-level algorithms directly to the reconfigurable substrate. Finally, we consider the issues involved in run-time reconfigurable systems, which reuse the configurable hardware during program execution.

1,666 citations

Posted Content
TL;DR: An exhaustive review of the research conducted in neuromorphic computing since the inception of the term is provided to motivate further work by illuminating gaps in the field where new research is needed.
Abstract: Neuromorphic computing has come to refer to a variety of brain-inspired computers, devices, and models that contrast the pervasive von Neumann computer architecture This biologically inspired approach has created highly connected synthetic neurons and synapses that can be used to model neuroscience theories as well as solve challenging machine learning problems The promise of the technology is to create a brain-like ability to learn and adapt, but the technical challenges are significant, starting with an accurate neuroscience model of how the brain works, to finding materials and engineering breakthroughs to build devices to support these models, to creating a programming framework so the systems can learn, to creating applications with brain-like capabilities In this work, we provide a comprehensive survey of the research and motivations for neuromorphic computing over its history We begin with a 35-year review of the motivations and drivers of neuromorphic computing, then look at the major research areas of the field, which we define as neuro-inspired models, algorithms and learning approaches, hardware and devices, supporting systems, and finally applications We conclude with a broad discussion on the major research topics that need to be addressed in the coming years to see the promise of neuromorphic computing fulfilled The goals of this work are to provide an exhaustive review of the research conducted in neuromorphic computing since the inception of the term, and to motivate further work by illuminating gaps in the field where new research is needed

570 citations

Journal ArticleDOI
TL;DR: A microarchitecture based on reconfigurable hardware emulation to allow high-speed reconfiguration and execution of the processor and to prove the viability of the proposal, the proposal was experimented with the MPEG-2 encoder and decoder and a Xilinx Virtex II Pro FPGA.
Abstract: In this paper, we present a polymorphic processor paradigm incorporating both general-purpose and custom computing processing. The proposal incorporates an arbitrary number of programmable units, exposes the hardware to the programmers/designers, and allows them to modify and extend the processor functionality at will. To achieve the previously stated attributes, we present a new programming paradigm, a new instruction set architecture, a microcode-based microarchitecture, and a compiler methodology. The programming paradigm, in contrast with the conventional programming paradigms, allows general-purpose conventional code and hardware descriptions to coexist in a program: In our proposal, for a given instruction set architecture, a onetime instruction set extension of eight instructions, is sufficient to implement the reconfigurable functionality of the processor. We propose a microarchitecture based on reconfigurable hardware emulation to allow high-speed reconfiguration and execution. To prove the viability of the proposal, we experimented with the MPEG-2 encoder and decoder and a Xilinx Virtex II Pro FPGA. We have implemented three operations, SAD, DCT, and IDCT. The overall attainable application speedup for the MPEG-2 encoder and decoder is between 2.64-3.18 and between 1.56-1.94, respectively, representing between 93 percent and 98 percent of the theoretically obtainable speedups.

436 citations

Proceedings ArticleDOI
24 Jul 2006
TL;DR: A distributed reconfigurable fabric inserted at RTL provides a debug platform that can be configured and operated post-silicon via the JTAG port and can be repeatedly reused to configure many debug structures such as assertions checkers, transaction identifiers, triggers, and event counters.
Abstract: In this paper we present a design-for-debug (DFD) reconfigurable infrastructure for SoCs to support at-speed in-system functional debug. A distributed reconfigurable fabric inserted at RTL provides a debug platform that can be configured and operated post-silicon via the JTAG port. The platform can be repeatedly reused to configure many debug structures such as assertions checkers, transaction identifiers, triggers, and event counters.

351 citations

Journal ArticleDOI
TL;DR: Fault-Tolerant Computer System Design by Dhiraj K. Pradhan examines the design of fault-tolerant systems and their applications in the oil and gas industry.
Abstract: Fault-Tolerant Computer System Design by Dhiraj K. Pradhan 550 pp. $72 Prentice Hall Upper Saddle River, N.J. 1996 ISBN 0-13-057887-8

222 citations