scispace - formally typeset
Search or ask a question
Author

Stamatis Vassiliadis

Other affiliations: Nokia, IBM, Technical University of Crete  ...read more
Bio: Stamatis Vassiliadis is an academic researcher from Delft University of Technology. The author has contributed to research in topics: Field-programmable gate array & Adder. The author has an hindex of 49, co-authored 395 publications receiving 8752 citations. Previous affiliations of Stamatis Vassiliadis include Nokia & IBM.


Papers
More filters
Journal ArticleDOI
TL;DR: A microarchitecture based on reconfigurable hardware emulation to allow high-speed reconfiguration and execution of the processor and to prove the viability of the proposal, the proposal was experimented with the MPEG-2 encoder and decoder and a Xilinx Virtex II Pro FPGA.
Abstract: In this paper, we present a polymorphic processor paradigm incorporating both general-purpose and custom computing processing. The proposal incorporates an arbitrary number of programmable units, exposes the hardware to the programmers/designers, and allows them to modify and extend the processor functionality at will. To achieve the previously stated attributes, we present a new programming paradigm, a new instruction set architecture, a microcode-based microarchitecture, and a compiler methodology. The programming paradigm, in contrast with the conventional programming paradigms, allows general-purpose conventional code and hardware descriptions to coexist in a program: In our proposal, for a given instruction set architecture, a onetime instruction set extension of eight instructions, is sufficient to implement the reconfigurable functionality of the processor. We propose a microarchitecture based on reconfigurable hardware emulation to allow high-speed reconfiguration and execution. To prove the viability of the proposal, we experimented with the MPEG-2 encoder and decoder and a Xilinx Virtex II Pro FPGA. We have implemented three operations, SAD, DCT, and IDCT. The overall attainable application speedup for the MPEG-2 encoder and decoder is between 2.64-3.18 and between 1.56-1.94, respectively, representing between 93 percent and 98 percent of the theoretically obtainable speedups.

436 citations

Book ChapterDOI
29 Aug 2000
TL;DR: Computer architecture is a truly fascinating field in which improvements in the basic echnology and innovations how to make best use of he underlying technology has yielded a performance growth exceeding a million times over the past 50 years.
Abstract: Computer architecture is a truly fascinating field in tha improvements in the basic echnology and innovations how to make best use of he underlying tech- nology has yielded a performance growth exceeding a million times over the past 50 years. What is even more amazing is the fact that he pressure on maintain- ing his rate of performance growth shows no decline. In fact, as performance thresholds are passed, application designers face new opportunities tha give new challenging problems to work on for computer architects.

310 citations

Proceedings ArticleDOI
20 Feb 2005
TL;DR: A 64-bit ANSI/IEEE Std 754-1985 floating point design of a hardware matrix multiplier optimized for FPGA implementations and implement a scalable linear array of processing elements (PE) supporting the proposed algorithm in the Xilinx Virtex II Pro technology.
Abstract: We introduce a 64-bit ANSI/IEEE Std 754-1985 floating point design of a hardware matrix multiplier optimized for FPGA implementations. A general block matrix multiplication algorithm, applicable for an arbitrary matrix size is proposed. The algorithm potentially enables optimum performance by exploiting the data locality and reusability incurred by the general matrix multiplication scheme and considering the limitations of the I/O bandwidth and the local storage volume. We implement a scalable linear array of processing elements (PE) supporting the proposed algorithm in the Xilinx Virtex II Pro technology. Synthesis results confirm a superior performance-area ratio compared to related recent works. Assuming the same FPGA chip, the same amount of local memory, and the same I/O bandwidth, our design outperforms related proposals by at least 1.7X and up to 18X consuming the least reconfigurable resources. A total of 39 PEs can be integrated into the xc2vp125-7 FPGA, reaching performance of, e.g., 15.6 GFLOPS with 1600 KB local memory and 400 MB/s external memory bandwidth.

224 citations

Patent
30 Sep 1987
TL;DR: In this paper, a dynamic history table maintains a record of the pipeline processor number in which each incoming instruction is executing, and other characteristics of the instruction, including the pipe number for future reference.
Abstract: A dynamic multiple instruction stream, multiple data, multiple pipeline (MIMD) apparatus simultaneously executes more than one instruction associated with a multiple number of instruction streams utilizing multiple data associated with the multiple number of instruction streams in a multiple number of pipeline processors. Since instructions associated with a multiple number of instruction streams are being executed simultaneously by a multiple number of pipeline processors, a tracking mechanism is needed for keeping track of the pipe in which each instruction is executing. As a result, a dynamic history table maintains a record of the pipeline processor number in which each incoming instruction is executing, and other characteristics of the instruction. When a particular instruction is received, it is decoded and its type is determined. Each pipeline processor handles a certain category of instructions; the particular instruction is transmitted to the pipeline processor having its corresponding category. However, before transmission, the pipeline processor is checked for completion of its oldest instruction by consulting the dynamic history table. If the table indicates that the oldest instruction in the pipeline processor should complete, execution of the oldest instruction in such processor completes, leaving room for insertion of the particular instruction therein for execution. When the particular instruction is transmitted to its associated pipeline processor, information including the pipe number is stored in the dynamic history table for future reference.

171 citations

Proceedings ArticleDOI
01 Dec 2006
TL;DR: A nondeterministic finite automata (NFA) based implementation was presented, which takes advantage of new basic building blocks to support more complex regular expressions than the previous approaches.
Abstract: Recent intrusion detection systems (IDS) use regular expressions instead of static patterns as a more efficient way to represent hazardous packet payload contents. This paper focuses on regular expressions pattern matching engines implemented in reconfigurable hardware. A nondeterministic finite automata (NFA) based implementation was presented, which takes advantage of new basic building blocks to support more complex regular expressions than the previous approaches. The methodology is supported by a tool that automatically generates the circuitry for the given regular expressions, outputting VHDL representations ready for logic synthesis. Furthermore, techniques to reduce the area cost of our designs and maximize performance when targeting FPGAs were included. Experimental results show that our tool is able to generate a regular expression engine to match more than 500 IDS regular expressions (from the Snort ruleset) using only 25K logic cells and achieving 2 Gbps throughput on a Virtex2 and 2.9 on a Virtex4 device. Concerning the throughput per area required per matching non-meta character, our design is 3.4 and 10 times more efficient than previous ASIC and FPGA approaches, respectively

146 citations


Cited by
More filters
01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations

Patent
29 Aug 2006
TL;DR: In this paper, a set top box for interacting with broadband media streams, with an adaptive user interface, content-based media processing and/or media metadata processing, and telecommunications integration, is presented.
Abstract: An intelligent electronic appliance preferably includes a user interface, data input and/or output port, and an intelligent processor. A preferred embodiment comprises a set top box for interacting with broadband media streams, with an adaptive user interface, content-based media processing and/or media metadata processing, and telecommunications integration. An adaptive user interface models the user, by observation, feedback, and/or explicit input, and presents a user interface and/or executes functions based on the user model. A content-based media processing system analyzes media content, for example audio and video, to understand the content, for example to generate content-descriptive metadata. A media metadata processing system operates on locally or remotely generated metadata to process the media in accordance with the metadata, which may be, for example, an electronic program guide, MPEG 7 data, and/or automatically generated format. A set top box preferably includes digital trick play effects, and incorporated digital rights management features.

2,644 citations

Patent
06 Jun 1995
TL;DR: An adaptive interface for a programmable system, for predicting a desired user function, based on user history, as well as machine internal status and context, is presented for confirmation by the user, and the predictive mechanism is updated based on this feedback as mentioned in this paper.
Abstract: An adaptive interface for a programmable system, for predicting a desired user function, based on user history, as well as machine internal status and context. The apparatus receives an input from the user and other data. A predicted input is presented for confirmation by the user, and the predictive mechanism is updated based on this feedback. Also provided is a pattern recognition system for a multimedia device, wherein a user input is matched to a video stream on a conceptual basis, allowing inexact programming of a multimedia device. The system analyzes a data stream for correspondence with a data pattern for processing and storage. The data stream is subjected to adaptive pattern recognition to extract features of interest to provide a highly compressed representation which may be efficiently processed to determine correspondence. Applications of the interface and system include a VCR, medical device, vehicle control system, audio device, environmental control system, securities trading terminal, and smart house. The system optionally includes an actuator for effecting the environment of operation, allowing closed-loop feedback operation and automated learning.

1,976 citations

Journal ArticleDOI
TL;DR: The aim of this paper is to present the reader with a perspective on how JFNK may be applicable to applications of interest and to provide sources of further practical information.

1,803 citations