Showing papers on "Benchmark (computing) published in 1998"

PDF

Open Access

Proceedings Article•DOI•

The simulation and evaluation of dynamic voltage scaling algorithms

[...]

Trevor Pering¹, Tom Burd¹, Robert W. Brodersen¹•Institutions (1)

10 Aug 1998

TL;DR: This paper presents a foundation for the simulation and analysis of DVS algorithms applied to a benchmark suite specifically targeted for PDA devices.

...read moreread less

Abstract: The reduction of energy consumption in microprocessors can be accomplished without impacting the peak performance through the use of dynamic voltage scaling (DVS). This approach varies the processor voltage under software control to meet dynamically varying performance requirements. This paper presents a foundation for the simulation and analysis of DVS algorithms. These algorithms are applied to a benchmark suite specifically targeted for PDA devices.

...read moreread less

593 citations

Proceedings Article•DOI•

Test set compaction algorithms for combinational circuits

[...]

Ilker Hamzaoglu¹, Janak H. Patel¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Nov 1998

TL;DR: In this paper, two new algorithms, redundant vector elimination (RVE) and essential fault reduction (EFR), were proposed for generating compact test sets for combinational circuits under the single stuck at fault model.

...read moreread less

Abstract: This paper presents two new algorithms, Redundant Vector Elimination (RVE) and Essential Fault Reduction (EFR), for generating compact test sets for combinational circuits under the single stuck at fault model, and a new heuristic for estimating the minimum single stuck at fault test set size. These algorithms together with the dynamic compaction algorithm are incorporated into an advanced ATPG system for combinational circuits, called MinTest. MinTest found better lower bounds and generated smaller test sets than the previously published results for the ISCAS85 and full scan version of the ISCAS89 benchmark circuits.

...read moreread less

451 citations

Proceedings Article•DOI•

The ISPD98 circuit benchmark suite

[...]

Charles J. Alpert¹•Institutions (1)

IBM¹

01 Apr 1998

TL;DR: The ISPD98 benchmark suite is introduced which consists of 18 circuits with sizes ranging from 13,000 to 210,000 modules and Experimental results for three existing partitioners are presented so that future researchers in partitioning can more easily evaluate their heuristics.

...read moreread less

Abstract: From 1985-1993, the MCNC regularly introduced and maintained circuit benchmarks for use by the Design Automation community. However, during the last five years, no new circuits have been introduced that can be used for developing fundamental physical design applications, such as partitioning and placement. The largest circuit in the existing set of benchmark suites has over 100,000 modules, but the second largest has just over 25,000 modules, which is small by today's standards. This paper introduces the ISPD98 benchmark suite which consists of 18 circuits with sizes ranging from 13,000 to 210,000 modules. Experimental results for three existing partitioners are presented so that future researchers in partitioning can more easily evaluate their heuristics.

...read moreread less

318 citations

Journal Article•DOI•

Bi-Level Integrated System Synthesis

[...]

J. Sobieszczanski¹, Jeremy S. Agte², Jr Robert R. Sandusky²•Institutions (2)

Langley Research Center¹, George Washington University²

01 Sep 1998-AIAA Journal

TL;DR: Modularity of the method is intended to fit the human organization and map well on the computing technology of concurrent processing.

...read moreread less

Abstract: BLISS is a method for optimization of engineering systems by decomposition. It separates the system level optimization, having a relatively small number of design variables, from the potentially numerous subsystem optimizations that may each have a large number of local design variables. The subsystem optimizations are autonomous and may be conducted concurrently. Subsystem and system optimizations alternate, linked by sensitivity data, producing a design improvement in each iteration. Starting from a best guess initial design, the method improves that design in iterative cycles, each cycle comprised of two steps. In step one, the system level variables are frozen and the improvement is achieved by separate, concurrent, and autonomous optimizations in the local variable subdomains. In step two, further improvement is sought in the space of the system level variables. Optimum sensitivity data link the second step to the first. The method prototype was implemented using MATLAB and iSIGHT programming software and tested on a simplified, conceptural level supersonic business jet design, and a detailed design of an electronic device. Satisfactory convergence and favorable agreement with the benchmark results were observed. Modularity of the method is intended to fit the human organization and map well on the computing technology of concurrent processing.

...read moreread less

263 citations

Proceedings Article•DOI•

Bi-Level Integrated System Synthesis (BLISS)

[...]

Sobieszczanski Jaroslaw, S Agte Jeremy¹, Jr Robert R. Sandusky¹•Institutions (1)

George Washington University¹

19 Aug 1998

TL;DR: Modularity of the method is intended to fit the human organization and map well on the computing technology of concurrent processing.

...read moreread less

Abstract: BLISS is a method for optimization of engineering systems by decomposition. It separates the system level optimization, having a relatively small number of design variables, from the potentially numerous subsystem optimizations that may each have a large number of local design variables. The subsystem optimizations are autonomous and may be conducted concurrently. Subsystem and system optimizations alternate, linked by sensitivity data, producing a design improvement in each iteration. Starting from a best guess initial design, the method improves that design in iterative cycles, each cycle comprised of two steps. In step one, the system level variables are frozen and the improvement is achieved by separate, concurrent, and autonomous optimizations in the local variable subdomains. In step two, further improvement is sought in the space of the system level variables. Optimum sensitivity data link the second step to the first. The method prototype was implemented using MATLAB and iSIGHT programming software and tested on a simplified, conceptual level supersonic business jet design, and a detailed design of an electronic device. Satisfactory convergence and favorable agreement with the benchmark results were observed. Modularity of the method is intended to fit the human organization and map well on the computing technology of concurrent processing.

...read moreread less

241 citations

Journal Article•DOI•

Benchmark for radar allocation and tracking in ECM

[...]

William Dale Blair¹, G.A. Watson, Thiagalingam Kirubarajan, Yaakov Bar-Shalom•Institutions (1)

Naval Surface Warfare Center¹

01 Oct 1998-IEEE Transactions on Aerospace and Electronic Systems

TL;DR: In this paper, a benchmark problem for tracking maneuvering targets is presented, where the best tracking algorithm is the one that minimizes a weighted average of the radar energy and radar time, while satisfying a constraint of 4% on the maximum number of lost tracks.

...read moreread less

Abstract: A benchmark problem for tracking maneuvering targets is presented. The benchmark problem involves beam pointing control of a phased array (i.e., agile beam) radar against highly maneuvering targets in the presence of false alarms (FAs) and electronic counter measurements (ECM). The testbed simulation described includes the effects of target amplitude fluctuations, beamshape, missed detections, FAs, finite resolution, target maneuvers, and track loss. Multiple waveforms are included in the benchmark so that the radar energy can be coordinated with the tracking algorithm. The ECM includes a standoff jammer (SOJ) broadcasting wideband noise and targets attempting range gate pull off (RGPO). The limits on the position and maneuverability of the targets are given along with descriptions of six target trajectories. The "best" tracking algorithm is the one that minimizes a weighted average of the radar energy and radar time, while satisfying a constraint of 4% on the maximum number of lost tracks, The radar model, the ECM techniques, the target scenarios, and performance criteria for the benchmark are presented.

...read moreread less

200 citations

Journal Article•DOI•

Benchmark problems in structural control: part I—Active Mass Driver system

[...]

Billie F. Spencer¹, Shirley J. Dyke², H. S. Deoskar¹•Institutions (2)

University of Notre Dame¹, Washington University in St. Louis²

01 Nov 1998-Earthquake Engineering & Structural Dynamics

TL;DR: In this article, the authors present the overview and problem definition for a benchmark structural control problem, which is a scale model of a three-storey building employing an active mass driver.

...read moreread less

Abstract: This paper presents the overview and problem definition for a benchmark structural control problem. The structure considered—chosen because of the widespread interest in this class of systems—is a scale model of a three-storey building employing an active mass driver. A model for this structural system, including the actuator and sensors, has been developed directly from experimentally obtained data and will form the basis for the benchmark study. Control constraints and evaluation criteria are presented for the design problem. A simulation program has been developed and made available to facilitate comparison of the efficiency and merit of various control strategies. A sample control design is given to illustrate some of the design challenges. © 1998 John Wiley & Sons, Ltd.

...read moreread less

196 citations

Book Chapter•DOI•

Metrics and Benchmarking for Parallel Job Scheduling

[...]

Dror G. Feitelson¹, Larry Rudolph²•Institutions (2)

Hebrew University of Jerusalem¹, Massachusetts Institute of Technology²

30 Mar 1998

TL;DR: It is argued that the focus should be on on-line open systems, and proposed that a standard workload should be used as a benchmark for schedulers, which will specify distributions of parallelism and runtime, as found by analyzing accounting traces.

...read moreread less

Abstract: The evaluation of parallel job schedulers hinges on two things: the use of appropriate metrics, and the use of appropriate workloads on which the scheduler can operate. We argue that the focus should be on on-line open systems, and propose that a standard workload should be used as a benchmark for schedulers. This benchmark will specify distributions of parallelism and runtime, as found by analyzing accounting traces, and also internal structures that create different speedup and synchronization characteristics. As for metrics, we present some problems with slowdown and bounded slowdown that have been proposed recently.

...read moreread less

191 citations

Journal Article•DOI•

Multilevel circuit partitioning

[...]

Charles J. Alpert¹, Jen-Hsin Huang, Andrew B. Kahng²•Institutions (2)

IBM¹, University of California, Los Angeles²

01 Aug 1998-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper proposes a new multilevel partitioning algorithm that exploits some of the latest innovations of classical iterative partitioning approaches and presents quadrisection results which compare favorably to the partitionings obtained by the GORDIAN cell placement tool.

...read moreread less

Abstract: Many previous works in partitioning have used some underlying clustering algorithm to improve performance. As problem sizes reach new levels of complexity, a single application of a clustering algorithm is insufficient to produce excellent solutions. Recent work has illustrated the promise of multilevel approaches. A multilevel partitioning algorithm recursively clusters the instance until its size is smaller than a given threshold, then unclusters the instance, while applying a partitioning refinement algorithm. In this paper, we propose a new multilevel partitioning algorithm that exploits some of the latest innovations of classical iterative partitioning approaches. Our method also uses a new technique to control the number of levels in our matching-based clustering algorithm. Experimental results show that our heuristic outperforms numerous existing bipartitioning heuristics with improvements ranging from 6.9 to 27.9% for 100 runs and 3.0 to 20.6% for just ten runs (while also using less CPU time). Further, our algorithm generates solutions better than the best known mincut bipartitionings for seven of the ACM/SIGDA benchmark circuits, including golem3 (which has over 100000 cells). We also present quadrisection results which compare favorably to the partitionings obtained by the GORDIAN cell placement tool. Our work in multilevel quadrisection has been used as the basis for an effective cell placement package.

...read moreread less

171 citations

Proceedings Article•DOI•

Low-energy embedded FPGA structures

[...]

Eric Kusse¹, Jan M. Rabaey²•Institutions (2)

University of California, Berkeley¹, University of California²

10 Aug 1998

TL;DR: The main features of the proposed cell include a rich local-interconnect network, which drastically reduces the energy dissipated in the wiring, and a dual-voltage scheme that allows pass-transistor networks to operate at low-voltages yet maintains decent performance.

...read moreread less

Abstract: This paper introduces an energy-efficient FPGA module, intended for embedded implementations. The main features of the proposed cell include a rich local-interconnect network, which drastically reduces the energy dissipated in the wiring, and a dual-voltage scheme that allows pass-transistor networks to operate at low-voltages yet maintains decent performance. Simulations on a benchmark set demonstrate that the proposed module succeeds in its goal of reducing energy consumption by an order of magnitude over existing implementations.

...read moreread less

151 citations

Proceedings Article•DOI•

Execution characteristics of desktop applications on Windows NT

[...]

Dennis Lee¹, Patrick Crowley¹, Jean-Loup Baer¹, Thomas Anderson¹, Brian N. Bershad¹ - Show less +1 more•Institutions (1)

University of Washington¹

16 Apr 1998

TL;DR: This paper examines the performance of desktop applications running on the Microsoft Windows NT operating system on Intel x86 processors, and contrasts these applications to the programs in the integer SPEC95 benchmark suite, and shows that the desktop applications have similar characteristics to theinteger SPEC95 benchmarks for many of these metrics.

...read moreread less

Abstract: This paper examines the performance of desktop applications running on the Microsoft Windows NT operating system on Intel x86 processors, and contrasts these applications to the programs in the integer SPEC95 benchmark suite. We present measurements of basic instruction set and program characteristics, and detailed simulation results of the way these programs use the memory system and processor branch architecture. We show that the desktop applications have similar characteristics to the integer SPEC95 benchmarks for many of these metrics. However, compared to the integer SPEC95 applications, desktop applications have larger instruction working sets, execute instructions in a greater number of unique functions, cross DLL boundaries frequently, and execute a greater number of indirect calls.

...read moreread less

Journal Article•DOI•

Advanced Motion Control: An Industrial Perspective*

[...]

M Maarten Steinbuch¹, Meindert L. Norg¹•Institutions (1)

Philips¹

01 Jan 1998-European Journal of Control

TL;DR: In this paper, the design and implementation of robust controllers for the focus servo of a compact disk and the tracking servos of a hard disk mechanism is investigated. But the authors focus on the track-following performance in the presence of disk disturbances.

...read moreread less

Journal Article•DOI•

Minimum-Relative-Entropy Calibration of Asset-Pricing Models

[...]

Marco Avellaneda¹•Institutions (1)

Courant Institute of Mathematical Sciences¹

01 Oct 1998-International Journal of Theoretical and Applied Finance

TL;DR: The minimum-relative-entropy algorithm is a special case of a general class of algorithms for calibrating models based on stochastic control and convex optimization and shows that the algorithm has a unique solution which is stable, i.e. it depends smoothly on the input prices.

...read moreread less

Abstract: We present an algorithm for calibrating asset-pricing models to the prices of benchmark securities. The algorithm computes the probability that minimizes the relative entropy with respect to a prior distribution and satisfies a finite number of moment constraints. These constraints arise from fitting the model to the prices of benchmark prices are studied in detail. We find that the sensitivities can be interpreted as regression coefficients of the payoffs of contingent claims on the set of payoffs of the benchmark instruments. We show that the algorithm has a unique solution which is stable, i.e. it depends smoothly on the input prices. The sensitivities of the values of contingent claims with respect to varriations in the benchmark instruments, in the risk-neutral measure. We also show that the minimum-relative-entropy algorithm is a special case of a general class of algorithms for calibrating models based on stochastic control and convex optimization. As an illustration, we use minimum-relative-entropy to construct a smooth curve of instantaneous forward rates from US LIBOR swap/FRA data and to study the corresponding sensitivities of fixed-income securities to variations in input prices.

...read moreread less

Journal Article•DOI•

A low-power VLSI architecture for full-search block-matching motion estimation

[...]

V.L. Do¹, Kenneth Yi Yun•Institutions (1)

University of California, San Diego¹

01 Aug 1998-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: Augmenting the estimation technique to a conventional systolic-architecture-based VLSI motion estimation reduces the power consumption by a factor of 2, while still preserving the optimal solution and the throughput.

...read moreread less

Abstract: This paper presents an architectural enhancement to reduce the power consumption of the full-search block-matching (FSBM) motion estimation. Our approach is based on eliminating unnecessary computation using conservative approximation. Augmenting the estimation technique to a conventional systolic-architecture-based VLSI motion estimation reduces the power consumption by a factor of 2, while still preserving the optimal solution and the throughput. A register-transfer level implementation as well as simulation results on benchmark video clips are presented.

...read moreread less

Journal Article•DOI•

Exact analytical solutions for some popular benchmark problems in topology optimization

[...]

George I. N. Rozvany

01 Jan 1998-Structural Optimization

TL;DR: In this article, the exact analytical truss solutions for some "benchmark" problems, which are often used as test examples in both discretized layout optimization of trusses and variable topology shape optimization of perforated plates under plane stress, are provided.

...read moreread less

Abstract: The aim of this paper is to provide the exact analytical truss solutions for some “benchmark” problems, which are often used as test examples in both discretized layout optimization of trusses and variable topology (or generalized) shape optimization of perforated plates under plane stress.

...read moreread less

Proceedings Article•DOI•

Selective eager execution on the PolyPath architecture

[...]

Artur Klauser¹, Abhijit Paithankar¹, Dirk Grunwald¹•Institutions (1)

University of Colorado Boulder¹

16 Apr 1998

TL;DR: This paper presents Selective Eager Execution (SEE), an execution model to overcome mis-speculation penalties by executing both paths after diffident branches, and presents the micro-architecture of the PolyPath processor, which is an extension of an aggressive superscalar, out-of-order architecture.

...read moreread less

Abstract: Control-flow misprediction penalties are a major impediment to high performance in wide-issue superscalar processors. In this paper we present Selective Eager Execution (SEE), an execution model to overcome mis-speculation penalties by executing both paths after diffident branches. We present the micro-architecture of the PolyPath processor, which is an extension of an aggressive superscalar, out-of-order architecture. The PolyPath architecture uses a novel instruction tagging and register renaming mechanism to execute instructions from multiple paths simultaneously in the same processor pipeline, while retaining maximum resource availability for single-path code sequences.Results of our execution-driven, pipeline-level simulations show that SEE can improve performance by as much as 36% for the go benchmark, and an average of 14% on SPECint95, when compared to a normal superscalar, out-of-order, speculative execution, monopath processor. Moreover, our architectural model is both elegant and practical to implement, using a small amount of additional state and control logic.

...read moreread less

Proceedings Article•DOI•

MPI-SIM: using parallel simulation to evaluate MPI programs

[...]

Sundeep Prakash¹, Rajive Bagrodia¹•Institutions (1)

University of California, Los Angeles¹

01 Dec 1998

TL;DR: MPI-SIM as mentioned in this paper is a library for the execution driven parallel simulation of MPI programs, which can be used to predict the performance of existing MPI applications as a function of architectural characteristics, including number of processors and message communication latencies.

...read moreread less

Abstract: This paper describes the design and implementation of MPI-SIM, a library for the execution driven parallel simulation of MPI programs. MPI-LITE, a portable library that supports multithreaded MPI, is also described. MPI-SIM, built on top of MPI-LITE, can be used to predict the performance of existing MPI programs as a function of architectural characteristics, including number of processors and message communication latencies. The simulation models can be executed sequentially or in parallel. Parallel executions of MPI-SIM models are synchronized using a set of asynchronous conservative protocols. MPI-SIM reduces synchronization overheads by exploiting the communication characteristics of the program it simulates. This paper presents validation and performance results from the use of MPI-SIM to simulate applications from the NAS Parallel Benchmark suite. Using the techniques described here, we are able to reduce the number of synchronizations in the parallel simulation as compared with the synchronous quantum protocol and are able to achieve speedups ranging from 3.2-11.9 in going from sequential to parallel simulation using 16 processors on the IBM SP2.

...read moreread less

Journal Article•DOI•

Benchmark problems in structural control : Part II : Active tendon system

[...]

Billie F. Spencer¹, Shirley J. Dyke², H. S. Deoskar¹•Institutions (2)

University of Notre Dame¹, Washington University in St. Louis²

01 Nov 1998-Earthquake Engineering & Structural Dynamics

TL;DR: Spencer et al. as mentioned in this paper defined a benchmark structural control problem for a model building configured with an active mass driver (AMD), based on a high-fidelity analytical model of a three-storey tendon-controlled structure at the National Center for Earthquake Engineering Research (NCEER).

...read moreread less

Abstract: In a companion paper (Spencer et al.), an overview and problem definition was presented for a well-defined benchmark structural control problem for a model building configured with an Active Mass Driver (AMD). A second benchmark problem is posed here based on a high-fidelity analytical model of a three-storey, tendon-controlled structure at the National Center for Earthquake Engineering Research (NCEER). The purpose of formulating this problem is to provide another setting in which to evaluate the relative effectiveness and implementability of various structural control algorithms. To achieve a high level of realism, an evaluation model is presented in the problem definition which is derived directly from experimental data obtained for the structure. This model accurately represents the behaviour of the laboratory structure and fully incorporates actuator/sensor dynamics. As in the companion paper, the evaluation model will be considered as the real structural system. In general, controllers that are successfully implemented on the evaluation model can be expected to perform similarly in the laboratory setting. Several evaluation criteria are given, along with the associated control design constraints. © 1998 John Wiley & Sons, Ltd.

...read moreread less

SKaMPI: A Detailed, Accurate MPI Benchmark

[...]

Peter Sanders¹, Ralf Reussner, Lutz Prechelt, Matthias P. Müller¹•Institutions (1)

Max Planck Society¹

01 Jan 1998

Abstract: SKaMPI is a benchmark for MPI implementations. Its purpose is the detailed analysis of the runtime of individual MPI operations and comparison of these for different implementations of MPI. SKaMPI can be configured and tuned in many ways: operations, measurement precision, communication modes, packet sizes, number of processors used etc. The technically most interesting feature of SKaMPI are measurement mechanisms which combine accuracy, efficiency and robustness. Postprocessors support graphical presentation and comparisons of different sets of results which are collected in a public web-site. We describe the SKaMPI design and implementation and illustrate its main aspects with actual measurements.

...read moreread less

Journal Article•DOI•

Towards a simulation-benchmark for evaluating respirometry-based control strategies

[...]

Henri Spanjers, Peter A. Vanrolleghem¹, Khanh Nguyen, Henk Vanhooren¹, Gilles G. Patry - Show less +1 more•Institutions (1)

Ghent University¹

01 Jan 1998-Water Science and Technology

TL;DR: In this paper, the authors present a methodology for the evaluation of respirometry-based control strategies in a full scale environment, based on a methodology including simulation model, plant layout, controller and test procedure.

...read moreread less

Proceedings Article•DOI•

Evaluating MMX technology using DSP and multimedia applications

[...]

Ravi Bhargava¹, Lizy K. John¹, Brian L. Evans¹, Ramesh Radhakrishnan¹•Institutions (1)

University of Texas at Austin¹

01 Nov 1998

TL;DR: This paper evaluates the X86 architecture's multimedia extension (MMX) instruction set on a set of benchmarks to understand which aspects of native signal processing instruction sets are most useful, the current limitations, and how they can be utilized most efficiently.

...read moreread less

Abstract: Many current general purpose processors are using extensions to the instruction set architecture to enhance the performance of digital signal processing (DSP) and multimedia applications. In this paper, we evaluate the X86 architecture's multimedia extension (MMX) instruction set on a set of benchmarks. Our benchmark suite includes kernels (filtering, fast Fourier transforms, and vector arithmetic) and applications (JPEG compression, Doppler radar processing, imaging, and G.722 speech encoding). Each benchmark has at least one non-MMX version in C and an MMX version that makes calls to an MMX assembly library. The versions differ in the implementation of filtering, vector arithmetic, and other relevant kernels. The observed speed up for the MMX versions of the suite ranges from less than 1.0 to 6.1. In addition to quantifying the speedup, we perform detailed instruction level profiling using Intel's VTune profiling tool. Using VTune, we profile static and dynamic instructions, microarchitecture operations, and data references to isolate the specific reasons for speedup or lack thereof. This analysis allows one to understand which aspects of native signal processing instruction sets are most useful, the current limitations, and how they can be utilized most efficiently.

...read moreread less

A collection of benchmark examples for the numerical solution of algebraic Riccati equations I: Continuous-time case

[...]

Peter Benner, A. J. Laub, Volker Mehrmann

30 Oct 1998

Journal Article•DOI•

Telescopic units: a new paradigm for performance optimization of VLSI designs

[...]

Luca Benini¹, Enrico Macii², Massimo Poncino², G. De Micheli¹•Institutions (2)

Stanford University¹, Polytechnic University of Turin²

01 Mar 1998-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: An algorithm for automatically restructuring the controllers of the data paths in which variable-latency units have been introduced is formulated, and results show an average throughput improvement exceeding 27%, at the price of a modest area increase.

...read moreread less

Abstract: This paper introduces a novel optimization paradigm for increasing the throughput of digital systems. The basic idea consists of transforming fixed-latency units into variable-latency ones that run with a faster clock cycle. The transformation is fully automatic and can be used in conjunction with traditional design techniques to improve the overall performance of speed-critical units. In addition, we introduce procedures for reducing the area overhead of the modified units, and we formulate an algorithm for automatically restructuring the controllers of the data paths in which variable-latency units have been introduced. Results, obtained on a large set of benchmark circuits, show an average throughput improvement exceeding 27%, at the price of a modest area increase (less than 8% on average).

...read moreread less

Value management The value management benchmark; a good practice framework for clients and practitioners

[...]

S. Male, J. Kelly, Scott Fernie, London Thomas Telford Ltd.

01 Jan 1998

Journal Article•

Integrating path and timing analysis using instruction-level simulation techniques

[...]

Thomas Lundqvist¹, Per Stenström¹•Institutions (1)

Chalmers University of Technology¹

01 Jan 1998-Lecture Notes in Computer Science

TL;DR: In this article, the authors present a new method that integrates path and timing analysis to estimate worst-case execution time on contemporary processors with complex pipelines and multi-level memory hierarchies.

...read moreread less

Abstract: Previously published methods for estimation of the worst-case execution time on contemporary processors with complex pipelines and multi-level memory hierarchies result in overestimations owing to insufficient path and/or timing analysis. This paper presents a new method that integrates path and timing analysis to address these limitations. First, it is based on instruction-level architecture simulation techniques and thus has a potential to perform arbitrarily detailed timing analysis of hardware platforms. Second, by extending the simulation technique with the capability of handling unknown input data values, it is possible to exclude infeasible (or false) program paths in many cases, and also calculate path information, such as bounds on number of loop iterations, without the need for annotating the programs. Finally, in order to keep the number of program paths to be analyzed at a manageable level, we have extended the simulator with a path-merging strategy. This paper presents the method and particularly evaluates its capability to exclude infeasible paths based on seven benchmark programs.

...read moreread less

Journal Article•DOI•

High VelociTI processing [Texas Instruments VLIW DSP architecture]

[...]

N. Seshan

01 Mar 1998-IEEE Signal Processing Magazine

TL;DR: An overview of the VelociTI including architectural principles, data path, instruction set, and pipeline operation is presented, and both the C62x fixed-point CPU and the C67x floating-point CPUs are described.

...read moreread less

Abstract: The Texas Instruments VelociTI architecture is a very long instruction word (VLIW) architecture. The TMS320C6x family of digital signal processors (DSPs) is the first to employ the VelociTI architecture, with the TMS3206201 (C6201) being the first device in this family. The C6201 is based on the fixed-point TMS320C62x (C62x) CPU. This article describes the VelociTI VLIW architecture and discusses the C62x, C67x, C6201, and the VelociTI development tools. An overview of the VelociTI including architectural principles, data path, instruction set, and pipeline operation is presented, and both the C62x fixed-point CPU and the C67x floating-point CPU are described. A summary of the C62x benchmark performance is also presented. The chip-level support outside the CPU that allows the C6201 to operate in a variety of high-performance DSP environments is also described. An overview of the C6x development environment is also given, demonstrating the breadth of the development environment and illustrating the programming methodology. The article concludes with a performance analysis of the C compiler.

...read moreread less

Journal Article•DOI•

Practical issues in multivariable feedback control performance assessment

[...]

Biao Huang¹, Sirish L. Shah¹•Institutions (1)

University of Alberta¹

01 Oct 1998-Journal of Process Control

TL;DR: In this paper, a more practical benchmark, which is specified in terms of desired closed-loop dynamics, is proposed for performance assessment of feedback controllers in the H 2 framework, for MIMO processes.

...read moreread less

Journal Article•DOI•

An ℒ2 disturbance attenuation solution to the nonlinear benchmark problem

[...]

Panagiotis Tsiotras¹, Martin Corless², Mario A. Rotea²•Institutions (2)

University of Virginia¹, Purdue University²

15 Apr 1998-International Journal of Robust and Nonlinear Control

TL;DR: In this paper, a series expansion solution to the Hamilton-Jacobi-Isaacs Equation associated with the nonlinear disturbance attenuation problem was obtained for a nonlinear controller.

...read moreread less

Abstract: In this paper, we use the theory of L2 disturbance attenuation for linear (H1) and nonlinear systems to obtain solutions to the Nonlinear Benchmark Problem (NLBP) proposed in the paper by Bupp et. al.1. By considering a series expansion solution to the Hamilton-Jacobi-Isaacs Equation associated with the nonlinear disturbance attenuation problem, we obtain a series expansion solution for a nonlinear controller. Numerical simulations compare the performance of the third order approximation of the nonlinear controller with its rst order approximation (which is the same as the linear H1 controller obtained from the linearized problem.)

...read moreread less

Book Chapter•DOI•

On the Automatic Parallelization of Sparse and Irregular Fortran Programs

[...]

Yuan Lin¹, David Padua¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

28 May 1998-Lecture Notes in Computer Science

TL;DR: Three new techniques were derived that produced good speedups when manually applied to the authors' benchmark codes and can be implemented in a parallelizing compiler and applied automatically.

...read moreread less

Abstract: Automatic parallelization is usually believed to be less effective at exploiting implicit paralleism in sparse/irregular programs than in their dense/regular counterparts. However, not much is really known because there have been few research reports on this topic. In this work, we have studied the possibility of using an automatic parallelizing compiler to detect the parallelism in sparse/irregular programs. The study with a collection of sparse/irregular programs led us to some common loop patterns. Based on these patterns three new techniques were derived that produced good speedups when manually applied to our benchmark codes. More importantly, these parallelization methode can be implemented in a parallelizing compiler and can be applied automatically.

...read moreread less

Book Chapter•DOI•

Integrating Path and Timing Analysis Using Instruction-Level Simulation Techniques

[...]

Thomas Lundqvist¹, Per Stenström¹•Institutions (1)

Chalmers University of Technology¹

01 Jun 1998

TL;DR: A new method that integrates path and timing analysis to address limitations of previously published methods for estimation of the worst-case execution time on contemporary processors with complex pipelines and multi-level memory hierarchies is presented.

...read moreread less

Collapse