Showing papers on "Benchmark (computing) published in 2004"

PDF

Open Access

Journal Article•DOI•

An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision

[...]

Yuri Boykov¹, Vladimir Kolmogorov•Institutions (1)

01 Sep 2004-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper compares the running times of several standard algorithms, as well as a new algorithm that is recently developed that works several times faster than any of the other methods, making near real-time performance possible.

...read moreread less

Abstract: Minimum cut/maximum flow algorithms on graphs have emerged as an increasingly useful tool for exactor approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-Tarjan style "push -relabel" methods and algorithms based on Ford-Fulkerson style "augmenting paths." We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and segmentation. In many cases, our new algorithm works several times faster than any of the other methods, making near real-time performance possible. An implementation of our max-flow/min-cut algorithm is available upon request for research purposes.

...read moreread less

4,463 citations

Proceedings Article•DOI•

The Princeton Shape Benchmark

[...]

Philip Shilane¹, Patrick Min¹, Michael Kazhdan¹, Thomas Funkhouser¹•Institutions (1)

Princeton University¹

07 Jun 2004

TL;DR: It is concluded that no single descriptor is best for all classifications, and thus the main contribution of this paper is to provide a framework to determine the conditions under which each descriptor performs best.

...read moreread less

Abstract: In recent years, many shape representations and geometric algorithms have been proposed for matching 3D shapes. Usually, each algorithm is tested on a different (small) database of 3D models, and thus no direct comparison is available for competing methods. We describe the Princeton Shape Benchmark (PSB), a publicly available database of polygonal models collected from the World Wide Web and a suite of tools for comparing shape matching and classification algorithms. One feature of the benchmark is that it provides multiple semantic labels for each 3D model. For instance, it includes one classification of the 3D models based on function, another that considers function and form, and others based on how the object was constructed (e.g., man-made versus natural objects). We find that experiments with these classifications can expose different properties of shape-based retrieval algorithms. For example, out of 12 shape descriptors tested, extended Gaussian images by B. Horn (1984) performed best for distinguishing man-made from natural objects, while they performed among the worst for distinguishing specific object types. Based on experiments with several different shape descriptors, we conclude that no single descriptor is best for all classifications, and thus the main contribution of this paper is to provide a framework to determine the conditions under which each descriptor performs best.

...read moreread less

1,561 citations

Proceedings Article•DOI•

A comparative study of differential evolution, particle swarm optimization, and evolutionary algorithms on numerical benchmark problems

[...]

J.S. Vesterstrom¹, René Thomsen¹•Institutions (1)

Aarhus University¹

19 Jun 2004

TL;DR: The results from this study show that DE generally outperforms the other algorithms, however, on two noisy functions, both DE and PSO were outperformed by the EA.

...read moreread less

Abstract: Several extensions to evolutionary algorithms (EAs) and particle swarm optimization (PSO) have been suggested during the last decades offering improved performance on selected benchmark problems. Recently, another search heuristic termed differential evolution (DE) has shown superior performance in several real-world applications. In this paper, we evaluate the performance of DE, PSO, and EAs regarding their general applicability as numerical optimization techniques. The comparison is performed on a suite of 34 widely used benchmark problems. The results from our study show that DE generally outperforms the other algorithms. However, on two noisy functions, both DE and PSO were outperformed by the EA.

...read moreread less

1,252 citations

Proceedings Article•DOI•

Efficient belief propagation for early vision

[...]

Pedro F. Felzenszwalb¹, Daniel P. Huttenlocher¹•Institutions (1)

Cornell University¹

19 Jul 2004

TL;DR: New algorithmic techniques are presented that substantially improve the running time of the belief propagation approach and reduce the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel.

...read moreread less

Abstract: Markov random field models provide a robust and unified framework for early vision problems such as stereo, optical flow and image restoration. Inference algorithms based on graph cuts and belief propagation yield accurate results, but despite recent advances are often still too slow for practical use. In this paper we present new algorithmic techniques that substantially improve the running time of the belief propagation approach. One of our techniques reduces the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel, which is important for problems such as optical flow or image restoration that have a large label set. A second technique makes it possible to obtain good results with a small fixed number of message passing iterations, independent of the size of the input images. Taken together these techniques speed up the standard algorithm by several orders of magnitude. In practice we obtain stereo, optical flow and image restoration algorithms that are as accurate as other global methods (e.g., using the Middlebury stereo benchmark) while being as fast as local techniques.

...read moreread less

889 citations

Journal Article•DOI•

Benchmark Control Problems for Seismically Excited Nonlinear Buildings

[...]

Yasuki Ohtori¹, Richard Christenson², Billie F. Spencer³, Shirley J. Dyke⁴•Institutions (4)

Central Research Institute of Electric Power Industry¹, Colorado School of Mines², University of Illinois at Urbana–Champaign³, Washington University in St. Louis⁴

15 Mar 2004-Journal of Engineering Mechanics-asce

TL;DR: In this paper, the authors present the problem definition and guidelines of a set of benchmark control problems for seismically excited nonlinear buildings, focusing on three typical steel structures, 3-, 9-, and 20-story buildings designed for the SAC project for the Los Angeles, California region.

...read moreread less

Abstract: This paper presents the problem definition and guidelines of a set of benchmark control problems for seismically excited nonlinear buildings. Focusing on three typical steel structures, 3-, 9-, and 20-story buildings designed for the SAC project for the Los Angeles, California region, the goal of this study is to provide a clear basis to evaluate the efficacy of various structural control strategies. A nonlinear evaluation model has been developed that portrays the salient features of the structural system. Evaluation criteria and control constraints are presented for the design problems. The task of each participant in this benchmark study is to define (including sensors and control algorithms), evaluate, and report on their proposed control strategies. These strategies may be either passive, active, semiactive, or a combination thereof. The benchmark control problems will then facilitate direct comparison of the relative merits of the various control strategies. To illustrate some of the design challenges, a sample control strategy employing active control with a linear quadratic Gaussian control algorithm is applied to the 20-story building.

...read moreread less

609 citations

Book Chapter•DOI•

Linear road: a stream data management benchmark

[...]

Arvind Arasu¹, Mitch Cherniack², Eduardo Galvez², David Maier³, Anurag S. Maskey², Esther Ryvkina², Michael Stonebraker⁴, Richard Tibbetts⁴ - Show less +4 more•Institutions (4)

Stanford University¹, Brandeis University², Oregon Health & Science University³, Massachusetts Institute of Technology⁴

31 Aug 2004

TL;DR: Results show that a dedicated Stream Data Management System can outperform a Relational Database by at least a factor of 5 on streaming data applications.

...read moreread less

Abstract: This paper specifies the Linear Road Benchmark for Stream Data Management Systems (SDMS). Stream Data Management Systems process streaming data by executing continuous and historical queries while producing query results in real-time. This benchmark makes it possible to compare the performance characteristics of SDMS' relative to each other and to alternative (e.g., Relational Database) systems. Linear Road has been endorsed as an SDMS benchmark by the developers of both the Aurora [1] (out of Brandeis University, Brown University and MIT) and STREAM [8] (out of Stanford University) stream systems. Linear Road simulates a toll system for the motor vehicle expressways of a large metropolitan area. The tolling system uses "variable tolling" [6, 11, 9]: an increasingly prevalent tolling technique that uses such dynamic factors as traffic congestion and accident proximity to calculate toll charges. Linear Road specifies a variable tolling system for a fictional urban area including such features as accident detection and alerts, traffic congestion measurements, toll calculations and historical queries. After specifying the benchmark, we describe experimental results involving two implementations: one using a commercially available Relational Database and the other using Aurora. Our results show that a dedicated Stream Data Management System can outperform a Relational Database by at least a factor of 5 on streaming data applications.

...read moreread less

400 citations

Journal Article•DOI•

Neural network techniques for financial performance prediction: integrating fundamental and technical analysis

[...]

Monica Lam¹•Institutions (1)

California State University, Sacramento¹

01 Sep 2004

TL;DR: Experimental results indicate that neural networks using 1 year's or multiple years' financial data consistently and significantly outperform the minimum benchmark, but not the maximum benchmark, and rule extraction as a postprocessing technique for improving prediction accuracy and for explaining the prediction logic to financial decision makers is demonstrated.

...read moreread less

Abstract: This research project investigates the ability of neural networks, specifically, the backpropagation algorithm, to integrate fundamental and technical analysis for financial performance prediction. The predictor attributes include 16 financial statement variables and 11 macroeconomic variables. The rate of return on common shareholders' equity is used as the to-be-predicted variable. Financial data of 364 S&P companies are extracted from the CompuStat database, and macroeconomic variables are extracted from the Citibase database for the study period of 1985-1995. Used as predictors in Experiments 1, 2, and 3 are the 1 year's, the 2 years', and the 3 years' financial data, respectively. Experiment 4 has 3 years' financial data and macroeconomic data as predictors. Moreover, in order to compensate for data noise and parameter misspecification as well as to reveal prediction logic and procedure, we apply a rule extraction technique to convert the connection weights from trained neural networks to symbolic classification rules. The performance of neural networks is compared with the average return from the top one-third returns in the market (maximum benchmark) that approximates the return from perfect information as well as with the overall market average return (minimum benchmark) that approximates the return from highly diversified portfolios. Paired t tests are carried out to calculate the statistical significance of mean differences. Experimental results indicate that neural networks using 1 year's or multiple years' financial data consistently and significantly outperform the minimum benchmark, but not the maximum benchmark. As for neural networks with both financial and macroeconomic predictors, they do not outperform the minimum or maximum benchmark in this study. The experimental results also show that the average return of 0.25398 from extracted rules is the only compatible result to the maximum benchmark of 0.2786. Consequentially, we demonstrate rule extraction as a postprocessing technique for improving prediction accuracy and for explaining the prediction logic to financial decision makers.

...read moreread less

361 citations

Journal Article•DOI•

[...]

Kam Yim Sze¹, X. H. Liu¹, S. H. Lo¹•Institutions (1)

University of Hong Kong¹

01 Jul 2004-Finite Elements in Analysis and Design

TL;DR: In this article, the results of geometric nonlinear benchmark problems of shells are presented in the form of load-deflection curves and the relative convergent difficulty of the problems are revealed by the number of load increments and the total number of iterations required by an automatic load increment scheme for attaining the converged solutions under the maximum loads.

...read moreread less

357 citations

Journal Article•

Survey and Benchmark of Block Ciphers for Wireless Sensor Networks

[...]

Yee Wei Law¹, Jeroen Doumen¹, Pieter H. Hartel¹•Institutions (1)

University of Twente¹

01 Jan 2004-CTIT technical report series

TL;DR: In this article, the authors have identified the candidates of block ciphers suitable for WSNs based on existing literature and devised a systematic framework that not only considers the security properties but also the storage and energy-efficency of the candidates.

...read moreread less

Abstract: Choosing the most storage- and energy-efficient block cipher specifically for wireless sensor networks (WSNs) is not as straightforward as it seems. To our knowledge so far, there is no systematic evaluation framework for the purpose. In this paper, we have identified the candidates of block ciphers suitable for WSNs based on existing literature. For evaluating and assessing these candidates, we have devised a systematic framework that not only considers the security properties but also the storage- and energy-efficency of the candidates. Finally, based on the evaluation results, we have selected the suitable ciphers for WSNs, namely Rijndael for high security and energy efficiency requirements; and MISTY1 for good storage and energy efficiency.

...read moreread less

271 citations

Journal Article•DOI•

A multi-algorithm, multi-timescale method for cell simulation

[...]

Kouichi Takahashi¹, Kazunari Kaizu¹, Bin Hu¹, Masaru Tomita¹•Institutions (1)

Keio University¹

01 Mar 2004-Bioinformatics

TL;DR: A modular, object-oriented simulation meta-algorithm based on a discrete-event scheduler and Hermite polynomial interpolation has been developed and implemented and it is shown that this new method can efficiently handle many components driven by different algorithms and different timescales.

...read moreread less

Abstract: Motivation: Many important problems in cell biology require the dense nonlinear interactions between functional modules to be considered. The importance of computer simulation in understanding cellular processes is now widely accepted, and a variety of simulation algorithms useful for studying certain subsystems have been designed. Many of these are already widely used, and a large number of models constructed on these existing formalisms are available. A significant computational challenge is how we can integrate such sub-cellular models running on different types of algorithms to construct higher order models. Results: A modular, object-oriented simulation meta-algorithm based on a discrete-event scheduler and Hermite polynomial interpolation has been developed and implemented. It is shown that this new method can efficiently handle many components driven by different algorithms and different timescales. The utility of this simulation framework is demonstrated further with a 'composite' heat-shock response model that combines the Gillespie--Gibson stochastic algorithm and deterministic differential equations. Dramatic improvements in performance were obtained without significant accuracy drawbacks. A multi-timescale demonstration of coupled harmonic oscillators is also shown. Availability: An implementation of the method is available as part of E-Cell Simulation Environment Version 3 downloadable from http://www.e-cell.org/software. Benchmark models are included in the package, and also available upon request. Supplementary information: Complete lists of reactions and parameters of the heat-shock model, and more results are available at http://www.e-cell.org/bioinfo/takahashi03-1-supp.pdf

...read moreread less

194 citations

Proceedings Article•DOI•

Formal online methods for voltage/frequency control in multiple clock domain microprocessors

[...]

Qiang Wu¹, Philo Juang¹, Margaret Martonosi¹, Douglas W. Clark¹•Institutions (1)

Princeton University¹

07 Oct 2004

TL;DR: This paper presents an effective online DVFS scheme for an MCD processor which takes a formal analytic approach, is driven by dynamic workloads, and is suitable for all applications and can be generalized for energy control in processors other than MCD, such as tiled stream processors.

...read moreread less

Abstract: Multiple Clock Domain (MCD) processors are a promising future alternative to today's fully synchronous designs Dynamic Voltage and Frequency Scaling (DVFS) in an MCD processor has the extra flexibility to adjust the voltage and frequency in each domain independently Most existing DVFS approaches are profile-based offline schemes which are mainly suitable for applications whose execution char-acteristics are constrained and repeatable While some work has been published about online DVFS schemes, the prior approaches are typically heuristic-based In this paper, we present an effective online DVFS scheme for an MCD processor which takes a formal analytic approach, is driven by dynamic workloads, and is suitable for all applications In our approach, we model an MCD processor as a queue-domain network and the online DVFS as a feedback control problem with issue queue occupancies as feedback signals A dynamic stochastic queuing model is first proposed and linearized through an accu-rate linearization technique A controller is then designed and verified by stability analysis Finally we evaluate our DVFS scheme through a cycle-accurate simulation with a broad set of applications selected from MediaBench and SPEC2000 benchmark suites Compared to the best-known prior approach, which is heuristic-based, the proposed online DVFS scheme is substantially more effective due to its automatic regulation ability For example, we have achieved a 2-3 fold increase in efficiency in terms of energy-delay product improvement In addition, our control theoretic technique is more resilient, requires less tuning effort, and has better scalability as compared to prior online DVFS schemesWe believe that the techniques and methodology described in this paper can be generalized for energy control in processors other than MCD, such as tiled stream processors

...read moreread less

Journal Article•DOI•

Benchmarking for comparative evaluation of RP systems and processes

[...]

M. Mahesh¹, Y.S. Wong, Jerry Y. H. Fuh, Han Tong Loh•Institutions (1)

National University of Singapore¹

01 Apr 2004-Rapid Prototyping Journal

TL;DR: The ability of the benchmark part to determine achievable geometric features and accuracy by the aforementioned RP processes is presented and discussed.

...read moreread less

Abstract: A geometric benchmark part is proposed, designed and fabricated for the performance evaluation of rapid prototyping machines/processes. The benchmark part incorporates key shapes and features of better‐known benchmark parts. It also includes new geometric features, such as freeform surfaces, certain mechanical features and pass‐fail features that are increasingly required or expected of RP processes/systems. The part is suitable for fabrication on a typical RP machines. In this paper, the application of the benchmark part is demonstrated using relatively common RP processes. The ability of the benchmark part to determine achievable geometric features and accuracy by the aforementioned RP processes is presented and discussed.

...read moreread less

The Princeton Shape Benchmark (Figures 1 and 2).

[...]

Philip Shilane, Patrick Min, Michael Kazhdan, Thomas Funkhouser

01 Jan 2004

Journal Article•

Design and implementation of a lightweight dynamic optimization system

[...]

Jiwei Lu¹, Howard Chen¹, Pen-Chung Yew, Wei-Chung Hsu¹•Institutions (1)

University of Minnesota¹

01 Apr 2004-Journal of Instruction-level Parallelism

TL;DR: This paper describes a software system of real implementation that detects performance problems of running applications and deploys optimizations to increase execution eciency, and presents this lightweight system as an example of using existing hardware and software to deploy speculative optimizations to improve a program’s runtime performance.

...read moreread less

Abstract: Many opportunities exist to improve micro-architectural performance due to performance events that are dicult to optimize at static compile time. Cache misses and branch mis-prediction patterns may vary for dierent micro-architectures using dierent inputs. Dynamic optimization provides an approach to address these and other performance events at runtime. This paper describes a software system of real implementation that detects performance problems of running applications and deploys optimizations to increase execution eciency. We discuss issues of detecting performance bottlenecks, generating optimized traces and redirecting execution from the original code to the dynamically optimized code. Our current system speeds up many of the CPU2000 benchmark programs having large numbers of D-Cache misses through dynamically deployed cache prefetching. For other applications that don’t benefit from our runtime optimization, the average cost is only 2% of execution time. We present this lightweight system as an example of using existing hardware and software to deploy speculative optimizations to improve a program’s runtime performance.

...read moreread less

Journal Article•DOI•

Warp Processors

[...]

Roman Lysecky¹, Greg Stitt², Frank Vahid²•Institutions (2)

University of Arizona¹, University of California, Riverside²

07 Jun 2004

TL;DR: This work developed a custom FPGA fabric specifically designed to enable lean place and route tools, and developed extremely fast and efficient versions of partitioning, decompilation, synthesis, technology mapping, placement, and routing.

...read moreread less

Abstract: We describe a new processing architecture, known as a warp processor, that utilizes a field-programmable gate array (FPGA) to improve the speed and energy consumption of a software binary executing on a microprocessor. Unlike previous approaches that also improve software using an FPGA but do so using a special compiler, a warp processor achieves these improvements completely transparently and operates from a standard binary. A warp processor dynamically detects the binary's critical regions, reimplements those regions as a custom hardware circuit in the FPGA, and replaces the software region by a call to the new hardware implementation of that region. While not all benchmarks can be improved using warp processing, many can, and the improvements are dramatically better than those achievable by more traditional architecture improvements. The hardest part of warp processing is that of dynamically reimplementing code regions on an FPGA, requiring partitioning, decompilation, synthesis, placement, and routing tools, all having to execute with minimal computation time and data memory so as to coexist on chip with the main processor. We describe the results of developing our warp processor. We developed a custom FPGA fabric specifically designed to enable lean place and route tools, and we developed extremely fast and efficient versions of partitioning, decompilation, synthesis, technology mapping, placement, and routing. Warp processors achieve overall application speedups of 6.3X with energy savings of 66% across a set of embedded benchmark applications. We further show that our tools utilize acceptably small amounts of computation and memory which are far less than traditional tools. Our work illustrates the feasibility and potential of warp processing, and we can foresee the possibility of warp processing becoming a feature in a variety of computing domains, including desktop, server, and embedded applications.

...read moreread less

Journal Article•DOI•

Packetization and routing analysis of on-chip multiprocessor networks

[...]

Terry Tao Ye¹, Luca Benini², Giovanni De Micheli¹•Institutions (2)

Stanford University¹, University of Bologna²

01 Feb 2004-Journal of Systems Architecture

TL;DR: This paper analyzes different routing schemes for packetized on-chip communication on a mesh network architecture, with particular emphasis on specific benefits and limitations of silicon VLSI implementations and proposes a contention-look-ahead on- chip routing scheme.

...read moreread less

Journal Article•DOI•

A Benchmark for Methods in Reverse Engineering and Model Discrimination: Problem Formulation and Solutions

[...]

Andreas Kremling¹, Sophia Fischer¹, Kapil G. Gadkar², Francis J. Doyle², Thomas Sauter³, Eric Bullinger⁴, Frank Allgöwer⁵, Ernst Dieter Gilles¹ - Show less +4 more•Institutions (5)

Max Planck Society¹, University of California, Santa Barbara², University of Luxembourg³, University of Liège⁴, University of Stuttgart⁵

01 Sep 2004-Genome Research

TL;DR: A benchmark problem is described for the reconstruction and analysis of biochemical networks given sampled experimental data and several solutions based on linear and nonlinear models are discussed.

...read moreread less

Abstract: A benchmark problem is described for the reconstruction and analysis of biochemical networks given sampled experimental data. The growth of the organisms is described in a bioreactor in which one substrate is fed into the reactor with a given feed rate and feed concentration. Measurements for some intracellular components are provided representing a small biochemical network. Problems of reverse engineering, parameter estimation, and identifiability are addressed. The contribution mainly focuses on the problem of model discrimination. If two or more model variants describe the available experimental data, a new experiment must be designed to discriminate between the hypothetical models. For the problem presented, the feed rate and feed concentration of a bioreactor system are available as control inputs. To verify calculated input profiles an interactive Web site (http://www.sysbio.de/projects/benchmark/) is provided. Several solutions based on linear and nonlinear models are discussed.

...read moreread less

Book Chapter•DOI•

A Comparison Between ACO Algorithms for the Set Covering Problem

[...]

Lucas Lessing¹, Irina Dumitrescu², Thomas Stützle¹•Institutions (2)

Technische Universität Darmstadt¹, HEC Montréal²

05 Sep 2004

TL;DR: The best performing ACO algorithms implemented, when combined with a fine-tuned local search procedure, reach excellent performance on a set of well known benchmark instances.

...read moreread less

Abstract: In this paper we present a study of several Ant Colony Optimization (ACO) algorithms for the Set Covering Problem. In our computational study we emphasize the influence of different ways of defining the heuristic information on the performance of the ACO algorithms. Finally, we show that the best performing ACO algorithms we implemented, when combined with a fine-tuned local search procedure, reach excellent performance on a set of well known benchmark instances.

...read moreread less

Book Chapter•DOI•

On Test Functions for Evolutionary Multi-objective Optimization

[...]

Tatsuya Okabe¹, Yaochu Jin¹, Markus Olhofer¹, Bernhard Sendhoff¹•Institutions (1)

Honda¹

18 Sep 2004

TL;DR: This paper presents a straightforward way to define benchmark problems with an arbitrary Pareto front both in the fitness and parameter spaces and introduces a difficulty measure based on the mapping of probability density functions from parameter to fitness space.

...read moreread less

Abstract: In order to evaluate the relative performance of optimization algorithms benchmark problems are frequently used. In the case of multi-objective optimization (MOO), we will show in this paper that most known benchmark problems belong to a constrained class of functions with piecewise linear Pareto fronts in the parameter space. We present a straightforward way to define benchmark problems with an arbitrary Pareto front both in the fitness and parameter spaces. Furthermore, we introduce a difficulty measure based on the mapping of probability density functions from parameter to fitness space. Finally, we evaluate two MOO algorithms for new benchmark problems.

...read moreread less

Journal Article•

On test functions for evolutionary multi-objective optimization

[...]

Tatsuya Okabe¹, Yaochu Jin¹, Markus Olhofer¹, Bernhard Sendhoff¹•Institutions (1)

Honda¹

01 Jan 2004-Lecture Notes in Computer Science

TL;DR: In this paper, the authors define benchmark problems with an arbitrary Pareto front both in the fitness and parameter spaces, and introduce a difficulty measure based on the mapping of probability density functions from parameter to fitness space.

...read moreread less

Journal Article•DOI•

Application of a genetic algorithm for optimal damper distribution within the nonlinear seismic benchmark building

[...]

N. Wongprasert, Michael D. Symans¹•Institutions (1)

Rensselaer Polytechnic Institute¹

15 Mar 2004-Journal of Engineering Mechanics-asce

TL;DR: In this paper, a genetic algorithm with integer representation was used to determine the optimal damper locations to control the seismic response of a 20-story benchmark building and the results from numerical simulations of the nonlinear benchmark building show that, depending on the objective function used, the optimal DAMPER locations can vary significantly.

...read moreread less

Abstract: This paper presents a systematic method for identifying the optimal damper distribution to control the seismic response of a 20-story benchmark building. A genetic algorithm with integer representation was used to determine the damper locations. Both H2- and H∞-norms of the linear system transfer function were utilized as the objective functions. Moreover, frequency weighting was incorporated into the objective functions so that the genetic algorithm emphasized minimization of the response in the second mode of vibration instead of the dominant first mode. The results from numerical simulations of the nonlinear benchmark building show that, depending on the objective function used, the optimal damper locations can vary significantly. However, most of dampers tend to be concentrated in the lowermost and uppermost stories. In general, the damper configurations evaluated herein performed well in terms of reducing the seismic response of the benchmark building in comparison to the uncontrolled building.

...read moreread less

Journal Article•DOI•

Custom-instruction synthesis for extensible-processor platforms

[...]

Fei Sun¹, Srivaths Ravi, Anand Raghunathan, Niraj K. Jha•Institutions (1)

Princeton University¹

30 Jan 2004-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: It is demonstrated that the number of custom instruction candidates grows rapidly with program size, leading to a large design space, and that the quality (speedup) of custom instructions varies significantly across this space, motivating the need for the proposed flow.

...read moreread less

Abstract: Efficiency and flexibility are critical, but often conflicting, design goals in embedded system design. The recent emergence of extensible processors promises a favorable tradeoff between efficiency and flexibility, while keeping design turnaround times short. Current extensible processor design flows automate several tedious tasks, but typically require designers to manually select the parts of the program that are to be implemented as custom instructions. In this work, we describe an automatic methodology to select custom instructions to augment an extensible processor, in order to maximize its efficiency for a given application program. We demonstrate that the number of custom instruction candidates grows rapidly with program size, leading to a large design space, and that the quality (speedup) of custom instructions varies significantly across this space, motivating the need for the proposed flow. Our methodology features cost functions to guide the custom instruction selection process, as well as static and dynamic pruning techniques to eliminate inferior parts of the design space from consideration. Furthermore, we employ a two-stage process, wherein a limited number of promising instruction candidates are first short-listed using efficient selection criteria, and then evaluated in more detail through cycle-accurate instruction set simulation and synthesis of the corresponding hardware, to identify the custom instruction combinations that result in the highest program speedup or maximize speedup under a given area constraint. We have evaluated the proposed techniques using a state-of-the-art extensible processor platform, in the context of a commercial design flow. Experiments with several benchmark programs indicate that custom processors synthesized using automatic custom instruction selection can result in large improvements in performance (up to 5.4/spl times/, an average of 3.4/spl times/), energy (up to 4.5/spl times/, an average of 3.2/spl times/), and energy-delay products (up to 24.2/spl times/, an average of 12.6/spl times/), while speeding up the design process significantly.

...read moreread less

MicroLib: A Case for the Quantitative Comparison of Micro-ArchitectureMechanisms

[...]

Daniel Gracia Perez, Gilles Mouchard, Olivier Temam

19 Jun 2004

TL;DR: This study is part of a broader effort, called MicroLib, an open library of modular simulators aimed at promoting the disclosure and sharing of simulator models, to outline that the lack of interoperable simulators and not disclosing simulators at publication time make it difficult to fairly assess the benefit of research ideas.

...read moreread less

Abstract: While most research papers on computer architectures include some performance measurements, these performance numbers tend to be distrusted. Up to the point that, after so many research articles on data cache architectures, for instance, few researchers have a clear view of what are the best data cache mechanisms. To illustrate the usefulness of a fair quantitative comparison, we have picked a target architecture component for which lots of optimizations have been proposed (data caches), and we have implemented most of the hardware data cache optimizations of the past 4 years in top conferences. Then we have ranked the different mechanisms, or more precisely, we have examined the impact of benchmark selection, process model precision,. . . on ranking, and obtained some surprising results. This study is part of a broader effort, called MicroLib, aimed at promoting the disclosure and sharing of simulator models.

...read moreread less

Journal Article•DOI•

The COST benchmark simulation model—current state and future perspective

[...]

Ulf Jeppsson, Marie-Noëlle Pons¹•Institutions (1)

Centre national de la recherche scientifique¹

01 Mar 2004-Control Engineering Practice

TL;DR: The COST benchmark as mentioned in this paper is a platform-independent simulation environment defining a plant layout, a simulation model, influent loads, test procedures and evaluation criteria, which has been developed in close collaboration with the IWA Task Group on Respirometry.

...read moreread less

Proceedings Article•DOI•

Using AspectJ to separate concerns in parallel scientific Java code

[...]

Bruno Harbulot¹, John R. Gurd¹•Institutions (1)

University of Manchester¹

22 Mar 2004

TL;DR: It is concluded that: (1) scientific software is rarely produced in true object-oriented style; and (2) the inherent loop structure of many scientific algorithms is incompatible with the join point philosophy of AspectJ.

...read moreread less

Abstract: Scientific software frequently demands high performance in order to execute complex models in acceptable time. A major means of obtaining high performance is via parallel execution on multi-processor systems. However, traditional methods of programming for parallel execution can lead to substantial code-tangling where the needs of the mathematical model crosscut with the concern of parallel execution.Aspect-Oriented Programming is an attractive technology for solving the problem of code-tangling in high performance parallel scientific software. The underlying mathematical model and the parallelism can be treated as separate concerns and programmed accordingly. Their elements of code can then be woven together to produce the final application. This paper investigates the extent to which AspectJ technology can be used to achieve the desired separation of concerns in programs from the Java Grande Forum benchmark suite, a set of test applications for evaluation of the performance of Java in the context of numerical computation. The paper analyses three different benchmark programs and classifies the degrees of difficulty in separating concerns within them in a form suitable for AspectJ. This leads to an assessment of the influence of the design of a numerical application on the ability of AspectJ to solve this kind of code-tangling problem. It is concluded that: (1) scientific software is rarely produced in true object-oriented style; and (2) the inherent loop structure of many scientific algorithms is incompatible with the join point philosophy of AspectJ.Since AspectJ cannot intercept the iterations of for-loops (which are at the heart of high-performance computing), various object-oriented models are proposed for describing (embarrassingly parallel) rectangular double-nested forloops that make it possible to use AspectJ for encapsulating parallelisation in an aspect. Finally, a test-case using these models is presented, together with performance results obtained on various Java Virtual Machines.

...read moreread less

Book Chapter•DOI•

Simple Feasibility Rules and Differential Evolution for Constrained Optimization

[...]

Efrén Mezura-Montes¹, Carlos A. Coello Coello¹, Edy I. Tun-Morales²•Institutions (2)

CINVESTAV¹, Villahermosa Institute of Technology²

26 Apr 2004

TL;DR: In this paper, the authors proposed a differential evolution algorithm to solve constrained optimization problems, which does not require any extra parameters other than those normally adopted by the Differential Evolution algorithm and uses three simple selection criteria based on feasibility to guide the search to the feasible region.

...read moreread less

Abstract: In this paper, we propose a differential evolution algorithm to solve constrained optimization problems. Our approach uses three simple selection criteria based on feasibility to guide the search to the feasible region. The proposed approach does not require any extra parameters other than those normally adopted by the Differential Evolution algorithm. The present approach was validated using test functions from a well-known benchmark commonly adopted to validate constraint-handling techniques used with evolutionary algorithms. The results obtained by the proposed approach are very competitive with respect to other constraint-handling techniques that are representative of the state-of-the-art in the area.

...read moreread less

Journal Article•DOI•

New Bayesian Model Updating Algorithm Applied to a Structural Health Monitoring Benchmark

[...]

Jianye Ching¹, James L. Beck¹•Institutions (1)

California Institute of Technology¹

01 Dec 2004-Structural Health Monitoring-an International Journal

TL;DR: The results show that the probabilistic approach is able to successfully detect and locate the simulated damage involving stiffness loss in the braces of the analytical benchmark model based on simulated ambient-vibration data.

...read moreread less

Abstract: A new two-step approach for probabilistic structural health monitoring is presented, which involves modal identification followed by damage assessment using the pre- and post-damage modal parameters based on a new Bayesian model updating algorithm. The new approach aims to attack the structural health monitoring problems with incomplete modeshape information by including the underlying full modeshapes of the system as extra random variables, and by employing the Expectation-Maximisation algorithm to determine the most probable parameter values. The non-concave non-linear optimisation problem associated with incomplete modeshape cases is converted into two coupled quadratic optimisation problems, so that the computation becomes simpler and more robust. We illustrate the new approach by analysing the Phase II Simulated Benchmark problems sponsored by the IASC-ASCE Task Group on Structural Health Monitoring. The results of the analysis show that the probabilistic approach is able to successfully detect and locate the simulated damage involving stiffness loss in the braces of the analytical benchmark model based on simulated ambient-vibration data.

...read moreread less

Journal Article•DOI•

Topomers: a validated protocol for their self-consistent generation.

[...]

Robert J. Jilek, Richard D. Cramer

22 May 2004-Journal of Chemical Information and Computer Sciences

TL;DR: A general protocol is detailed for deterministically generating self-consistent shapes of molecular fragments from their topologies and other extensions to the topomer methodology are validated by repetition of earlier benchmark studies.

...read moreread less

Abstract: The hypothesis underlying topomer development is that describing molecular structures consistently may be at least as productive as describing them more realistically but incompletely. A general protocol is detailed for deterministically generating self-consistent shapes of molecular fragments from their topologies. These and other extensions to the topomer methodology are validated by repetition of earlier benchmark studies.

...read moreread less

Journal Article•

Simple feasibility rules and differential evolution for constrained optimization

[...]

Efrén Mezura-Montes¹, Carlos A. Coello Coello¹, Edy I. Tun-Morales²•Institutions (2)

CINVESTAV¹, Villahermosa Institute of Technology²

01 Jan 2004-Lecture Notes in Computer Science

TL;DR: The present approach uses three simple selection criteria based on feasibility to guide the search to the feasible region and results obtained are very competitive with respect to other constraint-handling techniques that are representative of the state-of-the-art in the area.

...read moreread less

Proceedings Article•DOI•

Dynamic FPGA routing for just-in-time FPGA compilation

[...]

R. Lyseckya¹, Frank Vahid¹, Sheldon X.-D. Tan¹•Institutions (1)

University of California, Riverside¹

07 Jun 2004

TL;DR: The concept of a standard hardware binary is introduced, using a just-in-time compiler to compile the hardware binary to an FPGA, and the Riverside On-Chip Router (ROCR) designed to efficiently route a hardware circuit for a simple configurable logic fabric is presented.

...read moreread less

Abstract: Just-in-time (JIT) compilation has previously been used in many applications to enable standard software binaries to execute on different underlying processor architectures. However, embedded systems increasingly incorporate Field Programmable Gate Arrays (FPGAs), for which the concept of a standard hardware binary did not previously exist, requiring designers to implement a hardware circuit for a single specific FPGA. We introduce the concept of a standard hardware binary, using a just-in-time compiler to compile the hardware binary to an FPGA. A JIT compiler for FPGAs requires the development of lean versions of technology mapping, placement, and routing algorithms, of which routing is the most computationally and memory expensive step. We present the Riverside On-Chip Router (ROCR) designed to efficiently route a hardware circuit for a simple configurable logic fabric that we have developed. Through experiments with MCNC benchmark hardware circuits, we show that ROCR works well for JIT FPGA compilation, producing good hardware circuits using an order of magnitude less memory resources and execution time compared with the well known Versatile Place and Route (VPR) tool suite. ROCR produces good hardware circuits using 13X less memory and executing 10X faster than VPR's fastest routing algorithm. Furthermore, our results show ROCR requires only 10% additional routing resources, and results in circuit speeds only 32% slower than VPR's timing-driven router, and speeds that are actually 10% faster than VPR's routability-driven router.

...read moreread less

Collapse