scispace - formally typeset
Search or ask a question

Showing papers on "Benchmark (computing) published in 2009"


Proceedings ArticleDOI
04 Oct 2009
TL;DR: This characterization shows that the Rodinia benchmarks cover a wide range of parallel communication patterns, synchronization techniques and power consumption, and has led to some important architectural insight, such as the growing importance of memory-bandwidth limitations and the consequent importance of data layout.
Abstract: This paper presents and characterizes Rodinia, a benchmark suite for heterogeneous computing. To help architects study emerging platforms such as GPUs (Graphics Processing Units), Rodinia includes applications and kernels which target multi-core CPU and GPU platforms. The choice of applications is inspired by Berkeley's dwarf taxonomy. Our characterization shows that the Rodinia benchmarks cover a wide range of parallel communication patterns, synchronization techniques and power consumption, and has led to some important architectural insight, such as the growing importance of memory-bandwidth limitations and the consequent importance of data layout.

2,697 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: The Caltech Pedestrian Dataset is introduced, which is two orders of magnitude larger than existing datasets and proposes improved evaluation metrics, demonstrating that commonly used per-window measures are flawed and can fail to predict performance on full images.
Abstract: Pedestrian detection is a key problem in computer vision, with several applications including robotics, surveillance and automotive safety. Much of the progress of the past few years has been driven by the availability of challenging public datasets. To continue the rapid rate of innovation, we introduce the Caltech Pedestrian Dataset, which is two orders of magnitude larger than existing datasets. The dataset contains richly annotated video, recorded from a moving vehicle, with challenging images of low resolution and frequently occluded people. We propose improved evaluation metrics, demonstrating that commonly used per-window measures are flawed and can fail to predict performance on full images. We also benchmark several promising detection systems, providing an overview of state-of-the-art performance and a direct, unbiased comparison of existing methods. Finally, by analyzing common failure cases, we help identify future research directions for the field.

1,329 citations


Proceedings ArticleDOI
29 Jun 2009
TL;DR: A benchmark consisting of a collection of tasks that are run on an open source version of MR as well as on two parallel DBMSs shows a dramatic performance difference between the two paradigms.
Abstract: There is currently considerable enthusiasm around the MapReduce (MR) paradigm for large-scale data analysis [17]. Although the basic control flow of this framework has existed in parallel SQL database management systems (DBMS) for over 20 years, some have called MR a dramatically new computing model [8, 17]. In this paper, we describe and compare both paradigms. Furthermore, we evaluate both kinds of systems in terms of performance and development complexity. To this end, we define a benchmark consisting of a collection of tasks that we have run on an open source version of MR as well as on two parallel DBMSs. For each task, we measure each system's performance for various degrees of parallelism on a cluster of 100 nodes. Our results reveal some interesting trade-offs. Although the process to load data into and tune the execution of parallel DBMSs took much longer than the MR system, the observed performance of these DBMSs was strikingly better. We speculate about the causes of the dramatic performance difference and consider implementation concepts that future systems should take from both kinds of architectures.

1,188 citations


Journal ArticleDOI
TL;DR: A family of improved variants of the DE/target-to-best/1/bin scheme, which utilizes the concept of the neighborhood of each population member, and is shown to be statistically significantly better than or at least comparable to several existing DE variants as well as a few other significant evolutionary computing techniques over a test suite of 24 benchmark functions.
Abstract: Differential evolution (DE) is well known as a simple and efficient scheme for global optimization over continuous spaces. It has reportedly outperformed a few evolutionary algorithms (EAs) and other search heuristics like the particle swarm optimization (PSO) when tested over both benchmark and real-world problems. DE, however, is not completely free from the problems of slow and/or premature convergence. This paper describes a family of improved variants of the DE/target-to-best/1/bin scheme, which utilizes the concept of the neighborhood of each population member. The idea of small neighborhoods, defined over the index-graph of parameter vectors, draws inspiration from the community of the PSO algorithms. The proposed schemes balance the exploration and exploitation abilities of DE without imposing serious additional burdens in terms of function evaluations. They are shown to be statistically significantly better than or at least comparable to several existing DE variants as well as a few other significant evolutionary computing techniques over a test suite of 24 benchmark functions. The paper also investigates the applications of the new DE variants to two real-life problems concerning parameter estimation for frequency modulated sound waves and spread spectrum radar poly-phase code design.

1,086 citations


Journal ArticleDOI
TL;DR: The basic ideas behind the previous benchmark are extended to generate directed and weighted networks with built-in community structure, and the possibility that nodes belong to more communities is considered, a feature occurring in real systems, such as social networks.
Abstract: Many complex networks display a mesoscopic structure with groups of nodes sharing many links with the other nodes in their group and comparatively few with nodes of different groups. This feature is known as community structure and encodes precious information about the organization and the function of the nodes. Many algorithms have been proposed but it is not yet clear how they should be tested. Recently we have proposed a general class of undirected and unweighted benchmark graphs, with heterogeneous distributions of node degree and community size. An increasing attention has been recently devoted to develop algorithms able to consider the direction and the weight of the links, which require suitable benchmark graphs for testing. In this paper we extend the basic ideas behind our previous benchmark to generate directed and weighted networks with built-in community structure. We also consider the possibility that nodes belong to more communities, a feature occurring in real systems, such as social networks. As a practical application, we show how modularity optimization performs on our benchmark.

963 citations


Journal ArticleDOI
TL;DR: A novel optimization algorithm, group search optimizer (GSO), which is inspired by animal behavior, especially animal searching behavior, and has competitive performance to other EAs in terms of accuracy and convergence speed, especially on high-dimensional multimodal problems.
Abstract: Nature-inspired optimization algorithms, notably evolutionary algorithms (EAs), have been widely used to solve various scientific and engineering problems because of to their simplicity and flexibility. Here we report a novel optimization algorithm, group search optimizer (GSO), which is inspired by animal behavior, especially animal searching behavior. The framework is mainly based on the producer-scrounger model, which assumes that group members search either for ldquofindingrdquo (producer) or for ldquojoiningrdquo (scrounger) opportunities. Based on this framework, concepts from animal searching behavior, e.g., animal scanning mechanisms, are employed metaphorically to design optimum searching strategies for solving continuous optimization problems. When tested against benchmark functions, in low and high dimensions, the GSO algorithm has competitive performance to other EAs in terms of accuracy and convergence speed, especially on high-dimensional multimodal problems. The GSO algorithm is also applied to train artificial neural networks. The promising results on three real-world benchmark problems show the applicability of GSO for problem solving.

658 citations


Journal ArticleDOI
27 Jul 2009
TL;DR: The results suggest that people are remarkably consistent in the way that they segment most 3D surface meshes, that no one automatic segmentation algorithm is better than the others for all types of objects, and that algorithms based on non-local shape features seem to produce segmentations that most closely resemble ones made by humans.
Abstract: This paper describes a benchmark for evaluation of 3D mesh segmentation salgorithms. The benchmark comprises a data set with 4,300 manually generated segmentations for 380 surface meshes of 19 different object categories, and it includes software for analyzing 11 geometric properties of segmentations and producing 4 quantitative metrics for comparison of segmentations. The paper investigates the design decisions made in building the benchmark, analyzes properties of human-generated and computer-generated segmentations, and provides quantitative comparisons of 7 recently published mesh segmentation algorithms. Our results suggest that people are remarkably consistent in the way that they segment most 3D surface meshes, that no one automatic segmentation algorithm is better than the others for all types of objects, and that algorithms based on non-local shape features seem to produce segmentations that most closely resemble ones made by humans.

592 citations


01 Jan 2009
TL;DR: The testbed of noise-free functions is defined and motivated, and the participants' favorite black-box real-parameter optimizer in a few dimensions a few hundreds of times and execute the provided post-processing script afterwards.
Abstract: Quantifying and comparing performance of optimization algorithms is one important aspect of research in search and optimization. However, this task turns out to be tedious and difficult to realize even in the single-objective case -- at least if one is willing to accomplish it in a scientifically decent and rigorous way. The BBOB 2009 workshop will furnish most of this tedious task for its participants: (1) choice and implementation of a well-motivated real-parameter benchmark function testbed, (2) design of an experimental set-up, (3) generation of data output for (4) post-processing and presentation of the results in graphs and tables. What remains to be done for the participants is to allocate CPU-time, run their favorite black-box real-parameter optimizer in a few dimensions a few hundreds of times and execute the provided post-processing script afterwards. In this report, the testbed of noise-free functions is defined and motivated.

521 citations


Journal ArticleDOI
TL;DR: In this article, benchmark configurations for quantitative validation and comparison of incompressible interfacial flow codes, which model two-dimensional bubbles rising in liquid columns, are proposed, and the benchmark quantities: circularity, center of mass, and mean rise velocity are defined and measured to monitor convergence toward a reference solution.
Abstract: Benchmark configurations for quantitative validation and comparison of incompressible interfacial flow codes, which model two-dimensional bubbles rising in liquid columns, are proposed. The benchmark quantities: circularity, center of mass, and mean rise velocity are defined and measured to monitor convergence toward a reference solution. Comprehensive studies are undertaken by three independent research groups, two representing Eulerian level set finite-element codes and one representing an arbitrary Lagrangian-Eulerian moving grid approach.

486 citations


Journal ArticleDOI
TL;DR: This work investigates diagnostic accuracy as a function of several parameters (such as quality and quantity of the program spectra collected during the execution of the system) and shows that SFL can effectively be applied in the context of embedded software development in an industrial environment.

443 citations


Journal ArticleDOI
TL;DR: In this work, a seeker optimization algorithm (SOA)-based reactive power dispatch method is proposed, based on the concept of simulating the act of human searching, which is superior to the other listed algorithms and can be efficiently used for optimal reactivePower dispatch.
Abstract: Optimal reactive power dispatch problem in power systems has thrown a growing influence on secure and economical operation of power systems. However, this issue is well known as a nonlinear, multimodal and mixed-variable problem. In the last decades, computation intelligence-based techniques, such as genetic algorithms (GAs), differential evolution (DE) algorithms and particle swarm optimization (PSO) algorithms, etc., have often been used for this aim. In this work, a seeker optimization algorithm (SOA)-based reactive power dispatch method is proposed. The SOA is based on the concept of simulating the act of human searching, where the search direction is based on the empirical gradient by evaluating the response to the position changes and the step length is based on uncertainty reasoning by using a simple Fuzzy rule. In this study, the algorithm's performance is evaluated on benchmark function optimization. Then, the SOA is applied to optimal reactive power dispatch on standard IEEE 57- and 118-bus power systems, and compared with conventional nonlinear programming method, two versions of GAs, three versions of DE algorithms and four versions of PSO algorithms. The simulation results show that the proposed approach is superior to the other listed algorithms and can be efficiently used for optimal reactive power dispatch.

Book
26 May 2009
TL;DR: This paper presents a component-wise decomposition of such an interdisciplinary multimedia system, covering influences from information retrieval, computer vision, machine learning, and human–computer interaction and lays down the anatomy of a concept-based video search engine.
Abstract: In this paper, we review 300 references on video retrieval, indicating when text-only solutions are unsatisfactory and showing the promising alternatives which are in majority concept-based. Therefore, central to our discussion is the notion of a semantic concept: an objective linguistic description of an observable entity. Specifically, we present our view on how its automated detection, selection under uncertainty, and interactive usage might solve the major scientific problem for video retrieval: the semantic gap. To bridge the gap, we lay down the anatomy of a concept-based video search engine. We present a component-wise decomposition of such an interdisciplinary multimedia system, covering influences from information retrieval, computer vision, machine learning, and human–computer interaction. For each of the components we review state-of-the-art solutions in the literature, each having different characteristics and merits. Because of these differences, we cannot understand the progress in video retrieval without serious evaluation efforts such as carried out in the NIST TRECVID benchmark. We discuss its data, tasks, results, and the many derived community initiatives in creating annotations and baselines for repeatable experiments. We conclude with our perspective on future challenges and opportunities.

01 Jan 2009
TL;DR: This workQuantifying and comparing performance of numerical optimization algorithms is an important aspect of research in search and optimization but this task turns out to be tedious and dicult.
Abstract: Quantifying and comparing performance of optimization algorithms is one important aspect of research in search and optimization. However, this task turns out to be tedious and difficult to realize even in the single-objective case -- at least if one is willing to accomplish it in a scientifically decent and rigorous way. The BBOB 2009 workshop will furnish most of this tedious task for its participants: (1) choice and implementation of a well-motivated single-objective benchmark function testbed, (2) design of an experimental set-up, (3) generation of data output for (4) post-processing and presentation of the results in graphs and tables. What remains to be done for the participants is to allocate CPU-time, run their favorite black-box real-parameter optimizer in a few dimensions a few hundreds of times and execute the provided post-processing script afterwards. Here, the experimental procedure and data formats are thoroughly defined and motivated and the data presentation is touched on.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: This paper evaluated several matting methods with their benchmark and show that their performance varies depending on the error function, and reveals problems of existing algorithms, not reflected in previously reported results.
Abstract: The availability of quantitative online benchmarks for low-level vision tasks such as stereo and optical flow has led to significant progress in the respective fields. This paper introduces such a benchmark for image matting. There are three key factors for a successful benchmarking system: (a) a challenging, high-quality ground truth test set; (b) an online evaluation repository that is dynamically updated with new results; (c) perceptually motivated error functions. Our new benchmark strives to meet all three criteria. We evaluated several matting methods with our benchmark and show that their performance varies depending on the error function. Also, our challenging test set reveals problems of existing algorithms, not reflected in previously reported results. We hope that our effort will lead to considerable progress in the field of image matting, and welcome the reader to visit our benchmark at www.aIphamatting.com.

Proceedings ArticleDOI
29 Mar 2009
TL;DR: SP^2Bench as discussed by the authors is a publicly available, language-specific SPARQL performance benchmark for RDF query language, which is based on the DBLP scenario and comprises both a data generator for creating arbitrarily large DQLP-like documents and a set of carefully designed benchmark queries.
Abstract: Recently, the SPARQL query language for RDF has reached the W3C recommendation status. In response to this emerging standard, the database community is currently exploring efficient storage techniques for RDF data and evaluation strategies for SPARQL queries. A meaningful analysis and comparison of these approaches necessitates a comprehensive and universal benchmark platform. To this end, we have developed SP^2Bench, a publicly available, language-specific SPARQL performance benchmark. SP^2Bench is settled in the DBLP scenario and comprises both a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries. The generated documents mirror key characteristics and social-world distributions encountered in the original DBLP data set, while the queries implement meaningful requests on top of this data, covering a variety of SPARQL operator constellationsand RDF access patterns. As a proof of concept, we apply SP^2Bench to existing engines and discuss their strengths and weaknesses that follow immediately from the benchmark results

Journal ArticleDOI
TL;DR: This paper summarises the results of a benchmark study that compares a number of mathematical and numerical models applied to specific problems in the context of carbon dioxide (CO2) storage in geologic formations.
Abstract: This paper summarises the results of a benchmark study that compares a number of mathematical and numerical models applied to specific problems in the context of carbon dioxide (CO2) storage in geologic formations. The processes modelled comprise advective multi-phase flow, compositional effects due to dissolution of CO2 into the ambient brine and non-isothermal effects due to temperature gradients and the Joule–Thompson effect. The problems deal with leakage through a leaky well, methane recovery enhanced by CO2 injection and a reservoir-scale injection scenario into a heterogeneous formation. We give a description of the benchmark problems then briefly introduce the participating codes and finally present and discuss the results of the benchmark study.

Journal ArticleDOI
TL;DR: A framework for analyzing the results of a SLAM approach based on a metric for measuring the error of the corrected trajectory is proposed, which overcomes serious shortcomings of approaches using a global reference frame to compute the error.
Abstract: In this paper, we address the problem of creating an objective benchmark for evaluating SLAM approaches. We propose a framework for analyzing the results of a SLAM approach based on a metric for measuring the error of the corrected trajectory. This metric uses only relative relations between poses and does not rely on a global reference frame. This overcomes serious shortcomings of approaches using a global reference frame to compute the error. Our method furthermore allows us to compare SLAM approaches that use different estimation techniques or different sensor modalities since all computations are made based on the corrected trajectory of the robot. We provide sets of relative relations needed to compute our metric for an extensive set of datasets frequently used in the robotics community. The relations have been obtained by manually matching laser-range observations to avoid the errors caused by matching algorithms. Our benchmark framework allows the user to easily analyze and objectively compare different SLAM approaches.

Proceedings ArticleDOI
08 Jun 2009
TL;DR: Adagio is presented, a novel runtime system that makes DVS practical for complex, real-world scientific applications by incurring only negligible delay while achieving significant energy savings.
Abstract: Power and energy are first-order design constraints in high performance computing. Current research using dynamic voltage scaling (DVS) relies on trading increased execution time for energy savings, which is unacceptable for most high performance computing applications. We present Adagio, a novel runtime system that makes DVS practical for complex, real-world scientific applications by incurring only negligible delay while achieving significant energy savings. Adagio improves and extends previous state-of-the-art algorithms by combining the lessons learned from static energy-reducing CPU scheduling with a novel runtime mechanism for slack prediction. We present results using Adagio for two real-world programs, UMT2K and ParaDiS, along with the NAS Parallel Benchmark suite. While requiring no modification to the application source code, Adagio provides total system energy savings of 8% and 20% for UMT2K and ParaDiS, respectively, with less than 1% increase in execution time.

30 Jun 2009
TL;DR: This benchmark model deals with the wind turbine on a system level, and it includes sensor, actuator, and system faults, namely faults in the pitch system, the drive train, the generator, and the converter system.
Abstract: This paper presents a test benchmark model for the evaluation of fault detection and accommodation schemes. This benchmark model deals with the wind turbine on a system level, and it includes sensor, actuator, and system faults, namely faults in the pitch system, the drive train, the generator, and the converter system. Since it is a system-level model, converter and pitch system models are simplified because these are controlled by internal controllers working at higher frequencies than the system model. The model represents a three-bladed pitch-controlled variable-speed wind turbine with a nominal power of 4.8 MW. The fault detection and isolation (FDI) problem was addressed by several teams, and five of the solutions are compared in the second part of this paper. This comparison relies on additional test data in which the faults occur in different operating conditions than in the test data used for the FDI design.

Journal ArticleDOI
TL;DR: This paper proposes CHStone, a suite of benchmark programs for C-based high-level synthesis, which consists of a dozen of large, easy-to-use programs written in C, which are selected from various application domains.
Abstract: In general, standard benchmark suites are critically important for researchers to quantitatively evaluate their new ideas and algorithms. This paper proposes CHStone, a suite of benchmark programs for C-based high-level synthesis. CHStone consists of a dozen of large, easy-to-use programs written in C, which are selected from various application domains. This paper also analyzes the characteristics of the CHStone benchmark programs, which will be valuable for researchers to use CHStone for the evaluation of their new techniques. In addition, we present future challenges to be solved towards the practical high-level synthesis.

Proceedings ArticleDOI
14 Nov 2009
TL;DR: This work investigates the design and scalability of work stealing on modern distributed memory systems and demonstrates high efficiency and low overhead when scaling to 8,192 processors for three benchmark codes: a producer-consumer benchmark, the unbalanced tree search benchmark, and a multiresolution analysis kernel.
Abstract: Irregular and dynamic parallel applications pose significant challenges to achieving scalable performance on large-scale multicore clusters. These applications often require ongoing, dynamic load balancing in order to maintain efficiency. Scalable dynamic load balancing on large clusters is a challenging problem which can be addressed with distributed dynamic load balancing systems. Work stealing is a popular approach to distributed dynamic load balancing; however its performance on large-scale clusters is not well understood. Prior work on work stealing has largely focused on shared memory machines. In this work we investigate the design and scalability of work stealing on modern distributed memory systems. We demonstrate high efficiency and low overhead when scaling to 8,192 processors for three benchmark codes: a producer-consumer benchmark, the unbalanced tree search benchmark, and a multiresolution analysis kernel.

Journal ArticleDOI
TL;DR: A benchmark model for simulation of fault detection and accommodation schemes of the wind turbine on a system level containing sensors, actuators and systems faults in the pitch system, drive train, generator and converter system is presented.

Proceedings ArticleDOI
15 Jun 2009
TL;DR: A new resource management scheme that integrates the Kalman filter into feedback controllers to dynamically allocate CPU resources to virtual machines hosting server applications and is enhanced to deal with multi-tier server applications.
Abstract: Data center virtualization allows cost-effective server consolidation which can increase system throughput and reduce power consumption. Resource management of virtualized servers is an important and challenging task, especially when dealing with fluctuating workloads and complex multi-tier server applications. Recent results in control theory-based resource management have shown the potential benefits of adjusting allocations to match changing workloads.This paper presents a new resource management scheme that integrates the Kalman filter into feedback controllers to dynamically allocate CPU resources to virtual machines hosting server applications. The novelty of our approach is the use of the Kalman filter-the optimal filtering technique for state estimation in the sum of squares sense-to track the CPU utilizations and update the allocations accordingly. Our basic controllers continuously detect and self-adapt to unforeseen workload intensity changes.Our more advanced controller self-configures itself to any workload condition without any a priori information. Indicatively, it results in within 4.8% of the performance of workload-aware controllers under high intensity workload changes, and performs equally well under medium intensity traffic. In addition, our controllers are enhanced to deal with multi-tier server applications: by using the pair-wise resource coupling between application components, they provide a 3% on average server performance improvement when facing large unexpected workload increases when compared to controllers with no such resource-coupling mechanism. We evaluate our techniques by controlling a 3-tier Rubis benchmark web site deployed on a prototype Xen-virtualized cluster.

Journal ArticleDOI
TL;DR: This study explores the stress points of the EEMBC embedded benchmark suite using the benchmark characterization technique to understand how benchmarks stress the processors that they aim to test.
Abstract: Benchmark consumers expect benchmark suites to be complete, accurate, and consistent, and benchmark scores serve as relative measures of performance. however, it is important to understand how benchmarks stress the processors that they aim to test. this study explores the stress points of the EEMBC embedded benchmark suite using the benchmark characterization technique.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: A chip-level power control algorithm that is systematically designed based on optimal control theory that can precisely control the power of a CMP chip to the desired set point while maintaining the temperature of each core below a specified threshold is proposed.
Abstract: As chip multiprocessors (CMP) become the main trend in processor development, various power and thermal management strategies have recently been proposed to optimize system performance while controlling the power or temperature of a CMP chip to stay below a constraint. The availability of per-core DVFS (dynamic voltage and frequency scaling) also makes it possible to develop advanced management strategies. However, most existing solutions rely on open-loop search or optimization with the assumption that power can be estimated accurately, while others adopt oversimplified feedback control strategies to control power and temperature separately, without any theoretical guarantees. In this paper, we propose a chip-level power control algorithm that is systematically designed based on optimal control theory. Our algorithm can precisely control the power of a CMP chip to the desired set point while maintaining the temperature of each core below a specified threshold. Furthermore, an online model estimator is designed to achieve analytical assurance of control accuracy and system stability, even in the face of significant workload variations or unpredictable chip or core variations. Empirical results on a physical testbed show that our controller outperforms two state-of-the-art control algorithms by having better SPEC benchmark performance and more precise power control. In addition, extensive simulation results demonstrate the efficacy of our algorithm for various CMP configurations.

Proceedings ArticleDOI
You Zhou1, Ying Tan1
18 May 2009
TL;DR: A novel parallel approach to run standard particle swarm optimization (SPSO) on Graphic Processing Unit (GPU) is presented, which shows special speed advantages on large swarm population applications and hign dimensional problems, which can be widely used in real optimizing problems.
Abstract: A novel parallel approach to run standard particle swarm optimization (SPSO) on Graphic Processing Unit (GPU) is presented in this paper. By using the general-purpose computing ability of GPU and based on the software platform of Compute Unified Device Architecture (CUDA) from NVIDIA, SPSO can be executed in parallel on GPU. Experiments are conducted by running SPSO both on GPU and CPU, respectively, to optimize four benchmark test functions. The running time of the SPSO based on GPU (GPU-SPSO) is greatly shortened compared to that of the SPSO on CPU (CPU-SPSO). Running speed of GPU-SPSO can be more than 11 times as fast as that of CPU-SPSO, with the same performance. compared to CPU-SPSO, GPU-SPSO shows special speed advantages on large swarm population applications and hign dimensional problems, which can be widely used in real optimizing problems.

Journal ArticleDOI
01 May 2009
TL;DR: An adaptive penalty function for solving constrained optimization problems using genetic algorithms that is able to find very good solutions comparable to the chosen state-of-the-art designs.
Abstract: This paper proposes an adaptive penalty function for solving constrained optimization problems using genetic algorithms. The proposed method aims to exploit infeasible individuals with low objective value and low constraint violation. The number of feasible individuals in the population is used to guide the search process either toward finding more feasible individuals or searching for the optimum solution. The proposed method is simple to implement and does not need any parameter tuning. The performance of the algorithm is tested on 22 benchmark functions in the literature. The results show that the proposed approach is able to find very good solutions comparable to the chosen state-of-the-art designs. Furthermore, it is able to find feasible solutions in every run for all of the benchmark functions tested.

Journal ArticleDOI
TL;DR: This work proposes a robust termination condition for GPU-SF based on a filtered approximation of the normalized stress function, and shows that the performance of Glimmer on GPUs is substantially faster than a CPU implementation of the same algorithm.
Abstract: We present Glimmer, a new multilevel algorithm for multidimensional scaling designed to exploit modern graphics processing unit (GPU) hardware. We also present GPU-SF, a parallel, force-based subsystem used by Glimmer. Glimmer organizes input into a hierarchy of levels and recursively applies GPU-SF to combine and refine the levels. The multilevel nature of the algorithm makes local minima less likely while the GPU parallelism improves speed of computation. We propose a robust termination condition for GPU-SF based on a filtered approximation of the normalized stress function. We demonstrate the benefits of Glimmer in terms of speed, normalized stress, and visual quality against several previous algorithms for a range of synthetic and real benchmark datasets. We also show that the performance of Glimmer on GPUs is substantially faster than a CPU implementation of the same algorithm.

Journal ArticleDOI
TL;DR: The experimental results show that integrating the hybrid evolutionary algorithm with the adaptive constraint-handling technique is beneficial, and the proposed method achieves competitive performance with respect to some other state-of-the-art approaches in constrained evolutionary optimization.
Abstract: A novel approach to deal with numerical and engineering constrained optimization problems, which incorporates a hybrid evolutionary algorithm and an adaptive constraint-handling technique, is presented in this paper. The hybrid evolutionary algorithm simultaneously uses simplex crossover and two mutation operators to generate the offspring population. Additionally, the adaptive constraint-handling technique consists of three main situations. In detail, at each situation, one constraint-handling mechanism is designed based on current population state. Experiments on 13 benchmark test functions and four well-known constrained design problems verify the effectiveness and efficiency of the proposed method. The experimental results show that integrating the hybrid evolutionary algorithm with the adaptive constraint-handling technique is beneficial, and the proposed method achieves competitive performance with respect to some other state-of-the-art approaches in constrained evolutionary optimization.

Proceedings ArticleDOI
18 May 2009
TL;DR: This paper investigates the use of cloud computing for high-performance numerical applications and assumes unlimited monetary resources to answer the question, "How high can a cloud computing service get in the TOP500 list?"
Abstract: Computing as a utility has reached the mainstream. Scientists can now rent time on large commercial clusters through several vendors. The cloud computing model provides flexible support for "pay as you go" systems. In addition to no upfront investment in large clusters or supercomputers, such systems incur no maintenance costs. Furthermore, they can be expanded and reduced on-demand in real-time.Current cloud computing performance falls short of systems specifically designed for scientific applications. Scientific computing needs are quite different from those of web applications--composed primarily of database queries--that have been the focus of cloud computing vendors.In this paper we investigate the use of cloud computing for high-performance numerical applications. In particular, we assume unlimited monetary resources to answer the question, "How high can a cloud computing service get in the TOP500 list?" We show results for the Linpack benchmark on different allocations on Amazon EC2.