scispace - formally typeset
Search or ask a question

Showing papers on "Benchmark (computing) published in 2007"


Journal ArticleDOI
TL;DR: Three different uses of a recurrent neural network as a reservoir that is not trained but instead read out by a simple external classification layer are compared and a new measure for the reservoir dynamics based on Lyapunov exponents is introduced.

930 citations


Proceedings ArticleDOI
10 Sep 2007
TL;DR: This work investigates diagnostic accuracy as a function of several parameters (such as quality and quantity of the program spectra collected during the execution of the system), some of which directly relate to test design, and indicates that the superior performance of a particular similarity coefficient, used to analyze the programSpectrum-based fault localization, is largely independent of test design.
Abstract: Spectrum-based fault localization shortens the test- diagnose-repair cycle by reducing the debugging effort. As a light-weight automated diagnosis technique it can easily be integrated with existing testing schemes. However, as no model of the system is taken into account, its diagnostic accuracy is inherently limited. Using the Siemens Set benchmark, we investigate this diagnostic accuracy as a function of several parameters (such as quality and quantity of the program spectra collected during the execution of the system), some of which directly relate to test design. Our results indicate that the superior performance of a particular similarity coefficient, used to analyze the program spectra, is largely independent of test design. Furthermore, near- optimal diagnostic accuracy (exonerating about 80% of the blocks of code on average) is already obtained for low-quality error observations and limited numbers of test cases. The influence of the number of test cases is of primary importance for continuous (embedded) processing applications, where only limited observation horizons can be maintained.

686 citations


Journal ArticleDOI
TL;DR: This work presents ClassBench, a suite of tools for benchmarking packet classification algorithms and devices and seeks to eliminate the significant access barriers to realistic test vectors for researchers and initiate a broader discussion to guide the refinement of the tools and codification of a formal benchmarking methodology.
Abstract: Packet classification is an enabling technology for next generation network services and often a performance bottleneck in high-performance routers. The performance and capacity of many classification algorithms and devices, including TCAMs, depend upon properties of filter sets and query patterns. Despite the pressing need, no standard performance evaluation tools or filter sets are publicly available. In response to this problem, we present ClassBench, a suite of tools for benchmarking packet classification algorithms and devices. ClassBench includes a filter set generator that produces synthetic filter sets that accurately model the characteristics of real filter sets. Along with varying the size of the filter sets, we provide high-level control over the composition of the filters in the resulting filter set. The tool suite also includes a trace generator that produces a sequence of packet headers to exercise packet classification algorithms with respect to a given filter set. Along with specifying the relative size of the trace, we provide a simple mechanism for controlling locality of reference. While we have already found ClassBench to be very useful in our own research, we seek to eliminate the significant access barriers to realistic test vectors for researchers and initiate a broader discussion to guide the refinement of the tools and codification of a formal benchmarking methodology. (The ClassBench tools are publicly available at the following site: http://www.arl.wustl.edu/~det3/ClassBench/.)

478 citations


Proceedings ArticleDOI
25 Apr 2007
TL;DR: Why PTLsim's x86 focus is highly relevant, and the full system simulation results are used to demonstrate the pitfalls of userspace only simulation, are described.
Abstract: In this paper, we introduce PTLsim, a cycle accurate full system x86-64 microprocessor simulator and virtual machine. PTLsim models a modern superscalar out of order x86-64 processor core at a configurable level of detail ranging from RTL-level models of all key pipeline structures, caches and devices up to full-speed native execution on the host CPU. Unlike other microarchitectural simulators, PTLsim targets the real commercially available x86 ISA, rather than a discontinued architecture with limited tools and an uncertain future. PTLsim supports several flavors: a single threaded userspace version and a full system version providing an SMT model and the infrastructure for multi-core support. We first describe what it takes to perform cycle accurate modeling of a complete x86 machine at the muop (micro-operation) level, along with the challenges and requirements for effective full system multi-processor capable simulation. We then describe the internal architecture of full system PTLsim and how it interacts with the Xen hypervisor and PTLsim's native mode co-simulation technology. We experimentally evaluate PTLsim's real world accuracy by configuring it like an AMD Athlon 64 machine before running a demanding full system client-server networked benchmark inside PTLsim. We compare the statistics generated by our model with the actual numbers from the real processor to demonstrate PTLsim is accurate to within 5% across all major parameters. We provide a discussion of prior simulation tools, along with their strengths and weaknesses. We describe why PTLsim's x86 focus is highly relevant, and we use our full system simulation results to demonstrate the pitfalls of userspace only simulation. Finally, we conclude by detailing future work

389 citations


Proceedings ArticleDOI
25 Apr 2007
TL;DR: This paper identifies clusters of applications that generate certain types of performance interference and develops mathematical models to predict the performance of a new application from its workload characteristics, able to predict performance with average error.
Abstract: Virtualization is an essential technology in modern datacenters Despite advantages such as security isolation, fault isolation, and environment isolation, current virtualization techniques do not provide effective performance isolation between virtual machines (VMs) Specifically, hidden contention for physical resources impacts performance differently in different workload configurations, causing significant variance in observed system throughput To this end, characterizing workloads that generate performance interference is important in order to maximize overall utility In this paper, we study the effects of performance interference by looking at system-level workload characteristics In a physical host, we allocate two VMs, each of which runs a sample application chosen from a wide range of benchmark and real-world workloads For each combination, we collect performance metrics and runtime characteristics using an instrumented Ken hypervisor Through subsequent analysis of collected data, we identify clusters of applications that generate certain types of performance interference Furthermore, we develop mathematical models to predict the performance of a new application from its workload characteristics Our evaluation shows our techniques were able to predict performance with average error of approximately 5%

350 citations


Proceedings ArticleDOI
11 Jun 2007
TL;DR: This work applies a regression-based approximation of the CPU demand of client transactions on a given hardware to an analytic model of a simple network of queues, each queue representing a tier, and shows the approximation's effectiveness for modeling diverse workloads with a changing transaction mix over time.
Abstract: The multi-tier implementation has become the industry standard for developing scalable client-server enterprise applications. Since these applications are performance sensitive, effective models for dynamic resource provisioning and for delivering quality of service to these applications become critical. Workloads in such environments are characterized by client sessions of interdependent requests with changing transaction mix and load over time, making model adaptivity to the observed workload changes a critical requirement for model effectiveness. In this work, we apply a regression-based approximation of the CPU demand of client transactions on a given hardware. Then we use this approximation in an analytic model of a simple network of queues, each queue representing a tier, and show the approximation's effectiveness for modeling diverse workloads with a changing transaction mix over time. Using the TPC- W benchmark and its three different transaction mixes we investigate factors that impact the efficiency and accuracy of the proposed performance prediction models. Experimental results show that this regression-based approach provides a simple and powerful solution for efficient capacity planning and resource provisioning of multi-tier applications under changing workload conditions.

289 citations


Proceedings ArticleDOI
11 Jun 2007
TL;DR: This work proposes and motivate JouleSort, an external sort benchmark, for evaluating the energy efficiency of a wide range of computer systems from clusters to handhelds, and demonstrates a Joule sort system that is over 3.5x as energy-efficient as last year's estimated winner.
Abstract: The energy efficiency of computer systems is an important concern in a variety of contexts. In data centers, reducing energy use improves operating cost, scalability, reliability, and other factors. For mobile devices, energy consumption directly affects functionality and usability. We propose and motivate JouleSort, an external sort benchmark, for evaluating the energy efficiency of a wide range of computer systems from clusters to handhelds. We list the criteria, challenges, and pitfalls from our experience in creating a fair energy-efficiency benchmark. Using a commercial sort, we demonstrate a JouleSort system that is over 3.5x as energy-efficient as last year's estimated winner. This system is quite different from those currently used in data centers. It consists of a commodity mobile CPU and 13 laptop drives connected by server-style I/O interfaces.

278 citations


Proceedings ArticleDOI
03 Dec 2007
TL;DR: Experimental results from a 17-server farm running the industry standard TPC-W e-commerce benchmark show that co-adaptation renders a cut-down in energy consumption by more than 50%, when workload is not high, while maintaining latency within acceptable bounds.
Abstract: The increased complexity of performance-sensitive software systems leads to increased use of automated adaptation policies in lieu of manual performance tuning. Composition of adaptive components into larger adaptive systems, however, presents challenges that arise from potential incompatibilities among the respective adaptation policies. Consequently, unstable or poorly-tuned feedback loops may result that cause performance deterioration. This paper (i) presents a mechanism, called adaptation graph analysis, for identifying potential incompatibilities between composed adaptation policies and (ii) illustrates a general design methodology for co-adaptation that resolves such incompatibilities. Our results are demonstrated by a case study on energy minimization in multi-tier Web server farms subject to soft real-time constraints. Two independently efficient energy saving policies (an on/off policy that switches machines off when not needed and a dynamic voltage scaling policy) are shown to conflict leading to increased energy consumption when combined. Our adaptation graph analysis predicts the problem, and our co-adaptation design methodology finds a solution that improves performance. Experimental results from a 17-server farm running the industry standard TPC-W e-commerce benchmark show that co-adaptation renders a cut-down in energy consumption by more than 50%, when workload is not high, while maintaining latency within acceptable bounds. The paper serves as a proof of concept of the proposed conflict-identification and resolution methodology and an invitation to further investigate a science for composing adaptive systems.

268 citations


Journal ArticleDOI
TL;DR: Results show that all three Subset Simulation methods are effective in high-dimensional problems and that some computational efficiency can be gained by adopting the splitting and hybrid strategies when calculating the reliability for the first-passage benchmark problems.

229 citations


Journal ArticleDOI
15 Feb 2007
TL;DR: This paper presents differential evolution algorithms, which use different adaptive or self-adaptive mechanisms applied to the control parameters, and detailed performance comparisons of these algorithms on the benchmark functions are outlined.
Abstract: Differential evolution (DE) has been shown to be a simple, yet powerful, evolutionary algorithm for global optimization for many real problems. Adaptation, especially self-adaptation, has been found to be highly beneficial for adjusting control parameters, especially when done without any user interaction. This paper presents differential evolution algorithms, which use different adaptive or self-adaptive mechanisms applied to the control parameters. Detailed performance comparisons of these algorithms on the benchmark functions are outlined.

227 citations


Proceedings ArticleDOI
21 Mar 2007
TL;DR: STMBench7 is presented: a candidate benchmark for evaluating STM implementations and illustrated with an evaluation of a well-known software transactional memory implementation.
Abstract: Software transactional memory (STM) is a promising technique for controlling concurrency in modern multi-processor architectures. STM aims to be more scalable than explicit coarse-grained locking and easier to use than fine-grained locking. However, STM implementations have yet to demonstrate that their runtime overheads are acceptable. To date, empiric evaluations of these implementations have suffered from the lack of realistic benchmarks. Measuring performance of an STM in an overly simplified setting can be at best uninformative and at worst misleading as it may steer researchers to try to optimize irrelevant aspects of their implementations.This paper presents STMBench7: a candidate benchmark for evaluating STM implementations. The underlying data structure consists of a set of graphs and indexes intended to be suggestive of many complex applications, e.g., CAD/CAM. A collection of operations is supported to model a wide range of workloads and concurrency patterns. Companion locking strategies serve as a baseline for STM performance comparisons. STMBench7 strives for simplicity. Users may choose a workload, number of threads, benchmark length, as well as the possibility of structure modification and the nature of traversals of shared data structures. We illustrate the use of STMBench7 with an evaluation of a well-known software transactional memory implementation.

Journal ArticleDOI
TL;DR: In this paper, a Benchmark study on reliability estimation of structural systems is presented, which attempts to assess various recently proposed alternative procedures for reliability estimation with respect to their accuracy and computational efficiency.

Proceedings ArticleDOI
09 Jun 2007
TL;DR: This paper analyzes the SPEC CPU2006 benchmarks using performance counter based experimentation from several state of the art systems, and uses statistical techniques such as principal component analysis and clustering to draw inferences on the similarity of the benchmarks and the redundancy in the suite and arrive at meaningful subsets.
Abstract: The recently released SPEC CPU2006 benchmark suite is expected to be used by computer designers and computer architecture researchers for pre-silicon early design analysis. Partial use of benchmark suites by researchers, due to simulation time constraints, compiler difficulties, or library or system call issues is likely to happen; but a random subset can lead to misleading results. This paper analyzes the SPEC CPU2006 benchmarks using performance counter based experimentation from several state of the art systems, and uses statistical techniques such as principal component analysis and clustering to draw inferences on the similarity of the benchmarks and the redundancy in the suite and arrive at meaningful subsets.The SPEC CPU2006 benchmark suite contains several programs from areas such as artificial intelligence and includes none from the electronic design automation (EDA) application area. Hence there is a concern on the application balance in the suite. An analysis from the perspective of fundamental program characteristics shows that the included programs offer characteristics broader than the EDA programs' space. A subset of 6 integer programs and 8 floating point programs can yield most of the information from the entire suite.

Journal ArticleDOI
TL;DR: An empirical study on GE-HPGA using a benchmark problem and a realistic aerodynamic airfoil shape optimization problem for diverse Grid environments having different communication protocols, cluster sizes, processing nodes, at geographically disparate locations indicates that the proposed GE- HPGA offers a credible framework for providing a significant speed-up to evolutionary design optimization in science and engineering.

Proceedings ArticleDOI
01 Sep 2007
TL;DR: This paper proposes two new efficient DE variants, named DECC-I andDECC-II, for high-dimensional optimization (up to 1000 dimensions), based on a cooperative coevolution framework incorporated with several novel strategies.
Abstract: Most reported studies on differential evolution (DE) are obtained using low-dimensional problems, e.g., smaller than 100, which are relatively small for many real-world problems. In this paper we propose two new efficient DE variants, named DECC-I and DECC-II, for high-dimensional optimization (up to 1000 dimensions). The two algorithms are based on a cooperative coevolution framework incorporated with several novel strategies. The new strategies are mainly focus on problem decomposition and subcomponents cooperation. Experimental results have shown that these algorithms have superior performance on a set of widely used benchmark functions.

Journal ArticleDOI
01 Dec 2007
TL;DR: This paper proposes a new hybridization of optimization methodologies called particle swarm optimization with recombination and dynamic linkage discovery (PSO-RDL), which can provide a level of performance comparable to that given by other advanced optimization techniques.
Abstract: In this paper, we try to improve the performance of the particle swarm optimizer by incorporating the linkage concept, which is an essential mechanism in genetic algorithms, and design a new linkage identification technique called dynamic linkage discovery to address the linkage problem in real-parameter optimization problems. Dynamic linkage discovery is a costless and effective linkage recognition technique that adapts the linkage configuration by employing only the selection operator without extra judging criteria irrelevant to the objective function. Moreover, a recombination operator that utilizes the discovered linkage configuration to promote the cooperation of particle swarm optimizer and dynamic linkage discovery is accordingly developed. By integrating the particle swarm optimizer, dynamic linkage discovery, and recombination operator, we propose a new hybridization of optimization methodologies called particle swarm optimization with recombination and dynamic linkage discovery (PSO-RDL). In order to study the capability of PSO-RDL, numerical experiments were conducted on a set of benchmark functions as well as on an important real-world application. The benchmark functions used in this paper were proposed in the 2005 institute of electrical and electronics engineers congress on evolutionary computation. The experimental results on the benchmark functions indicate that PSO-RDL can provide a level of performance comparable to that given by other advanced optimization techniques. In addition to the benchmark, PSO-RDL was also used to solve the economic dispatch (ED) problem for power systems, which is a real-world problem and highly constrained. The results indicate that PSO-RDL can successfully solve the ED problem for the three-unit power system and obtain the currently known best solution for the 40-unit system.

Book ChapterDOI
09 Sep 2007
TL;DR: Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations is presented.
Abstract: This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a "Long Short-Term Memory" architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.

Journal ArticleDOI
01 Feb 2007
TL;DR: A memetic algorithm (MA) for a nonslicing and hard-module VLSI floorplanning problem is presented that uses an effective genetic search method to explore the search space and an efficient local search methods to exploit information in the search region.
Abstract: Floorplanning is an important problem in very large scale integrated-circuit (VLSI) design automation as it determines the performance, size, yield, and reliability of VLSI chips. From the computational point of view, VLSI floorplanning is an NP-hard problem. In this paper, a memetic algorithm (MA) for a nonslicing and hard-module VLSI floorplanning problem is presented. This MA is a hybrid genetic algorithm that uses an effective genetic search method to explore the search space and an efficient local search method to exploit information in the search region. The exploration and exploitation are balanced by a novel bias search strategy. The MA has been implemented and tested on popular benchmark problems. Experimental results show that the MA can quickly produce optimal or nearly optimal solutions for all the tested benchmark problems

Journal ArticleDOI
TL;DR: The benchmarks that make up the SPEC CPU2006 benchmark suite are set-up, run, timed, and scored by the CPU tools harness.
Abstract: The benchmarks that make up the SPEC CPU2006 benchmark suite are set-up, run, timed, and scored by the CPU tools harness. The tools have evolved over time from a collection of edit-it-yourself makefiles, scripts, and an Excel spreadsheet to the current Perl-based suite. The basic purpose of the tools is to make life easier for the benchmarker; they make it easier to tweak compilation settings, easier to keep track of those settings, and most importantly, they make it easier to follow the run and reporting rules.

Journal ArticleDOI
TL;DR: This paper presents an automatic way of evolving hierarchical Takagi-Sugeno fuzzy systems (TS-FS) using probabilistic incremental program evolution (PIPE) with specific instructions and fine tuning of the if - then rule's parameters encoded in the structure using evolutionary programming (EP).
Abstract: This paper presents an automatic way of evolving hierarchical Takagi-Sugeno fuzzy systems (TS-FS). The hierarchical structure is evolved using probabilistic incremental program evolution (PIPE) with specific instructions. The fine tuning of the if - then rule's parameters encoded in the structure is accomplished using evolutionary programming (EP). The proposed method interleaves both PIPE and EP optimizations. Starting with random structures and rules' parameters, it first tries to improve the hierarchical structure and then as soon as an improved structure is found, it further fine tunes the rules' parameters. It then goes back to improve the structure and the rules' parameters. This loop continues until a satisfactory solution (hierarchical TS-FS model) is found or a time limit is reached. The proposed hierarchical TS-FS is evaluated using some well known benchmark applications namely identification of nonlinear systems, prediction of the Mackey-Glass chaotic time-series and some classification problems. When compared to other neural networks and fuzzy systems, the developed hierarchical TS-FS exhibits competing results with high accuracy and smaller size of hierarchical architecture.

Proceedings ArticleDOI
10 Feb 2007
TL;DR: A scalable approach combines design space sampling and statistical inference to identify trends from a sparse simulation of the space to motivate the application of techniques in statistical inference for more effective use of modern simulator infrastructure.
Abstract: We apply a scalable approach for practical, comprehensive design space evaluation and optimization. This approach combines design space sampling and statistical inference to identify trends from a sparse simulation of the space. The computational efficiency of sampling and inference enables new capabilities in design space exploration. We illustrate these capabilities using performance and power models for three studies of a 260,000 point design space: (1) Pareto frontier analysis, (2) pipeline depth analysis, and (3) multiprocessor heterogeneity analysis. For each study, we provide an assessment of predictive error and sensitivity of observed trends to such error. We construct Pareto frontiers and find predictions for Pareto optima are no less accurate than those for the broader design space. We reproduce and enhance prior pipeline depth studies, demonstrating constrained sensitivity studies may not generalize when many other design parameters are held at constant values. Lastly, we identify efficient heterogeneous core designs by clustering per benchmark optimal architectures. Collectively, these studies motivate the application of techniques in statistical inference for more effective use of modern simulator infrastructure

Journal ArticleDOI
TL;DR: A key conclusion of this study is that a simple MS parallelization strategy can exploit time-continuation and parallel speedups to dramatically improve the efficiency and reliability of evolutionary multiobjective algorithms in water resources applications.

01 Jan 2007
TL;DR: The primary objective of the NEA is to promote cooperation among the governments of its participating countries in furthering the development of nuclear power as a safe, environmentally acceptable and economic energy source.
Abstract: Pursuant to Article 1 of the Convention signed in Paris on 14th December 1960, and which came into force on 30th September 1961, the Organization for Economic Cooperation and Development (OECD) shall promote policies designed: − to achieve the highest sustainable economic growth and employment and a rising standard of living in Member countries, while maintaining financial stability, and thus to contribute to the development of the world economy; − to contribute to sound economic expansion in Member as well as non-member countries in the process of economic development; and − to contribute to the expansion of world trade on a multilateral, non-discriminatory basis in accordance with international obligations. NUCLEAR ENERGY AGENCY The OECD Nuclear Energy Agency (NEA) was established on 1st February 1958 under the name of OEEC European Nuclear Energy Agency. It received its present designation on 20th April 1972, when Japan became its first non-European full Member. NEA membership today consists of all OECD Member countries, except New Zealand and Poland. The Commission of the European Communities takes part in the work of the Agency. The primary objective of the NEA is to promote cooperation among the governments of its participating countries in furthering the development of nuclear power as a safe, environmentally acceptable and economic energy source. This is achieved by: − encouraging harmonization of national regulatory policies and practices, with particular reference to the safety of nuclear installations, protection of man against ionising radiation and preservation of the environment, radioactive waste management, and nuclear third party liability and insurance; − assessing the contribution of nuclear power to the overall energy supply by keeping under review the technical and economic aspects of nuclear power growth and forecasting demand and supply for the different phases of the nuclear fuel cycle; − developing exchanges of scientific and technical information particularly through participation in common services; − setting up international research and development programs and joint undertakings. In these and related tasks, the NEA works in close collaboration with the International Atomic Energy Agency in Vienna, with which it has concluded a Cooperation Agreement, as well as with other international organizations in the nuclear field. Permission to reproduce a portion of this work for non-commercial purposes or classroom use should be obtained through the Centre français d'exploitation du droit de copie 4 Foreword In recent years there has been an increasing demand from nuclear research, industry, safety and regulation …

Proceedings ArticleDOI
01 Sep 2007
TL;DR: Compared to other self-adaptive DE algorithms, JADE converges faster and reliably in at least 10 out of a set of 13 benchmark problems and shows competitive results in other cases as well.
Abstract: A new differential evolution algorithm, JADE, is proposed to improve the rate and the reliability of convergence performance by implementing a new mutation strategy 'DE/current-to-p-best' and controlling the parameters in a self-adaptive manner. The 'DE/current-to-p-best' is a generalization of 'DE/current-to-best'. It diversifies the population but still inherits the fast convergence property. Self-adaptation is beneficial for performance improvement. Also, it avoids the requirement of prior knowledge about parameter settings and thus works well without user interaction. Compared to other self-adaptive DE algorithms, JADE converges faster and reliably in at least 10 out of a set of 13 benchmark problems and shows competitive results in other cases as well. Simulations results also clearly show that there is no single parameter value suitable for various problems or even at different optimization stages of a single problem.

Journal ArticleDOI
TL;DR: A modified version of the differential evolution algorithm is presented to allow each parent vector in the population to generate more than one trial (child) vector at each generation and therefore to increase its probability of generating a better one.
Abstract: This article presents a modified version of the differential evolution algorithm to solve engineering design problems. The aim is to allow each parent vector in the population to generate more than one trial (child) vector at each generation and therefore to increase its probability of generating a better one. To deal with constraints, some criteria based on feasibility and a diversity mechanism to maintain infeasible solutions in the population are used. The approach is tested on a set of well-known benchmark problems. After that, it is used to solve engineering design problems and its performance is compared with those provided by typical penalty function approaches and also against state-of-the-art techniques.

Book ChapterDOI
28 Aug 2007
TL;DR: The neural simulation tool NEST is presented, a neuronal network simulator which uses a hybrid strategy, combining distributed simulation across cluster nodes (MPI) with thread-based simulation on each computer, to simulate very large networks with acceptable time and memory requirements.
Abstract: To understand the principles of information processing in the brain, we depend on models with more than 105 neurons and 109 connections. These networks can be described as graphs of threshold elements that exchange point events over their connections. From the computer science perspective, the key challenges are to represent the connections succinctly; to transmit events and update neuron states efficiently; and to provide a comfortable user interface. We present here the neural simulation tool NEST, a neuronal network simulator which addresses all these requirements. To simulate very large networks with acceptable time and memory requirements, NEST uses a hybrid strategy, combining distributed simulation across cluster nodes (MPI) with thread-based simulation on each computer. Benchmark simulations of a computationally hard biological neuronal network model demonstrate that hybrid parallelization yields significant performance benefits on clusters of multi-core computers, compared to purely MPIbased distributed simulation.

Journal ArticleDOI
TL;DR: In this paper, the authors presented a mathematical approach to reconfigure process plans to account for changes in parts' features beyond the scope of the original product family by inserting/removing features iteratively using a novel 0-1 integer programming model.

Proceedings Article
11 Mar 2007
TL;DR: This paper demonstrates how deterministic annealing can be applied to different SVM formulations of the multiple-instance learning (MIL) problem and proposes a new objective function which together with the deterministicAnnealing algorithm finds better local minima and achieves better performance on a set of benchmark datasets.
Abstract: In this paper we demonstrate how deterministic annealing can be applied to different SVM formulations of the multiple-instance learning (MIL) problem. Our results show that we find better local minima compared to the heuristic methods those problems are usually solved with. However this does not always translate into a better test error suggesting an inadequacy of the objective function. Based on this finding we propose a new objective function which together with the deterministic annealing algorithm finds better local minima and achieves better performance on a set of benchmark datasets. Furthermore the results also show how the structure of MIL datasets influence the performance of MIL algorithms and we discuss how future benchmark datasets for the MIL problem should be designed.

Proceedings Article
23 Sep 2007
TL;DR: The characteristics of different phases of the TPC-DS workload, namely: database load, query workload and data maintenance; and also their impact to the benchmark's performance metric are detailed.
Abstract: The Transaction Processing Performance Council (TPC) is completing development of TPC-DS, a new generation industry standard decision support benchmark. The TPC-DS benchmark, first introduced in the "The Making of TPC-DS" [9] paper at the 32nd International Conference on Very Large Data Bases (VLDB), has now entered the TPC's "Formal Review" phase for new benchmarks; companies and researchers alike can now download the draft benchmark specification and tools for evaluation. The first paper [9] gave an overview of the TPC-DS data model, workload model, and execution rules. This paper details the characteristics of different phases of the workload, namely: database load, query workload and data maintenance; and also their impact to the benchmark's performance metric. As with prior TPC benchmarks, this workload will be widely used by vendors to demonstrate their capabilities to support complex decision support systems, by customers as a key factor in purchasing servers and software, and by the database community for research and development of optimization techniques.

Journal ArticleDOI
TL;DR: These efforts to design and implement such an OpenMP compiler on top of Open64, an open source compiler framework, by extending its existing analysis and optimization and adopting a source‐to‐source translator approach where a native back end is not available are presented.
Abstract: SUMMARY OpenMP has gained wide popularity as an API for parallel programming on shared memory and distributed shared memory platforms. Despite its broad availability, there remains a need for a portable, robust, open source, optimizing OpenMP compiler for C/C++/Fortran 90, especially for teaching and research, for example into its use on new target architectures, such as SMPs with chip multi-threading, as well as learning how to translate for clusters of SMPs. In this paper, we present our efforts to design and implement such an OpenMP compiler on top of Open64, an open source compiler framework, by extending its existing analysis and optimization and adopting a source-to-source translator approach where a native back end is not available. The compilation strategy we have adopted and the corresponding runtime support are described. The OpenMP validation suite is used to determine the correctness of the translation. The compiler’s behavior is evaluated using benchmark tests from the EPCC microbenchmarks and the NAS parallel benchmark. Copyright c � 2007 John Wiley & Sons, Ltd.