scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Accurate and efficient processor performance prediction via regression tree based modeling

01 Oct 2009-Journal of Systems Architecture (North-Holland)-Vol. 55, Iss: 10, pp 457-467
TL;DR: A performance prediction approach which employs state-of-the-art techniques from experiment design, machine learning and data mining is proposed which generates highly accurate estimations for unsampled points in the design space and shows the robustness for the worst-case prediction.
About: This article is published in Journal of Systems Architecture.The article was published on 2009-10-01. It has received 43 citations till now. The article focuses on the topics: Performance prediction & Robustness (computer science).
Citations
More filters
Journal ArticleDOI
TL;DR: This article categorizes, reviews, and analyzes the state-of-the-art single−/multi-response adaptive sampling approaches for global metamodeling in support of simulation-based engineering design and discusses some important issues that affect the success of an adaptive sampling approach.
Abstract: Metamodeling is becoming a rather popular means to approximate the expensive simulations in today’s complex engineering design problems since accurate metamodels can bring in a lot of benefits. The metamodel accuracy, however, heavily depends on the locations of the observed points. Adaptive sampling, as its name suggests, places more points in regions of interest by learning the information from previous data and metamodels. Consequently, compared to traditional space-filling sampling approaches, adaptive sampling has great potential to build more accurate metamodels with fewer points (simulations), thereby gaining increasing attention and interest by both practitioners and academicians in various fields. Noticing that there is a lack of reviews on adaptive sampling for global metamodeling in the literature, which is needed, this article categorizes, reviews, and analyzes the state-of-the-art single−/multi-response adaptive sampling approaches for global metamodeling in support of simulation-based engineering design. In addition, we also review and discuss some important issues that affect the success of an adaptive sampling approach as well as providing brief remarks on adaptive sampling for other purposes. Last, challenges and future research directions are provided and discussed.

276 citations


Cites background or methods from "Accurate and efficient processor pe..."

  • ...For the improvement of global accuracy, a distance term d, which plays a role of global exploration, is often required to discard some points that are close to each other (Hendrickx and Dhaene 2005; Li et al. 2009; Eason and Cremaschi 2014)....

    [...]

  • ...With no constraints, the new points tend to cluster in regions with large prediction difference (Li et al. 2009)....

    [...]

  • ...Li et al. (2009) trained the multiple additive regression trees (MART) model on the sample set 20 times with different random seeds, and suggested using the coefficient of variance σ ̂=μ (where μ represents the mean) instead of the QBC variance to better rank the candidate points....

    [...]

Journal ArticleDOI
TL;DR: A review of adaptive schemes for kriging proposed in the literature is presented, to provide the reader with an overview of the main principles of adaptive techniques, and insightful details to pertinently employ available tools depending on the application at hand.
Abstract: Metamodels aim to approximate characteristics of functions or systems from the knowledge extracted on only a finite number of samples. In recent years kriging has emerged as a widely applied metamodeling technique for resource-intensive computational experiments. However its prediction quality is highly dependent on the size and distribution of the given training points. Hence, in order to build proficient kriging models with as few samples as possible adaptive sampling strategies have gained considerable attention. These techniques aim to find pertinent points in an iterative manner based on information extracted from the current metamodel. A review of adaptive schemes for kriging proposed in the literature is presented in this article. The objective is to provide the reader with an overview of the main principles of adaptive techniques, and insightful details to pertinently employ available tools depending on the application at hand. In this context commonly applied strategies are compared with regards to their characteristics and approximation capabilities. In light of these experiments, it is found that the success of a scheme depends on the features of a specific problem and the goal of the analysis. In order to facilitate the entry into adaptive sampling a guide is provided. All experiments described herein are replicable using a provided open source toolbox.

93 citations

Journal ArticleDOI
TL;DR: A multiobjective optimization heuristic which iteratively updates and queries the RSM to identify the design points with the highest expected improvement and is compared with state-of-the-art techniques such as response-surface Pareto iterative refinement ReSPIR and nondominated-sorting genetic algorithm NSGA-II.
Abstract: This paper presents OSCAR, an optimization methodology exploiting spatial correlation of multicore design spaces. This paper builds upon the observation that power consumption and performance metrics of spatially close design configurations (or points) are statistically correlated. We propose to exploit the correlation by using a response surface model (RSM), i.e., a closed-form expression suitable for predicting the quality of nonsimulated design points. This model is useful during the design space exploration (DSE) phase to quickly converge to the Pareto set of the multiobjective problem without executing lengthy simulations. To this end, we introduce a multiobjective optimization heuristic which iteratively updates and queries the RSM to identify the design points with the highest expected improvement. The RSM allows to consolidate the Pareto set by reducing the number of simulations required, thus speeding up the exploration process. We compare the proposed heuristic with state-of-the-art approaches [conventional, RSM-based, and structured design of experiments (DoEs)]. Experimental results show that OSCAR is a faster heuristic with respect to state-of-the-art techniques such as response-surface Pareto iterative refinement ReSPIR and nondominated-sorting genetic algorithm NSGA-II. In fact, OSCAR used a lower number of simulations to produce a similar solution, i.e., an average of 150 simulations instead of 320 simulations (NSGA-II) and 178 simulations (ReSPIR). When the number of design points is fixed to an average of 300, OSCAR achieves less than 0.6% in terms of average distance with respect to the reference solution while NSGA-II achieves 3.4%. Reported results also show that OSCAR can significantly improve structured DoE approaches by slightly increasing the number of experiments.

37 citations


Cites methods from "Accurate and efficient processor pe..."

  • ...The selection of a specific type of RSM (model selection) has been addressed by means of data mining [3], genetically evolved neural networks [28], and genetically evolved linear models [29]....

    [...]

  • ...For these reasons, analytical models [or response surface models (RSMs)] are increasingly used to approximate the profiling information obtained with computer simulations by effectively replacing it during optimization [3], [4]....

    [...]

Patent
07 Jan 2010
TL;DR: In this paper, the authors present a technique for implementing policies in a data store based on the mapping of the first schema to the second schema. And they define a second schema based at least in part on a policy and an ontology.
Abstract: Techniques for implementing policies. In an embodiment, first data is stored in a first data store according to a first schema. A second schema is defined based at least in part on a policy and an ontology. Second data, which includes at least a portion of the first data, is stored in a second data store according to the second schema. Storing the second data is based at least in part on a mapping of the first schema to the second schema. At least a portion of the second data is analyzed and results of the analysis are provided to a user.

37 citations

Patent
31 May 2011
TL;DR: In this paper, machine learning techniques are used to create models of systems and those models can be used to determine optimal configurations for a system, while remaining within budgetary and/or resource constraints.
Abstract: Techniques for tuning systems generate configurations that are used to test the systems to determine optimal configurations for the systems. The configurations for a system are generated to allow for effective testing of the system while remaining within budgetary and/or resource constraints. The configurations may be selected to satisfy one or more conditions on their distributions to ensure that a satisfactory set of configurations are tested. Machine learning techniques may be used to create models of systems and those models can be used to determine optimal configurations.

32 citations

References
More filters
Journal ArticleDOI
TL;DR: A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.
Abstract: Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest-descent minimization. A general gradient descent “boosting” paradigm is developed for additive expansions based on any fitting criterion.Specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification. Special enhancements are derived for the particular case where the individual additive components are regression trees, and tools for interpreting such “TreeBoost” models are presented. Gradient boosting of regression trees produces competitive, highly robust, interpretable procedures for both regression and classification, especially appropriate for mining less than clean data. Connections between this approach and the boosting methods of Freund and Shapire and Friedman, Hastie and Tibshirani are discussed.

17,764 citations


"Accurate and efficient processor pe..." refers background in this paper

  • ...[6] show that using small values of gradient descent step size always lead to better prediction performance....

    [...]

Journal ArticleDOI
TL;DR: This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples.
Abstract: Classification and regression trees are machine-learning methods for constructing prediction models from data. The models are obtained by recursively partitioning the data space and fitting a simple prediction model within each partition. As a result, the partitioning can be represented graphically as a decision tree. Classification trees are designed for dependent variables that take a finite number of unordered values, with prediction error measured in terms of misclassification cost. Regression trees are for dependent variables that take continuous or ordered discrete values, with prediction error typically measured by the squared difference between the observed and predicted values. This article gives an introduction to the subject by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 14-23 DOI: 10.1002/widm.8 This article is categorized under: Technologies > Classification Technologies > Machine Learning Technologies > Prediction Technologies > Statistical Fundamentals

16,974 citations

Book
01 Jan 1983
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Abstract: The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

14,825 citations