scispace - formally typeset
Search or ask a question

Showing papers on "Performance prediction published in 1992"


Proceedings ArticleDOI
09 Sep 1992
TL;DR: A performance prediction scheme is presented which could be incorporated into a compiler to determine the best data distribution for a given program and system.
Abstract: The effective utilization of a cluster of workstations for the implementation of a scientific application requires a highly flexible software environment. The characteristics of such an environment are considered and two novel data distribution aspects of this environment are explored. The performance of a distributed memory multicomputer, such as a workstation cluster, is very sensitive to the strategy used to distribute data to the processors. A performance prediction scheme is presented which could be incorporated into a compiler to determine the best data distribution for a given program and system. For heterogeneous systems a data distribution strategy has been developed which takes into account the different capabilities of the processors. A number of experiments have been conducted on workstation clusters to demonstrate these software techniques. >

58 citations


Journal ArticleDOI
TL;DR: A development process for parallel programs that launches performance engineering in the early design phase is proposed, based on a Petri net specification methodology for the performance critical parts of a parallel system.

53 citations


Proceedings ArticleDOI
24 Jun 1992
TL;DR: An effective approach to the performance evaluation without recourse to simulations is presented in this paper, based on a performance measure of hybrid nature that is a continuous-valued matrix function of a discrete-valued sequence - the system mode sequence.
Abstract: The Interacting Multiple Model (IMM) algorithm has been shown to be one of the most cost-effective hybrid state estimation schemes. Its performance, however, could only be evaluated via expensive Monte-Carlo simulations. An effective approach to the performance evaluation without recourse to simulations is presented in this paper. This approach is based on a performance measure of hybrid nature in the sense that it is a continuous-valued matrix function of a discrete-valued sequence - the system mode sequence. This system mode sequence is an essential description of the scenario of the problem on which the performance of the algorithm is dependent and being predicted. The performance measure is efficiently calculated in an off-line recursion. The capability of this approach in predicting quantitatively the average performance of the algorithm is illustrated via two important examples: a generic Air Traffic Control tracking problem and a nonstationary noise identification problem.

49 citations


Proceedings ArticleDOI
01 Aug 1992
TL;DR: A new performance prediction tool is introduced, which automatically derives performance estimates for single program multiple data (SPMD) parallel Fortran 77 programs based on distributed memory systems (DMS).
Abstract: In order to take on the challenge of fully automatic program parallelizing, one of the last and probably the most decisive missing tool is a performance estimation system. In this paper a new performance prediction tool is introduced, which automatically derives performance estimates for single program multiple data (SPMD) parallel Fortran 77 programs based on distributed memory systems (DMS). The underlying methodology is based on static and dynamic techniques. This paper discusses in particular a high level abstract description of the parallel program, which is utilized to derive performance estimates. The salient features of the overall design of this tool and its components are described.

44 citations


Proceedings ArticleDOI
01 Dec 1992
TL;DR: The authors describe how a single method based on communication-to-computation (C/C) ratio can be used to predict performance accurately and yet fairly simply in some commonly encountered cases.
Abstract: The authors goal is to be able to predict the performance of a parallel program early in the program development process; to that end they require prediction methods that can be based on incomplete programs. They describe how a single method based on communication-to-computation (C/C) ratio can be used to predict performance accurately and yet fairly simply in some commonly encountered cases. They show how C/C-ratio-based methods are accomplished for both distributed-memory and coherent-memory multiprocessors. They show that focusing on C/C ratio simplifies the use of theory, machine benchmarking and application measurement necessary to provide good parallel performance prediction. In addition, the methods demonstrated are useful because they can be applied to program fragments, or serially executed code. >

29 citations


Proceedings ArticleDOI
01 Jun 1992
TL;DR: In this paper, a set of computer programs for the performance prediction of shaft-power and jet-propulsion cycles, such as simple, regenerative, intercooled-regenerative, turbojet and turbofan cycles, are presented.
Abstract: The performance of gas-turbine engines is the result of choices of type of cycle for the application, cycle temperature ratio, pressure ratio, cooling flows and component losses. The output is usually given as efficiency versus specific power. The type of efficiency of interest (thermal, propulsive, specific thrust, overall efficiency) must be specified. This paper presents a set of computer programs for the performance prediction of shaft-power and jet-propulsion cycles, such as simple, regenerative, intercooled-regenerative, turbojet and turbofan cycles. Each cycle is constructed using individual component modules. Realistic default assumptions are made by the programs, or other values can be specified by the user for component efficiencies as functions of pressure ratio, cooling mass-flow rate as a function of cooling technology levels, and various other cycle losses. The programs can be used to predict design point and off-design point operation using appropriate component efficiencies. The effect of various cycle choices on overall performance are discussed.Copyright © 1992 by ASME

28 citations


Journal ArticleDOI
TL;DR: In this paper, the average daily solar system performance was calculated from the product of clear-sky solar performance and the average time fraction of clear sky, and the predicted results compare favourably with results predicted by the φf chart method.

25 citations


Journal ArticleDOI
TL;DR: In this paper, two computational methods for predicting the performance of horizontal axis wind turbines (HAWTs) are described, one is a nonlinear lifting line procedure with a helical wake model and the second is a panel method with a new wake model.

23 citations


Proceedings ArticleDOI
09 Jun 1992
TL;DR: A general framework for analyzing the performance of this type of computation for any given topology is discussed and models for two widely used parallel programming strategies: processor farms and divide and conquer are derived.
Abstract: An efficient execution model for tree structured computations is presented. A general framework for analyzing the performance of this type of computation for any given topology is discussed. The framework is used to derive models for two widely used parallel programming strategies: processor farms and divide and conquer. The models were validated on a large multicomputer, and it was shown that their accuracy is such that they can be used to predict the performance of applications that use the above strategies. The use of these models to evaluate performance and to restructure the application to improve performance is discussed. >

15 citations


Journal ArticleDOI
TL;DR: The ASSPIN as mentioned in this paper time domain code gives acousticians a powerful technique of advanced propeller noise prediction, using exact solutions of the Ffowcs Williams-Hawkings equation with exact blade geometry and kinematics.
Abstract: The time domain code ASSPIN gives acousticians a powerful technique of advanced propeller noise prediction. Except for nonlinear effects, the code uses exact solutions of the Ffowcs Williams-Hawkings equation with exact blade geometry and kinematics. By including nonaxial inflow, periodic loading noise, and adaptive time steps to accelerate computer execution, the development of this code becomes complete.

13 citations


01 Feb 1992
TL;DR: In this article, measured and predicted rotor performance for the SERI advanced wind turbine blades were compared to assess the accuracy of predictions and to identify the sources of error affecting both predictions and measurements.
Abstract: Measured and predicted rotor performance for the SERI advanced wind turbine blades were compared to assess the accuracy of predictions and to identify the sources of error affecting both predictions and measurements. An awareness of these sources of error contributes to improved prediction and measurement methods that will ultimately benefit future rotor design efforts. Propeller/vane anemometers were found to underestimate the wind speed in turbulent environments such as the San Gorgonio Pass wind farm area. Using sonic or cup anemometers, good agreement was achieved between predicted and measured power output for wind speeds up to 8 m/sec. At higher wind speeds an optimistic predicted power output and the occurrence of peak power at wind speeds lower than measurements resulted from the omission of turbulence and yaw error. In addition, accurate two-dimensional (2-D) airfoil data prior to stall and a post stall airfoil data synthesization method that reflects three-dimensional (3-D) effects were found to be essential for accurate performance prediction. 11 refs.

Proceedings ArticleDOI
15 Jun 1992
TL;DR: In this paper, a linear performance prediction model for IC binning using measurements collected on a high-volume manufacturing line was built using electrical measurements collected before packaging in order to predict the high speed performance of manufactured parts before final test.
Abstract: A model for integrated circuit (IC) binning has been built using measurements collected on a high-volume manufacturing line. This model uses electrical measurements collected before packaging in order to predict the high-speed performance of manufactured parts before final test. A small set of measurable parameters responsible for the variation in fabricated parts was identified. The statistically significant, linear performance prediction model was built using data from a 1-Mbit CMOS EPROM production line. The applications of the model include aiding the packaging decision, production planning and scheduling, process characterization, and control and design for manufacturability. >

01 Mar 1992
TL;DR: Computer code TD2 computes design point velocity diagrams and performance for multistage, multishaft, cooled or uncooled, axial flow turbines and was recently modified to upgrade modeling related to turbine cooling and to the internal loss correlation.
Abstract: Computer code TD2 computes design point velocity diagrams and performance for multistage, multishaft, cooled or uncooled, axial flow turbines. This streamline analysis code was recently modified to upgrade modeling related to turbine cooling and to the internal loss correlation. These modifications are presented in this report along with descriptions of the code's expanded input and output. This report serves as the users manual for the upgraded code, which is named TD2-2.

01 Jul 1992
TL;DR: In this article, an analytical method was proposed to predict the thermal performance of the hypervapotron made of materials other than copper, and preliminary results showed an excellent agreement between experimental results and analytical prediction over a wide range of flow velocities, pressures, subcooling temperatures and heat fluxes.
Abstract: A hypervapotron is a water-cooled device which combines the advantages of finned surfaces with the large heat transfer rates possible during boiling heat transfer. Hypervapotrons have been used as beam dumps in the past and plans are under way to use them for divertor cooling in the Joint European Torus (JET). Experiments at JET have shows that a surface heat flux of 25 MW/m{sup 2} can be achieved in hypervapotrons. This performance makes such a device very attractive for cooling of divertor of the International Thermonuclear Experimental Reactor (ITER). This paper presents an analytical method to predict the thermal performance of the hypervapotrons. Preliminary results show an excellent agreement between experimental results and analytical prediction over a wide range of flow velocities, pressures, subcooling temperatures and heat fluxes. This paper also presents the predicted performance of hypervapotron made of materials other than copper. After further development and verification, the analytical method could be used for optimizing designs and performance prediction.

21 Jul 1992
TL;DR: This report presents the theoretical foundations of the RASP model as well as the numerical implementation of this theory and the results of a sample execution of the model.
Abstract: : In 1984, the Naval Research Laboratory (NRL) published a report that described a sequence of computer programs to predict long-range, low-frequency monostatic or bistatic reverberation for either the ocean surface or bottom Since that time, numerous improvements and extensions have been made to the original sequence of programs that have incorporated advances in the theory and understanding of underwater acoustics, numerical modeling, and computer software Examples of enhancements include the addition of predicted target returns, improved spatial interpolations of sound speed, and the application of a wave-theoretic treatment of caustics The present collective versions of the programs is now referred to as the Range-dependent Active System Performance, or RASP model This report presents the theoretical foundations of the RASP model as well as the numerical implementation of this theory Further, a detailed description of model software and instructions for model execution are provided along with the results of a sample execution

01 Jan 1992
TL;DR: It is shown that the Lanczos algorithm with the Cholesky factorization scheme is far superior to the sub-space iteration method of eigensolution when substantial numbers of eigenvectors are required for control design and/or performance optimization.
Abstract: Simply transporting design codes from sequential-scalar computers to parallel-vector computers does not fully utilize the computational benefits offered by high performance computers. By performing integrated controls and structures design on an experimental truss platform with both sequential-scalar and parallel-vector design codes, conclusive results are presented to substantiate this claim. The efficiency of a Cholesky factorization scheme in conjunction with a variable-band row data structure is presented. In addition, the Lanczos eigensolution algorithm has been incorporated in the design code for both parallel and vector computations. Comparisons of computational efficiency between the initial design code and the parallel-vector design code are presented. It is shown that the Lanczos algorithm with the Cholesky factorization scheme is far superior to the sub-space iteration method of eigensolution when substantial numbers of eigenvectors are required for control design and/or performance optimization. Integrated design results show the need for continued efficiency studies in the area of element computations and matrix assembly.

Proceedings ArticleDOI
03 Feb 1992
TL;DR: Using an analytical model, it is shown that the proposed method achieves a significant increase in the throughput of database systems using redundant disk arrays by reducing the number of recovery operations needed to maintain the consistency of the database.
Abstract: The authors propose a method for using redundant disk arrays to support rapid recovery from system crashes and transaction aborts in addition to their role in providing media failure recovery. A twin-page scheme is used to store the parity information in the array, making it possible to keep the old version of the parity along with the new version. The old version of the parity is used to undo updates performed by aborted transactions or by transactions interrupted by a system failure. Using an analytical model, it is shown that the proposed method achieves a significant increase in the throughput of database systems using redundant disk arrays by reducing the number of recovery operations needed to maintain the consistency of the database. >

Dissertation
01 Jan 1992
TL;DR: To develop fast efficient simulation techniques that can reduce simulation time sufficiently so that the performance of large parallel programs can be examined in a reasonable time, such techniques when incorporated in simulation tools will reduce design time, improve multiprocessor designs, and enable the concurrent development of optimized multiproprocessing software.
Abstract: Simulation has emerged as the primary means for evaluating the design of multiprocessor systems. Simulation of such systems has become increasingly time-consuming because of the increasing complexity of the interactions between components of the systems and the need to simulate large parallel programs to obtain accurate performance prediction. The objective of this thesis is to develop fast efficient simulation techniques that can reduce simulation time sufficiently so that the performance of large parallel programs can be examined in a reasonable time. Such techniques when incorporated in simulation tools will reduce design time, improve multiprocessor designs, and enable the concurrent development of optimized multiprocessor software. SIMPLE (Simulation Instrument for Multiprocessors at Program Level using Emulation), a software package for simulating multiprocessor systems efficiently, uses high-resolution clocks on the host systems to directly obtain the timing of code-segments. SIMPLE automatically instruments parallel programs, directly executes the instrumented programs, and incorporates architecture parameters. SIMPLE usually takes no more than 2 instructions to simulate 1 instruction, compared to 300 or more for accurate instruction-level simulators. Techniques were developed to parallelize multiprocessor simulators. The importance of dynamic topological information and global simulation information was identified, and the DLC (Dynamic Logical Channel) and DAI (Direct Access of Information) schemes were developed to exploit this information. In the DLC scheme, parallel programs are analyzed and instrumented before simulation. During simulation, information on the interactions between execution threads in gathered. By exploiting this information, the set of possible interactions is limited and simulation parallelism is improved. The DAI scheme aggressively collects useful simulation information in shared-memory multiprocessor hosts to reduce nonessential blocking and resolve local deadlocks. The simulation overhead of collecting information is reduced by search-pruning techniques. The DLC and DAI schemes were used to parallelize SIMPLE. A prototype of parallel SIMPLE was constructed on the Sequent Symmetry multiprocessor.

01 Jan 1992
TL;DR: In this paper, a free-wake, vortex embedded, full-potential CFD method, called HELIX-I, was used to validate the performance of a UH-60 rotor.
Abstract: This is an effort aimed at validating recent hover prediction methods. The experimental basis for this validation work is an extensive set of loads, wake and performance data, which were obtained from a pressure instrumented model UH-60 rotor tested at the Sikorsky hover test facility and at Duits-Nederlandse Windtunnel (DNW). This model was equipped with replaceable tips - including a tapered and a BERP-type tip - which permitted studies of the effects of rotor geometry. The central prediction method studied is a free-wake, vortex embedded, full-potential CFD method - called HELIX-I. It is found that the HELIX-I code produces very good comparisons with the data including wake, surface pressure and performance. Comparisons with the measured radial load distributions have permitted an improved understanding of the wake resolution modelling requirements of CFD methods. Since HELIX-I is a combined Eulerian/Lagrangian method, limited comparisons are also made with a Lagrangian boundary element code (called EHPIC) and an Eulerian Navier-Stokes code (called TURNS). In most cases all methods produce good comparisons with the data. It is found that the HELIX-I code provides a good compromise between the speed of boundary integral methods and the comprehensive nature of Navier-Stokes methods.


Journal ArticleDOI
TL;DR: It is pointed out that the use of the database system has limited accuracy if only ship's main particulars are used for the database access and an idea of using a type ship concept for improvement of the maneuverability prediction is proposed.
Abstract: A study on the database system aimed at the prediction of ship's maneuverability has been carried out. The database system covering full-scale ship's performance, hydrodynamic ship model data and ship geometry is presented. The full scale ship maneuvering performance data and model hydrodynamic data are analyzed here. The results are explained by comparison with series ship model test data. It is pointed out that the use of the database system has limited accuracy if only ship's main particulars are used for the database access. To remedy this situation, an idea of using a type ship concept for improvement of the maneuverability prediction is proposed.

01 Feb 1992
TL;DR: A compilation of several lunar surface thermal management and power system studies completed under contract and IR&D is presented in this article, which includes analysis and preliminary design of all major components of an integrated thermal management system, including loads determination, active internal acquisition and transport equipment, external transport systems (active and passive), passive insulation, solar shielding, and a range of lunar surface radiator concepts.
Abstract: A compilation of several lunar surface thermal management and power system studies completed under contract and IR&D is presented. The work includes analysis and preliminary design of all major components of an integrated thermal management system, including loads determination, active internal acquisition and transport equipment, external transport systems (active and passive), passive insulation, solar shielding, and a range of lunar surface radiator concepts. Several computer codes were utilized in support of this study, including RADSIM to calculate radiation exchange factors and view factors, RADIATOR (developed in-house) for heat rejection system sizing and performance analysis over a lunar day, SURPWER for power system sizing, and CRYSTORE for cryogenic system performance predictions. Although much of the work was performed in support of lunar rover studies, any or all of the results can be applied to a range of surface applications. Output data include thermal loads summaries, subsystem performance data, mass, and volume estimates (where applicable), integrated and worst-case lunar day radiator size/mass and effective sink temperatures for several concepts (shielded and unshielded), and external transport system performance estimates for both single and two-phase (heat pumped) transport loops. Several advanced radiator concepts are presented, along with brief assessments of possible system benefits and potential drawbacks. System point designs are presented for several cases, executed in support of the contract and IR&D studies, although the parametric nature of the analysis is stressed to illustrate applicability of the analysis procedure to a wide variety of lunar surface systems. The reference configuration(s) derived from the various studies will be presented along with supporting criteria. A preliminary design will also be presented for the reference basing scenario, including qualitative data regarding TPS concerns and issues.

Journal ArticleDOI
TL;DR: This paper presents the simulation of a class of MIMD systems using discrete-event simulation enhanced by graphics-oriented reporting to aid in the analysis, understanding, design, and performance prediction of aclass of M IMD systems in a user-friendly environment.
Abstract: This paper presents the simulation of a class of MIMD systems using discrete-event simulation enhanced by graphics-oriented reporting. MIMD systems can be specified, modelled and simulated using the package and in turn provide estimates of certain performance indices. A multiserver queueing model is constructed to describe the flow of data and instructions through the various elements of the system. The model is composed of P processors sharing M memory modules through a user-defined interconnection network. Graphical outputs allow the user to view the state of every processor/memory module over the simulation time with performance estimates such as the relative speedup, throughput, and utilization factor. Moreover, system performance graphs as a function of various system parameters are obtained to indicate the expected system behaviour for various loads and system configurations. The results section shows a case study of the influence of the memory modules' access time on the system performance. The purpose of the results is to aid in the analysis, understanding, design, and performance prediction of a class of MIMD systems in a user-friendly environment.


Proceedings ArticleDOI
24 Mar 1992
TL;DR: A high-performance processor circuit called the SC-3 has been developed to meet the requi rements of advanced experiment and attitude control applications based on the 16 MHz Intel 80386/80387 chip set and implements a dual bus system configuration which allows high-speed, 32-bit wide memory and low-speed.
Abstract: A high-performance processor circuit called the SC-3 has been developed to meet the requi rements of advanced experiment and attitude control applications. It is based on the 16 MHz Intel 80386/80387 chip set and implements a dual bus system configuration which allows high-speed, 32-bit wide memory and low-speed. 16-bit wide InputtOutput(IfO) circuits to be separated. This separation maintains compatibility with a wide range of current ItO circuit designs while exploiting the high-bandwidth memory access capabilities of the 80386. Performance is further enhanced by means of a cache on the 32-bit bus. Gibson, Whetstone, and Dhrystone instruction mixes have been used to evaluate performance under various operating modes. When the SC-3 is constrained to execute from 16-bit memory. the Gibson mix indicates a 32% performance improvement compared to previous 16-bit processors. An average of 1.1 million Whetstones per second are performed over the typical range of memory wait states. The average Dhrystone performance improvement between 32-bit non-cached and 32-bit cached operation over a typical range of memory wait states is 115%. The initial application of this processor circuit is on Stanford University's Gravity Probe-B experiment.


Patent
16 Dec 1992
TL;DR: In this paper, the authors present a computer performance prediction system which executes a performance prediction with an entire system as an object from system constituting data at a specification retrieving stage, as for a computer development.
Abstract: PURPOSE:To obtain a computer performance prediction system which executes a performance prediction with an entire system as an object from system constituting data at a specification retrieving stage, as for a computer development. CONSTITUTION:Partial simulation executing means 21-26 respectively equipped with data tables 31-36 correspond to each resource constituting the computer system as the object of the performance prediction. A simulation managing means 20 successively monitors the situation of the simulation being executed by each partial simulation executing means 21-26, and when the situation assumed to change the states of the other resources is generated, the partial simulation executing means corresponding to the resource is urged to transit the state, and the parameter of a system model affected by this state transition is changed, and stored in a data table 30.


Journal Article
TL;DR: A parallel matrix multiplication algorithm is presented, and studies of its performance and estimation are discussed, and an efficient scheme for partitioning the input matrices is introduced which enables overlapping computation with communication.
Abstract: A parallel matrix multiplication algorithm is presented, and studies of its performance and estimation are discussed. The algorithm is implemented on a network of transputers connected in a ring topology. An efficient scheme for partitioning the input matrices is introduced which enables overlapping computation with communication. This makes the algorithm achieve near-ideal speed-up for reasonably large matrices. Analytical expressions for the execution time of the algorithm have been derived by analysing its computation and communication characteristics. These expressions are validated by comparing the theoretical results of the performance with the experimental values obtained on a four-transputer network for both square and irregular matrices. The analytical model is also used to estimate the performance of the algorithm for a varying number of transputers and varying problem sizes. Although the algorithm is implemented on transputers, the methodology and the partitioning scheme presented in this paper are quite general and can be implemented on other processors which have the capability of overlapping computation with communication. The equations for performance prediction can also be extended to other multiprocessor systems.

01 Oct 1992
TL;DR: In this article, the reliability prediction process for large, closed (i.e., non-repairable) fault-tolerant, on-board satellite systems is addressed, along with techniques for reduction of error in prediction.
Abstract: : The reliability prediction process for large, closed (i.e., non- repairable) fault-tolerant, on-board satellite systems is addressed, along with techniques for reduction of error in prediction. Three major error sources are discussed, only one of which can be effectively minimized by modelers. Specific error sources in model construction are identified. It is presented that, although errors exist in the reliability prediction process, complete accuracy is not required for effective use of the predictions in the design process. A set of three tools (CARE III Bavu 84a, Bavu 84b), (HARP HARP 89), and CRAFTS(CRAF 88) are used to model an example on-board system design to illustrate minimization of errors. Guidelines for selecting the proper reliability tools are also given.