scispace - formally typeset
Search or ask a question

Showing papers on "Reliability (statistics) published in 2000"


Journal ArticleDOI
TL;DR: A wider understanding of reliability and adoption of the typical error as the standard measure of reliability would improve the assessment of tests and equipment in the authors' disciplines.
Abstract: Reliability refers to the reproducibility of values of a test, assay or other measurement in repeated trials on the same individuals. Better reliability implies better precision of single measurements and better tracking of changes in measurements in research or practical settings. The main measures of reliability are within-subject random variation, systematic change in the mean, and retest correlation. A simple, adaptable form of within-subject variation is the typical (standard) error of measurement: the standard deviation of an individual’s repeated measurements. For many measurements in sports medicine and science, the typical error is best expressed as a coefficient of variation (percentage of the mean). A biased, more limited form of within-subject variation is the limits of agreement: the 95% likely range of change of an individual’s measurements between 2 trials. Systematic changes in the mean of a measure between consecutive trials represent such effects as learning, motivation or fatigue; these changes need to be eliminated from estimates of within-subject variation. Retest correlation is difficult to interpret, mainly because its value is sensitive to the heterogeneity of the sample of participants. Uses of reliability include decision-making when monitoring individuals, comparison of tests or equipment, estimation of sample size in experiments and estimation of the magnitude of individual differences in the response to a treatment. Reasonable precision for estimates of reliability requires approximately 50 study participants and at least 3 trials. Studies aimed at assessing variation in reliability between tests or equipment require complex designs and analyses that researchers seldom perform correctly. A wider understanding of reliability and adoption of the typical error as the standard measure of reliability would improve the assessment of tests and equipment in our disciplines. CURRENT OPINION

4,149 citations


Journal ArticleDOI
TL;DR: This technical note details the preliminary stage in the development of a postural analysis tool, Rapid Entire Body Assessment, specifically designed to be sensitive to the type of unpredictable working postures found in health care and other service industries.

1,533 citations


Book
14 Jan 2000
TL;DR: This book discusses the concepts of limit states and limit state functions, and presents methodologies for calculating reliability indices and calibrating partial safety factors, and supplies information on the probability distributions and parameters used to characterize both applied loads and member resistances.
Abstract: This book enables both students and practicing engineers to appreciate how to value and handle reliability as an important dimension of structural design. The book discusses the concepts of limit states and limit state functions, and presents methodologies for calculating reliability indices and calibrating partial safety factors. It also supplies information on the probability distributions and parameters used to characterize both applied loads and member resistances. This book contains more discussions of United States (US) and international codes and the issues underlying their development. There is a significant discussion on Monte Carlo simulation. The books' emphasis is on the practical applications of structural reliability theory rather than the theory itself. Consequently, probability theory is treated as a tool, and enough is given to show the novice reader how to calculate reliability. Some background in structural engineering and structural mechanics is assumed.

944 citations


Proceedings ArticleDOI
17 Oct 2000
TL;DR: A set of mechanisms are proposed, which eliminate, or significantly reduce the negative effects of such fraudulent behavior, and can be easily integrated into existing online reputation systems in order to safeguard their reliability in the presence of potentially deceitful buyers and sellers.
Abstract: reporting systems have emerged as an important risk management mechanism in online trading communities. However, the predictive value of these systems can be compromised in situations where conspiring buyers intentionally give unfair ratings to sellers or, where sellers discriminate on the quality of service they provide to different buyers. This paper proposes and evaluates a set of mechanisms, which eliminate, or significantly reduce the negative effects of such fraudulent behavior. The proposed mechanisms can be easily integrated into existing online reputation systems in order to safeguard their reliability in the presence of potentially deceitful buyers and sellers.

725 citations


Journal ArticleDOI
TL;DR: In this paper, a review of some of the strategies available for the pursuit of reliability and validity in qualitative research is undertaken, and some of them are clearly identified as means to establish existing criteria and are found to have variable value.

693 citations


Journal ArticleDOI
TL;DR: In this paper, a structural deterioration reliability (probabilistic) model has been used to calculate probabilities of structural failure, and three durability design specifications are considered in a lifetime reliability analysis of a RC slab bridge.

676 citations


Book
05 Apr 2000
TL;DR: This chapter discusses SFEM-Based Reliability Evaluation of Nonlinear Two- and Three-Dimensional Structures under Dynamic Loading and its applications in Linear Static Problems and Spatial Variability Problems.
Abstract: Basic Concept of Reliability. Commonly Used Probability Distributions. Fundamentals of Reliability Analysis. Simulation Techniques. Implicit Performance Functions: Introduction to SFEM. SFEM for Linear Static Problems. SFEM for Spatial Variability Problems. SFEM-Based Reliability Evaluation of Nonlinear Two- and Three-Dimensional Structures. Structures under Dynamic Loading. Appendices. References. Index.

565 citations


Journal ArticleDOI
TL;DR: It is suggested that one must always use the results of cross-sectional studies to draw inferences about longitudinal processes with trepidation, and delineate the circumstances in which only longitudinal studies can answer crucial questions.
Abstract: OBJECTIVE: Cross-sectional studies are often used in psychiatric research as a basis of longitudinal inferences about developmental or disease processes. While the limitations of such usage are often acknowledged, these are often understated. The authors describe how such inferences are often, and sometimes seriously, misleading. METHOD: Why and how these inferences mislead are here demonstrated on an intuitive level, by using simulated data inspired by real problems in psychiatric research. >RESULTS: Four factors with major roles in the relationship between cross-sectional studies and longitudinal inferences are selection of time scale, type of developmental process studied, reliability of measurement, and clarity of terminology. The authors suggest how to recognize inferential errors when they occur, describe how to protect against such errors in future research, and delineate the circumstances in which only longitudinal studies can answer crucial questions. CONCLUSIONS: The simple conclusion is that on...

499 citations


Journal ArticleDOI
TL;DR: In this paper, the authors provide clear definitions of the terms "validity" and "reliability" and illustrate these definitions through examples, and clarify how these issues may be addressed in the development of scoring rubrics.
Abstract: In Moskal (2000), a framework for developing scoring rubrics was presented and the issues of validity and reliability were given cursory attention. Although many teachers have been exposed to the statistical definitions of the terms "validity" and "reliability" in teacher preparation courses, these courses often do not discuss how these concepts are related to classroom practices (Stiggins, 1999). One purpose of this article is to provide clear definitions of the terms "validity" and "reliability" and illustrate these definitions through examples. A second purpose is to clarify how these issues may be addressed in the development of scoring rubrics. Scoring rubrics are descriptive scoring schemes that are developed by teachers or other evaluators to guide the analysis of the products and/or processes of students' efforts (Brookhart, 1999; Moskal, 2000). The ideas presented here are applicable for anyone using scoring rubrics in the classroom, regardless of the discipline or grade level.

452 citations


Journal ArticleDOI
TL;DR: A set of procedures was used to develop and assess intercoder reliability with free-flowing text data, in which the coders themselves determined the length of codable text segments, and suggest that these procedures may be useful for validating the conclusions drawn from other qualitative studies using text data.
Abstract: Intercoder reliability is a measure of agreement among multiple coders for how they apply codes to text data. Intercoder reliability can be used as a proxy for the validity of constructs that emerg...

382 citations


Journal ArticleDOI
TL;DR: To help clinicians assess sedation at the bedside, to aid readers critically appraise the growing number of sedation studies in the ICU literature, and to inform the design of future investigations, additional information about the measurement properties of Sedation effectiveness instruments is needed.
Abstract: Objective: To systematically review instruments for measuring the level and effectiveness of sedation in adult and pediatric ICU patients.¶Study identification: We searched MEDLINE, EMBASE, the Cochrane Library and reference lists of the relevant articles. We selected studies if the sedation instrument reported items related to consciousness and one or more additional items related to the effectiveness or side effects of sedation.¶Data abstraction: We extracted data on the description of the instrument and on their measurement properties (internal consistency, reliability, validity and responsiveness).¶Results: We identified 25 studies describing relevant sedation instruments. In addition to the level of consciousness, agitation and synchrony with the ventilator were the most frequently assessed aspects of sedation. Among the 25 instruments, one developed in pediatric ICU patients (the Comfort Scale), and 3 developed in adult ICU patients (the Ramsay scale, the Sedation-Agitation-Scale and the Motor Activity Assessment Scale), were tested for both reliability and validity. None of these instruments were tested for their ability to detect change in sedation status over time (responsiveness).¶Conclusion: Many instruments have been used to measure sedation effectiveness in ICU patients. However, few of them exhibit satisfactory clinimetric properties. To help clinicians assess sedation at the bedside, to aid readers critically appraise the growing number of sedation studies in the ICU literature, and to inform the design of future investigations, additional information about the measurement properties of sedation effectiveness instruments is needed.

Journal ArticleDOI
TL;DR: The DCDQ proved capable of distinguishing children who had motor problems (as measured by standardized tests) from children without motor problems, and is a succinct and useful measure for use by occupational therapists.
Abstract: Objective As the consequences of clumsiness in children become better understood, the need for valid measurement tools is apparent. Parent report has the potential for providing historical knowledge of the child's motor skills, as well as perceptions of their children's motor difficulties. The objective was to develop a parent questionnaire to identify motor difficulties in children. Method A sample of 306 children participated in the development of a 17-item parent questionnaire, called the Developmental Coordination Disorder Questionnaire (DCDQ). Internal consistency, concurrent and construct validity were examined. Results The DCDQ proved capable of distinguishing children who had motor problems (as measured by standardized tests) from children without motor problems. Correlations with standardized tests were significant. Two other studies confirmed the construct validity of the DCDQ. Factor analysis revealed four distinct factors, useful in defining the nature of the difficulties. Conclusion The DCDQ is a succinct and useful measure for use by occupational therapists.

Book
27 Mar 2000
TL;DR: The methodology used for this analysis focused on modeling Failures at the Component Level, as well as the design of Experiments and Analysis of Variance, and the construction of Model Selection and Validation.
Abstract: CONTEXT OF RELIABILITY ANALYSIS. An Overview. Illustrative Cases and Data Sets. BASIC RELIABILITY METHODOLOGY. Collection and Preliminary Analysis of Failure Data. Probability Distributions for Modeling Time to Failure. Basic Statistical Methods for Data Analysis. RELIABILITY MODELING, ESTIMATION, AND PREDICTION. Modeling Failures at the Component Level. Modeling and Analysis of Multicomponent Systems. Advanced Statistical Methods for Data Analysis. Software Reliability. Design of Experiments and Analysis of Variance. Model Selection and Validation. RELIABILITY MANAGEMENT, IMPROVEMENT, AND OPTIMIZATION. Reliability Management. Reliability Engineering. Reliability Prediction and Assessment. Reliability Improvement. Maintenance of Unreliable Systems. Warranties and Service Contracts. Reliability Optimization. EPILOGUE. Case Studies. Resource Materials. Appendices. References. Indexes.

Book
14 Apr 2000
TL;DR: Terminology and Notation for Repairable Systems and Probabilistic Models: The Poisson Process are reviewed.
Abstract: Terminology and Notation for Repairable Systems. Probabilistic Models: The Poisson Process. Probabilistic Models: Renewal and Other Processes. Analyzing Data from a Single Repairable System. Analyzing Data from Multiple Repairable Systems. Appendix. References.

Journal ArticleDOI
TL;DR: In this article, a new adaptive adaptive site-specific spectra-based pushover analysis is proposed, which accounts for the effect of higher modes and overcomes the shortcomings of the FEMA procedure.
Abstract: The estimation of inelastic seismic demands using nonlinear static procedures, or pushover analyses, are inevitably going to be favored by practicing engineers over nonlinear time‐history methods. While there has been some concern over the reliability of static procedures to predict inelastic seismic demands, improved procedures overcoming these drawbacks are still forthcoming. In this paper, the potential limitations of static procedures, such as those recommended in FEMA 273, are highlighted through an evaluation of the response of instrumented buildings that experienced strong ground shaking in the 1994 Northridge earthquake. A new enhanced adaptive “modal” site‐specific spectra‐based pushover analysis is proposed, which accounts for the effect of higher modes and overcomes the shortcomings of the FEMA procedure. Features of the proposed procedure include its similarity to traditional response spectrum‐based analysis and the explicit consideration of ground motion characteristics during the an...

Journal ArticleDOI
01 Feb 2000
TL;DR: In this article, a review of existing approaches and how these may be used and/or adapted to suit the needs and the required indexes of the new competitive industry and the different parties associated with it is presented.
Abstract: Reliability is an important issue in power systems and historically has been assessed using deterministic criteria and indexes. However, these approaches can be, and in many cases have been, replaced by probabilistic methods that are able to respond to the actual stochastic factors that influence the reliability of the system. In the days of global, completely integrated and/or nationalized electricity supply industries, the only significant objective was the reliability seen by actual end users. Also, the system was structured in a relatively simple way such that generation, transmission, and distribution could be assessed as a series of sequential hierarchical levels. Failures at any level could cause interruptions of supply to the end user. All planning and operational criteria were intended to minimize such interruptions within economic limits. The system has been, or is being, restructured and now many individual parties are involved, often competitively, including generators, network owners, network operators, energy suppliers, regulators, as well as the end users. Each of these parties has a need to know the quality and performance of the system sector or subsector for which they are responsible. Therefore, there is now a need for a range of reliability measures; the actual measure(s) needed varying between the different system parties. This paper addresses these issues and, in particular, reviews existing approaches and how these may be used and/or adapted to suit the needs and the required indexes of the new competitive industry and the different parties associated with it.

Book
01 Dec 2000
TL;DR: In this article, a case study on design for software reliability optimization is presented, where the authors present an optimal scheduled-maintenance policy and a heuristic algorithm for optimization in reliability systems.
Abstract: List of figures List of tables Preface Acknowledgments 1 Introduction to reliability systems 2 Analysis and classification of reliability optimization models 3 Redundancy allocation by heuristic methods 4 Redundancy allocation by dynamic programming 5 Redundancy allocation by discrete optimization methods 6 Reliability optimization by nonlinear programming 7 Metaheuristic algorithms for optimization in reliability systems 8 Reliability-redundancy allocation 9 Component assignment in reliability systems 10 Reliability systems with multiple objectives 11 Other methods for system-reliability optimization 12 Burn-in optimization under limited capacity 13 Case study on design for software reliability optimization 14 Case study on an optimal scheduled-maintenance policy 15 Case studies on reliability optimization Appendices References Index

Proceedings ArticleDOI
24 Jan 2000
TL;DR: A general model estimates the minimum reliability requirement for multiple components within a system that will yield the goal reliability value for the system by solving for an optimum component reliability, which satisfies the system's reliability goal requirement.
Abstract: During the design phase of a product, reliability engineers are called upon to evaluate the reliability of the system. The question of how to meet a reliability goal for the system arises when the estimated reliability is inadequate. This then becomes a reliability allocation problem at the component level. In this paper, a general model estimates the minimum reliability requirement for multiple components within a system that will yield the goal reliability value for the system. The model consists of two parts. The first part is a nonlinear programming formulation of the allocation problem. The second part is a cost function formulation to be used in the nonlinear programming algorithm. A general behavior of the cost as a function of a component's reliability is assumed for this matter. The system's cost is then minimized by solving for an optimum component reliability, which satisfies the system's reliability goal requirement. Once the reliability requirement for each component is estimated, one can then decide whether to achieve this reliability by fault tolerance or fault avoidance. The model has yielded very encouraging results and it can be applied to any type of system, simple or complex, and for a variety of distributions. The advantage of this model is that it is very flexible, and requires very little processing time. These advantages make the proposed reliability allocation solution a great system design tool. A computer program has been developed and the model is available in a commercial software package called BlockSim/sup TM/.

Journal ArticleDOI
TL;DR: Differences in construct validity of the three measures were evident and their concurrent validity was relatively low, and the implications for research, and for assessing the quality of individual doctors' 'interpersonal' care are discussed.

Journal ArticleDOI
TL;DR: A condition-specific quality of life measure for those patients with severe dentofacial deformity who were requesting orthognathic treatment and to assess the reliability of this instrument, found to divide into four clinically meaningful domains.
Abstract: – The assessment of quality of life is becoming increasingly important in clinical research. Its importance in dentistry has been realised only relatively recently. Health-related quality of life is concerned with the aspects of quality of life that relate specifically to an individual’s health. This may be measured using two groups of instruments: (i) generic measures, which provide a summary of health-related quality of life and sometimes generate a single index measure of health or (ii) condition-specific measures, which focus on a particular condition, disease, population or problem and are potentially more responsive to small, but clinically important, changes in health. Objectives: The aim of this study was to develop a condition-specific quality of life measure for those patients with severe dentofacial deformity who were requesting orthognathic treatment and to assess the reliability of this instrument. Method: Instrument content was derived through a literature review and interviews with clinicians and patients. The resulting instrument was tested for internal consistency and test-retest reliability. Results and conclusions: The instrument was found to divide into four clinically meaningful domains. Internal consistency and test-retest reliability were good. Patient acceptance of the questionnaire was also encouraging.

Journal ArticleDOI
TL;DR: The results of this review could not demonstrate reliable outcomes and therefore no evidence on which to base acceptance of mobility tests of the SIJ into daily clinical practice, and there are no indications that 'upgrading' of methodological quality would have improved the final conclusions.

Journal ArticleDOI
TL;DR: The AUDIT incorporated in a health risk screening questionnaire is a reliable and valid self-administered instrument to identify at-risk drinkers and alcohol-dependent individuals in primary care settings.
Abstract: Background: Self-administered, general health risk screening questionnaires that are administered while patients wait in the doctor's office may be a reasonable and timesaving approach to address the requirements of preventive medicine in a typical 10-min medical visit. The psychometric characteristics of the Alcohol Use Disorders Identification Test (AUDIT) incorporated within a health questionnaire (H-AUDIT) have not been examined. Methods: The reliability and validity of the self-administered AUDIT were compared between the H-AUDIT and the AUDIT used as a single scale (S-AUDIT) in 332 primary care patients. Results: No major demographic or alcohol use characteristics were found between the 166 subjects who completed the H-AUDIT and the 166 individuals who completed the S-AUDIT. The test-retest reliability of the 166 subjects who completed the H-AUDIT [estimated by Spearman correlation coefficient at a 6-week interval (0.88), internal consistency (total correlation coefficients for all items ranged from 0.38 to 0.69; Cronbach α index 0.85), and the sensitivity and specificity of the H-AUDIT were used to identify at-risk drinkers’ areas under receiver operating characteristic (0.77) and alcohol-dependent subjects’ areas under receiver operating characteristic (0.89)] was similar to the same measurements obtained with the 166 individuals who completed the S-AUDIT. Conclusions: The AUDIT incorporated in a health risk screening questionnaire is a reliable and valid self-administered instrument to identify at-risk drinkers and alcohol-dependent individuals in primary care settings.

Journal ArticleDOI
TL;DR: This paper defines the key components of such a platform and for each component, a detailed review of techniques available for their implementation is provided, and a modular approach is used.

Journal ArticleDOI
TL;DR: In this article, a Bayesian procedure is proposed to quantify the modeling uncertainty, including the uncertainty in mechanical and statistical model selection and distribution parameters, for a fatigue reliability problem with the combination of two competing crack growth models.

Journal ArticleDOI
TL;DR: This paper generalizes a preventive maintenance optimization problem to multi-state systems, which have a range of performance levels and an algorithm is developed which obtains the sequence of maintenance actions providing system functioning with the desired level of reliability during its lifetime by minimum maintenance cost.

Journal ArticleDOI
TL;DR: In this article, a study of 350 nonprofit organizations investigates the adequacy, reliability, and appropriate interpretation of IRS 990 Return data through comparisons of selected entries with corresponding measures from each organization's audited financial statements.
Abstract: The IRS 990 Return is becoming an increasingly prominent source of financial data underlying descriptions of the nonprofit sector and studies of nonprofit organizations. However, questions about the quality of the data continue to be of concern. This study of 350 nonprofit organizations investigates the adequacy, reliability, and appropriate interpretation of IRS 990 Return data through comparisons of selected entries with corresponding measures from each organization’s audited financial statements. Both quantitative and qualitative methods are used to examine and explain the consistency between the two data sources. The study concludes that the IRS 990 Return can be considered an adequate and reliable source of financial information for many types of investigations, but preparers and users of the data need a clearer understanding of its purposes to enable appropriate interpretations.

Journal ArticleDOI
TL;DR: In this paper, the authors use generalizability theory to show why rater variance is not properly interpreted as measurement error, and show how such systematic rater effects can influence both reliability estimates and validity coefficients.
Abstract: Interrater correlations are widely interpreted as estimates of the reliability of supervisory performance ratings, and are frequently used to correct the correlations between ratings and other measures (e.g., test scores) for attenuation. These interrater correlations do provide some useful information, but they are not reliability coefficients. There is clear evidence of systematic rater effects in performance appraisal, and variance associated with raters is not a source of random measurement error. We use generalizability theory to show why rater variance is not properly interpreted as measurement error, and show how such systematic rater effects can influence both reliability estimates and validity coefficients. We show conditions under which interrater correlations can either overestimate or underestimate reliability coefficients, and discuss reasons other than random measurement error for low interrater correlations.


Journal ArticleDOI
TL;DR: In this article, the authors describe the development and initial psychometric evidence for a set of five constructed response measures designed to assess complex problem-solving skills and knowledge expected to influence leadership.
Abstract: We describe the development of and initial psychometric evidence for a set of five constructed response measures designed to assess complex problem-solving skills and knowledge expected to influence leadership. Structured (cued) and unstructured (uncued) problem solving scenarios intended to assess process skills associated with creative problem solving are presented first. Solution construction tasks developed to assess attention to constraints and characteristics in the broader problem context are presented next. Finally, social judgment tasks intended to assess understanding of people and social systems and a task sort to assess knowledge of leadership roles are presented. Preliminary evidence for the reliability and construct validity of these constructed response measures supports their efficacy in assessing skills that underlie effective organizational leadership.

Journal ArticleDOI
TL;DR: Analysis of measures of automobile reliability published in Consumer Reports yields three findings: quality improves over the production life of a car model with the same kind of regularity as an efficiency learning curve, and there is a quality learning curve.
Abstract: Whereas most prior research on the learning curve has focused on improvements in efficiency, this paper deals with the impact of learning on product quality. The key data are measures of automobile reliability published inConsumer Reports. Analysis yields three findings: (1) Quality improves over the production life of a car model with the same kind of regularity as an efficiency learning curve. Thus, there is a quality learning curve. (2) Unlike in the efficiency domain, however, learning in the domain of product reliability is primarily a function of time, and not of how many cars have gone down the assembly line. Thus, quality depends not on the accumulation of production experience per se, but on the intensity of "off-line" quality improvement activities and on the transfer of knowledge from the general environment over time. (3) In contrast to the traditional injunction, "do not buy a new car in its first year of production," the opposite advice actually seems to apply: In any given year, the newest car models have the best quality. That is, new car-model designs typically include significant quality improvements that are more than enough to outweigh any disruption created in manufacturing by the new model's introduction and that even surpass the incremental improvements made to older, existing car models.