scispace - formally typeset
Search or ask a question

Showing papers on "Reliability (statistics) published in 2004"


Journal ArticleDOI
TL;DR: In a recent article as mentioned in this paper, Lombard, Snyder-Duch, and Bracken surveyed 200 content analyses for their reporting of reliability tests, compared the virtues and drawbacks of five popular reliability measures, and proposed guidelines and standards for their use.
Abstract: In a recent article in this journal, Lombard, Snyder-Duch, and Bracken (2002) surveyed 200 content analyses for their reporting of reliability tests, compared the virtues and drawbacks of five popular reliability measures, and proposed guidelines and standards for their use. Their discussion revealed that numerous misconceptions circulate in the content analysis literature regarding how these measures behave and can aid or deceive content analysts in their effort to ensure the reliability of their data. This article proposes three conditions for statistical measures to serve as indices of the reliability of data and examines the mathematical structure and the behavior of the five coefficients discussed by the authors, as well as two others. It compares common beliefs about these coefficients with what they actually do and concludes with alternative recommendations for testing reliability in content analysis and similar data-making efforts.

2,101 citations


Book ChapterDOI
31 Aug 2004
TL;DR: A novel method for the detection and estimation of change that assumes that the points in the stream are independently generated, but otherwise makes no assumptions on the nature of the generating distribution.
Abstract: Detecting changes in a data stream is an important area of research with many applications. In this paper, we present a novel method for the detection and estimation of change. In addition to providing statistical guarantees on the reliability of detected changes, our method also provides meaningful descriptions and quantification of these changes. Our approach assumes that the points in the stream are independently generated, but otherwise makes no assumptions on the nature of the generating distribution. Thus our techniques work for both continuous and discrete data. In an experimental study we demonstrate the power of our techniques.

883 citations


Journal ArticleDOI
TL;DR: The Nasal Obstruction Symptom Evaluation Scale is a valid, reliable, and responsive instrument that is brief and easy to complete and has potential use for outcomes studies in adults with nasal obstruction.
Abstract: Objective The study goal was to validate a disease-specific health status instrument for use in patients with nasal obstruction. Design, settings, and patients The study consisted of a prospective instrument validation conducted at 4 academic medical centers with 32 adults with nasal septal deformity. Methods Prospective instrument validation occurred in 2 stages. Stage 1 was the development of a preliminary (alpha-version) instrument of potential items. Stage 2 was a test of the alpha-version for item performance, internal consistency, and test-retest reliability; construct, discriminant, criterion validity, and responsiveness; and creation of the final instrument. Results Items with poor performance were eliminated from the alpha-version instrument. In testing the final instrument, test-retest reliability was adequate at 0.702; internal consistency reliability was also adequate at 0.785. Validity was confirmed using correlation and comparison analysis, and response sensitivity was excellent. Conclusions The Nasal Obstruction Symptom Evaluation Scale is a valid, reliable, and responsive instrument that is brief and easy to complete and has potential use for outcomes studies in adults with nasal obstruction.

811 citations


Journal ArticleDOI
TL;DR: This presentation explains how assessment data, like other scientific experimental data, must be reproducible in order to be meaningfully interpreted.
Abstract: Context All assessment data, like other scientific experimental data, must be reproducible in order to be meaningfully interpreted. Purpose The purpose of this paper is to discuss applications of reliability to the most common assessment methods in medical education. Typical methods of estimating reliability are discussed intuitively and non-mathematically. Summary Reliability refers to the consistency of assessment outcomes. The exact type of consistency of greatest interest depends on the type of assessment, its purpose and the consequential use of the data. Written tests of cognitive achievement look to internal test consistency, using estimation methods derived from the test-retest design. Rater-based assessment data, such as ratings of clinical performance on the wards, require interrater consistency or agreement. Objective structured clinical examinations, simulated patient examinations and other performance-type assessments generally require generalisability theory analysis to account for various sources of measurement error in complex designs and to estimate the consistency of the generalisations to a universe or domain of skills. Conclusions Reliability is a major source of validity evidence for assessments. Low reliability indicates that large variations in scores can be expected upon retesting. Inconsistent assessment scores are difficult or impossible to interpret meaningfully and thus reduce validity evidence. Reliability coefficients allow the quantification and estimation of the random errors of measurement in assessments, such that overall assessment can be improved.

621 citations


Proceedings ArticleDOI
28 Jun 2004
TL;DR: The results imply that leveraging a single microarchitecture design for multiple remaps across a few technology generations will become increasingly difficult, and motivate a need for workload specific, microarch Architectural lifetime reliability awareness at an early design stage.
Abstract: The relentless scaling of CMOS technology has provided a steady increase in processor performance for the past three decades. However, increased power densities (hence temperatures) and other scaling effects have an adverse impact on long-term processor lifetime reliability. This paper represents a first attempt at quantifying the impact of scaling on lifetime reliability due to intrinsic hard errors, taking workload characteristics into consideration. For our quantitative evaluation, we use RAMP (Srinivasan et al., 2004), a previously proposed industrial-strength model that provides reliability estimates for a workload, but for a given technology. We extend RAMP by adding scaling specific parameters to enable workload-dependent lifetime reliability evaluation at different technologies. We show that (1) scaling has a significant impact on processor hard failure rates - on average, with SPEC benchmarks, we find the failure rate of a scaled 65nm processor to be 316% higher than a similarly pipelined 180nm processor; (2) time-dependent dielectric breakdown and electromigration have the largest increases; and (3) with scaling, the difference in reliability from running at worst-case vs. typical workload operating conditions increases significantly, as does the difference from running different workloads. Our results imply that leveraging a single microarchitecture design for multiple remaps across a few technology generations will become increasingly difficult, and motivate a need for workload specific, microarchitectural lifetime reliability awareness at an early design stage.

577 citations


Journal ArticleDOI
TL;DR: Suggestions presented in this paper should help the analyst to design and perform the minimum number validation experiments needed to obtain all the required information to establish and demonstrate the reliability of its analytical procedure.

512 citations


Journal ArticleDOI
TL;DR: The results from the Distributed Model Intercomparison Project (DMIP) study as discussed by the authors show that some calibration strategies for distributed models are not as well defined as strategies for lumped models, but some calibration efforts applied to distributed models significantly improve simulation results.

494 citations


Journal ArticleDOI
TL;DR: Since the desired outcome of the movement measurements is a reliable estimate of body segment kinematics, state-of-the-art techniques proposed for minimization of error propagation arising from a cluster of external markers are described.

473 citations


Journal ArticleDOI
TL;DR: A multitrait–multimethod analysis of perceptual performance measures to investigate item-specific trait, method and error variance finds that while random error and systematic bias account for a large portion of item variance, perceptual measures satisfy the requirements of reliability and validity.

460 citations


Journal Article
TL;DR: The CSA/MTI appeared to have acceptable reliability for most research applications and values with the other devices indicate some possible concerns with reliability, but additional work is needed to better understand factors contributing to variability in accelerometry data.
Abstract: Introduction Numerous studies have examined the validity of accelerometry-based activity monitors but few studies have systematically studied the reliability of different accelerometer units for assessing a standardized bout of physical activity. Improving understanding of error in these devices is an important research objective because they are increasingly being used in large surveillance studies and intervention trials that require the use of multiple units over time. Methods Four samples of college-aged participants were recruited to collect reliability data on four different accelerometer types (CSA/MTI, Biotrainer Pro, Tritrac-R3D, and Actical). The participants completed three trials of treadmill walking (3 mph) while wearing multiple units of a specific monitor type. For each trial, the participant completed a series of 5-min bouts of walking (one for each monitoring unit) with 1-min of standing rest between each bout. Generalizability (G) theory was used to quantify variance components associated with individual monitor units, trials, and subjects as well as interactions between these terms. Results The overall G coefficients range from 0.43 to 0.64 for the four monitor types. Corresponding intraclass correlation coefficients (ICC) ranged from 0.62 to 0.80. The CSA/MTI was found to have the least variability across monitor units and trials and the highest overall reliability. The Actical was found to have the poorest reliability. Conclusion The CSA/MTI appeared to have acceptable reliability for most research applications (G values above 0.60 and ICC values above 0.80), but values with the other devices indicate some possible concerns with reliability. Additional work is needed to better understand factors contributing to variability in accelerometry data and to determine appropriate calibration protocols to improve reliability of these measures for different research applications.

438 citations


Journal ArticleDOI
TL;DR: This paper focuses particularly on the latter with respect to ways of ensuring content validity and achieving acceptable levels of reliability in the OSCE format.
Abstract: The traditional clinical examination has been shown to have serious limitations in terms of its validity and reliability. The OSCE provides some answers to these limitations and has become very popular. Many variants on the original OSCE format now exist and much research has been done on various aspects of their use. Issues to be addressed relate to organization matters and to the quality of the assessment. This paper focuses particularly on the latter with respect to ways of ensuring content validity and achieving acceptable levels of reliability. A particular concern has been the demonstrable need for long examinations if high levels of reliability are to be achieved. Strategies for reducing the practical difficulties this raises are discussed. Standard setting methods for use with OSCEs are described.

Journal ArticleDOI
02 Mar 2004
TL;DR: This paper proposes dynamic reliability management (DRM) - a technique where the processor can respond to changing application behavior to maintain its lifetime reliability target, and describes an architecture-level model and its implementation that can dynamically track lifetime reliability, responding to changes in application behavior.
Abstract: Ensuring long processor lifetimes by limiting failuresdue to wear-out related hard errors is a critical requirementfor all microprocessor manufacturers. We observethat continuous device scaling and increasing temperaturesare making lifetime reliability targets even harder to meet.However, current methodologies for qualifying lifetime reliabilityare overly conservative since they assume worst-caseoperating conditions. This paper makes the case thatthe continued use of such methodologies will significantlyand unnecessarily constrain performance. Instead, lifetimereliability awareness at the microarchitectural design stagecan mitigate this problem, by designing processors that dynamicallyadapt in response to the observed usage to meeta reliability target.We make two specific contributions. First, we describean architecture-level model and its implementation, calledRAMP, that can dynamically track lifetime reliability, respondingto changes in application behavior. RAMP isbased on state-of-the-art device models for different wear-outmechanisms. Second, we propose dynamic reliabilitymanagement (DRM) - a technique where the processorcan respond to changing application behavior to maintainits lifetime reliability target. In contrast to currentworst-case behavior based reliability qualification methodologies,DRM allows processors to be qualified for reliabilityat lower (but more likely) operating points than theworst case. Using RAMP, we show that this can save costand/or improve performance, that dynamic voltage scalingis an effective response technique for DRM, and that dynamicthermal management neither subsumes nor is sub-sumedby DRM.

Journal ArticleDOI
TL;DR: In this article, the authors published an important article on the validation of the International Physical Activity Questionnaire (IPAQ), which was always a major limitation among studies on the IPAQ.
Abstract: Dear Editor-in-Chief:In the August 2003 issue of Medicine & Science in Sports & Exercise®, Craig et al. (2) published an important article on the validation of the International Physical Activity Questionnaire (IPAQ). Lack of comparability was always a major limitation among studies aim

Journal ArticleDOI
TL;DR: The currently available methodological evidence points towards a high quality of the MRS scale to measure and to compare HRQoL of aging women in different regions and over time, it suggests a high reliability and high validity as far as the process of construct validation could be completed yet.
Abstract: This paper compiles data from different sources to get a first comprehensive picture of psychometric and other methodological characteristics of the Menopause Rating Scale (MRS) scale. The scale was designed and standardized as a self-administered scale to (a) to assess symptoms/complaints of aging women under different conditions, (b) to evaluate the severity of symptoms over time, and (c) to measure changes pre- and postmenopause replacement therapy. The scale became widespread used (available in 10 languages). A large multinational survey (9 countries in 4 continents) from 2001/ 2002 is the basis for in depth analyses on reliability and validity of the MRS. Additional small convenience samples were used to get first impressions about test-retest reliability. The data were centrally analyzed. Data from a postmarketing HRT study were used to estimate discriminative validity. Reliability measures (consistency and test-retest stability) were found to be good across countries, although the sample size for test-retest reliability was small. Validity: The internal structure of the MRS across countries was astonishingly similar to conclude that the scale really measures the same phenomenon in symptomatic women. The sub-scores and total score correlations were high (0.7–0.9) but lower among the sub-scales (0.5–0.7). This however suggests that the subscales are not fully independent. Norm values from different populations were presented showing that a direct comparison between Europe and North America is possible, but caution recommended with comparisons of data from Latin America and Indonesia. But this will not affect intra-individual comparisons within clinical trials. The comparison with the Kupperman Index showed sufficiently good correlations, illustrating an adept criterion-oriented validity. The same is true for the comparison with the generic quality-of-life scale SF-36 where also a sufficiently close association has been shown. The currently available methodological evidence points towards a high quality of the MRS scale to measure and to compare HRQoL of aging women in different regions and over time, it suggests a high reliability and high validity as far as the process of construct validation could be completed yet.

Book ChapterDOI
01 Jan 2004
TL;DR: In the context of system reliability, the authors can distinguish three prototype cases: purely voluntary provision of public goods, individuals may tend to shirk, and an inefficient level of the public good.
Abstract: System reliability often depends on the effort of many individuals, making reliability a public good. It is well-known that purely voluntary provision of public goods may result in a free rider problem: individuals may tend to shirk, resulting in an inefficient level of the public good. How much effort each individual exerts will depend on his own benefits and costs, the efforts exerted by the other individuals, and the technology that relates individual effort to outcomes. In the context of system reliability, we can distinguish three prototype cases.

Journal ArticleDOI
TL;DR: Analysis of different clinimetric parameters support the use of the TIS in both clinical use and future stroke research, and guidelines for treatment and level of quality of trunk activity can be derived from the assessment.
Abstract: Objective: To examine the clinimetric characteristics of the Trunk Impairment Scale (TIS). This newly developed scale evaluates motor impairment of the trunk after stroke. The TIS scores, on a range from 0 to 23, static and dynamic sitting balance as well as trunk co-ordination. It also aims to score the quality of trunk movement and to be a guide for treatment.Design: Two physiotherapists observed each patient simultaneously, but scored independently. Each patient was re-examined by one of the therapists.Subjects: Twenty-eight patients in a rehabilitation setting.Results: Kappa and weighted kappa values for item per item reliability ranged for all but two, from 0.62 to 1. All percentages of agreement exceeded 81%. Intraclass correlations (ICC) for the summed scores of the different subscales were between 0.85 and 0.99. Test–retest and interobserver reliability for the TIS total score (ICC) was 0.96 and 0.99, respectively. The 95% limits of agreement for the test–retest and interexaminer measurement error...

Book
01 Jan 2004
TL;DR: This chapter focuses on investigating relationships among different sets of test scores, and on investigating reliability for norm-referenced tests and making statistical inferences about these relationships.
Abstract: 1 Basic concepts and terms 2 Describing test scores 3 Investigating relationships among different sets of test scores 4 Analyzing test tasks 5 Investigating reliability for norm-referenced tests 6 Investigating reliability for criterion-referenced tests 7 Stating hypotheses and making statistical inferences 8 Tests of statistical significance 9 Investigating validity 10 Reporting and interpreting test scores

Journal ArticleDOI
TL;DR: The reliability of continuous or binary outcome measures is usually assessed by estimation of the intraclass correlation coefficient (ICC), and the optimal allocation for the number of subjects k and thenumber of repeated measurements n that minimize the variance of the estimated ICC is discussed.
Abstract: The reliability of continuous or binary outcome measures is usually assessed by estimation of the intraclass correlation coefficient (ICC). A crucial step for this purpose is the determination of the required sample size. In this review, we discuss the contributions made in this regard and derive the optimal allocation for the number of subjects k and the number of repeated measurements n that minimize the variance of the estimated ICC. Cost constraints are discussed for both normally and non-normally distributed responses, with emphasis on the case of dichotomous assessments. Tables showing optimal choices of k and n are given along with the guidelines for the efficient design of reliability studies.

BookDOI
21 Jul 2004
TL;DR: SPSS for Introductory Statistics, Third Edition as discussed by the authors is a tool to help students analyze and interpret research data using SPSS syntax and output, as well as a set of problems to solve.
Abstract: This book distinguishes itself from other SPSS resources through its unique integration of the research process (including design) and the use and interpretation of the statistics Designed to help students analyze and interpret research data, the authors demonstrate how to choose the appropriate statistic based on the research design, interpret SPSS output, and write about the output in a research paper The authors describe the use and interpretation of these statistics in user-friendly, non-technical language The book prepares students for all of the steps in the research process, from design and data collection, to writing about the results The new edition features SPSS 140 for Windows, but can also be used with older and newer versions There are also new problems, expanded discussions of effect sizes, and an expanded appendix on getting started with SPSS The book features discussions of writing about outputs, data entry and checking, reliability assessment, testing assumptions, and descriptive, inferential, and nonparametric statistics Several related statistics are included in each chapter SPSS syntax, along with the output, is included for those who prefer this format Two realistic data sets are available on the book’s CD and are used to solve the end of chapter problems SPSS for Introductory Statistics, Third Edition, provides these helpful teaching tools: • All of the key SPSS windows needed to perform the analyses • Complete outputs with call-out boxes to highlight key points • Interpretation sections and questions to help students better understand the output • Lab assignments organized the way students proceed when they conduct a research project • Extra SPSS problems for practice in running and interpreting SPSS • Helpful appendices on how to get started with SPSS, write research questions, and create tables and figures This book is an ideal supplement for courses in either statistics or research methods taught in departments of psychology, education, and other social and health sciences The Instructor’s Resource CD features PowerPoint slides and answers to and additional information on the questions and problems

Journal ArticleDOI
TL;DR: The therapeutic relationship is a reliable predictor of patient outcome in mainstream psychiatric care and may need to take account of different, specific aspects of the relationship in psychiatric settings such as greater heterogeneity of treatment components and goals.
Abstract: Aims: To review the methods and findings from studies of the therapeutic relationship (TR) in the treatment of severe mental illness.Method: A literature search was conducted to identify all studies that used an operationalised measurement of the TR in the treatment of severe mental illness. Results: Fifteen scales–the majority of which were developed for psychotherapy–and the expressed emotion index have been used. Most scales have acceptable internal, inter-rater and test–retest reliability. As none of the scales has been used in more than five studies, no single scale is widely established in psychiatric research. A more positive relationship consistently predicts a better short-and long-term outcome. It appears that a large global factor accounts for the greatest proportion of the variance in the therapeutic relationship.Conclusions: The therapeutic relationship is a reliable predictor of patient outcome in mainstream psychiatric care. Valid assessments may need to take account of different, specific ...

Journal ArticleDOI
TL;DR: Spatial-temporal gait measurements demonstrate good to excellent test-retest reliability over a one-week time span.
Abstract: Background: The purpose of this study was to determine the test-retest reliability of temporal and spatial gait measurements over a one-week period as measured using an instrumented walkway system (GAITRite®). Methods: Subjects were tested on two occasions one week apart. Measurements were made at preferred and fast walking speeds using the GAITRite ® system. Measurements tested included walking speed, step length, stride length, base of support, step time, stride time, swing time, stance time, single and double support times, and toe in-toe out angle. Results: Twenty-one healthy subjects participated in this study. The group consisted of 12 men and 9 women, with an average age of 34 years (range: 19 – 59 years). At preferred walking speed, all gait measurements had ICC's of 0.92 and higher, except base of support which had an ICC of 0.80. At fast walking speed all gait measurements had ICC's above 0.89 except base of support (ICC = 0.79), Conclusions: Spatial-temporal gait measurements demonstrate good to excellent test-retest reliability over a one-week time span.

Journal Article
TL;DR: In this article, the authors present an approach for determining the reliability of component-based software architectures based on rich architecture definition language (RADL) oriented towards industrial middleware platforms, such as Microsoft's. NET and Sun's EJB.
Abstract: One of the motivations for specifying software architectures explicitly is the use of high level structural design information for improved control and prediction of software system quality attributes. In this paper, we present an approach for determining the reliability of component-based software architectures.Our method is based on rich architecture definition language (RADL) oriented towards modem industrial middleware platforms, such as Microsoft's. NET and Sun's EJB. Our methods involve parameterised contractual specifications based on state machines and thus permits efficient static analysis.We show how RADL allows software architects to predict component reliability through compositional analysis of usage profiles and of environment component reliability. We illustrate our approach with an e-commerce example and report about empirical measurements which confirm our analytical reliability prediction through monitoring in our reliability test-bed. Our evaluation confirms that prediction accuracy for software components necessitates modelling the behaviour of binary components and the dependency of provided services on required components. Fortunately, our measurements also show that an abstract protocol view of that behaviour is sufficient to predict reliability with high accuracy. The reliability of a component most strongly depends on its environment. Therefore, we advocate a reliability model parameterized by required component reliability in a deployment context.

Journal ArticleDOI
TL;DR: A sampling technique which uses lines in order to probe the failure domain, which is employed in conjunction with a stepwise procedure which makes use of Markov Chains and exhibits accelerated convergence.

Journal ArticleDOI
01 Sep 2004
TL;DR: In this paper, the authors show the issues related with validation of scales applied in health research and illustrate the steps of this process, including the selection of items, translation, validity, reliability and usefulness.
Abstract: This article shows the mayor issues related with validation of scales applied in health research and illustrates the steps of this process. Concepts related with the selection of items, translation, validity, reliability and usefulness are discussed in this article.

Journal ArticleDOI
TL;DR: The revised version of the Health Assessment Questionnaire (HAQ) (the HAQ-II) is a reliable and valid 10-item questionnaire that performs at least as well as the HAQ and is simpler to administer and score.
Abstract: Objective The Health Assessment Questionnaire (HAQ) has become the most common tool for measuring functional status in rheumatology However, the HAQ is long (34 questions, including 20 concerning activities of daily living and 14 relating to the use of aids and devices) and somewhat burdensome to score, has some floor effects, and has psychometric problems relating to linearity and confusing items We undertook this study to develop and validate a revised version of the HAQ (the HAQ-II) Methods Using Rasch analysis and a 31-question item bank, including 20 HAQ items, the 10-item HAQ-II was developed Five original items from the HAQ were retained We studied the HAQ-II in 14,038 patients with rheumatic disease over a 2-year period to determine its validity and reliability Results The HAQ-II was reliable (reliability of 088, compared with 083 for the HAQ), measured disability over a longer scale than the HAQ, and had no nonfitting items and no gaps Compared with the HAQ, modified HAQ, and Medical Outcomes Study Short Form 36 physical function scale, the HAQ-II was as well correlated or better correlated with clinical and outcome variables The HAQ-II performed as well as the HAQ in a clinical trial and in prediction of mortality and work disability The mean difference between the HAQ and HAQ-II scores was 002 units Conclusion The HAQ-II is a reliable and valid 10-item questionnaire that performs at least as well as the HAQ and is simpler to administer and score Conversion from HAQ to HAQ-II and from HAQ-II to HAQ for research purposes is simple and reliable The HAQ-II can be used in all places where the HAQ is now used, and it may prove to be easier to use in the clinic

Journal ArticleDOI
TL;DR: The research developed reliable and valid measures for the components of an online store image, and examined the relationships of these components to attitudes and intentions to purchase online.


Journal ArticleDOI
TL;DR: In this paper, the authors present a model from geotechnical data and show that the distinction between the trend or systematic error and the spatial error is a modeling choice, not a property of nature.
Abstract: Uncertainty and risk are central features of geotechnical and geological engineering. Engineers can deal with uncertainty by ignoring it, by being conservative, by using the observational method, or by quantifying it. In recent years, reliability analysis and probabilistic methods have found wide application in geotechnical engineering and related fields. The tools are well known, including methods of reliability analysis and decision trees. Analytical models for deterministic geotechnical applications are also widely available, even if their underlying reliability is sometimes suspect. The major issues involve input and output. In order to develop appropriate input, the engineer must understand the nature of uncertainty and probability. Most geotechnical uncertainty reflects lack of knowledge, and probability based on the engineer’s degree of belief comes closest to the profession’s practical approach. Bayesian approaches are especially powerful because they provide probabilities on the state of nature rather than on the observations. The first point in developing a model from geotechnical data is that the distinction between the trend or systematic error and the spatial error is a modeling choice, not a property of nature. Second, properties estimated from small samples may be seriously in error, whether they are used probabilistically or deterministically. Third, experts generally estimate mean trends well but tend to underestimate uncertainty and to be overconfident in their estimates. In this context, engineering judgment should be based on a demonstrable chain of reasoning and not on speculation. One difficulty in interpreting results is that most people, including engineers, have difficulty establishing an allowable probability of failure or dealing with low values of probability. The \IF-N\N plot is one useful vehicle for comparing calculated probabilities with observed frequencies of failure of comparable facilities. In any comparison it must be noted that a calculated probability is a lower bound because it must fail to incorporate the factors that are ignored in the analysis. It is useful to compare probabilities of failure for alternative designs, and the reliability methods reveal the contributions of different components to the uncertainty in the probability of failure. Probability is not a property of the world but a state of mind; geotechnical uncertainty is primarily epistemic, Bayesian, and belief based. The current challenges to the profession are to make use of probabilistic methods in practice and to sharpen our investigations and analyses so that each additional data point provides maximal information.

Book ChapterDOI
01 Jan 2004
TL;DR: This paper gives the main definitions relating to dependability, a generic concept including as special case such attributes as reliability, availability, safety, confidentiality, integrity, maintainability, etc.
Abstract: This paper gives the main definitions relating to dependability, a generic concept including as special case such attributes as reliability, availability, safety, confidentiality, integrity, maintainability, etc. Basic definitions are given first. They are then commented upon, and supplemented by additional definitions, which address the threats to dependability (faults, errors, failures), and the attributes of dependability. The discussion on the attributes encompasses the relationship of dependability with security, survivability and trustworthiness.

Journal ArticleDOI
TL;DR: The purpose of this study was to estimate the reliability of a questionnaire measuring various self-reported measures of the neighborhood environment of possible relevance to cardiovascular disease and suggested that self- reported neighborhood characteristics can be reliably measured.
Abstract: The majority of studies examining the relation between neighborhood environments and health have used census-based indicators to characterize neighborhoods. These studies have shown that neighborhood socieconomic characteristics are associated with a range of health outcomes. Establishing if these associations reflect causal relations requires testing hypotheses regarding how specific features of neighborhoods are related to specific health outcomes. However, there is little information on the reliability of neighborhood measures. The purpose of this study was to estimate the reliability of a questionnaire measuring various self-reported measures of the neighborhood environment of possible relevance to cardiovascular disease. The study consisted of a faceto-face and telephone interview administered twice to 48 participants over a 2-week period. The face-to-face and telephone portions of the interview lasted an average of 5 and 11 minutes, respectively. The questionnaire was piloted among a largely Latino and African American study sample recruited from a public hospital setting in New York City. Scales were used to assess six neighborhood domains: aesthetic quality, walking/ exercise environment, safety from crime, violence, access to healthy foods, and social cohesion. Cronbach’s α’s ranged from. 77 to. 94 for the scales corresponding to these domains, with test-retest correlations ranging from 0.78 to 0.91. In addition neighborhood indices for presence of recreational facilities, quality of recreational facilities, neighborhood participation, and neighborhood problems were examined. Test-retest reliability measures for these indices ranged from 0.73 to 0.91. The results from this study suggested that self-reported neighborhood characteristics can be reliably measured.