scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Statistical methods for assessing agreement between two methods of clinical measurement.

08 Feb 1986-The Lancet (Elsevier)-Vol. 327, Iss: 8476, pp 307-310
TL;DR: An alternative approach, based on graphical techniques and simple calculations, is described, together with the relation between this analysis and the assessment of repeatability.
About: This article is published in The Lancet.The article was published on 1986-02-08. It has received 43884 citations till now.
Citations
More filters
Journal ArticleDOI
13 Sep 1997-BMJ
TL;DR: Funnel plots, plots of the trials' effect estimates against sample size, are skewed and asymmetrical in the presence of publication bias and other biases Funnel plot asymmetry, measured by regression analysis, predicts discordance of results when meta-analyses are compared with single large trials.
Abstract: Objective: Funnel plots (plots of effect estimates against sample size) may be useful to detect bias in meta-analyses that were later contradicted by large trials. We examined whether a simple test of asymmetry of funnel plots predicts discordance of results when meta-analyses are compared to large trials, and we assessed the prevalence of bias in published meta-analyses. Design: Medline search to identify pairs consisting of a meta-analysis and a single large trial (concordance of results was assumed if effects were in the same direction and the meta-analytic estimate was within 30% of the trial); analysis of funnel plots from 37 meta-analyses identified from a hand search of four leading general medicine journals 1993-6 and 38 meta-analyses from the second 1996 issue of the Cochrane Database of Systematic Reviews . Main outcome measure: Degree of funnel plot asymmetry as measured by the intercept from regression of standard normal deviates against precision. Results: In the eight pairs of meta-analysis and large trial that were identified (five from cardiovascular medicine, one from diabetic medicine, one from geriatric medicine, one from perinatal medicine) there were four concordant and four discordant pairs. In all cases discordance was due to meta-analyses showing larger effects. Funnel plot asymmetry was present in three out of four discordant pairs but in none of concordant pairs. In 14 (38%) journal meta-analyses and 5 (13%) Cochrane reviews, funnel plot asymmetry indicated that there was bias. Conclusions: A simple analysis of funnel plots provides a useful test for the likely presence of bias in meta-analyses, but as the capacity to detect bias will be limited when meta-analyses are based on a limited number of small trials the results from such analyses should be treated with considerable caution. Key messages Systematic reviews of randomised trials are the best strategy for appraising evidence; however, the findings of some meta-analyses were later contradicted by large trials Funnel plots, plots of the trials9 effect estimates against sample size, are skewed and asymmetrical in the presence of publication bias and other biases Funnel plot asymmetry, measured by regression analysis, predicts discordance of results when meta-analyses are compared with single large trials Funnel plot asymmetry was found in 38% of meta-analyses published in leading general medicine journals and in 13% of reviews from the Cochrane Database of Systematic Reviews Critical examination of systematic reviews for publication and related biases should be considered a routine procedure

37,989 citations

Journal ArticleDOI
TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

22,147 citations


Cites methods from "Statistical methods for assessing a..."

  • ...Such a plot is called a Bland-Altman plot [36] or a Tukey mean-difference plot [10]....

    [...]

Journal ArticleDOI
TL;DR: A practical guideline for clinical researchers to choose the correct form of ICC is provided and the best practice of reporting ICC parameters in scientific publications is suggested.

12,717 citations

Journal ArticleDOI
TL;DR: The 95% limits of agreement, estimated by mean difference 1.96 standard deviation of the differences, provide an interval within which 95% of differences between measurements by the two methods are expected to lie.
Abstract: Agreement between two methods of clinical measurement can be quantified using the differences between observations made using the two methods on the same subjects. The 95% limits of agreement, estimated by mean difference +/- 1.96 standard deviation of the differences, provide an interval within which 95% of differences between measurements by the two methods are expected to lie. We describe how graphical methods can be used to investigate the assumptions of the method and we also give confidence intervals. We extend the basic approach to data where there is a relationship between difference and magnitude, both with a simple logarithmic transformation approach and a new, more general, regression approach. We discuss the importance of the repeatability of each method separately and compare an estimate of this to the limits of agreement. We extend the limits of agreement approach to data with repeated measurements, proposing new estimates for equal numbers of replicates by each method on each subject, for unequal numbers of replicates, and for replicated data collected in pairs, where the underlying value of the quantity being measured is changing. Finally, we describe a nonparametric approach to comparing methods.

7,976 citations

Journal ArticleDOI
TL;DR: The criteria can be used in systematic reviews of health status questionnaires, to detect shortcomings and gaps in knowledge of measurement properties, and to design validation studies.

7,439 citations


Cites methods from "Statistical methods for assessing a..."

  • ...Another adequate parameter of agreement is described by Bland and Altman [34]....

    [...]

  • ...[34] Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement....

    [...]

  • ...[35] Altman DG. Practical statistics for medical research....

    [...]

  • ...In both cases, we consider a sample size of at least 50 patients adequate for the assessment of the agreement parameter, based on a general guideline by Altman [35]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper shall describe what is usually done, show why this is inappropriate, suggest a better approach, and ask why such studies are done so badly.
Abstract: In medicine we often want to compare two different methods of measuring some quantity, such as blood pressure, gestational age, or cardiac stroke volume. Sometimes we compare an approximate or simple method with a very precise one. This is a calibration problem, and we shall not discuss it further here. Frequently, however, we cannot regard either method as giving the true value of the quantity being measured. In this case we want to know whether the methods give answers which are, in some sense, comparable. For example, we may wish to see whether a new, cheap and quick method produces answers that agree with those from an established method sufficiently well for clinical purposes. Many such studies, using a variety of statistical techniques, have been reported. Yet few really answer the question “Do the two methods of measurement agree sufficiently closely?” In this paper we shall describe what is usually done, show why this is inappropriate, suggest a better approach, and ask why such studies are done so badly. We will restrict our consideration to the comparison of two methods of measuring a continuous variable, although similar problems can arise with categorical variables.

3,847 citations


"Statistical methods for assessing a..." refers background in this paper

  • ...However, this high correlation does not mean that the two methods agree: (1) r measures the strength of a relation between two variables, not the agreement between them....

    [...]

Journal ArticleDOI
TL;DR: Positive correlations between pre-treatment blood-pressure and the fall in pressure after treatment was examined for most classes of antihypertensive drugs and suggests that for all manoeuvres response is related to the height of the pretreatment pressure.

179 citations

Journal Article

99 citations

Journal ArticleDOI
21 Mar 1981-BMJ

90 citations


"Statistical methods for assessing a..." refers methods in this paper

  • ...discussed method comparison studies in an article in the British Medical Journal and emphasized the importance of looking at between-method differences, rather than correlation.(4)...

    [...]

Journal ArticleDOI
29 Nov 1980-BMJ
TL;DR: This method was used to compare observed numbers of deaths from five types of leukaemia with their respective "expected" numbers but seven deaths is far too few for such tests.
Abstract: The t tests to compare two groups of measurements are used extremely widely, but often incorreCtly.2-4 The problems usually relate to the data not complying with the underlying statistical assumption that the two sets of data come from populations that are Normal and have the same variance. Another serious error is to ignore the fact that the two sets of measurements relate to the same (or matched) individuals, in which case the paired t test is needed. These problems are fairly familiar and have been well illustrated by White3 so I will not consider them further here. Although generally posing fewer problems, X2 tests for comparing proportions also suffer some abuse, notably where there are too few observations. The sample size constraint also applies to the form of X2 test which simply entails comparing observed and expected frequencies. This method was used to compare observed numbers of deaths from five types of leukaemia (0, 1, 2, 4, 0) with their respective \"expected\" numbers (2, 1, 1, 3, 0),5 but seven deaths is far too few for such

62 citations