scispace - formally typeset
Search or ask a question
Journal Article

Assessing reports of therapeutic trials.

01 May 1970-British Journal of Pharmacology (Wiley-Blackwell)-Vol. 39, Iss: 1
About: This article is published in British Journal of Pharmacology.The article was published on 1970-05-01 and is currently open access. It has received 73 citations till now.
Citations
More filters
Journal ArticleDOI
TL;DR: It is shown that it is feasible to develop a checklist that can be used to assess the methodological quality not only of randomised controlled trials but also non-randomised studies and it is possible to produce a Checklist that provides a profile of the paper, alerting reviewers to its particular methodological strengths and weaknesses.
Abstract: OBJECTIVE: To test the feasibility of creating a valid and reliable checklist with the following features: appropriate for assessing both randomised and non-randomised studies; provision of both an overall score for study quality and a profile of scores not only for the quality of reporting, internal validity (bias and confounding) and power, but also for external validity. DESIGN: A pilot version was first developed, based on epidemiological principles, reviews, and existing checklists for randomised studies. Face and content validity were assessed by three experienced reviewers and reliability was determined using two raters assessing 10 randomised and 10 non-randomised studies. Using different raters, the checklist was revised and tested for internal consistency (Kuder-Richardson 20), test-retest and inter-rater reliability (Spearman correlation coefficient and sign rank test; kappa statistics), criterion validity, and respondent burden. MAIN RESULTS: The performance of the checklist improved considerably after revision of a pilot version. The Quality Index had high internal consistency (KR-20: 0.89) as did the subscales apart from external validity (KR-20: 0.54). Test-retest (r 0.88) and inter-rater (r 0.75) reliability of the Quality Index were good. Reliability of the subscales varied from good (bias) to poor (external validity). The Quality Index correlated highly with an existing, established instrument for assessing randomised studies (r 0.90). There was little difference between its performance with non-randomised and with randomised studies. Raters took about 20 minutes to assess each paper (range 10 to 45 minutes). CONCLUSIONS: This study has shown that it is feasible to develop a checklist that can be used to assess the methodological quality not only of randomised controlled trials but also non-randomised studies. It has also shown that it is possible to produce a checklist that provides a profile of the paper, alerting reviewers to its particular methodological strengths and weaknesses. Further work is required to improve the checklist and the training of raters in the assessment of external validity.

6,849 citations

Journal ArticleDOI
TL;DR: The inability of case-mix adjustment methods to compensate for selection bias and the inability to identify non- randomised studies that are free of selection bias indicate that non-randomised studies should only be undertaken when RCTs are infeasible or unethical.
Abstract: OBJECTIVES: To consider methods and related evidence for evaluating bias in non-randomised intervention studies. DATA SOURCES: Systematic reviews and methodological papers were identified from a search of electronic databases; handsearches of key medical journals and contact with experts working in the field. New empirical studies were conducted using data from two large randomised clinical trials. METHODS: Three systematic reviews and new empirical investigations were conducted. The reviews considered, in regard to non-randomised studies, (1) the existing evidence of bias, (2) the content of quality assessment tools, (3) the ways that study quality has been assessed and addressed. (4) The empirical investigations were conducted generating non-randomised studies from two large, multicentre randomised controlled trials (RCTs) and selectively resampling trial participants according to allocated treatment, centre and period. RESULTS: In the systematic reviews, eight studies compared results of randomised and non-randomised studies across multiple interventions using meta-epidemiological techniques. A total of 194 tools were identified that could be or had been used to assess non-randomised studies. Sixty tools covered at least five of six pre-specified internal validity domains. Fourteen tools covered three of four core items of particular importance for non-randomised studies. Six tools were thought suitable for use in systematic reviews. Of 511 systematic reviews that included non-randomised studies, only 169 (33%) assessed study quality. Sixty-nine reviews investigated the impact of quality on study results in a quantitative manner. The new empirical studies estimated the bias associated with non-random allocation and found that the bias could lead to consistent over- or underestimations of treatment effects, also the bias increased variation in results for both historical and concurrent controls, owing to haphazard differences in case-mix between groups. The biases were large enough to lead studies falsely to conclude significant findings of benefit or harm. Four strategies for case-mix adjustment were evaluated: none adequately adjusted for bias in historically and concurrently controlled studies. Logistic regression on average increased bias. Propensity score methods performed better, but were not satisfactory in most situations. Detailed investigation revealed that adequate adjustment can only be achieved in the unrealistic situation when selection depends on a single factor. CONCLUSIONS: Results of non-randomised studies sometimes, but not always, differ from results of randomised studies of the same intervention. Non-randomised studies may still give seriously misleading results when treated and control groups appear similar in key prognostic factors. Standard methods of case-mix adjustment do not guarantee removal of bias. Residual confounding may be high even when good prognostic data are available, and in some situations adjusted results may appear more biased than unadjusted results. Although many quality assessment tools exist and have been used for appraising non-randomised studies, most omit key quality domains. Healthcare policies based upon non-randomised studies or systematic reviews of non-randomised studies may need re-evaluation if the uncertainty in the true evidence base was not fully appreciated when policies were made. The inability of case-mix adjustment methods to compensate for selection bias and our inability to identify non-randomised studies that are free of selection bias indicate that non-randomised studies should only be undertaken when RCTs are infeasible or unethical. Recommendations for further research include: applying the resampling methodology in other clinical areas to ascertain whether the biases described are typical; developing or refining existing quality assessment tools for non-randomised studies; investigating how quality assessments of non-randomised studies can be incorporated into reviews and the implications of individual quality features for interpretation of a review's results; examination of the reasons for the apparent failure of case-mix adjustment methods; and further evaluation of the role of the propensity score.

2,651 citations

Journal ArticleDOI
TL;DR: An annotated bibliography of scales and checklists developed to assess quality is presented, giving readers a quantitative index of the likelihood that the reported methodology and results are free of bias.

1,295 citations

Journal ArticleDOI
TL;DR: Empirical evidence indicates that differences in scale development can lead to important differences in quality assessment, and several methods for including quality scores in systematic reviews have been proposed, but since little empirical evidence supports any given method, results must be interpreted cautiously.
Abstract: Assessing the quality of randomized controlled trials is a relatively new and important development. Three approaches have been developed: component, checklist, and scale assessment. Component approaches evaluate selected aspects of trials, such as masking. Checklists and scales involve lists of items thought to be integral to study quality. Scales, unlike the other methods, provide a summary numeric score of quality, which can be formally incorporated into a systematic review. Most scales to date have not been developed with sufficient rigor, however. Empirical evidence indicates that differences in scale development can lead to important differences in quality assessment. Several methods for including quality scores in systematic reviews have been proposed, but since little empirical evidence supports any given method, results must be interpreted cautiously. Future efforts may be best focused on gathering more empirical evidence to identify trial characteristics directly related to bias in the estimates of intervention effects and on improving the way in which trials are reported.

454 citations

Journal ArticleDOI
TL;DR: A set of statistical reporting guidelines suitable for medical journals to include in their Instructions for Authors tell authors, journal editors, and reviewers how to report basic statistical methods and results.

344 citations