scispace - formally typeset
Search or ask a question

Showing papers by "Gordon H. Guyatt published in 2004"


Journal ArticleDOI
19 Jun 2004-BMJ
TL;DR: A system for grading the quality of evidence and the strength of recommendations that can be applied across a wide range of interventions and contexts is developed, and a summary of the approach from the perspective of a guideline user is presented.
Abstract: Users of clinical practice guidelines and other recommendations need to know how much confidence they can place in the recommendations Systematic and explicit methods of making judgments can reduce errors and improve communication We have developed a system for grading the quality of evidence and the strength of recommendations that can be applied across a wide range of interventions and contexts In this article we present a summary of our approach from the perspective of a guideline user Judgments about the strength of a recommendation require consideration of the balance between benefits and harms, the quality of the evidence, translation of the evidence into specific circumstances, and the certainty of the baseline risk It is also important to consider costs (resource utilisation) before making a recommendation Inconsistencies among systems for grading the quality of evidence and the strength of recommendations reduce their potential to facilitate critical appraisal and improve communication of these judgments Our system for guiding these complex judgments balances the need for simplicity with the need for full and transparent consideration of all important issues

7,608 citations


Journal ArticleDOI
TL;DR: The objective was to critically appraise six prominent systems for grading levels of evidence and the strength of recommendations as a basis for agreeing on characteristics of a common, sensible approach.
Abstract: A number of approaches have been used to grade levels of evidence and the strength of recommendations. The use of many different approaches detracts from one of the main reasons for having explicit approaches: to concisely characterise and communicate this information so that it can easily be understood and thereby help people make well-informed decisions. Our objective was to critically appraise six prominent systems for grading levels of evidence and the strength of recommendations as a basis for agreeing on characteristics of a common, sensible approach to grading levels of evidence and the strength of recommendations. Six prominent systems for grading levels of evidence and strength of recommendations were selected and someone familiar with each system prepared a description of each of these. Twelve assessors independently evaluated each system based on twelve criteria to assess the sensibility of the different approaches. Systems used by 51 organisations were compared with these six approaches. There was poor agreement about the sensibility of the six systems. Only one of the systems was suitable for all four types of questions we considered (effectiveness, harm, diagnosis and prognosis). None of the systems was considered usable for all of the target groups we considered (professionals, patients and policy makers). The raters found low reproducibility of judgements made using all six systems. Systems used by 51 organisations that sponsor clinical practice guidelines included a number of minor variations of the six systems that we critically appraised. All of the currently used approaches to grading levels of evidence and the strength of recommendations have important shortcomings.

975 citations


Journal ArticleDOI
01 Sep 2004-Chest
TL;DR: The title from a conference emphasizing consensus to "ACCP Conference on Antithrombotic and Thrombolytic Therapy: Evidence-Based Guidelines" reflects the evidence-based approach to making recommendations.

428 citations


Journal ArticleDOI
TL;DR: Lower within-person variability was responsible for the higher test-retest reliability of the interviewer-administered format while between person variability was similar for both formats.
Abstract: Assessment of health-related quality of life (HRQL) is important in patients with chronic obstructive pulmonary disease (COPD). Despite the high prevalence of COPD in Germany, Switzerland and Austria there is no validated disease-specific instrument available. The objective of this study was to translate the Chronic Respiratory Questionnaire (CRQ), one of the most widely used respiratory HRQL questionnaires, into German, develop an interviewer- and self-administered version including both standardised and individualised dyspnoea questions, and validate these versions in two randomised studies. We recruited three groups of patients with COPD in Switzerland, Germany and Austria. The 44 patients of the first group completed the CRQ during pilot testing to adapt the CRQ to German-speaking patients. We then recruited 80 patients participating in pulmonary rehabilitation programs to assess internal consistency reliability and cross-sectional validity of the CRQ. The third group consisted of 38 patients with stable COPD without an intervention to assess test-retest reliability. To compare the interviewer- and self-administered versions, we randomised patients in groups 2 and 3 to the interviewer- or self-administered CRQ. Patients completed both the standardised and individualised dyspnoea questions. For both administration formats and all domains, we found good internal consistency reliability (Crohnbach's alpha between 0.73 and 0.89). Cross-sectional validity tended to be better for the standardised compared to the individualised dyspnoea questions and cross-sectional validity was slightly better for the self-administered format. Test-retest reliability was good for both the interviewer-administered CRQ (intraclass correlation coefficients for different domains between 0.81 and 0.95) and the self-administered format (intraclass correlation coefficients between 0.78 and 0.86). Lower within-person variability was responsible for the higher test-retest reliability of the interviewer-administered format while between person variability was similar for both formats. Investigators in German-speaking countries can choose between valid and reliable self-and interviewer-administered CRQ formats.

339 citations


Journal Article
TL;DR: Results suggest good construct validity, internal consistency, reliability and test-retest reliability, but do not demonstrate discriminative validity, which is consistent with theoretical models of oral disease and its consequences.
Abstract: Purpose This study measured oral health-related quality of life for children, which involved the construction of child perceptions questionnaires (CPQs) for ages 6 to 7, 8 to 10, and 11 to 14. The purpose of this study was to present the development and evaluation of the CPQ for 8- to 10-year-olds (CPQ8-10). Methods Questions (N=25) were selected from the CPQ for 11- to 14-year-olds based on the child development literature and input from parents, child psychologist, and teacher of grades 3 and 4. Validity and reliability were evaluated on 68 and 33 children, respectively. Results There was a positive moderate correlation between the CPQ8-10 score and overall well-being rating (R=.45). The level of impact was slightly higher in the orofacial than in the pediatric dentistry group (mean score=19.1 vs 18.4, respectively). Hypotheses concerning the relationship between the CPQ8-10 score and number of decayed surfaces were confirmed with R=.29, and the mean score higher in caries-afflicted than caries-free children (21.1 vs 14.7). The Cronbach's alpha and intraclass correlation coefficients were 0.89 and 0.75, respectively. Conclusions Results suggest good construct validity, internal consistency, reliability and test-retest reliability, but do not demonstrate discriminative validity. This is consistent, however, with theoretical models of oral disease and its consequences. Further research is required, as these are preliminary findings based on convenience sampling.

291 citations


Book
23 Dec 2004
TL;DR: The authors have enhanced the usefulness of the book as a reference manual by separating the chapters of part 2 into six units, and to cover what every nurse and nursing student should know about using the health-care literature.
Abstract: This 600-page reference guide is designed to assist nurses from various backgrounds in better understanding and using the health-care literature. Based on the Users’ Guide to the Medical Literature:A Manual for EvidenceBased Clinical Practice, edited by Gordon Guyatt and Drummond Rennie, Evidence-Based Nursing (EBN) is written specifically for nurses and students engaged in practice, research, and education. Like the medical version, the EBN is divided into 36 chapters presented in two sections, in this case The Basics: Using the Nursing Literature and Beyond the Basics: Using and Teaching the Principles of Evidence-Based Nursing.The authors have enhanced the usefulness of the book as a reference manual by separating the chapters of part 2 into six units. The purpose of part 1 is twofold: to cover what every nurse and nursing student should know about using the health-care literature, and to present a curriculum for basic and continuing education.This material is presented in 11 chapters:

282 citations


Journal ArticleDOI
TL;DR: Evidence supporting the measurement properties of the MacNew Heart Disease Health-related Quality of Life Questionnaire, designed to evaluate how daily activities and physical, emotional, and social functioning are affected by coronary heart disease and its treatment, is reviewed.
Abstract: Background: The measurement of health, the effects of disease, and the impact of health care include not only an indication of changes in disease frequency and severity but also an estimate of patients' perception of health status before and after treatment. One of the more important developments in health care in the past decade may be the recognition that the patient's perspective is as legitimate and valid as the clinician's in monitoring health care outcomes. This has lead to the development of instruments to quantify the patients' perception of their health status before and after treatment. Methods: We review evidence supporting the measurement properties of the MacNew Heart Disease Health-related Quality of Life [MacNew] Questionnaire which was designed to evaluate how daily activities and physical, emotional, and social functioning are affected by coronary heart disease and its treatment. Results: Reliability was demonstrated by using internal consistency and the intraclass correlation coefficients for the three domains in the Dutch, English, Farsi, German, and Spanish versions of the MacNew. With internal consistency and intraclass correlation coefficients =>0.73, reliability is high. Validity of the MacNew was examined with factor analysis and three core underlying factors, physical, emotional, and social, were identified, explaining 63.0 – 66.5% of the observed variance and replicated in the translations with psychometric data. Construct validity of the MacNew was further demonstrated by extensive substantiation of the logical relationships, defined a priori, between items and other comparison tools. The MacNew is responsive and sensitive to changes in HRQL following various interventions for patients with heart disease with 11 of 13 effect size statistics >0.80. Taking an average of 10 minutes or less to complete, the respondent-burden for the MacNew is low and its acceptability is demonstrated by response rates of over 90%. Normative data are available for patients with myocardial infarction, angina, and heart failure in the English version. Conclusion: The MacNew may be a valuable tool for assessing and evaluating health related quality of life in patients with heart disease.

266 citations


Journal ArticleDOI
TL;DR: Physician estimates of intensive care unit survival <10% are associated with subsequent life support limitation and more powerfully predictintensive care unit mortality than illness severity, evolving or resolving organ dysfunction, and use of inotropes or vasopressors.
Abstract: Objective:Predicting outcomes for critically ill patients is an important aspect of discussions with families in the intensive care unit. Our objective was to evaluate clinical intensive care unit survival predictions and their consequences for mechanically ventilated patients.Design:Prospective coh

239 citations


Journal ArticleDOI
TL;DR: This systematic review suggests that specific heart failure-targeted interventions significantly decrease hospital readmissions but do not affect mortality rates.
Abstract: Background Heart failure is the leading cause of hospitalization and readmission in many hospitals worldwide. We performed a meta-analysis to evaluate the effectiveness of multidisciplinary heart failure management programs on hospital admission rates. Methods We identified studies through an electronic search and mortality using 8 distinct methods. Eligible studies met the following criteria: (1) randomized controlled clinical trials of adult inpatients hospitalized for heart failure enrolled either at the time of discharge or within 1 week after discharge; (2) heart failure–specific patient education intervention coupled with a postdischarge follow-up assessment; and (3) unplanned readmission reported. Four reviewers independently assessed each study for eligibility and quality, achieving a weighted κ of 0.73 for eligibility and 0.77 for quality. For each study we calculated the relative risk for readmissions and mortality for patients receiving enhanced education relative to patients receiving usual care. Results A total of 529 citation titles were identified, of which 8 randomized trials proved eligible. The pooled relative risk for hospital readmission rates using a random-effects model was 0.79 (95% confidence interval, 0.68-0.91; P P = .25). There was no apparent effect on mortality (relative risk, 0.98; 95% confidence interval, 0.72-1.34; P = .90; heterogeneity P = .20). Data were insufficient to meaningfully pool intervention effects on quality of life or compliance. Conclusion This systematic review suggests that specific heart failure–targeted interventions significantly decrease hospital readmissions but do not affect mortality rates.

237 citations


Journal ArticleDOI
28 Oct 2004-BMJ
TL;DR: Important developments in evidence based medicine over the subsequent decade included the increasing popularity of structured abstracts and secondary journals summarising evidence from clinical research and the application of formal rules of evidence in evaluating the clinical literature.
Abstract: The second decade will be as exciting as the first Evidence based medicine seeks to empower clinicians so that they can develop independent views regarding medical claims and controversies. Although many helped to lay the foundations of evidence based medicine,1 Archie Cochrane's insistence that clinical disciplines summarise evidence concerning their practices, Alvan Feinstein's role in defining the principles of quantitative clinical reasoning, and David Sackett's innovation in teaching critical appraisal all proved seminal. The term evidence based medicine,2 and the first comprehensive description of its tenets, appeared little more than a decade ago. In its original formulation, this discipline reduced the emphasis on unsystematic clinical experience and pathophysiological rationale, and promoted the examination of evidence from clinical research. Evidence based medicine therefore required new skills including efficient literature searching and the application of formal rules of evidence in evaluating the clinical literature. Important developments in evidence based medicine over the subsequent decade included the increasing popularity of structured abstracts3 and secondary journals summarising …

233 citations


Journal ArticleDOI
TL;DR: The authors' Internet-based survey to surgeons resulted in a significantly lower response rate than a traditional mailed survey, and researchers should not assume that the widespread availability and potential ease of Internet- based surveys will translate into higher response rates.
Abstract: BACKGROUND: Low response rates among surgeons can threaten the validity of surveys. Internet technologies may reduce the time, effort, and financial resources needed to conduct surveys. OBJECTIVE: We investigated whether using Web-based technology could increase the response rates to an international survey. METHODS: We solicited opinions from the 442 surgeon–members of the Orthopaedic Trauma Association regarding the treatment of femoral neck fractures. We developed a self-administered questionnaire after conducting a literature review, focus groups, and key informant interviews, for which we used sampling to redundancy techniques. We administered an Internet version of the questionnaire on a Web site, as well as a paper version, which looked similar to the Internet version and which had identical content. Only those in our sample could access the Web site. We alternately assigned the participants to receive the survey by mail (n=221) or an email invitation to participate on the Internet (n=221). Non-respondents in the mail arm received up to three additional copies of the survey, while non-respondents in the Internet arm received up to three additional requests, including a final mailed copy. All participants in the Internet arm had an opportunity to request an emailed Portable Document Format (PDF) version. RESULTS: The Internet arm demonstrated a lower response rate (99/221, 45%) than the mail questionnaire arm (128/221, 58%) (absolute difference 13%, 95% confidence interval 4%-22%, P<0.01). CONCLUSIONS: Our Internet-based survey to surgeons resulted in a significantly lower response rate than a traditional mailed survey. Researchers should not assume that the widespread availability and potential ease of Internet-based surveys will translate into higher response rates. [J Med Internet Res 2004;6(4):e39]

Journal ArticleDOI
01 Sep 2004-Chest
TL;DR: The Seventh ACCP Conference on Antithrombotic and thrombolytic therapy: Evidence Based Guidelines as mentioned in this paper presented the following recommendations: for patients presenting with non-ST-segment elevation (NSTE) acute coronary syndrome (ACS), they recommend immediate and then daily oral aspirin (Grade 1A) and for patients with an aspirin allergy, they recommended immediate treatment with clopidogrel, 300-mg bolus po, followed by 75 mg/d indefinitely (Grade 2A).

Journal ArticleDOI
TL;DR: Whether concealment of randomization or blinding was used in RCTs that failed to report these bias-reducing strategies is determined to ensure the use of these methodological safeguards in their publications.

Journal ArticleDOI
TL;DR: Telecare is associated with small effects on glycemic control in patients with type 1 diabetes on intensive insulin therapy but with inadequate gly glucose control.
Abstract: OBJECTIVE —To determine the efficacy of telecare (modem transmission of glucometer data and clinician feedback) to support intensive insulin therapy in patients with type 1 diabetes and inadequate glycemic control. RESEARCH DESIGN AND METHODS —Thirty-one patients with type 1 diabetes on intensive insulin therapy and with HbA1c >7.8% were randomized to telecare (glucometer transmission with feedback) or control (glucometer transmission without feedback) for 6 months. The primary end point was 6-month HbA1c. To place our findings in context, we pooled HbA1c change from baseline reported in randomized trials of telecare identified in a systematic review of the literature. RESULTS —Compared with the control group, telecare patients had a significantly lower 6-month HbA1c (8.2 vs. 7.8%, P = 0.03, after accounting for HbA1c at baseline) and a nonsignificant fourfold greater chance of achieving 6-month HbA1c ≤7% (29 vs. 7%; risk difference 21.9%, 95% CI −4.7 to 50.5). Nurses spent 50 more min/patient giving feedback on the phone with telecare patients than with control patients. Meta-analysis of seven randomized trials of adult patients with type 1 diabetes found a 0.4% difference (95% CI 0–0.8) in HbA1c mean change from baseline between the telecare and control groups. CONCLUSIONS —Telecare is associated with small effects on glycemic control in patients with type 1 diabetes on intensive insulin therapy but with inadequate glycemic control.

Journal ArticleDOI
TL;DR: In a survey of 2472 patients with chronic disease (hypertension, diabetes, heart failure, myocardial infarction, or depression) completed between 1986 and 1990 as mentioned in this paper, 17.1% strongly agreed, 45.5% agreed, 11.9% were uncertain, 22.4% disagreed, and only 4.8% strongly disagreed.
Abstract: In a formerly widespread approach to medical decision making, physicians made a diagnosis, considered the management alternatives, and informed patients what would be done to help them. Decision making rested exclusively in the physicians’ domain. This parental model of patient care challenged clinicians to interpret what was best for their patients. How much benefit must a treatment offer before it was worth subjecting a patient to its short term side effects, long term risks, inconveniences, and costs? With the clinician at the centre of decision making, the question was whether a treatment effect was “clinically relevant.” Even in the discipline of health related quality of life measurement, investigators sought the “minimal clinically important difference.” The parental model has its strengths, and in the past it is very likely that many patients preferred leaving the decisions to their doctors, particularly when the knowledge gap between physicians and the lay public was greater. Consider, for instance, the results of a survey of 2472 patients with chronic disease (hypertension, diabetes, heart failure, myocardial infarction, or depression) completed between 1986 and 1990. In response to the statement: “I prefer to leave decisions about my medical care up to my doctor,” 17.1% strongly agreed, 45.5% agreed, 11.1% were uncertain, 22.5% disagreed, and only 4.8% strongly disagreed.1 Cultural changes since the 1950s suggest that in the more distant past, the percentage preferring to leave decisions up to the doctor was even greater. For example, in the survey that generated these data, older patients were less inclined to prefer an active decision making role than younger respondents. Parental approaches to decision making have limitations related to physicians’ evaluation of how patients value benefits and risks. One issue is physicians’ tendency to assume that physiological outcomes will lead to improvements in mortality …

Journal ArticleDOI
TL;DR: The PCOSQ proved as responsive as the F-G, and more responsive than the objective measures of hair growth, to effects of troglitazone, and provides some support for the discriminative and longitudinal validity, and appreciable support forThe responsiveness, of thePCOSQ.

Journal ArticleDOI
01 Sep 2004-Chest
TL;DR: In this article, grades of recommendation for antithrombotic and thrombolytic therapy are given by considering the trade-off between the benefits of a treatment and the risks, burdens, and costs.

Journal ArticleDOI
TL;DR: The authors' Internet-based survey to surgeons resulted in a significantly lower response rate than a traditional mailed survey, and researchers should not assume that the widespread availability and potential ease of Internet- based surveys will translate into higher response rates.
Abstract: BACKGROUND: Low response rates among surgeons can threaten the validity of surveys. Internet technologies may reduce the time, effort, and financial resources needed to conduct surveys. OBJECTIVE: We investigated whether using Web-based technology could increase the response rates to an international survey. METHODS: We solicited opinions from the 442 surgeon–members of the Orthopaedic Trauma Association regarding the treatment of femoral neck fractures. We developed a self-administered questionnaire after conducting a literature review, focus groups, and key informant interviews, for which we used sampling to redundancy techniques. We administered an Internet version of the questionnaire on a Web site, as well as a paper version, which looked similar to the Internet version and which had identical content. Only those in our sample could access the Web site. We alternately assigned the participants to receive the survey by mail (n=221) or an email invitation to participate on the Internet (n=221). Non-respondents in the mail arm received up to three additional copies of the survey, while non-respondents in the Internet arm received up to three additional requests, including a final mailed copy. All participants in the Internet arm had an opportunity to request an emailed Portable Document Format (PDF) version. RESULTS: The Internet arm demonstrated a lower response rate (99/221, 45%) than the mail questionnaire arm (129/221, 58%) (absolute difference 13%, 95% confidence interval 4%-22%, P<0.01). CONCLUSIONS: Our Internet-based survey to surgeons resulted in a significantly lower response rate than a traditional mailed survey. Researchers should not assume that the widespread availability and potential ease of Internet-based surveys will translate into higher response rates. [J Med Internet Res 2004;6(3):e30]

Journal ArticleDOI
TL;DR: Evidence strongly supports a policy of not-for-profit health care delivery at the hospital level after a systematic review and meta-analysis of observational studies that directly compared the payments for care at private for-profit and private not-For-profit hospitals.
Abstract: Background: It has been shown that patients cared for at private for-profit hospitals have higher risk-adjusted mortality rates than those cared for at private not-for-profit hospitals. Uncertainty remains, however, about the economic implications of these forms of health care delivery. Since some policy-makers might still consider for-profit health care if expenditure savings were sufficiently large, we undertook a systematic review and meta-analysis to compare payments for care at private forprofit and private not-for-profit hospitals. Methods: We used 6 search strategies to identify published and unpublished observational studies that directly compared the payments for care at private for-profit and private not-forprofit hospitals. We masked the study results before teams of 2 reviewers independently evaluated the eligibility of all studies. We confirmed data or obtained additional data from all but 1 author. For each study, we calculated the payments for care at private for-profit hospitals relative to private notfor-profit hospitals and pooled the results using a random effects model. Results: Eight observational studies, involving more than 350 000 patients altogether and a median of 324 hospitals each, fulfilled our eligibility criteria. In 5 of 6 studies showing higher payments for care at private for-profit hospitals, the difference was statistically significant; in 1 of 2 studies showing higher payments for care at private not-for-profit hospitals, the difference was statistically significant. The pooled estimate demonstrated that private for-profit hospitals were associated with higher payments for care (relative payments for care 1.19, 95% confidence interval 1.07–1.33, p = 0.001). Interpretation: Private for-profit hospitals result in higher payments for care than private not-for-profit hospitals. Evidence strongly supports a policy of not-for-profit health care delivery at the hospital level.

Journal ArticleDOI
01 Jul 2004-BMJ
TL;DR: Clinicians and patients should beware of possible decreases in the systemic bioavailability of conventional drugs when taken concomitantly with St John's wort.
Abstract: Objective To determine the methodological quality of clinical trials that examined possible interactions of St John9s wort with conventional drugs, and to examine the results of these trials. Design Systematic review. Data sources Electronic databases from inception to April 2004, reference lists from published reports, and experts in the field. Study selection Eligible studies were prospective clinical trials evaluating the pharmacokinetic effect of St John9s wort on the metabolism of conventional drugs. Data extraction Two reviewers selected studies for inclusion and independently extracted data. Data synthesis 22 pharmacokinetic trials studied an average of 12 (SD 5) participants; 17 trials studied healthy volunteers and five studied patients. Most (17) studies used a “before and after” design; four studies used control groups other than the active group. Three studies randomised the sequence of administration or the participants to study arms or periods; three studies blinded participants or investigators. In 15 trials, investigators independently assayed the herb. Of 19 trials with available plasma data, three found no important interaction (change in area under the curve Conclusion Clinicians and patients should beware of possible decreases in the systemic bioavailability of conventional drugs when taken concomitantly with St John9s wort.

Journal ArticleDOI
04 Nov 2004-BMJ
TL;DR: Would you be able to identify misleading claims in a report of a well conducted study?
Abstract: Plenty of advice is available to help readers identify studies with weak methods, but would you be able to identify misleading claims in a report of a well conducted study?

Journal ArticleDOI
19 Nov 2004-AIDS
TL;DR: An effective preventative vaccine may represent the best hope for reversing the growing HIV pandemic.
Abstract: Human immunodeficiency virus (HIV) exacts a heavy toll in terms of morbidity and mortality. Education programs stressing preventative methods have managed to slow the spread of HIV infection in some regions [1]. However, an estimated 15 000 new individuals are infected each day. Highly active antiretroviral therapy (HAART), despite its proven benefits, remains too costly and complex for the majority of HIV-infected people, 90% of whom live in developing countries [2]. Under these circumstances, an effective preventative vaccine may represent the best hope for reversing the growing HIV pandemic.

Journal ArticleDOI
TL;DR: This editorial will address the problems created by traditional statistical concepts that call for 2-sided tests of statistical significance and require rejection of the null hypothesis and turn their backs on the ‘‘1-sided’’ clinical questions of superiority and non-inferiority.
Abstract: W hen busy clinicians bump into a new treatment, they ask themselves 2 questions. Firstly, is it better than (‘‘superior to’’) what they are using now? Secondly, if it’s not superior, is it as good as what they are using now (‘‘noninferior’’) and preferable for some other reason (eg, fewer side effects or more affordable)? Moreover, they want answers to these questions right away. Evidence-Based Medicine and its related evidence-based journals do their best to answer these questions in their ‘‘more informative titles.’’ That’s why this issue contains titles such as: ‘‘Angioplasty at an invasive treatment centre reduced mortality compared with first contact thrombolysis’’ (http://ebm.bmjjournals.com/cgi/content/9/2/ 42) and ‘‘Ximelagatran was non-inferior to warfarin in preventing stroke and systemic embolism in atrial fibrillation.’’ (http://ebm.bmjjournals.com/cgi/content/9/2/43) The latter of these 2 studies prompted this editorial. Progress toward this ‘‘more informative’’ goal has been slow because we have been prisoners of traditional statistical concepts that call for 2-sided tests of statistical significance and require rejection of the null hypothesis. We have further imprisoned ourselves by misinterpreting ‘‘statistically nonsignificant’’ results of these 2-tailed tests. Rather than recognising such results as ‘‘indeterminate’’ (uncertain), we conclude that they are ‘‘negative’’ (certain, providing proof of no difference between treatments). This editorial will address the problems created by these ways of thinking and, more importantly, their clinically relevant solutions. At the root of our problem is the ‘‘null hypothesis,’’ which decrees that the difference between a new and standard treatment ought to be zero. Two-sided p values tell us the probability that the results are compatible with that null hypothesis. When that probability is small (say, ,5%), we ‘‘reject’’ the null hypothesis and ‘‘accept’’ the ‘‘alternative hypothesis’’ that the difference we’ve observed is not zero. In doing so, however, we make no distinction between the new treatment being better, on the one hand, or worse, on the other, than the standard treatment. There are 3 consequences of this faulty reasoning. Firstly, by performing ‘‘2-sided’’ tests of statistical significance, investigators turn their backs on the ‘‘1-sided’’ clinical questions of superiority and non-inferiority. Secondly, they often fail to recognise that the results of these 2-sided tests, especially in small trials, can be ‘‘statistically nonsignificant’’ even when their confidence intervals include clinically important benefit or harm. Thirdly, investigators (abetted by editors) frequently misinterpret this failure to reject the null hypothesis (based on 2-sided p values .5%, or 95% confidence intervals that include zero). Rather than recognising their results as uncertain (‘‘indeterminate’’), they report them as ‘‘negative’’ and conclude that there is ‘‘no difference’’ between the treatments. By doing so, authors and editors and readers regularly fall into the trap of concluding that the ‘‘absence of proof of a difference’’ between 2 treatments constitutes ‘‘proof of an absence of a difference’’ between them. This mistake was forcefully pointed out by Phil Alderson and Iain Chalmers: ‘‘It is never correct to claim that treatments have no effect or that there is no difference in the effects of treatments. It is impossible to prove ... that two treatments have the same effect. There will always be some uncertainty surrounding estimates of treatment effects, and a small difference can never be excluded.’’ A solution to both this incompatibility (between 1-sided clinical reasoning and 2-sided statistical testing) and confusion (about the clinical interpretation of statistically nonsignificant results) has been around for decades, but is just now gaining widespread recognition and application. I assign most of the credit to a pair of biostatisticians, Charles Dunnett and Michael Gent, and others have also contributed to its development (although the latter sometimes refer to ‘‘non-inferiority’’ as ‘‘equivalence,’’ a term whose common usage fails to distinguish 1-sided from 2-sided thinking). I’ll illustrate the contribution of Charles Dunnett and Michael Gent with a pair of trials in which their thinking helped clinical colleagues escape from the prison of 2-sided null hypothesis testing and, by doing so, prevented the misinterpretation of statistically nonsignificant results. Thirty years ago, a group of us performed a randomised controlled trial (RCT) of nurse practitioners as providers of primary care. We wanted to know if patients fared as well under their care as under the care of general practitioners. Guided by Mike Gent, we came to realise that a 2-sided analysis that produced an ‘‘indeterminate,’’ statistically nonsignificant difference in patient outcomes could confuse rather than clarify matters. We therefore abandoned our initial 2-sided null hypothesis and decided that we’d ask a non-inferiority question: Were the outcomes of patients cared for by nurse practitioners non-inferior to those of patients cared for by general practitioners? Mike then helped us recognise the need to specify our limit of acceptable ‘‘inferiority’’ in terms of these outcomes. With his prodding, we decided that we would tolerate no worse than 5% lower physical, social, or emotional function at the end of the trial among patients randomised to our nurse practitioners as we observed among patients randomised to our general practitioners. As it happened, our 1-sided analysis revealed that the probability that our nurse practitioners’ patients were worse off (by >5%) than our general practitioners’ patients was as small as 0.008. We had established that nurse practitioners were not inferior to general practitioners as providers of primary care. Twenty years ago, a group of us performed an RCT of superficial temporal artery–middle cerebral artery anastomosis (‘‘EC-IC bypass’’) for patients with threatened stroke. To the disappointment of many, we failed to show a statistically significant superiority of surgery for preventing subsequent fatal and non-fatal stroke. It became important to overcome the ambiguity of this ‘‘indeterminate’’ result. We therefore asked the 1-sided question: What degree of surgical benefit could we rule out? That 1-sided analysis, which calculated 38

Journal ArticleDOI
TL;DR: A large number of North Americans who undergo noncardiac surgical procedures every year suffer a major perioperative cardiovascular event, and these events prolong hospital stays by a mean of 11 days and cost the US economy more than $1.2 trillion a year.
Abstract: Of the approximately 26 million North Americans who undergo noncardiac surgical procedures every year,[1][1],[2][2] 1%–5% suffer a major perioperative cardiovascular event.[1][1],[3][3] Perioperative ischemic events prolong hospital stays by a mean of 11 days[4][4] and cost the US economy

Journal ArticleDOI
TL;DR: An approach to understanding how to estimate a treatment's effectiveness that covered relative risk reduction, absolute risk reduction and number needed to treat was presented.
Abstract: In the first article in this series,[1][1] we presented an approach to understanding how to estimate a treatment's effectiveness that covered relative risk reduction, absolute risk reduction and number needed to treat. But how precise are these estimates of treatment effect? ![Figure][2]

Journal ArticleDOI
TL;DR: As is the case in many other areas, social factors may be important determinants of outcome in patients with traumatic fractures and Optimal orthopedic care may involve attention to modifiable risk factors, including smoking and alcohol consumption.
Abstract: Background:Although Weber type B ankle fractures are often considered benign with a good prognosis, evidence from observational studies suggests that 17% to 24% of such patients may have less satisfactory outcomes. Although the explanation for variability in outcomes remains unclear, previous studie

Journal ArticleDOI
01 Sep 2004-Chest
TL;DR: There are few implementation strategies that are of unequivocal, consistent benefit, and that are clearly and consistently worth resource investment, and fully informed decisions will require additional research to identify effective guideline implementation strategies.

Journal ArticleDOI
TL;DR: The literature indicates that the drug has no effect on long-term mortality, but reduces the incidence of hospitalization, and has a positive effect on the clinical status of symptomatic patients.

Journal ArticleDOI
TL;DR: Wide variation in the management of acute respiratory distress syndrome appears related to limited awareness of relevant research, conflicting interpretations of research findings, and adherence to varying local practice patterns.
Abstract: ObjectiveTo determine physicians’ opinions and practices related to the management of patients with acute respiratory distress syndrome.DesignCross-sectional mail survey.SettingProvince of Ontario, Canada.ParticipantsPhysicians treating patients with acute respiratory distress syndrome at university

Journal ArticleDOI
TL;DR: The data supported the Qualiveen's validity as a discriminative instrument for use with patients with MS and further studies should explore the questionnaire's longitudinal validity and responsiveness.