scispace - formally typeset
Search or ask a question

Showing papers by "Peter C Gøtzsche published in 2009"


Journal ArticleDOI
TL;DR: An Explanation and Elaboration of the PRISMA Statement is presented and updated guidelines for the reporting of systematic reviews and meta-analyses are presented.
Abstract: Systematic reviews and meta-analyses are essential to summarize evidence relating to efficacy and safety of health care interventions accurately and reliably. The clarity and transparency of these reports, however, is not optimal. Poor reporting of systematic reviews diminishes their value to clinicians, policy makers, and other users. Since the development of the QUOROM (QUality Of Reporting Of Meta-analysis) Statement—a reporting guideline published in 1999—there have been several conceptual, methodological, and practical advances regarding the conduct and reporting of systematic reviews and meta-analyses. Also, reviews of published systematic reviews have found that key information about these studies is often poorly reported. Realizing these issues, an international group that included experienced authors and methodologists developed PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) as an evolution of the original QUOROM guideline for systematic reviews and meta-analyses of evaluations of health care interventions. The PRISMA Statement consists of a 27-item checklist and a four-phase flow diagram. The checklist includes items deemed essential for transparent reporting of a systematic review. In this Explanation and Elaboration document, we explain the meaning and rationale for each checklist item. For each item, we include an example of good reporting and, where possible, references to relevant empirical studies and methodological literature. The PRISMA Statement, this document, and the associated Web site (http://www.prisma-statement.org/) should be helpful resources to improve reporting of systematic reviews and meta-analyses.

25,711 citations


Journal ArticleDOI
21 Jul 2009-BMJ
TL;DR: The meaning and rationale for each checklist item is explained, and an example of good reporting is included and, where possible, references to relevant empirical studies and methodological literature are included.
Abstract: Systematic reviews and meta-analyses are essential to summarise evidence relating to efficacy and safety of healthcare interventions accurately and reliably. The clarity and transparency of these reports, however, are not optimal. Poor reporting of systematic reviews diminishes their value to clinicians, policy makers, and other users. Since the development of the QUOROM (quality of reporting of meta-analysis) statement—a reporting guideline published in 1999—there have been several conceptual, methodological, and practical advances regarding the conduct and reporting of systematic reviews and meta-analyses. Also, reviews of published systematic reviews have found that key information about these studies is often poorly reported. Realising these issues, an international group that included experienced authors and methodologists developed PRISMA (preferred reporting items for systematic reviews and meta-analyses) as an evolution of the original QUOROM guideline for systematic reviews and meta-analyses of evaluations of health care interventions. The PRISMA statement consists of a 27-item checklist and a four-phase flow diagram. The checklist includes items deemed essential for transparent reporting of a systematic review. In this explanation and elaboration document, we explain the meaning and rationale for each checklist item. For each item, we include an example of good reporting and, where possible, references to relevant empirical studies and methodological literature. The PRISMA statement, this document, and the associated website (www.prisma-statement.org/) should be helpful resources to improve reporting of systematic reviews and meta-analyses.

13,813 citations


Journal ArticleDOI
TL;DR: This Explanation and Elaboration document explains the meaning and rationale for each checklist item and includes an example of good reporting and, where possible, references to relevant empirical studies and methodological literature.

8,021 citations



Journal ArticleDOI
TL;DR: The updating of the QUOROM Statement is described, to ensure clear presentation of what was planned, done, and found in a systematic review, and the name of the reporting guidance was changed to PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses).

3,513 citations


Journal ArticleDOI
09 Jul 2009-BMJ
TL;DR: One in three breast cancers detected in a population offered organised screening is overdiagnosed, with data from three countries showing a drop in incidence as the women exceeded the age limit for screening, but the reduction was small.
Abstract: Objective To estimate the extent of overdiagnosis (the detection of cancers that will not cause death or symptoms) in publicly organised screening programmes. Design Systematic review of published trends in incidence of breast cancer before and after the introduction of mammography screening. Data sources PubMed (April 2007), reference lists, and authors. Review methods One author extracted data on incidence of breast cancer (including carcinoma in situ), population size, screening uptake, time periods, and age groups, which were checked independently by the other author. Linear regression was used to estimate trends in incidence before and after the introduction of screening and in older, previously screened women. Meta-analysis was used to estimate the extent of overdiagnosis. Results Incidence data covering at least seven years before screening and seven years after screening had been fully implemented, and including both screened and non-screened age groups, were available from the United Kingdom; Manitoba, Canada; New South Wales, Australia; Sweden; and parts of Norway. The implementation phase with its prevalence peak was excluded and adjustment made for changing background incidence and compensatory drops in incidence among older, previously screened women. Overdiagnosis was estimated at 52% (95% confidence interval 46% to 58%). Data from three countries showed a drop in incidence as the women exceeded the age limit for screening, but the reduction was small and the estimate of overdiagnosis was compensated for in this review. Conclusions The increase in incidence of breast cancer was closely related to the introduction of screening and little of this increase was compensated for by a drop in incidence of breast cancer in previously screened women. One in three breast cancers detected in a population offered organised screening is overdiagnosed.

474 citations


Journal ArticleDOI
TL;DR: Spontaneous improvement and effect of placebo contributed importantly to the observed treatment effect in actively treated patients, but the relative importance of these factors differed according to clinical condition and intervention.
Abstract: It can be challenging for patients and clinicians to properly interpret a change in the clinical condition after a treatment has been given. It is not known to which extent spontaneous improvement, effect of placebo and effect of active interventions contribute to the observed change from baseline, and we aimed at quantifying these contributions. Systematic review and meta-analysis, based on a Cochrane review of the effect of placebo interventions for all clinical conditions. We selected all trials that had randomised the patients to three arms: no treatment, placebo and active intervention, and that had used an outcome that was measured on a continuous scale or on a ranking scale. Clinical conditions that had been studied in less than three trials were excluded. We analysed 37 trials (2900 patients) that covered 8 clinical conditions. The active interventions were psychological in 17 trials, physical in 15 trials, and pharmacological in 5 trials. Overall, across all conditions and interventions, there was a statistically significant change from baseline in all three arms. The standardized mean difference (SMD) for change from baseline was -0.24 (95% confidence interval -0.36 to -0.12) for no treatment, -0.44 (-0.61 to -0.28) for placebo, and -1.01 (-1.16 to -0.86) for active treatment. Thus, on average, the relative contributions of spontaneous improvement and of placebo to that of the active interventions were 24% and 20%, respectively, but with some uncertainty, as indicated by the confidence intervals for the three SMDs. The conditions that had the most pronounced spontaneous improvement were nausea (45%), smoking (40%), depression (35%), phobia (34%) and acute pain (25%). Spontaneous improvement and effect of placebo contributed importantly to the observed treatment effect in actively treated patients, but the relative importance of these factors differed according to clinical condition and intervention.

433 citations


Journal ArticleDOI
28 Jan 2009-BMJ
TL;DR: A small analgesic effect of acupuncture was found, which seems to lack clinical relevance and cannot be clearly distinguished from bias, whether needling at acupuncture points, or at any site, reduces pain independently of the psychological impact of the treatment ritual is unclear.
Abstract: Objectives To study the analgesic effect of acupuncture and placebo acupuncture and to explore whether the type of the placebo acupuncture is associated with the estimated effect of acupuncture. Design Systematic review and meta-analysis of three armed randomised clinical trials. Data sources Cochrane Library, Medline, Embase, Biological Abstracts, and PsycLIT. Data extraction and analysis Standardised mean differences from each trial were used to estimate the effect of acupuncture and placebo acupuncture. The different types of placebo acupuncture were ranked from 1 to 5 according to assessment of the possibility of a physiological effect, and this ranking was meta-regressed with the effect of acupuncture. Data synthesis Thirteen trials (3025 patients) involving a variety of pain conditions were eligible. The allocation of patients was adequately concealed in eight trials. The clinicians managing the acupuncture and placebo acupuncture treatments were not blinded in any of the trials. One clearly outlying trial (70 patients) was excluded. A small difference was found between acupuncture and placebo acupuncture: standardised mean difference −0.17 (95% confidence interval −0.26 to −0.08), corresponding to 4 mm (2 mm to 6 mm) on a 100 mm visual analogue scale. No statistically significant heterogeneity was present (P=0.10, I 2 =36%). A moderate difference was found between placebo acupuncture and no acupuncture: standardised mean difference −0.42 (−0.60 to −0.23). However, considerable heterogeneity (P 2 =66%) was also found, as large trials reported both small and large effects of placebo. No association was detected between the type of placebo acupuncture and the effect of acupuncture (P=0.60). Conclusions A small analgesic effect of acupuncture was found, which seems to lack clinical relevance and cannot be clearly distinguished from bias. Whether needling at acupuncture points, or at any site, reduces pain independently of the psychological impact of the treatment ritual is unclear.

351 citations


Journal Article
TL;DR: The STROBE Statement provides guidance to authors about how to improve the reporting of observational studies and facilitates critical appraisal and interpretation of studies by reviewers, journal editors and readers.

187 citations


Journal ArticleDOI
28 Jan 2009-BMJ
TL;DR: Peter Gøtzsche and colleagues argue that women are still not given enough, nor correct, information about the harms of screening for breast cancer.
Abstract: Peter Gotzsche and colleagues argue that women are still not given enough, nor correct, information about the harms of screening

152 citations


Journal ArticleDOI
TL;DR: The STROBE Statement provides guidance to authors about how to improve the reporting of observational studies and facilitates critical appraisal and interpretation of studies by reviewers, journal editors and readers.

Journal ArticleDOI
TL;DR: This debate examines how best to tackle ghostwriting in the medical literature from the perspectives of a researcher, an editor, and the professional medical writer.
Abstract: Background to the debate: Ghostwriting occurs when someone makes substantial contributions to a manuscript without attribution or disclosure. It is considered bad publication practice in the medical sciences, and some argue it is scientific misconduct. At its extreme, medical ghostwriting involves pharmaceutical companies hiring professional writers to produce papers promoting their products but hiding those contributions and instead naming academic physicians or scientists as the authors. To improve transparency, many editors' associations and journals allow professional medical writers to contribute to the writing of papers without being listed as authors provided their role is acknowledged. This debate examines how best to tackle ghostwriting in the medical literature from the perspectives of a researcher, an editor, and the professional medical writer.

Journal ArticleDOI
TL;DR: One in four breast cancers diagnosed in the screened age group in the Danish screening programme is overdiagnosed, which is lower than that for comparable countries, likely because of lower uptake, lower recall rates and lower detection rates of carcinoma in situ.
Abstract: Overdiagnosis in cancer screening is the detection of cancer lesions that would otherwise not have been detected. It is arguably the most important harm. We quantified overdiagnosis in the Danish mammography screening programme, which is uniquely suited for this purpose, as only 20% of the Danish population has been offered organised mammography screening over a long time-period. We collected incidence rates of carcinoma in situ and invasive breast cancer in areas with and without screening over 13 years with screening (1991-2003), and 20 years before its introduction (1971-1990). We explored the incidence increase comparing unadjusted incidence rates and used Poisson regression analysis to compensate for the background incidence trend, variation in age distribution and geographical variation in incidence. For the screened age group, 50 to 69 years, we found an overdiagnosis of 35% when we compared unadjusted incidence rates for the screened and non-screened areas, but after compensating for a small decline in incidence in older, previously screened women. Our adjusted Poisson regression analysis indicated a relative risk of 1.40 (95% CI: 1.35-1.45) for the whole screening period, and a potential compensatory drop in older women of 0.90 (95% CI: 0.88-0.96), yielding an overdiagnosis of 33%, which we consider the most reliable estimate. The drop in previously screened women was only present in one of the two screened regions and was small in absolute numbers. One in four breast cancers diagnosed in the screened age group in the Danish screening programme is overdiagnosed. Our estimate for Denmark is lower than that for comparable countries, likely because of lower uptake, lower recall rates and lower detection rates of carcinoma in situ.

Journal ArticleDOI
13 Aug 2009-BMJ
TL;DR: In 14 out of the 100 SMDs calculated at the meta-analysis level, individual observers reached different conclusions than the originally published review, and meta-analyses using SMDs are prone to observer variation and should be interpreted with caution.
Abstract: Objective To study the inter-observer variation related to extraction of continuous and numerical rating scale data from trial reports for use in meta-analyses. Design Observer agreement study. Data sources A random sample of 10 Cochrane reviews that presented a result as a standardised mean difference (SMD), the protocols for the reviews and the trial reports (n=45) were retrieved. Data extraction Five experienced methodologists and five PhD students independently extracted data from the trial reports for calculation of the first SMD result in each review. The observers did not have access to the reviews but to the protocols, where the relevant outcome was highlighted. The agreement was analysed at both trial and meta-analysis level, pairing the observers in all possible ways (45 pairs, yielding 2025 pairs of trials and 450 pairs of meta-analyses). Agreement was defined as SMDs that differed less than 0.1 in their point estimates or confidence intervals. Results The agreement was 53% at trial level and 31% at meta-analysis level. Including all pairs, the median disagreement was SMD=0.22 (interquartile range 0.07-0.61). The experts agreed somewhat more than the PhD students at trial level (61% v 46%), but not at meta-analysis level. Important reasons for disagreement were differences in selection of time points, scales, control groups, and type of calculations; whether to include a trial in the meta-analysis; and data extraction errors made by the observers. In 14 out of the 100 SMDs calculated at the meta-analysis level, individual observers reached different conclusions than the originally published review. Conclusions Disagreements were common and often larger than the effect of commonly used treatments. Meta-analyses using SMDs are prone to observer variation and should be interpreted with caution. The reliability of meta-analyses might be improved by having more detailed review protocols, more than one observer, and statistical expertise.

Journal ArticleDOI
TL;DR: The reporting on blinding in both trial protocols and publications is often inadequate and it is suggested that international guidelines for the reporting oftrial protocols and public access to protocols should be developed.

Journal ArticleDOI
TL;DR: A study found that women participating in mammography screening were content with the programme and the paternalistic invitations, but it is argued that this merely reflects that the information presented to the invited women is seriously biased in favour of participation.
Abstract: A study found that women participating in mammography screening were content with the programme and the paternalistic invitations that directly encourage participation and include a pre-specified time of appointment. We argue that this merely reflects that the information presented to the invited women is seriously biased in favour of participation. Women are not informed about the major harms of screening, and the decision to attend has already been made for them by a public authority. This short-circuits informed decision-making and the legislation on informed consent, and violates the autonomy of the women. Screening invitations must present both benefits and harms in a balanced fashion, and should offer, not encourage, participation. It should be stated clearly that the choice not to participate is as sensible as the choice to do so. To allow this to happen, the responsibility for the screening programmes must be separated from the responsibility for the information material.


Journal ArticleDOI
TL;DR: A systematic review of incidence trends in countries with organised mammography screening that presented data on breast cancer incidence for both screened and nonscreened age groups for at least 7 years before screening and 7 years after screening had been fully implemented, found that compensatory drops in the older age groups were small or absent, although major drops would have been expected if the lead time of 2.4 years were true.
Abstract: Sir, It is reported that 7 years after the start of screening in Sweden, the incidence of invasive breast cancer in the screened age group was 69% higher than expected (Duffy et al, 2008). After adjustment for a lead time of 2.4 years and for the increased use of hormone replacement therapy, they found 39% excess. We have reservations about adjustments for lead time (Zahl et al, 2008). Correction for lead time should only be done for those cancers that would have been diagnosed at a later time in the absence of screening. But many screen-detected cancers would not have come to the women’s attention in their remaining lifetime, if they had not attended screening and are by definition overdiagnosed. If these cancers are not excluded from the calculation of lead time, the lead-time distribution will be artificially rightskewed, and the average lead time will appear to be much longer than it really is. Duffy et al, (2008) did not exclude such cancers from the calculation of lead time, but counted all excess cancers detected in a randomised trial as advanced diagnoses, when some of them were in fact overdiagnosed cases. They have therefore overestimated the lead-time effect. Duffy et al (2008) recognise that the detection of non-overdiagnosed cancers should give rise to a drop in incidence when the women leave the screening programme. Quantifying such a compensatory drop is a less bias-prone method for adjusting the incidence increase for leadtime effects, as it makes no assumption about the average lead time. It is also very simple to use (see, e.g., Zahl et al, 2004). We have done a systematic review of incidence trends in countries with organised mammography screening that presented data on breast cancer incidence for both screened and nonscreened age groups for at least 7 years before screening and 7 years after screening had been fully implemented (Jorgensen and Gotzsche, 2008). We were able to include data from United Kingdom, Canada, Australia, Sweden and Norway. We found that compensatory drops in the older age groups were small or absent, although major drops would have been expected if the lead time of 2.4 years were true. When we adjusted for these drops, we found 36% overdiagnosis of invasive breast cancer, in good agreement with the results of Duffy et al (2008), and 51% when we included carcinoma in situ. Duffy et al (2008) mention that after prolonged follow-up of the Malmo randomised screening trial, an overdiagnosis of 7–8% was reported and they find this estimate considerably more plausible than their own estimate of 39%. However, they have overlooked that the former estimate is seriously flawed (Zahl et al, 2008). There was substantial opportunistic screening in the control group and after adjustment for this, the overdiagnosis estimate in the Malmo trial is 24% (Gotzsche and Jorgensen, 2006). It is asserted that after adjustments, overdiagnosis estimates will be smaller than many rates quoted in the past (Duffy et al, 2008). We disagree, as most ‘rates quoted in the past’ have been too small (Gotzsche and Nielsen, 2006). On the basis of randomised trials, the overdiagnosis is 30% (Gotzsche and Nielsen, 2006), and it becomes 44% if adjusted for opportunistic screening in the control groups (Jorgensen and Gotzsche, 2008). The low rates in the past have been too low for the very reason that they have been based on flawed lead-time models (Jorgensen and Gotzsche, 2008; Zahl et al, 2008). We believe the most reliable estimate for organised mammography screening is 51% (Jorgensen and Gotzsche, 2008). This means that one in three breast cancers detected in a population offered organised mammography screening are overdiagnosed. Many women are therefore harmed substantially by screening, as practically all detected carcinoma in situ cases and invasive cancers are treated, at great physical and psychological costs.

Journal ArticleDOI
TL;DR: A recent Cochrane review of 10 randomised trials aimed at testing the religious belief that praying to a god can help those who are prayed for presented a scientifically unsound mixture of theological and scientific arguments.
Abstract: We discuss in this commentary a recent Cochrane review of 10 randomised trials aimed at testing the religious belief that praying to a god can help those who are prayed for. The review concluded that the available studies merit additional research. However, the review presented a scientifically unsound mixture of theological and scientific arguments, and two of the included trials that had a large impact on the findings had problems that were not described in the review. The review fails to live up to the high standards required for Cochrane reviews.

Journal ArticleDOI
07 Jan 2009-Trials
TL;DR: This article proposed that raw data from all trials should be posted on a public website to make it much easier to detect errors and flaws in publications, and it would allow many research projects to be performed without collecting new data.
Abstract: Flaws in research papers are common but it may require arduous detective work to unravel them Checklists are helpful, but many inconsistencies will only be revealed through repeated cross-checks of every little detail, just like in a crime case As a major deterrent for dishonesty, raw data from all trials should be posted on a public website This would also make it much easier to detect errors and flaws in publications, and it would allow many research projects to be performed without collecting new data The prevailing culture of secrecy and ownership to data is not in the best interests of patients

Journal ArticleDOI
27 May 2009-BMJ
TL;DR: The figure in the news item by Mayor shows that UK breast cancer mortality has fallen steadily since 1989, and the NHS breast screening programme has contributed, but it was introduced in 1988, only a year before the decline began.
Abstract: The figure in the news item by Mayor shows that UK breast cancer mortality has fallen steadily since 1989.1 According to Cancer Research UK, the NHS breast screening programme has contributed, but it was introduced in 1988, only a year before the decline began.1 Screening works through advancing the time of detection, and …

Journal ArticleDOI
25 Aug 2009-BMJ
TL;DR: It is estimated that one death from breast cancer is avoided for every 100 women screened over 20 years, which is about 10 times more optimistic than that reported in a comprehensive systematic review.
Abstract: Wald and colleagues estimate that one death from breast cancer is avoided for every 100 women screened over 20 years,1 which is about 10 times more optimistic than that reported in a comprehensive systematic review.2 They assume a “generally accepted” 24% reduction in mortality from breast cancer in a population offered screening from a consensus report of …

Journal ArticleDOI
TL;DR: The authors' findings in central Europe are consistent with Earle’s work, but they do not agree with his conclusion that much more work is needed, including studies at the level of individual fi rms, to explain the undoubtedly complex pathways linking mass privatisation and mortality.


01 Jan 2009
TL;DR: All trials of acupuncture for pain that had two control groups consisting of placeboacupuncture and no acupuncture were analyzed, finding a small analgesic effect of acupuncture, which seems to lack clinical relevance and cannot be clearly distinguished from bias.