scispace - formally typeset
Search or ask a question

Showing papers in "BMC Medical Research Methodology in 2011"


Journal ArticleDOI
TL;DR: Based on the experiences of conducting several health-related case studies, a reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach.
Abstract: The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach. The paper concludes with key pointers to aid those designing and appraising proposals for conducting case study research, and a checklist to help readers assess the quality of case study reports.

1,489 citations


Journal ArticleDOI
TL;DR: The PPV of NRP coding of the Charlson conditions was consistently high and ranged from 82.0% (95% CI; 68.6%, 91.6%) for diabetes with diabetic complications to 100% (one-sided 97.9%, 100%) for congestive heart failure.
Abstract: Background The Charlson comorbidity index is often used to control for confounding in research based on medical databases. There are few studies of the accuracy of the codes obtained from these databases. We examined the positive predictive value (PPV) of the ICD-10 diagnostic coding in the Danish National Registry of Patients (NRP) for the 19 Charlson conditions.

984 citations


Journal ArticleDOI
TL;DR: Conducting a systematic review of reviews highlights the usefulness of bringing together a summary of reviews in one place, where there is more than one review on an important topic.
Abstract: Background: Hundreds of studies of maternity care interventions have been published, too many for most people involved in providing maternity care to identify and consider when making decisions. It became apparent that systematic reviews of individual studies were required to appraise, summarise and bring together existing studies in a single place. However, decision makers are increasingly faced by a plethora of such reviews and these are likely to be of variable quality and scope, with more than one review of important topics. Systematic reviews (or overviews) of reviews are a logical and appropriate next step, allowing the findings of separate reviews to be compared and contrasted, providing clinical decision makers with the evidence they need. Methods: The methods used to identify and appraise published and unpublished reviews systematically, drawing on our experiences and good practice in the conduct and reporting of systematic reviews are described. The process of identifying and appraising all published reviews allows researchers to describe the quality of this evidence base, summarise and compare the review’s conclusions and discuss the strength of these conclusions. Results: Methodological challenges and possible solutions are described within the context of (i) sources, (ii) study selection, (iii) quality assessment (i.e. the extent of searching undertaken for the reviews, description of study selection and inclusion criteria, comparability of included studies, assessment of publication bias and assessment of heterogeneity), (iv) presentation of results, and (v) implications for practice and research. Conclusion: Conducting a systematic review of reviews highlights the usefulness of bringing together a summary of reviews in one place, where there is more than one review on an important topic. The methods described here should help clinicians to review and appraise published reviews systematically, and aid evidence-based clinical decision-making.

856 citations


Journal ArticleDOI
TL;DR: The current status of sample size in focus group studies reported in health journals is described to describe the often poor and inconsistent reporting seen in these studies and it is suggested that journals adopt more stringent requirements for focus group method reporting.
Abstract: Focus group studies are increasingly published in health related journals, but we know little about how researchers use this method, particularly how they determine the number of focus groups to conduct. The methodological literature commonly advises researchers to follow principles of data saturation, although practical advise on how to do this is lacking. Our objectives were firstly, to describe the current status of sample size in focus group studies reported in health journals. Secondly, to assess whether and how researchers explain the number of focus groups they carry out. We searched PubMed for studies that had used focus groups and that had been published in open access journals during 2008, and extracted data on the number of focus groups and on any explanation authors gave for this number. We also did a qualitative assessment of the papers with regard to how number of groups was explained and discussed. We identified 220 papers published in 117 journals. In these papers insufficient reporting of sample sizes was common. The number of focus groups conducted varied greatly (mean 8.4, median 5, range 1 to 96). Thirty seven (17%) studies attempted to explain the number of groups. Six studies referred to rules of thumb in the literature, three stated that they were unable to organize more groups for practical reasons, while 28 studies stated that they had reached a point of saturation. Among those stating that they had reached a point of saturation, several appeared not to have followed principles from grounded theory where data collection and analysis is an iterative process until saturation is reached. Studies with high numbers of focus groups did not offer explanations for number of groups. Too much data as a study weakness was not an issue discussed in any of the reviewed papers. Based on these findings we suggest that journals adopt more stringent requirements for focus group method reporting. The often poor and inconsistent reporting seen in these studies may also reflect the lack of clear, evidence-based guidance about deciding on sample size. More empirical research is needed to develop focus group methodology.

568 citations


Journal ArticleDOI
TL;DR: Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses, and should be incorporated as standard into statistical software.
Abstract: Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I 2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Differing results were obtained when the standard Q and I 2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim.

367 citations


Journal ArticleDOI
TL;DR: By employing grounded theory methodology rigorously, medical researchers can better design and justify their methods, and produce high-quality findings that will be more useful to patients, professionals and the research community.
Abstract: Background: Qualitative methodologies are increasingly popular in medical research. Grounded theory is the methodology most-often cited by authors of qualitative studies in medicine, but it has been suggested that many ‘grounded theory’ studies are not concordant with the methodology. In this paper we provide a worked example of a grounded theory project. Our aim is to provide a model for practice, to connect medical researchers with a useful methodology, and to increase the quality of ‘grounded theory’ research published in the medical literature. Methods: We documented a worked example of using grounded theory methodology in practice. Results: We describe our sampling, data collection, data analysis and interpretation. We explain how these steps were consistent with grounded theory methodology, and show how they related to one another. Grounded theory methodology assisted us to develop a detailed model of the process of adapting preventive protocols into dental practice, and to analyse variation in this process in different dental practices. Conclusions: By employing grounded theory methodology rigorously, medical researchers can better design and justify their methods, and produce high-quality findings that will be more useful to patients, professionals and the research community.

340 citations


Journal ArticleDOI
TL;DR: It is clear that the numbers of studies eligible for meta-analyses are typically very small for all medical areas, outcomes and interventions covered by Cochrane reviews, highlighting the particular importance of suitable methods for the meta-analysis of small data sets.
Abstract: Background Cochrane systematic reviews collate and summarise studies of the effects of healthcare interventions. The characteristics of these reviews and the meta-analyses and individual studies they contain provide insights into the nature of healthcare research and important context for the development of relevant statistical and other methods.

303 citations


Journal ArticleDOI
TL;DR: The novel and pragmatic approach to framework synthesis developed and described here was found to be fit for purpose and future research should seek to test further this approach to qualitative data synthesis.
Abstract: A variety of different approaches to the synthesis of qualitative data are advocated in the literature. The aim of this paper is to describe the application of a pragmatic method of qualitative evidence synthesis and the lessons learned from adopting this "best fit" framework synthesis approach. An evaluation of framework synthesis as an approach to the qualitative systematic review of evidence exploring the views of adults to the taking of potential agents within the context of the primary prevention of colorectal cancer. Twenty papers from North America, Australia, the UK and Europe met the criteria for inclusion. Fourteen themes were identified a priori from a related, existing conceptual model identified in the literature, which were then used to code the extracted data. Further analysis resulted in the generation of a more sophisticated model with additional themes. The synthesis required a combination of secondary framework and thematic analysis approaches and was conducted within a health technology assessment timeframe. The novel and pragmatic "best fit" approach to framework synthesis developed and described here was found to be fit for purpose. Future research should seek to test further this approach to qualitative data synthesis.

302 citations


Journal ArticleDOI
TL;DR: The currently accepted simple product-based method for calculating exposure-specific risks (ESR) was only a reasonable approach when the exposure probability is small and the RR is ≤ 3.0, and the revised product- based estimator provides much improved accuracy.
Abstract: Background Previous studies have proposed a simple product-based estimator for calculating exposure-specific risks (ESR), but the methodology has not been rigorously evaluated. The goal of our study was to evaluate the existing methodology for calculating the ESR, propose an improved point estimator, and propose variance estimates that will allow the calculation of confidence intervals (CIs).

296 citations


Journal ArticleDOI
TL;DR: Few self report adherence measures currently available were designed to have the ability to be completed by or in conjunction with carers and few were able to distinguish between different types of non-adherence, which limited their ability be used effectively in the continuous improvement of targeted adherence enhancing interventions.
Abstract: There is a recognised need to build primary care medication adherence services which are tailored to patients' needs. Continuous quality improvement of such services requires a regular working method of measuring adherence in order to monitor effectiveness. Self report has been considered the method of choice for clinical use; it is cheap, relatively unobtrusive and able to distinguish between intentional and unintentional non-adherence, which have different underlying causes and therefore require different interventions. A self report adherence measure used in routine clinical practice would ideally be brief, acceptable to patients, valid, reliable, have the ability to distinguish between different types of non-adherence and be able to be completed by or in conjunction with carers where necessary.

266 citations


Journal ArticleDOI
TL;DR: It is found that shortening a relatively lengthy questionnaire significantly increased the response, and sending a full reminder pack to non-respondents appears a worthwhile, albeit more costly, strategy.
Abstract: Background Minimising participant non-response in postal surveys helps to maximise the generalisability of the inferences made from the data collected. The aim of this study was to examine the effect of questionnaire length, personalisation and reminder type on postal survey response rate and quality and to compare the cost-effectiveness of the alternative survey strategies.

Journal ArticleDOI
TL;DR: This study aims to produce methodological guidance, publication standards and training resources for those seeking to use the realist and/or meta-narrative approach to systematic review, whose overall place in the secondary research toolkit is not yet fully established.
Abstract: Background There is growing interest in theory-driven, qualitative and mixed-method approaches to systematic review as an alternative to (or to extend and supplement) conventional Cochrane-style reviews. These approaches offer the potential to expand the knowledge base in policy-relevant areas - for example by explaining the success, failure or mixed fortunes of complex interventions. However, the quality of such reviews can be difficult to assess. This study aims to produce methodological guidance, publication standards and training resources for those seeking to use the realist and/or meta-narrative approach to systematic review.

Journal ArticleDOI
TL;DR: GEM Initiative evidence maps have a broad range of potential end-users including funding agencies, researchers and clinicians, and complements other review methods for describing existing research, informing future research efforts, and addressing evidence gaps.
Abstract: Evidence mapping describes the quantity, design and characteristics of research in broad topic areas, in contrast to systematic reviews, which usually address narrowly-focused research questions. The breadth of evidence mapping helps to identify evidence gaps, and may guide future research efforts. The Global Evidence Mapping (GEM) Initiative was established in 2007 to create evidence maps providing an overview of existing research in Traumatic Brain Injury (TBI) and Spinal Cord Injury (SCI). The GEM evidence mapping method involved three core tasks: 1. Setting the boundaries and context of the map: Definitions for the fields of TBI and SCI were clarified, the prehospital, acute inhospital and rehabilitation phases of care were delineated and relevant stakeholders (patients, carers, clinicians, researchers and policymakers) who could contribute to the mapping were identified. Researchable clinical questions were developed through consultation with key stakeholders and a broad literature search. 2. Searching for and selection of relevant studies: Evidence search and selection involved development of specific search strategies, development of inclusion and exclusion criteria, searching of relevant databases and independent screening and selection by two researchers. 3. Reporting on yield and study characteristics: Data extraction was performed at two levels - 'interventions and study design' and 'detailed study characteristics'. The evidence map and commentary reflected the depth of data extraction. One hundred and twenty-nine researchable clinical questions in TBI and SCI were identified. These questions were then prioritised into high (n = 60) and low (n = 69) importance by the stakeholders involved in question development. Since 2007, 58 263 abstracts have been screened, 3 731 full text articles have been reviewed and 1 644 relevant neurotrauma publications have been mapped, covering fifty-three high priority questions. GEM Initiative evidence maps have a broad range of potential end-users including funding agencies, researchers and clinicians. Evidence mapping is at least as resource-intensive as systematic reviewing. The GEM Initiative has made advancements in evidence mapping, most notably in the area of question development and prioritisation. Evidence mapping complements other review methods for describing existing research, informing future research efforts, and addressing evidence gaps.

Journal ArticleDOI
TL;DR: The proposed method achieves similar bias and mean square error when estimating the mean survival time to that achieved by analysis of the complete underlying individual patient data and naturally yields estimates of the uncertainty in curve fits, which are not available using the traditional methods.
Abstract: Mean costs and quality-adjusted-life-years are central to the cost-effectiveness of health technologies. They are often calculated from time to event curves such as for overall survival and progression-free survival. Ideally, estimates should be obtained from fitting an appropriate parametric model to individual patient data. However, such data are usually not available to independent researchers. Instead, it is common to fit curves to summary Kaplan-Meier graphs, either by regression or by least squares. Here, a more accurate method of fitting survival curves to summary survival data is described. First, the underlying individual patient data are estimated from the numbers of patients at risk (or other published information) and from the Kaplan-Meier graph. The survival curve can then be fit by maximum likelihood estimation or other suitable approach applied to the estimated individual patient data. The accuracy of the proposed method was compared against that of the regression and least squares methods and the use of the actual individual patient data by simulating the survival of patients in many thousands of trials. The cost-effectiveness of sunitinib versus interferon-alpha for metastatic renal cell carcinoma, as recently calculated for NICE in the UK, is reassessed under several methods, including the proposed method. Simulation shows that the proposed method gives more accurate curve fits than the traditional methods under realistic scenarios. Furthermore, the proposed method achieves similar bias and mean square error when estimating the mean survival time to that achieved by analysis of the complete underlying individual patient data. The proposed method also naturally yields estimates of the uncertainty in curve fits, which are not available using the traditional methods. The cost-effectiveness of sunitinib versus interferon-alpha is substantially altered when the proposed method is used. The method is recommended for cost-effectiveness analysis when only summary survival data are available. An easy-to-use Excel spreadsheet to implement the method is provided.

Journal ArticleDOI
TL;DR: The sequential mixed mode appears to be the most cost-effective mode of survey administration for surveys of the population of doctors, if one is prepared to accept a degree of response bias.
Abstract: Surveys of doctors are an important data collection method in health services research. Ways to improve response rates, minimise survey response bias and item non-response, within a given budget, have not previously been addressed in the same study. The aim of this paper is to compare the effects and costs of three different modes of survey administration in a national survey of doctors. A stratified random sample of 4.9% (2,702/54,160) of doctors undertaking clinical practice was drawn from a national directory of all doctors in Australia. Stratification was by four doctor types: general practitioners, specialists, specialists-in-training, and hospital non-specialists, and by six rural/remote categories. A three-arm parallel trial design with equal randomisation across arms was used. Doctors were randomly allocated to: online questionnaire (902); simultaneous mixed mode (a paper questionnaire and login details sent together) (900); or, sequential mixed mode (online followed by a paper questionnaire with the reminder) (900). Analysis was by intention to treat, as within each primary mode, doctors could choose either paper or online. Primary outcome measures were response rate, survey response bias, item non-response, and cost. The online mode had a response rate 12.95%, followed by the simultaneous mixed mode with 19.7%, and the sequential mixed mode with 20.7%. After adjusting for observed differences between the groups, the online mode had a 7 percentage point lower response rate compared to the simultaneous mixed mode, and a 7.7 percentage point lower response rate compared to sequential mixed mode. The difference in response rate between the sequential and simultaneous modes was not statistically significant. Both mixed modes showed evidence of response bias, whilst the characteristics of online respondents were similar to the population. However, the online mode had a higher rate of item non-response compared to both mixed modes. The total cost of the online survey was 38% lower than simultaneous mixed mode and 22% lower than sequential mixed mode. The cost of the sequential mixed mode was 14% lower than simultaneous mixed mode. Compared to the online mode, the sequential mixed mode was the most cost-effective, although exhibiting some evidence of response bias. Decisions on which survey mode to use depend on response rates, response bias, item non-response and costs. The sequential mixed mode appears to be the most cost-effective mode of survey administration for surveys of the population of doctors, if one is prepared to accept a degree of response bias. Online surveys are not yet suitable to be used exclusively for surveys of the doctor population.

Journal ArticleDOI
TL;DR: This work systematically outline sample size formulae (including required number of randomisation units, detectable difference and power) for CRCTs with a fixed number of clusters, to provide a concise summary for both binary and continuous outcomes.
Abstract: Cluster randomised controlled trials (CRCTs) are frequently used in health service evaluation. Assuming an average cluster size, required sample sizes are readily computed for both binary and continuous outcomes, by estimating a design effect or inflation factor. However, where the number of clusters are fixed in advance, but where it is possible to increase the number of individuals within each cluster, as is frequently the case in health service evaluation, sample size formulae have been less well studied. We systematically outline sample size formulae (including required number of randomisation units, detectable difference and power) for CRCTs with a fixed number of clusters, to provide a concise summary for both binary and continuous outcomes. Extensions to the case of unequal cluster sizes are provided. For trials with a fixed number of equal sized clusters (k), the trial will be feasible provided the number of clusters is greater than the product of the number of individuals required under individual randomisation (n I ) and the estimated intra-cluster correlation (ρ). So, a simple rule is that the number of clusters (k) will be sufficient provided: Where this is not the case, investigators can determine the maximum available power to detect the pre-specified difference, or the minimum detectable difference under the pre-specified value for power. Designing a CRCT with a fixed number of clusters might mean that the study will not be feasible, leading to the notion of a minimum detectable difference (or a maximum achievable power), irrespective of how many individuals are included within each cluster.

Journal ArticleDOI
TL;DR: Given the increasing number of simple indices of IR it may be difficult for clinicians and researchers to select the most appropriate index for their studies.
Abstract: Insulin resistance is one of the major aggravating factors for metabolic syndrome. There are many methods available for estimation of insulin resistance which range from complex techniques down to simple indices. For all methods of assessing insulin resistance it is essential that their validity and reliability is established before using them as investigations. The reference techniques of hyperinsulinaemic euglycaemic clamp and its alternative the frequently sampled intravenous glucose tolerance test are the most reliable methods available for estimating insulin resistance. However, many simple methods, from which indices can be derived, have been assessed and validated e.g. homeostasis model assessment (HOMA), quantitative insulin sensitivity check index (QUICKI). Given the increasing number of simple indices of IR it may be difficult for clinicians and researchers to select the most appropriate index for their studies. This review therefore provides guidelines and advices which must be considered before proceeding with a study.

Journal ArticleDOI
TL;DR: The large majority of recently published papers where authors have described their trial as a pilot or addressing feasibility do not primarily address methodological issues preparatory to planning a subsequent study, and this is particularly so for papers reporting drug trials.
Abstract: In the last decade several authors have reviewed the features of pilot and feasibility studies and advised on the issues that should be addressed within them. We extend this literature by examining published pilot/feasibility trials that incorporate random allocation, examining their stated objectives, results presented and conclusions drawn, and comparing drug and non-drug trials. A search of EMBASE and MEDLINE databases for 2000 to 2009 revealed 3652 papers that met our search criteria. A random sample of 50 was selected for detailed review. Most of the papers focused on efficacy: those reporting drug trials additionally addressed safety/toxicity; while those reporting non-drug trials additionally addressed methodological issues. In only 56% (95% confidence intervals 41% to 70%) were methodological issues discussed in substantial depth, 18% (95% confidence interval 9% to 30%) discussed future trials and only 12% (95% confidence interval 5% to 24%) of authors were actually conducting one. Despite recent advice on topics that can appropriately be described as pilot or feasibility studies the large majority of recently published papers where authors have described their trial as a pilot or addressing feasibility do not primarily address methodological issues preparatory to planning a subsequent study, and this is particularly so for papers reporting drug trials. Many journals remain willing to accept the pilot/feasibility designation for a trial, possibly as an indication of inconclusive results or lack of adequate sample size.

Journal ArticleDOI
TL;DR: Comparison of AUCs is a useful descriptive tool for initial evaluation of whether a new predictor might be of clinical relevance, but it has vastly inferior statistical properties.
Abstract: We have observed that the area under the receiver operating characteristic curve (AUC) is increasingly being used to evaluate whether a novel predictor should be incorporated in a multivariable model to predict risk of disease. Frequently, investigators will approach the issue in two distinct stages: first, by testing whether the new predictor variable is significant in a multivariable regression model; second, by testing differences between the AUC of models with and without the predictor using the same data from which the predictive models were derived. These two steps often lead to discordant conclusions. We conducted a simulation study in which two predictors, X and X*, were generated as standard normal variables with varying levels of predictive strength, represented by means that differed depending on the binary outcome Y. The data sets were analyzed using logistic regression, and likelihood ratio and Wald tests for the incremental contribution of X* were performed. The patient-specific predictors for each of the models were then used as data for a test comparing the two AUCs. Under the null, the size of the likelihood ratio and Wald tests were close to nominal, but the area test was extremely conservative, with test sizes less than 0.006 for all configurations studied. Where X* was associated with outcome, the area test had much lower power than the likelihood ratio and Wald tests. Evaluation of the statistical significance of a new predictor when there are existing clinical predictors is most appropriately accomplished in the context of a regression model. Although comparison of AUCs is a conceptually equivalent approach to the likelihood ratio and Wald test, it has vastly inferior statistical properties. Use of both approaches will frequently lead to inconsistent conclusions. Nonetheless, comparison of receiver operating characteristic curves remains a useful descriptive tool for initial evaluation of whether a new predictor might be of clinical relevance.

Journal ArticleDOI
TL;DR: One-third of Cochrane reviews with substantial heterogeneity had major problems in relation to their handling of heterogeneity, and more attention is needed to this issue, as the problems identified can be essential for the conclusions of the reviews.
Abstract: Dealing with heterogeneity in meta-analyses is often tricky, and there is only limited advice for authors on what to do. We investigated how authors addressed different degrees of heterogeneity, in particular whether they used a fixed effect model, which assumes that all the included studies are estimating the same true effect, or a random effects model where this is not assumed. We sampled randomly 60 Cochrane reviews from 2008, which presented a result in its first meta-analysis with substantial heterogeneity (I2 greater than 50%, i.e. more than 50% of the variation is due to heterogeneity rather than chance). We extracted information on choice of statistical model, how the authors had handled the heterogeneity, and assessed the methodological quality of the reviews in relation to this. The distribution of heterogeneity was rather uniform in the whole I2 interval, 50-100%. A fixed effect model was used in 33 reviews (55%), but there was no correlation between I2 and choice of model (P = 0.79). We considered that 20 reviews (33%), 16 of which had used a fixed effect model, had major problems. The most common problems were: use of a fixed effect model and lack of rationale for choice of that model, lack of comment on even severe heterogeneity and of reservations and explanations of its likely causes. The problematic reviews had significantly fewer included trials than other reviews (4.3 vs. 8.0, P = 0.024). The problems became less pronounced with time, as those reviews that were most recently updated more often used a random effects model. One-third of Cochrane reviews with substantial heterogeneity had major problems in relation to their handling of heterogeneity. More attention is needed to this issue, as the problems we identified can be essential for the conclusions of the reviews.

Journal ArticleDOI
TL;DR: Studies using retrospective birth cohorts should account for the fixed cohort bias by removing selected births to get unbiased estimates of seasonal health effects, and found strong artificial seasonal patterns in gestation length by month of conception, which depended on the end date of the study.
Abstract: Many previous studies have found seasonal patterns in birth outcomes, but with little agreement about which season poses the highest risk. Some of the heterogeneity between studies may be explained by a previously unknown bias. The bias occurs in retrospective cohorts which include all births occurring within a fixed start and end date, which means shorter pregnancies are missed at the start of the study, and longer pregnancies are missed at the end. Our objective was to show the potential size of this bias and how to avoid it. To demonstrate the bias we simulated a retrospective birth cohort with no seasonal pattern in gestation and used a range of cohort end dates. As a real example, we used a cohort of 114,063 singleton births in Brisbane between 1 July 2005 and 30 June 2009 and examined the bias when estimating changes in gestation length associated with season (using month of conception) and a seasonal exposure (temperature). We used survival analyses with temperature as a time-dependent variable. We found strong artificial seasonal patterns in gestation length by month of conception, which depended on the end date of the study. The bias was avoided when the day and month of the start date was just before the day and month of the end date (regardless of year), so that the longer gestations at the start of the study were balanced by the shorter gestations at the end. After removing the fixed cohort bias there was a noticeable change in the effect of temperature on gestation length. The adjusted hazard ratios were flatter at the extremes of temperature but steeper between 15 and 25°C. Studies using retrospective birth cohorts should account for the fixed cohort bias by removing selected births to get unbiased estimates of seasonal health effects.

Journal ArticleDOI
TL;DR: It is proposed that, with further psychometric testing, this new measure of resilience will provide researchers and clinicians with a comprehensive and developmentally appropriate instrument to measure a young person's capacity to achieve positive outcomes despite life stressors.
Abstract: The concept of resilience has captured the imagination of researchers and policy makers over the past two decades. However, despite the ever growing body of resilience research, there is a paucity of relevant, comprehensive measurement tools. In this article, the development of a theoretically based, comprehensive multi-dimensional measure of resilience in adolescents is described. Extensive literature review and focus groups with young people living with chronic illness informed the conceptual development of scales and items. Two sequential rounds of factor and scale analyses were undertaken to revise the conceptually developed scales using data collected from young people living with a chronic illness and a general population sample. The revised Adolescent Resilience Questionnaire comprises 93 items and 12 scales measuring resilience factors in the domains of self, family, peer, school and community. All scales have acceptable alpha coefficients. Revised scales closely reflect conceptually developed scales. It is proposed that, with further psychometric testing, this new measure of resilience will provide researchers and clinicians with a comprehensive and developmentally appropriate instrument to measure a young person's capacity to achieve positive outcomes despite life stressors.

Journal ArticleDOI
TL;DR: Simulation methods offer a flexible option to estimate statistical power for standard and non-traditional study designs and parameters of interest and are universally applicable for evaluating study designs used in epidemiologic and social science research.
Abstract: Background: Estimating the required sample size and statistical power for a study is an integral part of study design. For standard designs, power equations provide an efficient solution to the problem, but they are unavailable for many complex study designs that arise in practice. For such complex study designs, computer simulation is a useful alternative for estimating study power. Although this approach is well known among statisticians, in our experience many epidemiologists and social scientists are unfamiliar with the technique. This article aims to address this knowledge gap. Methods: We review an approach to estimate study power for individual- or cluster-randomized designs using computer simulation. This flexible approach arises naturally from the model used to derive conventional power equations, but extends those methods to accommodate arbitrarily complex designs. The method is universally applicable to a broad range of designs and outcomes, and we present the material in a way that is approachable for quantitative, applied researchers. We illustrate the method using two examples (one simple, one complex) based on sanitation and nutritional interventions to improve child growth. Results: We first show how simulation reproduces conventional power estimates for simple randomized designs over a broad range of sample scenarios to familiarize the reader with the approach. We then demonstrate how to extend the simulation approach to more complex designs. Finally, we discuss extensions to the examples in the article, and provide computer code to efficiently run the example simulations in both R and Stata.

Journal ArticleDOI
TL;DR: The FORM framework has now been widely adopted by Australian guideline developers who find it to be a logical and intuitive way to formulate and grade recommendations in clinical practice guidelines.
Abstract: Clinical practice guidelines are an important element of evidence-based practice. Considering an often complicated body of evidence can be problematic for guideline developers, who in the past may have resorted to using levels of evidence of individual studies as a quasi-indicator for the strength of a recommendation. This paper reports on the production and trial of a methodology and associated processes to assist Australian guideline developers in considering a body of evidence and grading the resulting guideline recommendations. In recognition of the complexities of clinical guidelines and the multiple factors that influence choice in health care, a working group of experienced guideline consultants was formed under the auspices of the Australian National Health and Medical Research Council (NHMRC) to produce and pilot a framework to formulate and grade guideline recommendations. Consultation with national and international experts and extensive piloting informed the process. The FORM framework consists of five components (evidence base, consistency, clinical impact, generalisability and applicability) which are used by guideline developers to structure their decisions on how to convey the strength of a recommendation through wording and grading via a considered judgement form. In parallel (but separate from the grading process) guideline developers are asked to consider implementation implications for each recommendation. The framework has now been widely adopted by Australian guideline developers who find it to be a logical and intuitive way to formulate and grade recommendations in clinical practice guidelines.

Journal ArticleDOI
TL;DR: (Network) meta-analysis of survival data with models where the treatment effect is represented with several parameters using fractional polynomials can be more closely fitted to the available data than meta- analysis based on the constant hazard ratio.
Abstract: Pairwise meta-analysis, indirect treatment comparisons and network meta-analysis for aggregate level survival data are often based on the reported hazard ratio, which relies on the proportional hazards assumption. This assumption is implausible when hazard functions intersect, and can have a huge impact on decisions based on comparisons of expected survival, such as cost-effectiveness analysis. As an alternative to network meta-analysis of survival data in which the treatment effect is represented by the constant hazard ratio, a multi-dimensional treatment effect approach is presented. With fractional polynomials the hazard functions of interventions compared in a randomized controlled trial are modeled, and the difference between the parameters of these fractional polynomials within a trial are synthesized (and indirectly compared) across studies. The proposed models are illustrated with an analysis of survival data in non-small-cell lung cancer. Fixed and random effects first and second order fractional polynomials were evaluated. (Network) meta-analysis of survival data with models where the treatment effect is represented with several parameters using fractional polynomials can be more closely fitted to the available data than meta-analysis based on the constant hazard ratio.

Journal ArticleDOI
TL;DR: Treating death as a competing risk gives estimators which address the clinical questions of interest, and allows for simultaneous modelling of both in-hospital mortality and TCS / LOS.
Abstract: Hospital length of stay (LOS) and time for a patient to reach clinical stability (TCS) have increasingly become important outcomes when investigating ways in which to combat Community Acquired Pneumonia (CAP). Difficulties arise when deciding how to handle in-hospital mortality. Ad-hoc approaches that are commonly used to handle time to event outcomes with mortality can give disparate results and provide conflicting conclusions based on the same data. To ensure compatibility among studies investigating these outcomes, this type of data should be handled in a consistent and appropriate fashion. Using both simulated data and data from the international Community Acquired Pneumonia Organization (CAPO) database, we evaluate two ad-hoc approaches for handling mortality when estimating the probability of hospital discharge and clinical stability: 1) restricting analysis to those patients who lived, and 2) assigning individuals who die the "worst" outcome (right-censoring them at the longest recorded LOS or TCS). Estimated probability distributions based on these approaches are compared with right-censoring the individuals who died at time of death (the complement of the Kaplan-Meier (KM) estimator), and treating death as a competing risk (the cumulative incidence estimator). Tests for differences in probability distributions based on the four methods are also contrasted. The two ad-hoc approaches give different estimates of the probability of discharge and clinical stability. Analysis restricted to patients who survived is conceptually problematic, as estimation is conditioned on events that happen at a future time. Estimation based on assigning those patients who died the worst outcome (longest LOS and TCS) coincides with the complement of the KM estimator based on the subdistribution hazard, which has been previously shown to be equivalent to the cumulative incidence estimator. However, in either case the time to in-hospital mortality is ignored, preventing simultaneous assessment of patient mortality in addition to LOS and/or TCS. The power to detect differences in underlying hazards of discharge between patient populations differs for test statistics based on the four approaches, and depends on the underlying hazard ratio of mortality between the patient groups. Treating death as a competing risk gives estimators which address the clinical questions of interest, and allows for simultaneous modelling of both in-hospital mortality and TCS / LOS. This article advocates treating mortality as a competing risk when investigating other time related outcomes.

Journal ArticleDOI
TL;DR: For a large data set there seems to be no explicit preference for either a frequentist or Bayesian approach (if based on vague priors), and on relatively large data sets, the different software implementations of logistic random effects regression models produced similar results.
Abstract: Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC. Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient. On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "non-informative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain.

Journal ArticleDOI
TL;DR: The flexible parametric survival model is extended to incorporate cure as a special case to estimate the cure proportion and the survival of the "uncured" and gives similar results to a standard cure model, when it is reliable, and better fit when the standard Cure model gives biased estimates.
Abstract: When the mortality among a cancer patient group returns to the same level as in the general population, that is, the patients no longer experience excess mortality, the patients still alive are considered "statistically cured". Cure models can be used to estimate the cure proportion as well as the survival function of the "uncured". One limitation of parametric cure models is that the functional form of the survival of the "uncured" has to be specified. It can sometimes be hard to find a survival function flexible enough to fit the observed data, for example, when there is high excess hazard within a few months from diagnosis, which is common among older age groups. This has led to the exclusion of older age groups in population-based cancer studies using cure models. Here we have extended the flexible parametric survival model to incorporate cure as a special case to estimate the cure proportion and the survival of the "uncured". Flexible parametric survival models use splines to model the underlying hazard function, and therefore no parametric distribution has to be specified. We have compared the fit from standard cure models to our flexible cure model, using data on colon cancer patients in Finland. This new method gives similar results to a standard cure model, when it is reliable, and better fit when the standard cure model gives biased estimates. Cure models within the framework of flexible parametric models enables cure modelling when standard models give biased estimates. These flexible cure models enable inclusion of older age groups and can give stage-specific estimates, which is not always possible from parametric cure models.

Journal ArticleDOI
TL;DR: A number of methods, particularly the AFT method of Branson and Whitehead were found to give less biased estimates of the true treatment effect in these situations and alternative approaches such as the Branson & Whitehead method to adjust for switching should be considered.
Abstract: We investigate methods used to analyse the results of clinical trials with survival outcomes in which some patients switch from their allocated treatment to another trial treatment. These included simple methods which are commonly used in medical literature and may be subject to selection bias if patients switching are not typical of the population as a whole. Methods which attempt to adjust the estimated treatment effect, either through adjustment to the hazard ratio or via accelerated failure time models, were also considered. A simulation study was conducted to assess the performance of each method in a number of different scenarios. 16 different scenarios were identified which differed by the proportion of patients switching, underlying prognosis of switchers and the size of true treatment effect. 1000 datasets were simulated for each of these and all methods applied. Selection bias was observed in simple methods when the difference in survival between switchers and non-switchers were large. A number of methods, particularly the AFT method of Branson and Whitehead were found to give less biased estimates of the true treatment effect in these situations. Simple methods are often not appropriate to deal with treatment switching. Alternative approaches such as the Branson & Whitehead method to adjust for switching should be considered.

Journal ArticleDOI
TL;DR: There is consensus from a group of 21 international experts that methodological criteria to assess moderators within systematic reviews of RCTs is both timely and necessary.
Abstract: Background Current methodological guidelines provide advice about the assessment of sub-group analysis within RCTs, but do not specify explicit criteria for assessment. Our objective was to provide researchers with a set of criteria that will facilitate the grading of evidence for moderators, in systematic reviews.