scispace - formally typeset
Search or ask a question

Showing papers in "BMC Medical Research Methodology in 2010"


Journal ArticleDOI
TL;DR: A detailed examination of the key aspects of pilot studies for phase III trials including the general reasons for conducting a pilot study, the relationships between pilot studies, proof-of-concept studies, and adaptive designs, and some suggestions on how to report the results of pilot investigations using the CONSORT format.
Abstract: Pilot studies for phase III trials - which are comparative randomized trials designed to provide preliminary evidence on the clinical efficacy of a drug or intervention - are routinely performed in many clinical areas. Also commonly know as "feasibility" or "vanguard" studies, they are designed to assess the safety of treatment or interventions; to assess recruitment potential; to assess the feasibility of international collaboration or coordination for multicentre trials; to increase clinical experience with the study medication or intervention for the phase III trials. They are the best way to assess feasibility of a large, expensive full-scale study, and in fact are an almost essential pre-requisite. Conducting a pilot prior to the main study can enhance the likelihood of success of the main study and potentially help to avoid doomed main studies. The objective of this paper is to provide a detailed examination of the key aspects of pilot studies for phase III trials including: 1) the general reasons for conducting a pilot study; 2) the relationships between pilot studies, proof-of-concept studies, and adaptive designs; 3) the challenges of and misconceptions about pilot studies; 4) the criteria for evaluating the success of a pilot study; 5) frequently asked questions about pilot studies; 7) some ethical aspects related to pilot studies; and 8) some suggestions on how to report the results of pilot investigations using the CONSORT format.

2,365 citations


Journal ArticleDOI
TL;DR: These choices for the design requirements and preferred statistical methods for which no evidence is available in the literature or on which the Delphi panel members had substantial discussion are explained.
Abstract: Background: The COSMIN checklist (COnsensus-based Standards for the selection of health status Measurement INstruments) was developed in an international Delphi study to evaluate the methodological quality of studies on measurement properties of health-related patient reported outcomes (HR-PROs). In this paper, we explain our choices for the design requirements and preferred statistical methods for which no evidence is available in the literature or on which the Delphi panel members had substantial discussion. Methods: The issues described in this paper are a reflection of the Delphi process in which 43 panel members participated. Results: The topics discussed are internal consistency (relevance for reflective and formative models, and distinction with unidimensionality), content validity (judging relevance and comprehensiveness), hypotheses testing as an aspect of construct validity (specificity of hypotheses), criterion validity (relevance for PROs), and responsiveness (concept and relation to validity, and (in) appropriate measures). Conclusions: We expect that this paper will contribute to a better understanding of the rationale behind the items, thereby enhancing the acceptance and use of the COSMIN checklist.

1,213 citations


Journal ArticleDOI
TL;DR: Pilot studies are still poorly reported, with inappropriate emphasis on hypothesis-testing, and authors should be aware of the different requirements of pilot studies, feasibility studies and main studies and report them appropriately.
Abstract: In 2004, a review of pilot studies published in seven major medical journals during 2000-01 recommended that the statistical analysis of such studies should be either mainly descriptive or focus on sample size estimation, while results from hypothesis testing must be interpreted with caution. We revisited these journals to see whether the subsequent recommendations have changed the practice of reporting pilot studies. We also conducted a survey to identify the methodological components in registered research studies which are described as 'pilot' or 'feasibility' studies. We extended this survey to grant-awarding bodies and editors of medical journals to discover their policies regarding the function and reporting of pilot studies. Papers from 2007-08 in seven medical journals were screened to retrieve published pilot studies. Reports of registered and completed studies on the UK Clinical Research Network (UKCRN) Portfolio database were retrieved and scrutinized. Guidance on the conduct and reporting of pilot studies was retrieved from the websites of three grant giving bodies and seven journal editors were canvassed. 54 pilot or feasibility studies published in 2007-8 were found, of which 26 (48%) were pilot studies of interventions and the remainder feasibility studies. The majority incorporated hypothesis-testing (81%), a control arm (69%) and a randomization procedure (62%). Most (81%) pointed towards the need for further research. Only 8 out of 90 pilot studies identified by the earlier review led to subsequent main studies. Twelve studies which were interventional pilot/feasibility studies and which included testing of some component of the research process were identified through the UKCRN Portfolio database. There was no clear distinction in use of the terms 'pilot' and 'feasibility'. Five journal editors replied to our entreaty. In general they were loathe to publish studies described as 'pilot'. Pilot studies are still poorly reported, with inappropriate emphasis on hypothesis-testing. Authors should be aware of the different requirements of pilot studies, feasibility studies and main studies and report them appropriately. Authors should be explicit as to the purpose of a pilot study. The definitions of feasibility and pilot studies vary and we make proposals here to clarify terminology.

1,167 citations


Journal ArticleDOI
TL;DR: The process and required steps involved in the cross-cultural adaptation of a research instrument using the adaptation process of an attitudinal instrument as an example shows the importance of ensuring that concepts within an instrument are equal between the original and target language, time and context.
Abstract: Background Research questionnaires are not always translated appropriately before they are used in new temporal, cultural or linguistic settings. The results based on such instruments may therefore not accurately reflect what they are supposed to measure. This paper aims to illustrate the process and required steps involved in the cross-cultural adaptation of a research instrument using the adaptation process of an attitudinal instrument as an example.

528 citations


Journal ArticleDOI
TL;DR: These findings show that for a broad range of risk factors, two studies of the same population with varying response rate, sampling frame and mode of questionnaire administration yielded consistent estimates of exposure-outcome relationships, however, ORs varied between the studies where they did not use identical questionnaire items.
Abstract: There is little empirical evidence regarding the generalisability of relative risk estimates from studies which have relatively low response rates or are of limited representativeness. The aim of this study was to investigate variation in exposure-outcome relationships in studies of the same population with different response rates and designs by comparing estimates from the 45 and Up Study, a population-based cohort study (self-administered postal questionnaire, response rate 18%), and the New South Wales Population Health Survey (PHS) (computer-assisted telephone interview, response rate ~60%). Logistic regression analysis of questionnaire data from 45 and Up Study participants (n = 101,812) and 2006/2007 PHS participants (n = 14,796) was used to calculate prevalence estimates and odds ratios (ORs) for comparable variables, adjusting for age, sex and remoteness. ORs were compared using Wald tests modelling each study separately, with and without sampling weights. Prevalence of some outcomes (smoking, private health insurance, diabetes, hypertension, asthma) varied between the two studies. For highly comparable questionnaire items, exposure-outcome relationship patterns were almost identical between the studies and ORs for eight of the ten relationships examined did not differ significantly. For questionnaire items that were only moderately comparable, the nature of the observed relationships did not differ materially between the two studies, although many ORs differed significantly. These findings show that for a broad range of risk factors, two studies of the same population with varying response rate, sampling frame and mode of questionnaire administration yielded consistent estimates of exposure-outcome relationships. However, ORs varied between the studies where they did not use identical questionnaire items.

368 citations


Journal ArticleDOI
TL;DR: Investigating time-varying effects of prognostic factors of metastases should be an integral part of Cox survival analyses and provide insights on some specific time patterns, and on valuable biological information that could be missed otherwise.
Abstract: The Cox model relies on the proportional hazards (PH) assumption, implying that the factors investigated have a constant impact on the hazard - or risk - over time. We emphasize the importance of this assumption and the misleading conclusions that can be inferred if it is violated; this is particularly essential in the presence of long follow-ups. We illustrate our discussion by analyzing prognostic factors of metastases in 979 women treated for breast cancer with surgery. Age, tumour size and grade, lymph node involvement, peritumoral vascular invasion (PVI), status of hormone receptors (HRec), Her2, and Mib1 were considered. Median follow-up was 14 years; 264 women developed metastases. The conventional Cox model suggested that all factors but HRec, Her2, and Mib1 status were strong prognostic factors of metastases. Additional tests indicated that the PH assumption was not satisfied for some variables of the model. Tumour grade had a significant time-varying effect, but although its effect diminished over time, it remained strong. Interestingly, while the conventional Cox model did not show any significant effect of the HRec status, tests provided strong evidence that this variable had a non-constant effect over time. Negative HRec status increased the risk of metastases early but became protective thereafter. This reversal of effect may explain non-significant hazard ratios provided by previous conventional Cox analyses in studies with long follow-ups. Investigating time-varying effects should be an integral part of Cox survival analyses. Detecting and accounting for time-varying effects provide insights on some specific time patterns, and on valuable biological information that could be missed otherwise.

297 citations


Journal ArticleDOI
TL;DR: A significant and positive relationship between HI titre and clinical protection against influenza was observed in all tested models and was found to be similar irrespective of the type of viral strain (A or B) and the vaccination status of the individuals.
Abstract: Antibodies directed against haemagglutinin, measured by the haemagglutination inhibition (HI) assay are essential to protective immunity against influenza infection. An HI titre of 1:40 is generally accepted to correspond to a 50% reduction in the risk of contracting influenza in a susceptible population, but limited attempts have been made to further quantify the association between HI titre and protective efficacy. We present a model, using a meta-analytical approach, that estimates the level of clinical protection against influenza at any HI titre level. Source data were derived from a systematic literature review that identified 15 studies, representing a total of 5899 adult subjects and 1304 influenza cases with interval-censored information on HI titre. The parameters of the relationship between HI titre and clinical protection were estimated using Bayesian inference with a consideration of random effects and censorship in the available information. A significant and positive relationship between HI titre and clinical protection against influenza was observed in all tested models. This relationship was found to be similar irrespective of the type of viral strain (A or B) and the vaccination status of the individuals. Although limitations in the data used should not be overlooked, the relationship derived in this analysis provides a means to predict the efficacy of inactivated influenza vaccines when only immunogenicity data are available. This relationship can also be useful for comparing the efficacy of different influenza vaccines based on their immunological profile.

291 citations


Journal ArticleDOI
TL;DR: A review of recent research in automated de-identification of narrative text documents from the electronic health record finds methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize.
Abstract: Background: In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be considered de-identified, the HIPAA “Safe Harbor” technique requires 18 data elements (called PHI: Protected Health Information) to be removed. The de-identification of narrative text documents is often realized manually, and requires significant resources. Well aware of these issues, several authors have investigated automated de-identification of narrative text documents from the electronic health record, and a review of recent research in this domain is presented here. Methods: This review focuses on recently published research (after 1995), and includes relevant publications from bibliographic queries in PubMed, conference proceedings, the ACM Digital Library, and interesting publications referenced in already included papers. Results: The literature search returned more than 200 publications. The majority focused only on structured data de-identification instead of narrative text, on image de-identification, or described manual de-identification, and were therefore excluded. Finally, 18 publications describing automated text de-identification were selected for detailed analysis of the architecture and methods used, the types of PHI detected and removed, the external resources used, and the types of clinical documents targeted. All text de-identification systems aimed to identify and remove person names, and many included other types of PHI. Most systems used only one or two specific clinical document types, and were mostly based on two different groups of methodologies: pattern matching and machine learning. Many systems combined both approaches for different types of PHI, but the majority relied only on pattern matching, rules, and dictionaries. Conclusions: In general, methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize. Methods based on machine learning tend to perform better, especially with PHI that is not mentioned in the dictionaries used. Finally, the issues of anonymization, sufficient performance, and “over-scrubbing” are discussed in this publication.

288 citations


Journal ArticleDOI
TL;DR: It is indicated that raters often choose the same response option, but that it is difficult on item level to distinguish between articles of the COSMIN checklist.
Abstract: Background: The COSMIN checklist is a tool for evaluating the methodological quality of studies on measurement properties of health-related patient-reported outcomes. The aim of this study is to determine the inter-rater agreement and reliability of each item score of the COSMIN checklist (n = 114). Methods: 75 articles evaluating measurement properties were randomly selected from the bibliographic database compiled by the Patient-Reported Outcome Measurement Group, Oxford, UK. Raters were asked to assess the methodological quality of three articles, using the COSMIN checklist. In a one-way design, percentage agreement and intraclass kappa coefficients or quadratic-weighted kappa coefficients were calculated for each item. Results: 88 raters participated. Of the 75 selected articles, 26 articles were rated by four to six participants, and 49 by two or three participants. Overall, percentage agreement was appropriate (68% was above 80% agreement), and the kappa coefficients for the COSMIN items were low (61% was below 0.40, 6% was above 0.75). Reasons for low inter-rater agreement were need for subjective judgement, and accustom to different standards, terminology and definitions. Conclusions: Results indicated that raters often choose the same response option, but that it is difficult on item level to distinguish between articles. When using the COSMIN checklist in a systematic review, we recommend getting some training and experience, completing it by two independent raters, and reaching consensus on one final rating. Instructions for using the checklist are improved.

233 citations


Journal ArticleDOI
TL;DR: This paper presents a tutorial illustrating how network meta-analyses of survival endpoints can combine count and hazard ratio statistics in a single analysis on the hazard ratio scale and avoids the potential selection bias associated with conducting an analysis for a single statistic.
Abstract: Data on survival endpoints are usually summarised using either hazard ratio, cumulative number of events, or median survival statistics. Network meta-analysis, an extension of traditional pairwise meta-analysis, is typically based on a single statistic. In this case, studies which do not report the chosen statistic are excluded from the analysis which may introduce bias. In this paper we present a tutorial illustrating how network meta-analyses of survival endpoints can combine count and hazard ratio statistics in a single analysis on the hazard ratio scale. We also describe methods for accounting for the correlations in relative treatment effects (such as hazard ratios) that arise in trials with more than two arms. Combination of count and hazard ratio data in a single analysis is achieved by estimating the cumulative hazard for each trial arm reporting count data. Correlation in relative treatment effects in multi-arm trials is preserved by converting the relative treatment effect estimates (the hazard ratios) to arm-specific outcomes (hazards). A worked example of an analysis of mortality data in chronic obstructive pulmonary disease (COPD) is used to illustrate the methods. The data set and WinBUGS code for fixed and random effects models are provided. By incorporating all data presentations in a single analysis, we avoid the potential selection bias associated with conducting an analysis for a single statistic and the potential difficulties of interpretation, misleading results and loss of available treatment comparisons associated with conducting separate analyses for different summary statistics.

213 citations


Journal ArticleDOI
TL;DR: The results from this simulation study suggest that performing MICE-PMM may be the preferred MI approach provided that less than 50% of the cases have missing data and the missing data are not MNAR.
Abstract: There is no consensus on the most appropriate approach to handle missing covariate data within prognostic modelling studies. Therefore a simulation study was performed to assess the effects of different missing data techniques on the performance of a prognostic model. Datasets were generated to resemble the skewed distributions seen in a motivating breast cancer example. Multivariate missing data were imposed on four covariates using four different mechanisms; missing completely at random (MCAR), missing at random (MAR), missing not at random (MNAR) and a combination of all three mechanisms. Five amounts of incomplete cases from 5% to 75% were considered. Complete case analysis (CC), single imputation (SI) and five multiple imputation (MI) techniques available within the R statistical software were investigated: a) data augmentation (DA) approach assuming a multivariate normal distribution, b) DA assuming a general location model, c) regression switching imputation, d) regression switching with predictive mean matching (MICE-PMM) and e) flexible additive imputation models. A Cox proportional hazards model was fitted and appropriate estimates for the regression coefficients and model performance measures were obtained. Performing a CC analysis produced unbiased regression estimates, but inflated standard errors, which affected the significance of the covariates in the model with 25% or more missingness. Using SI, underestimated the variability; resulting in poor coverage even with 10% missingness. Of the MI approaches, applying MICE-PMM produced, in general, the least biased estimates and better coverage for the incomplete covariates and better model performance for all mechanisms. However, this MI approach still produced biased regression coefficient estimates for the incomplete skewed continuous covariates when 50% or more cases had missing data imposed with a MCAR, MAR or combined mechanism. When the missingness depended on the incomplete covariates, i.e. MNAR, estimates were biased with more than 10% incomplete cases for all MI approaches. The results from this simulation study suggest that performing MICE-PMM may be the preferred MI approach provided that less than 50% of the cases have missing data and the missing data are not MNAR.

Journal ArticleDOI
TL;DR: These surname lists can be used to identify cohorts of people with South Asian and Chinese origins from secondary data sources with a high degree of accuracy and could then be used in epidemiologic and health service research studies of populations with South Indian and Chinese origin.
Abstract: Surname lists are useful for identifying cohorts of ethnic minority patients from secondary data sources. This study sought to develop and validate lists to identify people of South Asian and Chinese origin. Comprehensive lists of South Asian and Chinese surnames were reviewed to identify those that uniquely belonged to the ethnic minority group. Surnames that were common in other populations, communities or ethnic groups were specifically excluded. These surname lists were applied to the Registered Persons Database, a registry of the health card numbers assigned to all residents of the Canadian province of Ontario, so that all residents were assigned to South Asian ethnicity, Chinese ethnicity or the General Population. Ethnic assignment was validated against self-identified ethnicity through linkage with responses to the Canadian Community Health Survey. The final surname lists included 9,950 South Asian surnames and 1,133 Chinese surnames. All 16,688,384 current and former residents of Ontario were assigned to South Asian ethnicity, Chinese ethnicity or the General Population based on their surnames. Among 69,859 respondents to the Canadian Community Health Survey, both lists performed extremely well when compared against self-identified ethnicity: positive predictive value was 89.3% for the South Asian list, and 91.9% for the Chinese list. Because surnames shared with other ethnic groups were deliberately excluded from the lists, sensitivity was lower (50.4% and 80.2%, respectively). These surname lists can be used to identify cohorts of people with South Asian and Chinese origins from secondary data sources with a high degree of accuracy. These cohorts could then be used in epidemiologic and health service research studies of populations with South Asian and Chinese origins.

Journal ArticleDOI
TL;DR: The error matrix provides an overview of the validity of the available evidence at a glance, and may assist in deciding which interventions to use in clinical practice.
Abstract: Clinical evidence continues to expand and is increasingly difficult to overview. We aimed at conceptualizing a visual assessment tool, i.e., a matrix for overviewing studies and their data in order to assess the clinical evidence at a glance. A four-step matrix was constructed using the three dimensions of systematic error, random error, and design error. Matrix step I ranks the identified studies according to the dimensions of systematic errors and random errors. Matrix step II orders the studies according to the design errors. Matrix step III assesses the three dimensions of errors in studies. Matrix step IV assesses the size and direction of the intervention effect. The application of this four-step matrix is illustrated with two examples: peri-operative beta-blockade initialized in relation to surgery versus placebo for major non-cardiac surgery, and antiarrhythmics for maintaining sinus rhythm after cardioversion of atrial fibrillation. When clinical evidence is deemed both internally and externally valid, the size of the intervention effect is to be assessed. The error matrix provides an overview of the validity of the available evidence at a glance, and may assist in deciding which interventions to use in clinical practice.

Journal ArticleDOI
TL;DR: The overall findings of this study suggest that most selected indicators in the HBSC survey questionnaire have satisfactory test-retest reliability for the students in Beijing.
Abstract: Background: Children’s health and health behaviour are essential for their development and it is important to obtain abundant and accurate information to understand young people’s health and health behaviour. The Health Behaviour in School-aged Children (HBSC) study is among the first large-scale international surveys on adolescent health through self-report questionnaires. So far, more than 40 countries in Europe and North America have been involved in the HBSC study. The purpose of this study is to assess the test-retest reliability of selected items in the Chinese version of the HBSC survey questionnaire in a sample of adolescents in Beijing, China. Methods: A sample of 95 male and female students aged 11 or 15 years old participated in a test and retest with a three weeks interval. Student Identity numbers of respondents were utilized to permit matching of test-retest questionnaires. 23 items concerning physical activity, sedentary behaviour, sleep and substance use were evaluated by using the percentage of response shifts and the single measure Intraclass Correlation Coefficients (ICC) with 95% confidence interval (CI) for all respondents and stratified by gender and age. Items on substance use were only evaluated for school children aged 15 years old. Results: The percentage of no response shift between test and retest varied from 32% for the item on computer use at weekends to 92% for the three items on smoking. Of all the 23 items evaluated, 6 items (26%) showed a moderate reliability, 12 items (52%) displayed a substantial reliability and 4 items (17%) indicated almost perfect reliability. No gender and age group difference of the test-retest reliability was found except for a few items on sedentary behaviour. Conclusions: The overall findings of this study suggest that most selected indicators in the HBSC survey questionnaire have satisfactory test-retest reliability for the students in Beijing. Further test-retest studies in a large and diverse sample, as well as validity studies, should be considered for the future Chinese HBSC study. Background Health behaviour of young people is a global concern. Currently, in China, a large range of problems concerning the health behaviour of the youth is emerging along with changes in lifestyle brought about by rapid economic development and globalization [1,2]. So far, only few national surveys concerning the health behaviour of the Chinese youth have been conducted. In addition to national level research, many studies which investigate a particular health behaviour, or a number of health behaviours and lifestyle traits of young people, have been done by Chinese researchers independently or through a collaborative project with foreign researchers [3-9]. Nevertheless, very few of them can give a comprehensive and comparable portfolio of health behaviour of young Chinese people. Research exploring children’s health behaviours and the factors that influence them are important for the development of effective health education and health promotion programs and policies for young people [10]. Many national and international level studies concerning young people’s health behaviour have been conducted in recent decades. The Health Behaviour in School-aged Children (HBSC) study is among the first large-scale

Journal ArticleDOI
TL;DR: Of the MI approaches investigated, MI using MICE-PMM produced the least biased estimates and better model performance measures, however, this MI approach still produced biased regression coefficient estimates with 75% missingness.
Abstract: The appropriate handling of missing covariate data in prognostic modelling studies is yet to be conclusively determined. A resampling study was performed to investigate the effects of different missing data methods on the performance of a prognostic model. Observed data for 1000 cases were sampled with replacement from a large complete dataset of 7507 patients to obtain 500 replications. Five levels of missingness (ranging from 5% to 75%) were imposed on three covariates using a missing at random (MAR) mechanism. Five missing data methods were applied; a) complete case analysis (CC) b) single imputation using regression switching with predictive mean matching (SI), c) multiple imputation using regression switching imputation, d) multiple imputation using regression switching with predictive mean matching (MICE-PMM) and e) multiple imputation using flexible additive imputation models. A Cox proportional hazards model was fitted to each dataset and estimates for the regression coefficients and model performance measures obtained. CC produced biased regression coefficient estimates and inflated standard errors (SEs) with 25% or more missingness. The underestimated SE after SI resulted in poor coverage with 25% or more missingness. Of the MI approaches investigated, MI using MICE-PMM produced the least biased estimates and better model performance measures. However, this MI approach still produced biased regression coefficient estimates with 75% missingness. Very few differences were seen between the results from all missing data approaches with 5% missingness. However, performing MI using MICE-PMM may be the preferred missing data approach for handling between 10% and 50% MAR missingness.

Journal ArticleDOI
TL;DR: This study confirms the cross-gender construct validity of psychological distress as assessed with the K6 despite differences in the expression of some symptoms in women and in men over the life-course and over time.
Abstract: Background: Psychological distress is a widespread indicator of mental health and mental illness in research and clinical settings. A recurrent finding from epidemiological studies and population surveys is that women report a higher mean level and a higher prevalence of psychological distress than men. These differences may reflect, to some extent, cultural norms associated with the expression of distress in women and men. Assuming that these norms differ across age groups and that they evolve over time, one would expect gender differences in psychological distress to vary over the life-course and over time. The objective of this study was to investigate the construct validity of a psychological distress scale, the K6, across gender in different age groups and over a twelve-year period. Methods: This study is based on data from the Canadian National Population Health Survey (C-NPHS). Psychological distress was assessed with the K6, a scale developed by Kessler and his colleagues. Data were examined through multi-group confirmatory factor analyses. Increasing levels of measurement and structural invariance across gender were assessed cross-sectionally with data from cycle 1 (n = 13019) of the C-NPHS and longitudinally with cycles 1 (1994-1995), 4 (2000-2001) and 7 (2006-2007). Results: Higher levels of measurement and structural invariance across gender were reached only after the constraint of equivalence was relaxed for various parameters of a few items of the K6. Some items had a different pattern of gender non invariance across age groups and over the course of the study. Gender differences in the expression of psychological distress may vary over the lifespan and over a 12-year period without markedly affecting the construct validity of the K6. Conclusions: This study confirms the cross-gender construct validity of psychological distress as assessed with the K6 despite differences in the expression of some symptoms in women and in men over the life-course and over time. Findings suggest that the higher mean level of psychological distress observed in women reflects a true difference in distress and is unlikely to be gender-biased. Gender differences in psychological distress are an important public health and clinical issue and further researches are needed to decipher the factors underlying these differences.

Journal ArticleDOI
TL;DR: The use of the graphical VAR approach for the analysis of electronic diary data leads to a deeper insight into patient's dynamics and dependence structures and could lead to a better understanding of complex psychological and physiological mechanisms in different areas of medical care and research.
Abstract: Background: In recent years, electronic diaries are increasingly used in medical research and practice to investigate patients' processes and fluctuations in symptoms over time. To model dynamic dependence structures and feedback mechanisms between symptom-relevant variables, a multivariate time series method has to be applied. Methods: We propose to analyse the temporal interrelationships among the variables by a structural modelling approach based on graphical vector autoregressive (VAR) models. We give a comprehensive description of the underlying concepts and explain how the dependence structure can be recovered from electronic diary data by a search over suitable constrained (graphical) VAR models. Results: The graphical VAR approach is applied to the electronic diary data of 35 obese patients with and without binge eating disorder (BED). The dynamic relationships for the two subgroups between eating behaviour, depression, anxiety and eating control are visualized in two path diagrams. Results show that the two subgroups of obese patients with and without BED are distinguishable by the temporal patterns which influence their respective eating behaviours. Conclusion: The use of the graphical VAR approach for the analysis of electronic diary data leads to a deeper insight into patient's dynamics and dependence structures. An increasing use of this modelling approach could lead to a better understanding of complex psychological and physiological mechanisms in different areas of medical care and research.

Journal ArticleDOI
TL;DR: A method framework is presented to inform the design and evaluation of subgrouping research in low back pain and to describe method options when investigating prognostic effects or subgroup treatment effects, to discuss the strengths and limitations of research methods suitable for the hypothesis-setting phase of sub group studies.
Abstract: There is considerable clinician and researcher interest in whether the outcomes for patients with low back pain, and the efficiency of the health systems that treat them, can be improved by 'subgrouping research'. Subgrouping research seeks to identify subgroups of people who have clinically important distinctions in their treatment needs or prognoses. Due to a proliferation of research methods and variability in how subgrouping results are interpreted, it is timely to open discussion regarding a conceptual framework for the research designs and statistical methods available for subgrouping studies (a method framework). The aims of this debate article are: (1) to present a method framework to inform the design and evaluation of subgrouping research in low back pain, (2) to describe method options when investigating prognostic effects or subgroup treatment effects, and (3) to discuss the strengths and limitations of research methods suitable for the hypothesis-setting phase of subgroup studies. The proposed method framework proposes six phases for studies of subgroups: studies of assessment methods, hypothesis-setting studies, hypothesis-testing studies, narrow validation studies, broad validation studies, and impact analysis studies. This framework extends and relabels a classification system previously proposed by McGinn et al (2000) as suitable for studies of clinical prediction rules. This extended classification, and its descriptive terms, explicitly anchor research findings to the type of evidence each provides. The inclusive nature of the framework invites appropriate consideration of the results of diverse research designs. Method pathways are described for studies designed to test and quantify prognostic effects or subgroup treatment effects, and examples are discussed. The proposed method framework is presented as a roadmap for conversation amongst researchers and clinicians who plan, stage and perform subgrouping research. This article proposes a research method framework for studies of subgroups in low back pain. Research designs and statistical methods appropriate for sequential phases in this research are discussed, with an emphasis on those suitable for hypothesis-setting studies of subgroups of people seeking care.

Journal ArticleDOI
TL;DR: Findings suggest that prospective analyses from this cohort are not substantially biased by non-response at the first follow-up assessment, and that bias was greatest in subgroups with small numbers.
Abstract: Nonresponse bias in a longitudinal study could affect the magnitude and direction of measures of association. We identified sociodemographic, behavioral, military, and health-related predictors of response to the first follow-up questionnaire in a large military cohort and assessed the extent to which nonresponse biased measures of association. Data are from the baseline and first follow-up survey of the Millennium Cohort Study. Seventy-six thousand, seven hundred and seventy-five eligible individuals completed the baseline survey and were presumed alive at the time of follow-up; of these, 54,960 (71.6%) completed the first follow-up survey. Logistic regression models were used to calculate inverse probability weights using propensity scores. Characteristics associated with a greater probability of response included female gender, older age, higher education level, officer rank, active-duty status, and a self-reported history of military exposures. Ever smokers, those with a history of chronic alcohol consumption or a major depressive disorder, and those separated from the military at follow-up had a lower probability of response. Nonresponse to the follow-up questionnaire did not result in appreciable bias; bias was greatest in subgroups with small numbers. These findings suggest that prospective analyses from this cohort are not substantially biased by non-response at the first follow-up assessment.

Journal ArticleDOI
TL;DR: Recruitment rates varied markedly across the projects despite similar initial strategies, and investigators must continue to find effective ways of reaching and involving diverse and representative samples of primary care providers and practices by building personal connections with, and buy-in from, potential participants.
Abstract: Background: While some research has been conducted examining recruitment methods to engage physicians and practices in primary care research, further research is needed on recruitment methodology as it remains a recurrent challenge and plays a crucial role in primary care research. This paper reviews recruitment strategies, common challenges, and innovative practices from five recent primary care health services research studies in Ontario, Canada. Methods: We used mixed qualitative and quantitative methods to gather data from investigators and/or project staff from five research teams. Team members were interviewed and asked to fill out a brief survey on recruitment methods, results, and challenges encountered during a recent or ongoing project involving primary care practices or physicians. Data analysis included qualitative analysis of interview notes and descriptive statistics generated for each study. Results: Recruitment rates varied markedly across the projects despite similar initial strategies. Common challenges and creative solutions were reported by many of the research teams, including building a sampling frame, developing front-office rapport, adapting recruitment strategies, promoting buy-in and interest in the research question, and training a staff recruiter. Conclusions: Investigators must continue to find effective ways of reaching and involving diverse and representative samples of primary care providers and practices by building personal connections with, and buy-in from, potential participants. Flexible recruitment strategies and an understanding of the needs and interests of potential participants may also facilitate recruitment.

Journal ArticleDOI
TL;DR: Results show that, provided essential and non-waivable conditions for causal inference are met, the CIR is most often inestimable whether through the Prevalence Ratio or the PreValence Odds Ratio, and that the latter is the measure that consistently yields an appropriate measure of the Incidence Density Ratio.
Abstract: Several papers have discussed which effect measures are appropriate to capture the contrast between exposure groups in cross-sectional studies, and which related multivariate models are suitable. Although some have favored the Prevalence Ratio over the Prevalence Odds Ratio -- thus suggesting the use of log-binomial or robust Poisson instead of the logistic regression models -- this debate is still far from settled and requires close scrutiny. In order to evaluate how accurately true causal parameters such as Incidence Density Ratio (IDR) or the Cumulative Incidence Ratio (CIR) are effectively estimated, this paper presents a series of scenarios in which a researcher happens to find a preset ratio of prevalences in a given cross-sectional study. Results show that, provided essential and non-waivable conditions for causal inference are met, the CIR is most often inestimable whether through the Prevalence Ratio or the Prevalence Odds Ratio, and that the latter is the measure that consistently yields an appropriate measure of the Incidence Density Ratio. Multivariate regression models should be avoided when assumptions for causal inference from cross-sectional data do not hold. Nevertheless, if these assumptions are met, it is the logistic regression model that is best suited for this task as it provides a suitable estimate of the Incidence Density Ratio.

Journal ArticleDOI
TL;DR: This paper presents the study protocol for a trial, to evaluate the influence of mobile phone reminders on adherence to first-line antiretroviral treatment in South India.
Abstract: Poor adherence to antiretroviral treatment has been a public health challenge associated with the treatment of HIV. Although different adherence-supporting interventions have been reported, their l ...

Journal ArticleDOI
TL;DR: Better framing of the research question using the PICOT format is independently associated with better overall reporting quality - although the effect is small - and better reporting of key methodologies.
Abstract: Background: Experts recommend formulating a structured research question to guide the research design However, the basis for this recommendation has not been formally evaluated The aim of this study was to examine if a structured research question using the PICOT (Population, Intervention, Comparator, Outcome, Timeframe) format is associated with a better reporting quality of randomized controlled trials (RCTs) Methods: We evaluated 89 RCTs reports published in three endocrinology journals in 2005 and 2006, the quality of reporting of which was assessed in a previous study We examined whether the reports stated each of the five elements of a structured research question: population, intervention, comparator, outcome and time-frame A PICOT score was created with a possible score between 0 and 5 Outcomes were: 1) a 14-point overall reporting quality score (OQS) based on the Consolidated Standards for Reporting Trials; and 2) a 3-point key score (KS), based on allocation concealment, blinding and use of intention-to-treat analysis We conducted multivariable regression analyses using generalized estimating equations to determine if a higher PICOT score or the use of a structured research question were independently associated with a better reporting quality Journal of publication, funding source and sample size were identified as factors associated with OQS in our previous report on this dataset, and therefore included in the model Results: A higher PICOT score was independently associated with OQS (incidence rate ratio (IRR) = 1021, 95% CI: 1012 to 1029) and KS (IRR = 1142, 95% CI: 1079 to 1210) A structured research question was present in 337% of the reports and it was associated with a better OQS (IRR = 1095, 95% CI 1059-1132) and KS (IRR = 1530, 95% CI 1311-1786) Conclusions: Better framing of the research question using the PICOT format is independently associated with better overall reporting quality - although the effect is small - and better reporting of key methodologies

Journal ArticleDOI
TL;DR: The results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.
Abstract: Previous studies have analyzed the psychometric properties of the World Health Organization Disability Assessment Schedule II (WHO-DAS II) using classical omnibus measures of scale quality. These analyses are sample dependent and do not model item responses as a function of the underlying trait level. The main objective of this study was to examine the effectiveness of the WHO-DAS II items and their options in discriminating between changes in the underlying disability level by means of item response analyses. We also explored differential item functioning (DIF) in men and women. The participants were 3615 adult general practice patients from 17 regions of Spain, with a first diagnosed major depressive episode. The 12-item WHO-DAS II was administered by the general practitioners during the consultation. We used a non-parametric item response method (Kernel-Smoothing) implemented with the TestGraf software to examine the effectiveness of each item (item characteristic curves) and their options (option characteristic curves) in discriminating between changes in the underliying disability level. We examined composite DIF to know whether women had a higher probability than men of endorsing each item. Item response analyses indicated that the twelve items forming the WHO-DAS II perform very well. All items were determined to provide good discrimination across varying standardized levels of the trait. The items also had option characteristic curves that showed good discrimination, given that each increasing option became more likely than the previous as a function of increasing trait level. No gender-related DIF was found on any of the items. All WHO-DAS II items were very good at assessing overall disability. Our results supported the appropriateness of the weights assigned to response option categories and showed an absence of gender differences in item functioning.

Journal ArticleDOI
TL;DR: GoWell compares the health and wellbeing effects of different approaches to regeneration, generates theory on pathways from regeneration to health and explores the attitudes and responses of residents and other stakeholders to neighbourhood change.
Abstract: There is little robust evidence to test the policy assumption that housing-led area regeneration strategies will contribute to health improvement and reduce social inequalities in health. The GoWell Programme has been designed to measure effects on health and wellbeing of multi-faceted regeneration interventions on residents of disadvantaged neighbourhoods in the city of Glasgow, Scotland. This mixed methods study focused (initially) on 14 disadvantaged neighbourhoods experiencing regeneration. These were grouped by intervention into 5 categories for comparison. GoWell includes a pre-intervention householder survey (n = 6008) and three follow-up repeat-cross sectional surveys held at two or three year intervals (the main focus of this protocol) conducted alongside a nested longitudinal study of residents from 6 of those areas. Self-reported responses from face-to-face questionnaires are analysed along with various routinely produced ecological data and documentary sources to build a picture of the changes taking place, their cost and impacts on residents and communities. Qualitative methods include interviews and focus groups of residents, housing managers and other stakeholders exploring issues such as the neighbourhood context, potential pathways from regeneration to health, community engagement and empowerment. Urban regeneration programmes are 'natural experiments.' They are complex interventions that may impact upon social determinants of population health and wellbeing. Measuring the effects of such interventions is notoriously challenging. GoWell compares the health and wellbeing effects of different approaches to regeneration, generates theory on pathways from regeneration to health and explores the attitudes and responses of residents and other stakeholders to neighbourhood change.

Journal ArticleDOI
TL;DR: Non-death attrition may cause greater bias than death in longitudinal studies, however although more than a quarter of the oldest participants in the ALSWH died in the 12 years following recruitment, differences from the national population changed only slightly.
Abstract: There are well-established risk factors, such as lower education, for attrition of study participants. Consequently, the representativeness of the cohort in a longitudinal study may deteriorate over time. Death is a common form of attrition in cohort studies of older people. The aim of this paper is to examine the effects of death and other forms of attrition on risk factor prevalence in the study cohort and the target population over time. Differential associations between a risk factor and death and non-death attrition are considered under various hypothetical conditions. Empirical data from the Australian Longitudinal Study on Women's Health (ALSWH) for participants born in 1921-26 are used to identify associations which occur in practice, and national cross-sectional data from Australian Censuses and National Health Surveys are used to illustrate the evolution of bias over approximately ten years. The hypothetical situations illustrate how death and other attrition can theoretically affect changes in bias over time. Between 1996 and 2008, 28.4% of ALSWH participants died, 16.5% withdrew and 10.4% were lost to follow up. There were differential associations with various risk factors, for example, non-English speaking country of birth was associated with non-death attrition but not death whereas being underweight (body mass index < 18.5) was associated with death but not other forms of attrition. Compared to national data, underrepresentation of women with non-English speaking country of birth increased from 3.9% to 7.2% and over-representation of current and ex-smoking increased from 2.6% to 5.8%. Deaths occur in both the target population and study cohort, while other forms of attrition occur only in the study cohort. Therefore non-death attrition may cause greater bias than death in longitudinal studies. However although more than a quarter of the oldest participants in the ALSWH died in the 12 years following recruitment, differences from the national population changed only slightly.

Journal ArticleDOI
TL;DR: Mailed surveys may provide a suitable alternative option for survey-based research with cancer patients and have promise in securing higher response rates, according to a pilot study.
Abstract: In recent years, response rates to telephone surveys have declined. Online surveys may miss many older and poorer adults. Mailed surveys may have promise in securing higher response rates. In a pilot study, 1200 breast, prostate and colon patients, randomly selected from the Pennsylvania Cancer Registry, were sent surveys in the mail. Incentive amount ($3 vs. $5) and length of the survey (10 pages vs. 16 pages) were randomly assigned. Overall, there was a high response rate (AAPOR RR4 = 64%). Neither the amount of the incentive, nor the length of the survey affected the response rate significantly. Colon cancer surveys were returned at a significantly lower rate (RR4 = 54%), than breast or prostate surveys (RR4 = 71%, and RR4 = 67%, respectively; p < .001 for both comparisons). There were no significant interactions among cancer type, length of survey and incentive amount in their effects on response likelihood. Mailed surveys may provide a suitable alternative option for survey-based research with cancer patients.

Journal ArticleDOI
TL;DR: The rate of mobile only households has been increasing in Australia and is following worldwide trends, but has not reached the high levels seen internationally.
Abstract: To examine the trend of "mobile only" households, and households that have a mobile phone or landline telephone listed in the telephone directory, and to describe these groups by various socio-demographic and health indicators. Representative face-to-face population health surveys of South Australians, aged 15 years and over, were conducted in 1999, 2004, 2006, 2007 and 2008 (n = 14285, response rates = 51.9% to 70.6%). Self-reported information on mobile phone ownership and usage (1999 to 2008) and listings in White Pages telephone directory (2006 to 2008), and landline telephone connection and listings in the White Pages (1999 to 2008), was provided by participants. Additional information was collected on self-reported health conditions and health-related risk behaviours. Mobile only households have been steadily increasing from 1.4% in 1999 to 8.7% in 2008. In terms of sampling frame for telephone surveys, 68.7% of South Australian households in 2008 had at least a mobile phone or landline telephone listed in the White Pages (73.8% in 2006; 71.5% in 2007). The proportion of mobile only households was highest among young people, unemployed, people who were separated, divorced or never married, low income households, low SES areas, rural areas, current smokers, current asthma or people in the normal weight range. The proportion with landlines or mobiles telephone numbers listed in the White Pages telephone directory was highest among older people, married or in a defacto relationship or widowed, low SES areas, rural areas, people classified as overweight, or those diagnosed with arthritis or osteoporosis. The rate of mobile only households has been increasing in Australia and is following worldwide trends, but has not reached the high levels seen internationally (12% to 52%). In general, the impact of mobile telephones on current sampling frames (exclusion or non-listing of mobile only households or not listed in the White Pages directory) may have a low impact on health estimates obtained using telephone surveys. However, researchers need to be aware that mobile only households are distinctly different to households with a landline connection, and the increase in the number of mobile-only households is not uniform across all groups in the community. Listing in the White Pages directory continues to decrease and only a small proportion of mobile only households are listed. Researchers need to be aware of these telephone sampling issues when considering telephone surveys.

Journal ArticleDOI
TL;DR: To meet the needs of a number of research programmes, a new model is required as a matter of importance and should be validated against both retrospective and prospective data, to ensure the predictions it gives are superior to those currently used.
Abstract: Background: Less than one third of publicly funded trials managed to recruit according to their original plan often resulting in request for additional funding and/or time extensions. The aim was to identify models which might be useful to a major public funder of randomised controlled trials when estimating likely time requirements for recruiting trial participants. The requirements of a useful model were identified as usability, based on experience, able to reflect time trends, accounting for centre recruitment and contribution to a commissioning decision. Methods: A systematic review of English language articles using MEDLINE and EMBASE. Search terms included: randomised controlled trial, patient, accrual, predict, enrol, models, statistical; Bayes Theorem; Decision Theory; Monte Carlo Method and Poisson. Only studies discussing prediction of recruitment to trials using a modelling approach were included. Information was extracted from articles by one author, and checked by a second, using a pre-defined form. Results: Out of 326 identified abstracts, only 8 met all the inclusion criteria. Of these 8 studies examined, there are five major classes of model discussed: the unconditional model, the conditional model, the Poisson model, Bayesian models and Monte Carlo simulation of Markov models. None of these meet all the pre-identified needs of the funder.

Journal ArticleDOI
TL;DR: A simulation study showed that under the given conditions, Bayesian model averaging had a higher probability of not selecting a redundant variable than stepwise regression and had a similar probability of selecting a true predictor.
Abstract: Automatic variable selection methods are usually discouraged in medical research although we believe they might be valuable for studies where subject matter knowledge is limited. Bayesian model averaging may be useful for model selection but only limited attempts to compare it to stepwise regression have been published. We therefore performed a simulation study to compare stepwise regression with Bayesian model averaging. We simulated data corresponding to five different data generating processes and thirty different values of the effect size (the parameter estimate divided by its standard error). Each data generating process contained twenty explanatory variables in total and had between zero and two true predictors. Three data generating processes were built of uncorrelated predictor variables while two had a mixture of correlated and uncorrelated variables. We fitted linear regression models to the simulated data. We used Bayesian model averaging and stepwise regression respectively as model selection procedures and compared the estimated selection probabilities. The estimated probability of not selecting a redundant variable was between 0.99 and 1 for Bayesian model averaging while approximately 0.95 for stepwise regression when the redundant variable was not correlated with a true predictor. These probabilities did not depend on the effect size of the true predictor. In the case of correlation between a redundant variable and a true predictor, the probability of not selecting a redundant variable was 0.95 to 1 for Bayesian model averaging while for stepwise regression it was between 0.7 and 0.9, depending on the effect size of the true predictor. The probability of selecting a true predictor increased as the effect size of the true predictor increased and leveled out at between 0.9 and 1 for stepwise regression, while it leveled out at 1 for Bayesian model averaging. Our simulation study showed that under the given conditions, Bayesian model averaging had a higher probability of not selecting a redundant variable than stepwise regression and had a similar probability of selecting a true predictor. Medical researchers building regression models with limited subject matter knowledge could thus benefit from using Bayesian model averaging.