scispace - formally typeset
Search or ask a question

Showing papers in "Journal of The Royal Statistical Society Series A-statistics in Society in 2006"


Journal ArticleDOI
TL;DR: In this paper, a pseudolikelihood approach for accommodating inverse probability weights in multilevel models with an arbitrary number of levels is implemented by using adaptive quadrature, and a sandwich estimator is used to obtain standard errors that account for stratification and clustering.
Abstract: Summary. Multilevel modelling is sometimes used for data from complex surveys involving multistage sampling, unequal sampling probabilities and stratification. We consider generalized linear mixed models and particularly the case of dichotomous responses. A pseudolikelihood approach for accommodating inverse probability weights in multilevel models with an arbitrary number of levels is implemented by using adaptive quadrature. A sandwich estimator is used to obtain standard errors that account for stratification and clustering. When level 1 weights are used that vary between elementary units in clusters, the scaling of the weights becomes important. We point out that not only variance components but also regression coefficients can be severely biased when the response is dichotomous. The pseudolikelihood methodology is applied to complex survey data on reading proficiency from the American sample of the ‘Program for international student assessment’ 2000 study, using the Stata program gllamm which can estimate a wide range of multilevel and latent variable models. Performance of pseudo-maximumlikelihood with different methods for handling level 1 weights is investigated in a Monte Carlo experiment. Pseudo-maximum-likelihood estimators of (conditional) regression coefficients perform well for large cluster sizes but are biased for small cluster sizes. In contrast, estimators of marginal effects perform well in both situations. We conclude that caution must be exercised in pseudo-maximum-likelihood estimation for small cluster sizes when level 1 weights are used.

580 citations


Journal ArticleDOI
TL;DR: In this paper, the authors quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality, and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987-2000 for the largest 100 cities in the USA.
Abstract: Summary. Multicity time series studies of particulate matter and mortality and morbidity have provided evidence that daily variation in air pollution levels is associated with daily variation in mortality counts. These findings served as key epidemiological evidence for the recent review of the US national ambient air quality standards for particulate matter. As a result, methodological issues concerning time series analysis of the relationship between air pollution and health have attracted the attention of the scientific community and critics have raised concerns about the adequacy of current model formulations. Time series data on pollution and mortality are generally analysed by using log-linear, Poisson regression models for overdispersed counts with the daily number of deaths as outcome, the (possibly lagged) daily level of pollution as a linear predictor and smooth functions of weather variables and calendar time used to adjust for timevarying confounders. Investigators around the world have used different approaches to adjust for confounding, making it difficult to compare results across studies. To date, the statistical properties of these different approaches have not been comprehensively compared. To address these issues, we quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality. First, we conduct a simulation study to compare and describe the properties of statistical methods that are commonly used for confounding adjustment. We generate data under several confounding scenarios and systematically compare the performance of the various methods with respect to the mean-squared error of the estimated air pollution coefficient.We find that the bias in the estimates generally decreases with more aggressive smoothing and that model selection methods which optimize prediction may not be suitable for obtaining an estimate with small bias. Second, we apply and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987–2000 for the largest 100 cities in the USA. When applying these approaches to adjusting for seasonal and long-term trends we find that the Study’s estimates for the national average effect of PM10 at lag 1 on mortality vary over approximately a twofold range, with 95% posterior intervals always excluding zero risk.

519 citations


Journal ArticleDOI
TL;DR: The paper defines responsive design and uses examples to illustrate the responsive use of paradata to guide mid-survey decisions affecting the non-response, measurement and sampling variance properties of resulting statistics.
Abstract: Summary. Over the past few years surveys have expanded to new populations, have incorporated measurement of new and more complex substantive issues and have adopted new data collection tools. At the same time there has been a growing reluctance among many household populations to participate in surveys. These factors have combined to present survey designers and survey researchers with increased uncertainty about the performance of any given survey design at any particular point in time. This uncertainty has, in turn, challenged the survey practitioner’s ability to control the cost of data collection and quality of resulting statistics. The development of computer-assisted methods for data collection has provided survey researchers with tools to capture a variety of process data (‘paradata’) that can be used to inform cost–quality trade-off decisions in realtime. The ability to monitor continually the streams of process data and survey data creates the opportunity to alter the design during the course of data collection to improve survey cost efficiency and to achieve more precise, less biased estimates. We label such surveys as ‘responsive designs’. The paper defines responsive design and uses examples to illustrate the responsive use of paradata to guide mid-survey decisions affecting the non-response, measurement and sampling variance properties of resulting statistics.

420 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present an intuitive review of these developments and contrast these estimators with multiple imputation from both a theoretical and a practical viewpoint, leading to the development of doubly robust or doubly protected estimators.
Abstract: Multiple imputation is now a well-established technique for analysing data sets where some units have incomplete observations. Provided that the imputation model is correct, the resulting estimates are consistent. An alternative, weighting by the inverse probability of observing complete data on a unit, is conceptually simple and involves fewer modelling assumptions, but it is known to be both inefficient (relative to a fully parametric approach) and sensitive to the choice of weighting model. Over the last decade, there has been a considerable body of theoretical work to improve the performance of inverse probability weighting, leading to the development of 'doubly robust' or 'doubly protected' estimators. We present an intuitive review of these developments and contrast these estimators with multiple imputation from both a theoretical and a practical viewpoint.

235 citations


Journal ArticleDOI
TL;DR: In this article, the authors focus on the National Child Development Study and show how non-response has accumulated over time and distinguish between attrition and wave nonresponse and show that the best predictors of nonresponse at any sweep are generally variables that are measured at the previous sweep.
Abstract: Summary. There is widespread concern that the cumulative effects of the non-response that is bound to affect any long-running longitudinal study will lead to mistaken inferences about change. We focus on the National Child Development Study and show how non-response has accumulated over time. We distinguish between attrition and wave non-response and show how these two kinds of non-response can be related to a set of explanatory variables. We model the discrete time hazard of non-response and also fit a set of multinomial logistic regressions to the probabilities of different kinds of non-response at a particular sweep. We find that the best predictors of non-response at any sweep are generally variables that are measured at the previous sweep but, although non-response is systematic, much of the variation in it remains unexplained by our models. We consider the implications of our results for both design and analysis.

213 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated whether people who split up actually become happier, using the British Household Panel Survey (BHP) to observe an individual's level of psychological well-being in the years before and after divorce.
Abstract: Divorce is a leap in the dark. The paper investigates whether people who split up actually become happier. Using the British Household Panel Survey, we can observe an individual's level of psychological well-being in the years before and after divorce. Our results show that divorcing couples reap psychological gains from the dissolution of their marriages. Men and women benefit equally. The paper also studies the effects of bereavement, of having dependant children and of remarriage. We measure well-being by using general health questionnaire and life satisfaction scores.

170 citations


Journal ArticleDOI
TL;DR: In this paper, the authors explored the consequences of health-related attrition for these models and found that while healthrelated attrition exists, it does not appear to distort the magnitudes of the estimated average partial effects of socioeconomic status.
Abstract: This paper considers models of the association between socioeconomic status and self-assessed health (SAH) based on eleven waves of the British Household Panel Survey (BHPS) and the full eight waves of the European Community Household Panel (ECHP). The objective is to explore the consequences of healthrelated attrition for these models. Attrition may be important as there is a risk of survivorship bias: long-term survivors who remain in the panel are likely to be healthier on average. To address this issue we describe the pattern of health-related attrition revealed by the BHPS and ECHP data. We both test and correct for attrition in our empirical models of the impact of socioeconomic status on self-assessed health. Descriptive evidence shows that there is health-related attrition in the data, with those in poor initial health more likely to drop out, and variable addition tests provide evidence of attrition bias in the panel data models of SAH. Nevertheless a comparison of estimates - based on the balanced sample, the unbalanced sample and corrected for non-response using inverse probability weights - does not show substantive differences in the average partial effects of the variables of interest. So, while health-related attrition exists, it does not appear to distort the magnitudes of the estimated average partial effects of socioeconomic status. Similar findings have been reported concerning the negligible influence of attrition bias in models of various labour market outcomes; we discuss possible explanations for our results. JEL codes I12 C23

165 citations


Journal ArticleDOI
TL;DR: A rationale for evidence synthesis is developed that is based on Bayesian decision modelling and expected value of information theory, which stresses not only the need for a lack of bias in estimates of treatment effects but also a lackof bias in assessments of uncertainty.
Abstract: Summary. Alongside the development of meta-analysis as a tool for summarizing research literature, there is renewed interest in broader forms of quantitative synthesis that are aimed at combining evidence from different study designs or evidence on multiple parameters. These have been proposed under various headings: the confidence profile method, cross-design synthesis, hierarchical models and generalized evidence synthesis. Models that are used in health technology assessment are also referred to as representing a synthesis of evidence in a mathematical structure. Here we review alternative approaches to statistical evidence synthesis, and their implications for epidemiology and medical decision-making. The methods include hierarchical models, models informed by evidence on different functions of several parameters and models incorporating both of these features. The need to check for consistency of evidence when using these powerful methods is emphasized. We develop a rationale for evidence synthesis that is based on Bayesian decision modelling and expected value of information theory, which stresses not only the need for a lack of bias in estimates of treatment effects but also a lack of bias in assessments of uncertainty. The increasing reliance of governmental bodies like the UK National Institute for Clinical Excellence on complex evidence synthesis in decision modelling is discussed.

161 citations


Journal ArticleDOI
TL;DR: A hierarchical Bayesian modelling approach is presented and its use is demonstrated in a relatively small but statistically challenging exercise: the reconstruction of prehistoric climate at Glendalough in Ireland from fossil pollen.
Abstract: Summary. We consider the problem of reconstructing prehistoric climates by using fossil data that have been extracted from lake sediment cores. Such reconstructions promise to provide one of the few ways to validate modern models of climate change. A hierarchical Bayesian modelling approach is presented and its use, inversely, is demonstrated in a relatively small but statistically challenging exercise: the reconstruction of prehistoric climate at Glendalough in Ireland from fossil pollen. This computationally intensive method extends current approaches by explicitly modelling uncertainty and reconstructing entire climate histories. The statistical issues that are raised relate to the use of compositional data (pollen) with covariates (climate) which are available at many modern sites but are missing for the fossil data. The compositional data arise as mixtures and the missing covariates have a temporal structure. Novel aspects of the analysis include a spatial process model for compositional data, local modelling of lattice data, the use, as a prior, of a random walk with long-tailed increments, a two-stage implementation of the Markov chain Monte Carlo approach and a fast approximate procedure for cross-validation in inverse problems. We present some details, contrasting its reconstructions with those which have been generated by a method in use in the palaeoclimatology literature. We suggest that the method provides a basis for resolving important challenging issues in palaeoclimate research. We draw attention to several challenging statistical issues that need to be overcome.

141 citations


Journal ArticleDOI
TL;DR: It is concluded that improvements in population educational attainment may not automatically lead to Improvements in population health, and that health policies for improving health and reducing health inequalities need to target specific causal pathways.
Abstract: The association of poor education and poor health has been consistently observed in many studies and in various countries. Thus far, studies examining the mechanisms underlying this association have looked at only a limited set of potential pathways. This study simultaneously examines six distinctive pathways, which have been hypothesized to link education and health and found support from previous studies. A causal analysis of education and health was performed using structural equation models. Data were used from six phases of the National Child Development Study, which is based on following up an initial sample of 17416 children who were born in 1958. The association between education and health appears to be explained by a combination of mechanisms: adolescent health and adult health behaviours for men and women, adult social class among men and parental social class among women. We conclude that improvements in population educational attainment may not automatically lead to improvements in population health, and that health policies for improving health and reducing health inequalities need to target specific causal pathways.

140 citations


Journal ArticleDOI
TL;DR: In this paper, the authors analyse patterns of consent and consent bias in the context of a large general household survey, the ‘Improving survey measurement of income and employment’ survey, also addressing issues that arise when there are multiple consent questions.
Abstract: We analyse patterns of consent and consent bias in the context of a large general household survey, the ‘Improving survey measurement of income and employment’ survey, also addressing issues that arise when there are multiple consent questions. A multivariate probit regression model for four binary outcomes with two incidental truncations is used. We show that there are biases in consent to data linkage with benefit and tax credit administrative records that are held by the Department for Work and Pensions, and with wage and employment data held by employers. There are also biases in respondents’ willingness and ability to supply their national insurance number. The biases differ according to the question that is considered. We also show that modelling questions on consent independently rather than jointly may lead to misleading inferences about consent bias. A positive correlation between unobservable individual factors affecting consent to Department for Work and Pensions record linkage and consent to employer record linkage is suggestive of a latent individual consent propensity.

Journal ArticleDOI
TL;DR: This article showed that the success of a transformation may be judged solely in terms of how closely the total error follows a Gaussian distribution, which avoids the complexity of separately evaluating pure errors and random effects.
Abstract: Summary. For a univariate linear model, the Box–Cox method helps to choose a response transformation to ensure the validity of a Gaussian distribution and related assumptions. The desire to extend the method to a linear mixed model raises many vexing questions. Most importantly, how do the distributions of the two sources of randomness (pure error and random effects) interact in determining the validity of assumptions? For an otherwise valid model, we prove that the success of a transformation may be judged solely in terms of how closely the total error follows a Gaussian distribution. Hence the approach avoids the complexity of separately evaluating pure errors and random effects. The extension of the transformation to the mixed model requires an exploration of its potential effect on estimation and inference of the model parameters. Analysis of longitudinal pulmonary function data and Monte Carlo simulations illustrate the methodology discussed.


Journal ArticleDOI
TL;DR: This paper applied a multinomial logistic model to the latent response and investigated how class membership relates to demographic and life style factors, political beliefs and religiosity over time, finding that marijuana use and attitudes are well summarized by a four-class model.
Abstract: Summary. Analysing the use of marijuana is challenging in part because there is no widely accepted single measure of individual use. Similarly, there is no single response variable that effectively captures attitudes toward its social and moral acceptability. One approach is to view the joint distribution of multiple use and attitude indicators as a mixture of latent classes. Pooling items from the annual ‘Monitoring the future’ surveys of American high school seniors from 1977 to 2001, we find that marijuana use and attitudes are well summarized by a four-class model. Secular trends in class prevalences over this period reveal major shifts in use and attitudes. Applying a multinomial logistic model to the latent response, we investigate how class membership relates to demographic and life style factors, political beliefs and religiosity over time. Inferences about the parameters of the latent class logistic model are obtained by a combination of maximum likelihood and Bayesian techniques.

Journal ArticleDOI
TL;DR: In this paper, the authors analyzed the Irish college degree applications data from the year 2000 using mixture models based on ranked data models to investigate the types of application behavior exhibited by college applicants and found that applicants form groups according to both the discipline and geographical location of their course choices.
Abstract: The Irish college admissions system involves prospective students listing up to ten courses in order of preference on their application. Places in third level educational institutions are subsequently ofiered to the applicants on the basis of both their preferences and their flnal second level examination results. The college applications system is a large area of public debate in Ireland. Detractors suggest the process creates artiflcial demand for ‘high proflle’ courses, causing applicants to ignore their vocational callings. Supporters argue that the system is impartial and transparent. The Irish college degree applications data from the year 2000 is analyzed using mixture models based on ranked data models to investigate the types of application behavior exhibited by college applicants. The results of this analysis show that applicants form groups according to both the discipline and geographical location of their course choices. In addition, there is evidence of the suggested ‘points race’ for high proflle courses. Finally, gender emerges as an in∞uential factor when studying course choice behavior.

Journal ArticleDOI
TL;DR: In this article, the authors examined household and area effects on the incidence of total property crimes and burglaries and thefts using data from the 2000 British Crime Survey and the 1991 Census Small Area Statistics.
Abstract: This study examines household and area effects on the incidence of total property crimes and burglaries and thefts. It uses data from the 2000 British Crime Survey and the 1991 Census Small Area Statistics. Results are obtained from estimated random effects multilevel models, with an assumed negative binomial distribution of the dependent variable. Both household and area characteristics, as well as selected interactions, explain a significant portion of the variation in property crimes. There are also a large number of significant between area random variances and covariances of household characteristics. The estimated fixed and random effects may assist in advancing victimisation theory. The methods have potential for developing a better understanding of factors that give rise to crime and so assist in framing crime prevention policy.

Journal ArticleDOI
TL;DR: This paper explored the relationship between the day of the week on which a survey respondent is interviewed and their self-reported job satisfaction and mental health scores by using data from the British Household Panel Survey.
Abstract: The paper explores the relationship between the day of the week on which a survey respondent is interviewed and their self-reported job satisfaction and mental health scores by using data from the British Household Panel Survey. Evidence presented here confirms that self-reported levels of job satisfaction and subjective levels of mental distress systematically vary according to the day of the week on which respondents are interviewed even when controlling for other observed and unobserved characteristics. However, we find that the main conclusions from previous studies of the determinants of job satisfaction and mental well-being are robust to the inclusion of day-of-interview controls.

Journal ArticleDOI
TL;DR: This paper investigated the life cycle relationship of work and family life in Britain based on the British Household Panel Survey and found that transitions in and out of employment for men are relatively independent of other transitions.
Abstract: The paper investigates the life-cycle relationship of work and family life in Britain based on the British Household Panel Survey. Using hazard regression techniques we estimate a five-equation model, which includes birth events, union formation, union dissolution, employment and non-employment events. We find that transitions in and out of employment for men are relatively independent of other transitions. In contrast, there are strong links between employment of females, having children and union formation. By undertaking a detailed microsimulations analysis, we show that different levels of labour force participation by females do not necessarily lead to large changes in fertility events. Changes in union formation and fertility events, in contrast, have larger effects on employment.

Journal ArticleDOI
TL;DR: In this paper, a combination of analytical and simulation based techniques within the Bayesian framework is proposed to determine sample size in clinical trials under normal likelihoods and at the substantive testing stage of a financial audit where normality is not an appropriate assumption.
Abstract: The problem motivating this article is the determination of sample size in clinical trials under normal likelihoods and at the substantive testing stage of a financial audit where normality is not an appropriate assumption. A combination of analytical and simulation based techniques within the Bayesian framework is proposed. The framework accommodates two different prior distributions: one is the general purpose fitting prior distribution used in Bayesian analysis and the other is the expert subjective prior distribution, the sampling prior which is believed to generate the parameter values which in turn generate the data. We obtain many theoretical results and one key result is that typical non-informative prior distributions lead to very small sample sizes. On the other hand, a very informative prior distribution may either lead to a very small or a very large sample size depending on the location of the centre of the prior distribution and the hypothesized value of the parameter. The methods developed here are quite general and can be applied to other sample size determination (SSD) problems. A number of numerical illustrations which bring out many other aspects of the optimum sample size are given.


Journal ArticleDOI
TL;DR: In this paper, the authors estimate a joint model of the formation and dissolution of cohabiting and marital unions among British women who were born in 1970 and use a multilevel simultaneous equations event history model to allow for residual correlation between the hazards of moving from an unpartnered state into cohabitation or marriage.
Abstract: We estimate a joint model of the formation and dissolution of cohabiting and marital unions among British women who were born in 1970. The focus of the analysis is the effect of previous cohabitation and marriage on subsequent partnership transitions. We use a multilevel simultaneous equations event history model to allow for residual correlation between the hazards of moving from an unpartnered state into cohabitation or marriage, converting a cohabiting union into marriage and dissolution of either form of union. A simultaneous modelling approach allows for the joint determination of these transitions, which may otherwise bias estimates of the effects of previous partnership outcomes on later transitions.

Journal ArticleDOI
TL;DR: In this article, the results of a Dutch study into the consequences of replacing home interviews by trained interviewers with Internet-delivered interviews in a survey on fraud in the area of disability benefits were presented.
Abstract: Summary. In the Netherlands, there is a research tradition that measures fraud against regulations by interviewing eligible individuals using a survey. In these studies the sensitive questions about fraud are posed by using a randomized response method.The paper describes the results of a Dutch study into the consequences of replacing home interviews by trained interviewers with Internet-delivered interviews in a survey on fraud in the area of disability benefits. Both surveys used computer-assisted self-interviews with randomized response questions. This study has three goals: first to present the research tradition that makes use of randomized response, second to compare the results of home interviews and the Internet survey and finally to introduce an adapted weighted logistic regression method to test the relationship between the probability of fraud and explanatory variables. The results show that there are no systematic differences between modes of interview, either for estimates of the prevalence of fraud or for the identification of associated variables. These outcomes result in the conclusion that the Internet survey is a useful and cost-effective instrument for measuring fraud in a population, and that it is unlikely that replacing home interviews with the Internet survey will result in a significant break with tradition.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the extent of state dependence in India, distinguishing this from family level risk factors that are common to siblings, and found that there is a significant degree of state-dependent in each of the three regions.
Abstract: Data from a range of environments indicate that the incidence of death is not randomly distributed across families but, rather, that there is a clustering of death among siblings. A natural explanation of this would be that there are (observed or unobserved) differences across families, e.g. in genetic frailty, education or living standards. Another hypothesis that is of considerable interest for both theory and policy is that there is a causal process whereby the death of a child influences the risk of death of the succeeding child in the family. Drawing language from the literature on the economics of unemployment, the causal effect is referred to here as state dependence (or scarring). The paper investigates the extent of state dependence in India, distinguishing this from family level risk factors that are common to siblings. It offers some methodological innovations on previous research. Estimates are obtained for each of three Indian states, which exhibit dramatic differences in socio-economic and demographic variables. The results suggest a significant degree of state dependence in each of the three regions. Eliminating scarring, it is estimated, would reduce the incidence of infant mortality (among children who are born after the first child) by 9.8% in the state of Uttar Pradesh, 6.0% in West Bengal and 5.9% in Kerala.


Journal ArticleDOI
TL;DR: In this article, the authors developed a method to empirically distinguish between two explanations for a bias in results based on only survey data: (1) selectivity due to related unobserved determinantsof unemployment durations and non-response, and (2) a causal effect of a job exit on nonresponse.
Abstract: Social surveys are often used to estimate unemployment duration distributions. Survey non-response may then cause a bias. We study this using a unique dataset thatcombines survey information of individual workers with administrative records ofthe same workers. The latter provide information on unemployment durations andpersonal characteristics of all survey respondents and non-respondents. We developa method to empirically distinguish between two explanations for a bias in resultsbased on only survey data: (1) selectivity due to related unobserved determinantsof unemployment durations and non-response, and (2) a causal effect of a job exiton non-response. The latter may occur even in fully homogeneous populations. Themethodology exploits variation in the timing of the duration outcome relative tothe survey moment. The results show evidence for both explanations. We discussimplications for standard methods to deal with non-response bias.

Journal ArticleDOI
TL;DR: Three approaches for fitting the response models and estimating parameters of substantive interest and their standard errors are compared: a modified conditional likelihood method, an EM procedure with the Louis formula and a Bayesian approach using Markov chain Monte Carlo methods.
Abstract: Summary. We present a general method of adjustment for non-ignorable non-response in studies where one or more further attempts are made to contact initial non-responders. A logistic regression model relates the probability of response at each contact attempt to covariates and outcomes of interest. We assume that the effect of these covariates and outcomes on the probability of response is the same at all contact attempts. Knowledge of the number of contact attempts enables estimation of the model by using only information from the respondents and the number of non-responders.Three approaches for fitting the response models and estimating parameters of substantive interest and their standard errors are compared: a modified conditional likelihood method in which the fitted inverse probabilities of response are used in weighted analyses for the outcomes of interest, an EM procedure with the Louis formula and a Bayesian approach using Markov chain Monte Carlo methods. We further propose the creation of several sets of weights to incorporate uncertainty in the probability weights in subsequent analyses. Our methods are applied as a sensitivity analysis to a postal survey of symptoms in Persian Gulf War veterans and other servicemen.

Journal ArticleDOI
TL;DR: Loads of the statistical methods in molecular evolution book catalogues in this site are found as the choice of you visiting this page.

Journal ArticleDOI
TL;DR: The authors used data from the British Household Panel Survey from 1994 to 2003 to assess the long-term effectiveness of refusal conversion procedures in terms of sample sizes, sample composition and data quality.
Abstract: Survey organizations often attempt to 'convert' sample members who refuse to take part in a survey. Persuasive techniques are used in an effort to change the refusers' minds and to agree to an interview. This is done to improve the response rate and, possibly, to reduce non-response bias. However, refusal conversion attempts are expensive and must be justified. Previous studies of the effects of refusal conversion attempts are few and have been restricted to cross-sectional surveys. The criteria for 'success' of a refusal conversion attempt are different in a longitudinal survey, where for many purposes the researcher requires complete data over multiple waves. The paper uses data from the British Household Panel Survey from 1994 to 2003 to assess the long-term effectiveness of refusal conversion procedures in terms of sample sizes, sample composition and data quality.

Journal ArticleDOI
TL;DR: In this paper, a capture-recapture technique is used to estimate the population size of difficult-to-count human populations in human studies using only the number of times that particular individuals were encountered in the study period.
Abstract: Summary. Capture–recapture techniques are widely used to estimate the size of difficult-to-count human populations. Applications often focus on the overlap between two or more samples, but another type of data that is encountered in human studies involves only the number of times that particular individuals were encountered in the study period. We present a method for estimating the population size in this situation. This method is simple and technically accessible and allows for entries and exits by individuals and for a difference between probabilities of initial and subsequent contacts. We apply the method to arrest data on male clients of prostitute women in Vancouver.