scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Web-Based Network Sampling: Efficiency and Efficacy of Respondent-Driven Sampling for Online Research

01 Aug 2008-Sociological Methods & Research (SAGE Publications)-Vol. 37, Iss: 1, pp 105-134
TL;DR: Web-based RDS (WebRDS) is found to be highly efficient and effective and methods for testing the validity of assumptions required by RDS estimation are presented.
Abstract: This study tests the feasibility, effectiveness, and efficiency of respondent-driven sampling (RDS) as a Web-based sampling method. Web-based RDS (WebRDS) is found to be highly efficient and effective. The online nature of WebRDS allows referral chains to progress very quickly, such that studies with large samples can be expected to proceed up to 20 times faster than with traditional sampling methods. Additionally, the unhidden nature of the study population allows comparison of RDS estimators to institutional data. Results indicate that RDS estimates are reasonable but not precise. This is likely due to bias associated with the random recruitment assumption and small sample size of the study. Finally, this article presents methods for testing the validity of assumptions required by RDS estimation.
Citations
More filters
Proceedings ArticleDOI
22 Jan 2006
TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.
Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

7,116 citations

Journal ArticleDOI
TL;DR: A wide range of non-probability designs exist and are being used in various settings, including case control studies, clinical trials, evaluation research, and more.
Abstract: Survey researchers routinely conduct studies that use different methods of data collection and inference. But for at least the past 60 years, the probabilitysampling framework has been used in most surveys. More recently, concerns about coverage and nonresponse coupled with rising costs have led some to wonder whether non-probability sampling methods might be an acceptable alternative, at least under some conditions (Groves 2006; Savage and Burrows 2007). A wide range of non-probability designs exist and are being used in various settings, including case control studies, clinical trials, evaluation research

539 citations

Journal ArticleDOI
TL;DR: In this article, the authors evaluate three critical sensitivities of the estimators: to bias induced by the initial sample, to uncontrollable features of respondent behavior, and to the without-replacement structure of sampling.
Abstract: Respondent-Driven Sampling (RDS) employs a variant of a link-tracing network sampling strategy to collect data from hard-to-reach populations. By tracing the links in the underlying social network, the process exploits the social structure to expand the sample and reduce its dependence on the initial (convenience) sample.The current estimators of population averages make strong assumptions in order to treat the data as a probability sample. We evaluate three critical sensitivities of the estimators: to bias induced by the initial sample, to uncontrollable features of respondent behavior, and to the without-replacement structure of sampling.Our analysis indicates: (1) that the convenience sample of seeds can induce bias, and the number of sample waves typically used in RDS is likely insufficient for the type of nodal mixing required to obtain the reputed asymptotic unbiasedness; (2) that preferential referral behavior by respondents leads to bias; (3) that when a substantial fraction of the target population is sampled the current estimators can have substantial bias.This paper sounds a cautionary note for the users of RDS. While current RDS methodology is powerful and clever, the favorable statistical properties claimed for the current estimates are shown to be heavily dependent on often unrealistic assumptions. We recommend ways to improve the methodology.

495 citations

Posted Content
TL;DR: It is indicated that the convenience sample of seeds can induce bias, and the number of sample waves typically used in RDS is likely insufficient for the type of nodal mixing required to obtain the reputed asymptotic unbiasedness.
Abstract: Respondent-Driven Sampling (RDS) employs a variant of a link-tracing network sampling strategy to collect data from hard-to-reach populations. By tracing the links in the underlying social network, the process exploits the social structure to expand the sample and reduce its dependence on the initial (convenience) sample. The primary goal of RDS is typically to estimate population averages in the hard-to-reach population. The current estimates make strong assumptions in order to treat the data as a probability sample. In particular, we evaluate three critical sensitivities of the estimators: to bias induced by the initial sample, to uncontrollable features of respondent behavior, and to the without-replacement structure of sampling. This paper sounds a cautionary note for the users of RDS. While current RDS methodology is powerful and clever, the favorable statistical properties claimed for the current estimates are shown to be heavily dependent on often unrealistic assumptions.

434 citations


Cites background from "Web-Based Network Sampling: Efficie..."

  • ...Wejnert and Heckathorn (2008) introduce a novel and important means of evaluating RDS estimation....

    [...]

Journal ArticleDOI
TL;DR: Investigating the performance of RDS by simulating sampling from 85 known, network populations finds that RDS is substantially less accurate than generally acknowledged and that reported RDS confidence intervals are misleadingly narrow.
Abstract: Respondent-driven sampling (RDS) is a network-based technique for estimating traits in hard-to-reach populations, for example, the prevalence of HIV among drug injectors. In recent years RDS has been used in more than 120 studies in more than 20 countries and by leading public health organizations, including the Centers for Disease Control and Prevention in the United States. Despite the widespread use and growing popularity of RDS, there has been little empirical validation of the methodology. Here we investigate the performance of RDS by simulating sampling from 85 known, network populations. Across a variety of traits we find that RDS is substantially less accurate than generally acknowledged and that reported RDS confidence intervals are misleadingly narrow. Moreover, because we model a best-case scenario in which the theoretical RDS sampling assumptions hold exactly, it is unlikely that RDS performs any better in practice than in our simulations. Notably, the poor performance of RDS is driven not by the bias but by the high variance of estimates, a possibility that had been largely overlooked in the RDS literature. Given the consistency of our results across networks and our generous sampling conditions, we conclude that RDS as currently practiced may not be suitable for key aspects of public health surveillance where it is now extensively applied.

371 citations

References
More filters
Journal ArticleDOI
04 Jun 1998-Nature
TL;DR: Simple models of networks that can be tuned through this middle ground: regular networks ‘rewired’ to introduce increasing amounts of disorder are explored, finding that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs.
Abstract: Networks of coupled dynamical systems have been used to model biological oscillators, Josephson junction arrays, excitable media, neural networks, spatial games, genetic control networks and many other self-organizing systems. Ordinarily, the connection topology is assumed to be either completely regular or completely random. But many biological, technological and social networks lie somewhere between these two extremes. Here we explore simple models of networks that can be tuned through this middle ground: regular networks 'rewired' to introduce increasing amounts of disorder. We find that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. We call them 'small-world' networks, by analogy with the small-world phenomenon (popularly known as six degrees of separation. The neural network of the worm Caenorhabditis elegans, the power grid of the western United States, and the collaboration graph of film actors are shown to be small-world networks. Models of dynamical systems with small-world coupling display enhanced signal-propagation speed, computational power, and synchronizability. In particular, infectious diseases spread more easily in small-world networks than in regular lattices.

39,297 citations

Book
01 Sep 1985

7,736 citations

Proceedings ArticleDOI
22 Jan 2006
TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.
Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

7,116 citations

Journal ArticleDOI
TL;DR: A new variant of chain-referral sampling, respondent-driven sampling, is introduced that employs a dual system of structured incentives to overcome some of the deficiencies of such samples and discusses how respondent- driven sampling can improve both network sampling and ethnographic investigation.
Abstract: A population is “hidden” when no sampling frame exists and public acknowledgment of membership in the population is potentially threatening. Accessing such populations is difficult because standard probability sampling methods produce low response rates and responses that lack candor. Existing procedures for sampling these populations, including snowball and other chain-referral samples, the key-informant approach, and targeted sampling, introduce well-documented biases into their samples. This paper introduces a new variant of chain-referral sampling, respondent-driven sampling, that employs a dual system of structured incentives to overcome some of the deficiencies of such samples. A theoretic analysis, drawing on both Markov-chain theory and the theory of biased networks, shows that this procedure can reduce the biases generally associated with chain-referral methods. The analysis includes a proof showing that even though sampling begins with an arbitrarily chosen set of initial subjects, as do most chain-referral samples, the composition of the ultimate sample is wholly independent of those initial subjects. The analysis also includes a theoretic specification of the conditions under which the procedure yields unbiased samples. Empirical results, based on surveys of 277 active drug injectors in Connecticut, support these conclusions. Finally, the conclusion discusses how respondent- driven sampling can improve both network sampling and ethnographic 44 investigation.

3,950 citations

Reference EntryDOI
15 Aug 2006
TL;DR: In this paper, the authors present two different approaches to snowball sampling: the first is to ask a person to inform potential subjects about the research project and share the investigator's contact information, and then it is up to the potential subjects to contact the investigator.
Abstract: Snowball sampling is a recruitment method in which an investigator enlists the help of a research subject in identifying, and possibly recruiting, additional subjects. It is useful when the investigator may not have access to a population of potential subjects who meet inclusion criteria, which may often be stigmatizing. There are two different approaches to snowball recruitment. In the first method, the investigator asks a person to inform potential subjects about the research project and share the investigator's contact information. It is then up to the potential subjects to contact the investigator. The informed consent process should make it clear that agreeing to contact others is not a requisite for participating in the research. Also, the researcher should not offer a reward or a " bounty " for recruiting subjects. This method rarely presents ethical issues for the IRB. The second method is more common but problematic. The investigator asks the first recruited subject for contact information about potential subjects and then contacts them directly. The major ethical issue is that the first subject may be divulging information about other people that they would prefer to be kept confidential. And it is especially problematic when the referring individual is a person of authority in the community. The IRB would evaluate very carefully the context in which this approach to recruitment is occurring. The Boise State University IRB has a good discussion of snowball sampling. For additional discussion, see he discussion on the NSF site.

2,795 citations


"Web-Based Network Sampling: Efficie..." refers background in this paper

  • ...Traditionally, the nonrandomness of social network connections has led such samples to be viewed as convenience samples from which unbiased estimation is not possible (Berg 1988)....

    [...]

Trending Questions (1)
What is referral sampling in research?

Referral sampling, also known as respondent-driven sampling (RDS), is a method in research where participants are recruited through referrals from existing participants.