Showing papers on "Sampling (statistics) published in 2011"

PDF

Open Access

Journal Article•DOI•

Sampling-based algorithms for optimal motion planning

[...]

Sertac Karaman¹, Emilio Frazzoli¹•Institutions (1)

01 Jun 2011-The International Journal of Robotics Research

TL;DR: In this paper, the authors studied the asymptotic behavior of the cost of the solution returned by stochastic sampling-based path planning algorithms as the number of samples increases.

...read moreread less

Abstract: During the last decade, sampling-based path planning algorithms, such as probabilistic roadmaps (PRM) and rapidly exploring random trees (RRT), have been shown to work well in practice and possess theoretical guarantees such as probabilistic completeness. However, little effort has been devoted to the formal analysis of the quality of the solution returned by such algorithms, e.g. as a function of the number of samples. The purpose of this paper is to fill this gap, by rigorously analyzing the asymptotic behavior of the cost of the solution returned by stochastic sampling-based algorithms as the number of samples increases. A number of negative results are provided, characterizing existing algorithms, e.g. showing that, under mild technical conditions, the cost of the solution returned by broadly used sampling-based algorithms converges almost surely to a non-optimal value. The main contribution of the paper is the introduction of new algorithms, namely, PRM* and RRT*, which are provably asymptotically optimal, i.e. such that the cost of the returned solution converges almost surely to the optimum. Moreover, it is shown that the computational complexity of the new algorithms is within a constant factor of that of their probabilistically complete (but not asymptotically optimal) counterparts. The analysis in this paper hinges on novel connections between stochastic sampling-based path planning algorithms and the theory of random geometric graphs.

...read moreread less

3,438 citations

Posted Content•

Sampling-based Algorithms for Optimal Motion Planning

[...]

Sertac Karaman¹, Emilio Frazzoli¹•Institutions (1)

Massachusetts Institute of Technology¹

05 May 2011-arXiv: Robotics

TL;DR: The main contribution of the paper is the introduction of new algorithms, namely, PRM and RRT*, which are provably asymptotically optimal, i.e. such that the cost of the returned solution converges almost surely to the optimum.

...read moreread less

Abstract: During the last decade, sampling-based path planning algorithms, such as Probabilistic RoadMaps (PRM) and Rapidly-exploring Random Trees (RRT), have been shown to work well in practice and possess theoretical guarantees such as probabilistic completeness. However, little effort has been devoted to the formal analysis of the quality of the solution returned by such algorithms, e.g., as a function of the number of samples. The purpose of this paper is to fill this gap, by rigorously analyzing the asymptotic behavior of the cost of the solution returned by stochastic sampling-based algorithms as the number of samples increases. A number of negative results are provided, characterizing existing algorithms, e.g., showing that, under mild technical conditions, the cost of the solution returned by broadly used sampling-based algorithms converges almost surely to a non-optimal value. The main contribution of the paper is the introduction of new algorithms, namely, PRM* and RRT*, which are provably asymptotically optimal, i.e., such that the cost of the returned solution converges almost surely to the optimum. Moreover, it is shown that the computational complexity of the new algorithms is within a constant factor of that of their probabilistically complete (but not asymptotically optimal) counterparts. The analysis in this paper hinges on novel connections between stochastic sampling-based path planning algorithms and the theory of random geometric graphs.

...read moreread less

2,210 citations

Proceedings Article•

Bayesian Learning via Stochastic Gradient Langevin Dynamics

[...]

Max Welling¹, Yee Whye Teh•Institutions (1)

University of California, Irvine¹

28 Jun 2011

TL;DR: This paper proposes a new framework for learning from large scale datasets based on iterative learning from small mini-batches by adding the right amount of noise to a standard stochastic gradient optimization algorithm and shows that the iterates will converge to samples from the true posterior distribution as the authors anneal the stepsize.

...read moreread less

Abstract: In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small mini-batches. By adding the right amount of noise to a standard stochastic gradient optimization algorithm we show that the iterates will converge to samples from the true posterior distribution as we anneal the stepsize. This seamless transition between optimization and Bayesian posterior sampling provides an inbuilt protection against overfitting. We also propose a practical method for Monte Carlo estimates of posterior statistics which monitors a "sampling threshold" and collects samples after it has been surpassed. We apply the method to three models: a mixture of Gaussians, logistic regression and ICA with natural gradients.

...read moreread less

2,080 citations

Journal Article•DOI•

unmarked: An R Package for Fitting Hierarchical Models of Wildlife Occurrence and Abundance

[...]

Ian Fiske, Richard B. Chandler

24 Aug 2011-Journal of Statistical Software

TL;DR: The R package unmarked provides a unified modeling framework for ecological research, including tools for data exploration, model fitting, model criticism, post-hoc analysis, and model comparison.

...read moreread less

Abstract: Ecological research uses data collection techniques that are prone to substantial and unique types of measurement error to address scientific questions about species abundance and distribution. These data collection schemes include a number of survey methods in which unmarked individuals are counted, or determined to be present, at spatially- referenced sites. Examples include site occupancy sampling, repeated counts, distance sampling, removal sampling, and double observer sampling. To appropriately analyze these data, hierarchical models have been developed to separately model explanatory variables of both a latent abundance or occurrence process and a conditional detection process. Because these models have a straightforward interpretation paralleling mechanisms under which the data arose, they have recently gained immense popularity. The common hierarchical structure of these models is well-suited for a unified modeling interface. The R package unmarked provides such a unified modeling framework, including tools for data exploration, model fitting, model criticism, post-hoc analysis, and model comparison.

...read moreread less

1,675 citations

Journal Article•DOI•

Purposeful sampling in qualitative research synthesis

[...]

Harsh Suri¹•Institutions (1)

University of Melbourne¹

03 Aug 2011-Qualitative Research Journal

TL;DR: In this article, the adaptability of purposeful sampling strategies to the process of qualitative research synthesis is examined, and the authors make a unique contribution to the literature by examining how different sampling strategies might be particularly suited to constructing multi-perspectival, emancipatory, participatory and deconstructive interpretations of published research.

...read moreread less

Abstract: Informed decisions about sampling are critical to improving the quality of research synthesis. Even though several qualitative research synthesists have recommended purposeful sampling for synthesizing qualitative research, the published literature holds sparse discussion on how different strategies for purposeful sampling may be applied to a research synthesis. In primary research, Patton is frequently cited as an authority on the topic of purposeful sampling. In Patton’s original texts that are referred to in this article, Patton does not make any suggestion of using purposeful sampling for research synthesis. This article makes a unique contribution to the literature by examining the adaptability of each of Patton’s 16 purposeful sampling strategies to the process of qualitative research synthesis. It illuminates how different purposeful sampling strategies might be particularly suited to constructing multi‐perspectival, emancipatory, participatory and deconstructive interpretations of published research.

...read moreread less

1,414 citations

Journal Article•

Riemann manifold Langevin and Hamiltonian Monte Carlo methods

[...]

Mark Girolami, Ben Calderhead

01 Jan 2011-Journal of the Royal Statistical Society

TL;DR: The methodology proposed automatically adapts to the local structure when simulating paths across this manifold, providing highly efficient convergence and exploration of the target density, and substantial improvements in the time‐normalized effective sample size are reported when compared with alternative sampling approaches.

...read moreread less

Abstract: The paper proposes Metropolis adjusted Langevin and Hamiltonian Monte Carlo sampling methods defined on the Riemann manifold to resolve the shortcomings of existing Monte Carlo algorithms when sampling from target densities that may be high dimensional and exhibit strong correlations. The methods provide fully automated adaptation mechanisms that circumvent the costly pilot runs that are required to tune proposal densities for Metropolis-Hastings or indeed Hamiltonian Monte Carlo and Metropolis adjusted Langevin algorithms. This allows for highly efficient sampling even in very high dimensions where different scalings may be required for the transient and stationary phases of the Markov chain. The methodology proposed exploits the Riemann geometry of the parameter space of statistical models and thus automatically adapts to the local structure when simulating paths across this manifold, providing highly efficient convergence and exploration of the target density. The performance of these Riemann manifold Monte Carlo methods is rigorously assessed by performing inference on logistic regression models, log-Gaussian Cox point processes, stochastic volatility models and Bayesian estimation of dynamic systems described by non-linear differential equations. Substantial improvements in the time-normalized effective sample size are reported when compared with alternative sampling approaches. MATLAB code that is available from http://www.ucl.ac.uk/statistics/research/rmhmc allows replication of all the results reported.

...read moreread less

1,031 citations

Journal Article•DOI•

Improving Marginal Likelihood Estimation for Bayesian Phylogenetic Model Selection

[...]

Wangang Xie, Paul O. Lewis¹, Yu Fan, Lynn Kuo¹, Ming-Hui Chen¹ - Show less +1 more•Institutions (1)

University of Connecticut¹

01 Mar 2011-Systematic Biology

TL;DR: A new method is introduced, steppingstone sampling (SS), which uses importance sampling to estimate each ratio in a series (the "stepping stones") bridging the posterior and prior distributions, which concludes that the greatly increased accuracy of the SS and TI methods argues for their use instead of the HM method, despite the extra computation needed.

...read moreread less

Abstract: The marginal likelihood is commonly used for comparing different evolutionary models in Bayesian phylogenet- ics and is the central quantity used in computing Bayes Factors for comparing model fit. A popular method for estimating marginal likelihoods, the harmonic mean (HM) method, can be easily computed from the output of a Markov chain Monte Carlo analysis but often greatly overestimates the marginal likelihood. The thermodynamic integration (TI) method is much more accurate than the HM method but requires more computation. In this paper, we introduce a new method, stepping- stone sampling (SS), which uses importance sampling to estimate each ratio in a series (the "stepping stones") bridging the posterior and prior distributions. We compare the performance of the SS approach to the TI and HM methods in simulation and using real data. We conclude that the greatly increased accuracy of the SS and TI methods argues for their use instead of the HM method, despite the extra computation needed. (Bayes factor; harmonic mean; phylogenetics, marginal likelihood; model selection; path sampling; thermodynamic integration; steppingstone sampling.)

...read moreread less

875 citations

Journal Article•DOI•

Comment: snowball versus respondent-driven sampling

[...]

Douglas D. Heckathorn¹•Institutions (1)

Cornell University¹

01 Aug 2011-Sociological Methodology

TL;DR: This comment summarizes the development of the RDS method, distinguishing among seven forms of the estimator, and offers a clarification of a related set of issues.

...read moreread less

Abstract: Leo Goodman (2011) provided a useful service with his clarification of the differences among snowball sampling as originally introduced by Coleman (1958–1959) and Goodman (1961) as a means for studying the structure of social networks; snowball sampling as a convenience method for studying hard-to-reach populations (Biernacki and Waldorf 1981); and respondent-driven sampling (RDS), a sampling method with good estimability for studying hard-to-reach populations (Heckathorn 1997, 2002, 2007; Salganik and Heckathorn 2004; Volz and Heckathorn 2008). This comment offers a clarification of a related set of issues. One is confusion between the latter form of snowball sampling, and RDS. A second is confusion resulting from multiple forms of the RDS estimator that derives from the incremental manner in which the method was developed. This comment summarizes the development of the method, distinguishing among seven forms of the estimator.

...read moreread less

652 citations

Book•

Sampling Essentials: Practical Guidelines for Making Sampling Choices

[...]

Johnnie N. Daniel

04 May 2011

TL;DR: This chapter discusses the choices made when choosing between taking a Census and Sampling, and the nature of the Sampling Unit and Mixed-Methods Sample Designs and the size of the Sample Glossary.

...read moreread less

Abstract: Preface List of Tables, Figures, and Research Notes Chapter 1. Preparing to Make Sampling Choices Chapter 2. Choosing Between Taking a Census and Sampling Chapter 3. Choosing Between Nonprobability Sampling and Probability Sampling Chapter 4. Choosing the Type of Nonprobability Sampling Chapter 5. Choosing the Type of Probability Sampling Chapter 6. Sampling Based on the Nature of the Sampling Unit and Mixed-Methods Sample Designs Chapter 7. Choosing the Size of the Sample Glossary References Index

...read moreread less

544 citations

Journal Article•DOI•

Challenges and results of sampling Chinese restaurant menu items for the USDA National Nutrient Database for Standard Reference

[...]

Robin Thomas, Susan E. Gebhardt

01 Apr 2011-The FASEB Journal

520 citations

Journal Article•DOI•

Ambient Sampling/Ionization Mass Spectrometry: Applications and Current Trends

[...]

Glenn A. Harris¹, Asiri S. Galhena¹, Facundo M. Fernández¹•Institutions (1)

Georgia Institute of Technology¹

06 May 2011-Analytical Chemistry

Journal Article•DOI•

A non-adapted sparse approximation of PDEs with stochastic inputs

[...]

Alireza Doostan¹, Houman Owhadi²•Institutions (2)

University of Colorado Boulder¹, California Institute of Technology²

01 Apr 2011-Journal of Computational Physics

TL;DR: The method converges in probability as a consequence of sparsity and a concentration of measure phenomenon on the empirical correlation between samples, and it is shown that the method is well suited for truly high-dimensional problems.

...read moreread less

Journal Article•DOI•

Nondestructive sampling of living systems using in vivo solid-phase microextraction.

[...]

Gangfeng Ouyang¹, Dajana Vuckovic², Janusz Pawliszyn²•Institutions (2)

Sun Yat-sen University¹, University of Waterloo²

27 Jan 2011-Chemical Reviews

TL;DR: Nondestructive Sampling of Living Systems Using in Vivo Solid-Phase Microextraction Gangfeng Ouyang, Dajana Vuckovic, and Janusz Pawliszyn.

...read moreread less

Abstract: Nondestructive Sampling of Living Systems Using in Vivo Solid-Phase Microextraction Gangfeng Ouyang,* Dajana Vuckovic, and Janusz Pawliszyn* MOEKeyLaboratory of Aquatic Product Safety/KLGHEI of Environment andEnergyChemistry, School ofChemistry andChemical Engineering, Sun Yat-sen University, Guangzhou 510275, China Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada

...read moreread less

Book Chapter•DOI•

Statistics of Random Processes

[...]

Markus J. Aschwanden¹•Institutions (1)

Advanced Technology Center¹

01 Jan 2011

TL;DR: The phenomenon of self-organized criticality (SOC) can be identified from many observations in the universe, by sampling statistical distributions of physical parameters, such as the distributions of time scales, spatial scales, or energies, for a set of events.

...read moreread less

Abstract: The phenomenon of self-organized criticality (SOC) can be identified from many observations in the universe, by sampling statistical distributions of physical parameters, such as the distributions of time scales, spatial scales, or energies, for a set of events. SOC manifests itself in the statistics of nonlinear processes.

...read moreread less

Journal Article•DOI•

Reassessing biases and other uncertainties in sea surface temperature observations measured in situ since 1850: 1. Measurement and sampling uncertainties

[...]

John Kennedy¹, Nick Rayner¹, Robert O. Smith², David E. Parker¹, Michael Saunby¹ - Show less +1 more•Institutions (2)

Met Office¹, University of Otago²

27 Jul 2011-Journal of Geophysical Research

TL;DR: The measurement uncertainties account for correlations between errors in observations made by the same ship or buoy due to miscalibration of the thermometer, which increase the estimated uncertainties on grid box averages as mentioned in this paper.

...read moreread less

Abstract: [1] New estimates of measurement and sampling uncertainties of gridded in situ sea surface temperature anomalies are calculated for 1850 to 2006. The measurement uncertainties account for correlations between errors in observations made by the same ship or buoy due, for example, to miscalibration of the thermometer. Correlations between the errors increase the estimated uncertainties on grid box averages. In grid boxes where there are many observations from only a few ships or drifting buoys, this increase can be large. The correlations also increase uncertainties of regional, hemispheric, and global averages above and beyond the increase arising solely from the inflation of the grid box uncertainties. This is due to correlations in the errors between grid boxes visited by the same ship or drifting buoy. At times when reliable estimates can be made, the uncertainties in global average, Southern Hemisphere, and tropical sea surface temperature anomalies are between 2 and 3 times as large as when calculated assuming the errors are uncorrelated. Uncertainties of Northern Hemisphere averages are approximately double. A new estimate is also made of sampling uncertainties. They are largest in regions of high sea surface temperature variability such as the western boundary currents and along the northern boundary of the Southern Ocean. The sampling uncertainties are generally smaller in the tropics and in the ocean gyres.

...read moreread less

Proceedings Article•DOI•

A global sampling method for alpha matting

[...]

Kaiming He¹, Christoph Rhemann², Carsten Rother³, Xiaoou Tang¹, Jian Sun³ - Show less +1 more•Institutions (3)

The Chinese University of Hong Kong¹, Vienna University of Technology², Microsoft³

20 Jun 2011

TL;DR: This paper proposes a global sampling method that uses all samples available in the image to handle the computational complexity introduced by the large number of samples, and poses the sampling task as a correspondence problem.

...read moreread less

Abstract: Alpha matting refers to the problem of softly extracting the foreground from an image. Given a trimap (specifying known foreground/background and unknown pixels), a straightforward way to compute the alpha value is to sample some known foreground and background colors for each unknown pixel. Existing sampling-based matting methods often collect samples near the unknown pixels only. They fail if good samples cannot be found nearby. In this paper, we propose a global sampling method that uses all samples available in the image. Our global sample set avoids missing good samples. A simple but effective cost function is defined to tackle the ambiguity in the sample selection process. To handle the computational complexity introduced by the large number of samples, we pose the sampling task as a correspondence problem. The correspondence search is efficiently achieved by generalizing a randomized algorithm previously designed for patch matching[3]. A variety of experiments show that our global sampling method produces both visually and quantitatively high-quality matting results.

...read moreread less

Book Chapter•DOI•

On the stratification of multi-label data

[...]

Konstantinos Sechidis¹, Grigorios Tsoumakas¹, Ioannis Vlahavas¹•Institutions (1)

Aristotle University of Thessaloniki¹

05 Sep 2011

TL;DR: This paper considers two stratification methods for multi- label data and empirically compares them along with random sampling on a number of datasets and reveals some interesting conclusions with respect to the utility of each method for particular types of multi-label datasets.

...read moreread less

Abstract: Stratified sampling is a sampling method that takes into account the existence of disjoint groups within a population and produces samples where the proportion of these groups is maintained. In single-label classification tasks, groups are differentiated based on the value of the target variable. In multi-label learning tasks, however, where there are multiple target variables, it is not clear how stratified sampling could/should be performed. This paper investigates stratification in the multi-label data context. It considers two stratification methods for multi-label data and empirically compares them along with random sampling on a number of datasets and based on a number of evaluation criteria. The results reveal some interesting conclusions with respect to the utility of each method for particular types of multi-label datasets.

...read moreread less

Journal Article•DOI•

Approaches to Recruiting ‘Hard-To-Reach’ Populations into Research: A Review of the Literature

[...]

Abdolreza Shaghaghi¹, Raj Bhopal², Aziz Sheikh²•Institutions (2)

Tabriz University of Medical Sciences¹, University of Edinburgh²

20 Dec 2011-health promotion perspectives

TL;DR: A range of techniques to recruit hard-to-reach populations, including snowball sampling, respondent-driven sampling (RDS), indigenous field worker sampling (IFWS), facility-based sampling (FBS), targeted sampling (TS), time-location (space) sampling (TLS), conventional cluster sampling (CCS) and capture re-capture sampling (CR) are identified.

...read moreread less

Abstract: Background: ‘Hard-to-reach’ is a term used to describe those sub-groups of the population that may be difficult to reach or involve in research or public health programmes. Application of a single term to call these sub-sections of populations implies a homogeneity within distinct groups, which does not necessarily exist. Different sampling techniques were introduced so far to recruit hard-to-reach populations. In this article, we have reviewed a range of approaches that have been used to widen participation in studies. Methods: We performed a Pubmed and Google search for relevant English language articles using the keywords and phrases: (hard-to-reach AND population* OR sampl*), (hidden AND population* OR sample*) and (“hard to reach” AND population* OR sample*) and a consultation of the retrieved articles’ bibliographies to extract empirical evidence from publications that discussed or examined the use of sampling techniques to recruit hidden or hard-to-reach populations in health studies. Results: Reviewing the literature has identified a range of techniques to recruit hard-to-reach populations, including snowball sampling, respondent-driven sampling (RDS), indigenous field worker sampling (IFWS), facility-based sampling (FBS), targeted sampling (TS), timelocation (space) sampling (TLS), conventional cluster sampling (CCS) and capture re-capture sampling (CR). Conclusion: The degree of compliance with a study by a certain ‘hard-to-reach’ group depends on the characteristics of that group, recruitment technique used and the subject of interest. Irrespective of potential advantages or limitations of the recruitment techniques reviewed, their successful use depends mainly upon our knowledge about specific characteristics of the target populations. Thus in line with attempts to expand the current boundaries of our knowledge about recruitment techniques in health studies and their applications in varying situations, we should also focus on possibly all contributing factors which may have an impact on participation rate within a defined population group.

...read moreread less

Proceedings Article•DOI•

Visual saliency detection by spatially weighted dissimilarity

[...]

Lijuan Duan¹, Chunpeng Wu¹, Jun Miao², Laiyun Qing², Yu Fu³ - Show less +1 more•Institutions (3)

Beijing University of Technology¹, Chinese Academy of Sciences², University of Surrey³

20 Jun 2011

TL;DR: Experimental results show that the proposed new visual saliency detection method outperforms current state-of-the-art methods on predicting human fixations.

...read moreread less

Abstract: In this paper, a new visual saliency detection method is proposed based on the spatially weighted dissimilarity. We measured the saliency by integrating three elements as follows: the dissimilarities between image patches, which were evaluated in the reduced dimensional space, the spatial distance between image patches and the central bias. The dissimilarities were inversely weighted based on the corresponding spatial distance. A weighting mechanism, indicating a bias for human fixations to the center of the image, was employed. The principal component analysis (PCA) was the dimension reducing method used in our system. We extracted the principal components (PCs) by sampling the patches from the current image. Our method was compared with four saliency detection approaches using three image datasets. Experimental results show that our method outperforms current state-of-the-art methods on predicting human fixations.

...read moreread less

Journal Article•DOI•

Sampling for validation of digital soil maps

[...]

Dick J. Brus¹, Bas Kempen¹, Gerard B. M. Heuvelink¹•Institutions (1)

Wageningen University and Research Centre¹

01 Jun 2011-European Journal of Soil Science

TL;DR: In this article, the suitability of five basic types of random sampling design for soil map validation was evaluated: simple, stratified simple, systematic, cluster and two-stage random sampling.

...read moreread less

Abstract: The increase in digital soil mapping around the world means that appropriate and efficient sampling strategies are needed for validation. Data used for calibrating a digital soil mapping model typically are non-random samples. In such a case we recommend collection of additional independent data and validation of the soil map by a design-based sampling strategy involving probability sampling and design-based estimation of quality measures. An important advantage over validation by data-splitting or cross-validation is that model-free estimates of the quality measures and their standard errors can be obtained, and thus no assumptions on the spatial auto-correlation of prediction errors need to be made. The quality of quantitative soil maps can be quantified by the spatial cumulative distribution function (SCDF) of the prediction errors, whereas for categorical soil maps the overall purity and the map unit purities (user's accuracies) and soil class representation (producer's accuracies) are suitable quality measures. The suitability of five basic types of random sampling design for soil map validation was evaluated: simple, stratified simple, systematic, cluster and two-stage random sampling. Stratified simple random sampling is generally a good choice: it is simple to implement, estimation of the quality measures and their precision is straightforward, it gives relatively precise estimates, and no assumptions are needed in quantifying the standard error of the estimated quality measures. Validation by probability sampling is illustrated with two case studies. A categorical soil map on point support depicting soil classes in the province of Drenthe of the Netherlands (268 000 ha) was validated by stratified simple random sampling. Sub-areas with different expected purities were used as strata. The estimated overall purity was 58% with a standard error of 4%. This was 9% smaller than the theoretical purity computed with the model. Map unit purities and class representations were estimated by the ratio estimator. A quantitative soil map, depicting the average soil organic carbon (SOC) contents of pixels in an area of 81 600 ha in Senegal, was validated by random transect sampling. SOC predictions were seriously biased, and the random error was considerable. Both case studies underpin the importance of independent validation of soil maps by probability sampling, to avoid unfounded trust in visually attractive maps produced by advanced pedometric techniques

...read moreread less

Diseños Observacionales: Ajuste y aplicación en psicología del deporte

[...]

María Teresa Anguera Argilaga¹, Angel Blanco Villaseñor¹, Antonio Hernández Mendo², José Luis Losada López¹•Institutions (2)

University of Barcelona¹, University of Málaga²

20 Jul 2011

TL;DR: In this article, the concept of observational design and its applications in the field of sports psychology are reviewed. But, as stated by the authors, "once the objectives of an observational design have been defined they will guide the subsequent research process, influencing the construction of observation instruments, the recording procedure and its metric, the observational sampling, the monitoring of data quality, and, above all, the choice of the most suitable analytic techniques in each case".

...read moreread less

Abstract: This study reviews the concept of observational design and its applications in the field of sports psychology. Once the objectives of an observational design have been defined they will guide the subsequent research process, influencing the construction of observation instruments, the recording procedure and its metric, the observational sampling, the monitoring of data quality, and, above all, the choice of the most suitable analytic techniques in each case. Obviously, they also have repercussions in terms of the interpretation of results.

...read moreread less

Journal Article•DOI•

Improved Inference for Respondent-Driven Sampling Data With Application to HIV Prevalence Estimation

[...]

Krista J. Gile¹•Institutions (1)

University of Massachusetts Amherst¹

01 Mar 2011-Journal of the American Statistical Association

TL;DR: In this paper, a successive sampling based estimator for population means based on respondent-driven sampling data is presented, and demonstrated its superior performance when the size of the hidden population is known.

...read moreread less

Abstract: Respondent-driven sampling is a form of link-tracing network sampling, which is widely used to study hard-to-reach populations, often to estimate population proportions. Previous treatments of this process have used a with-replacement approximation, which we show induces bias in estimates for large sample fractions and differential network connectedness by characteristic of interest. We present a treatment of respondent-driven sampling as a successive sampling process. Unlike existing representations, our approach respects the essential without-replacement feature of the process, while converging to an existing with-replacement representation for small sample fractions, and to the sample mean for a full-population sample. We present a successive-sampling based estimator for population means based on respondent-driven sampling data, and demonstrate its superior performance when the size of the hidden population is known. We present sensitivity analyses for unknown population sizes. In addition, we note tha...

...read moreread less

Journal Article•DOI•

Tamanho da amostra em estudos clínicos e experimentais

[...]

Hélio Amante Miot¹•Institutions (1)

Sao Paulo State University¹

01 Dec 2011-Jornal Vascular Brasileiro

TL;DR: In this article, a proper sample planning depends on basic knowledge of the study statistics and deep knowledge of a problem under investigation, in order to combine the statistical significance of the tests with the clinical meaning of the results.

...read moreread less

Abstract: dimension and also the sampling technique (collection/selection) of the elements of the study. It is essential in the elaboration of the project, and problems with such planning may compromise the final data analysis and interpretation of its results. A proper sample planning depends on basic knowledge of the study statistics and deep knowledge of the problem under investigation, in order to combine the statistical significance of the tests with the clinical meaning of the results 1,3,4 . Most biostatistical tests assume that the study sample is probabilistically representative of the population. Some samples of convenience, e.g. like choosing consecutive patients of a specific outpatient clinic, may not properly rep resent all the study population. The investigator should be alert to possible selection biases resulting from the availability of patients in consecutive sampling, because increasing sample size would not correct the effect of biased samples. In addition, strategies of non-probability stratified sampling, by using sample quotas, complex sampling (conglomerates, multi-levels), voluntary response, saturation of variables,

...read moreread less

Journal Article•DOI•

Multichannel Sampling of Pulse Streams at the Rate of Innovation

[...]

Kfir Gedalyahu¹, R Tur¹, Yonina C. Eldar¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 Apr 2011-IEEE Transactions on Signal Processing

TL;DR: This paper proposes a multichannel architecture for sampling pulse streams with arbitrary shape, operating at the rate of innovation, and shows that the pulse stream can be recovered from the proposed minimal-rate samples using standard tools taken from spectral estimation in a stable way even at high rates of innovation.

...read moreread less

Abstract: We consider minimal-rate sampling schemes for infinite streams of delayed and weighted versions of a known pulse shape. The minimal sampling rate for these parametric signals is referred to as the rate of innovation and is equal to the number of degrees of freedom per unit time. Although sampling of infinite pulse streams was treated in previous works, either the rate of innovation was not achieved, or the pulse shape was limited to Diracs. In this paper we propose a multichannel architecture for sampling pulse streams with arbitrary shape, operating at the rate of innovation. Our approach is based on modulating the input signal with a set of properly chosen waveforms, followed by a bank of integrators. This architecture is motivated by recent work on sub-Nyquist sampling of multiband signals. We show that the pulse stream can be recovered from the proposed minimal-rate samples using standard tools taken from spectral estimation in a stable way even at high rates of innovation. In addition, we address practical implementation issues, such as reduction of hardware complexity and immunity to failure in the sampling channels. The resulting scheme is flexible and exhibits better noise robustness than previous approaches.

...read moreread less

Journal Article•DOI•

Comment: on respondent‐driven sampling and snowball sampling in hard‐to‐reach populations and snowball sampling not in hard‐to‐reach populations

[...]

Leo A. Goodman¹•Institutions (1)

University of California, Berkeley¹

01 Aug 2011-Sociological Methodology

TL;DR: In this article, the difference between snowball sampling not in hard-to-reach populations and snowball sampling and respondent-driven sampling in hard to reach populations was pointed out and discussed.

...read moreread less

Abstract: In this commentary attention is drawn to the difference between snowball sampling not in hard-to-reach populations and snowball sampling and respondent-driven sampling in hard-to-reach populations. The approach to sampling design and inference called snowball sampling (not in hard-to-reach populations) was introduced in Coleman (1958– 1959) and Goodman (1961); and respondent-driven sampling in hardto-reach populations was introduced more recently in Heckathorn (1997, 2002, 2007). Still more recently, Gile and Handcock (2010) sounded a cautionary note for the users of respondent-driven sampling in hard-to-reach populations. Coleman (1958–1959) notes that snowball sampling in survey research is amenable to the same scientific procedures as ordinary random sampling, and Goodman (1961) introduces statistical methods with snowball sampling for the estimation

...read moreread less

Journal Article•DOI•

Equilibrium sampling in biomolecular simulations.

[...]

Daniel M. Zuckerman¹•Institutions (1)

University of Pittsburgh¹

05 May 2011-Annual Review of Biophysics

TL;DR: Efforts to enhance sampling capability range from the development of new algorithms to parallelization to novel uses of hardware, and special focus is placed on classifying algorithms in order to understand their fundamental strengths and limitations.

...read moreread less

Abstract: Equilibrium sampling of biomolecules remains an unmet challenge after more than 30 years of atomistic simulation. Efforts to enhance sampling capability, which are reviewed here, range from the development of new algorithms to parallelization to novel uses of hardware. Special focus is placed on classifying algorithms--most of which are underpinned by a few key ideas--in order to understand their fundamental strengths and limitations. Although algorithms have proliferated, progress resulting from novel hardware use appears to be more clear-cut than from algorithms alone, due partly to the lack of widely used sampling measures.

...read moreread less

Journal Article•DOI•

Inference about density and temporary emigration in unmarked populations

[...]

Richard B. Chandler¹, J. Andrew Royle¹, David I. King²•Institutions (2)

Patuxent Wildlife Research Center¹, University of Massachusetts Amherst²

01 Jul 2011-Ecology

TL;DR: A hierarchical model allowing inference about the density of unmarked populations subject to temporary emigration and imperfect detection is presented, which can be fit to data collected using a variety of standard survey methods such as repeated point counts.

...read moreread less

Abstract: Few species are distributed uniformly in space, and populations of mobile organisms are rarely closed with respect to movement, yet many models of density rely upon these assumptions. We present a hierarchical model allowing inference about the density of unmarked populations subject to temporary emigration and imperfect detection. The model can be fit to data collected using a variety of standard survey methods such as repeated point counts in which removal sampling, double-observer sampling, or distance sampling is used during each count. Simulation studies demonstrated that parameter estimators are unbiased when temporary emigration is either “completely random” or is determined by the size and location of home ranges relative to survey points. We also applied the model to repeated removal sampling data collected on Chestnut-sided Warblers (Dendroica pensylvancia) in the White Mountain National Forest, USA. The density estimate from our model, 1.09 birds/ha, was similar to an estimate of 1.11 birds/ha produced by an intensive spot-mapping effort. Our model is also applicable when processes other than temporary emigration affect the probability of being available for detection, such as in studies using cue counts. Functions to implement the model have been added to the R package unmarked.

...read moreread less

Journal Article•DOI•

Bayesian Adaptive Sampling for Variable Selection and Model Averaging

[...]

Merlise A. Clyde¹, Joyee Ghosh¹, Michael L. Littman¹•Institutions (1)

Duke University¹

01 Jan 2011-Journal of Computational and Graphical Statistics

TL;DR: A Bayesian adaptive sampling algorithm (BAS), that samples models without replacement from the space of models, is introduced and it is shown that BAS can outperform Markov chain Monte Carlo methods.

...read moreread less

Abstract: For the problem of model choice in linear regression, we introduce a Bayesian adaptive sampling algorithm (BAS), that samples models without replacement from the space of models. For problems that permit enumeration of all models, BAS is guaranteed to enumerate the model space in 2p iterations where p is the number of potential variables under consideration. For larger problems where sampling is required, we provide conditions under which BAS provides perfect samples without replacement. When the sampling probabilities in the algorithm are the marginal variable inclusion probabilities, BAS may be viewed as sampling models “near” the median probability model of Barbieri and Berger. As marginal inclusion probabilities are not known in advance, we discuss several strategies to estimate adaptively the marginal inclusion probabilities within BAS. We illustrate the performance of the algorithm using simulated and real data and show that BAS can outperform Markov chain Monte Carlo methods. The algorithm is imple...

...read moreread less

Journal Article•DOI•

Global distribution of linear and cyclic volatile methyl siloxanes in air.

[...]

Susie Genualdi¹, Tom Harner¹, Yu Cheng¹, Matthew MacLeod², Kaj M. Hansen³, Roger van Egmond⁴, Mahiba Shoeib¹, Sum Chi Lee¹ - Show less +4 more•Institutions (4)

Environment Canada¹, Stockholm University², Aarhus University³, University of Bedfordshire⁴

25 Mar 2011-Environmental Science & Technology

TL;DR: Concentrations of D3 and D4 are significantly correlated, as are D5 and D6, which suggests different sources for these two pairs of compounds, and agreement between measurements and models indicate that the sources, transport pathways, and sinks of D5 in the global atmosphere are fairly well understood.

...read moreread less

Abstract: The global distribution of linear and cyclic volatile methyl silxoanes (VMS) was investigated at 20 sites worldwide, including 5 locations in the Arctic, using sorbent-impregnated polyurethane foam ...

...read moreread less

Journal Article•DOI•

A Novel Sampling Theorem on the Sphere

[...]

Jason D. McEwen¹, Yves Wiaux¹•Institutions (1)

École Normale Supérieure¹

01 Dec 2011-IEEE Transactions on Signal Processing

TL;DR: This work develops a novel sampling theorem on the sphere and corresponding fast algorithms by associating the sphere with the torus through a periodic extension and highlights the advantages of the sampling theorem in the context of potential applications, notably in the field of compressive sampling.

...read moreread less

Abstract: We develop a novel sampling theorem on the sphere and corresponding fast algorithms by associating the sphere with the torus through a periodic extension. The fundamental property of any sampling theorem is the number of samples required to represent a band-limited signal. To represent exactly a signal on the sphere band-limited at L, all sampling theorems on the sphere require O(L2) samples. However, our sampling theorem requires less than half the number of samples of other equiangular sampling theorems on the sphere and an asymptotically identical, but smaller, number of samples than the Gauss-Legendre sampling theorem. The complexity of our algorithms scale as O(L3), however, the continual use of fast Fourier transforms reduces the constant prefactor associated with the asymptotic scaling considerably, resulting in algorithms that are fast. Furthermore, we do not require any precomputation and our algorithms apply to both scalar and spin functions on the sphere without any change in computational complexity or computation time. We make our implementation of these algorithms available publicly and perform numerical experiments demonstrating their speed and accuracy up to very high band-limits. Finally, we highlight the advantages of our sampling theorem in the context of potential applications, notably in the field of compressive sampling.

...read moreread less

Collapse