scispace - formally typeset
Search or ask a question

Showing papers on "Population proportion published in 2019"


Journal ArticleDOI
TL;DR: The exact Hypergeometric test for a population proportion to detect significant changes in the process quality level is proposed and provides monitoring of a manufacturing process with a considerably lower inspection effort than its Binomial counterpart.

19 citations


Journal ArticleDOI
TL;DR: In this paper, a family of exponential ratio estimators for the estimation of the population mean of the study variable using the information of the Population proportion posse is proposed, which is based on the exponential ratio method.
Abstract: In this article, we propose a family of exponential ratio estimators for the estimation of the population mean of the study variable using the information of the population proportion posse...

9 citations


Proceedings ArticleDOI
01 Dec 2019
TL;DR: This review covers most frequently encountered study designs in medical research and provides statistical formulas for the sample size calculation of the research studies with varying objectives such as estimation of population proportion(s) and hypothesis testing, estimates of population mean, estimation of diagnostic accuracy, and estimation of association and reliability.
Abstract: For the reliability and general ability of the results of any epidemiological or medical research study, adequate number of study units is extremely important. Study of both less than needed and unnecessarily large sample size is ethically and economically unfair. Considering the importance of sample size, its determination and justification in all kinds of epidemiological or medical research studies, we propose simple and non-technical explanation to the available sample size calculation methods. In this review we covered most frequently encountered study designs in medical research. Present review provides statistical formulas for the sample size calculation of the research studies with varying objectives such as estimation of population proportion(s) and hypothesis testing, estimation of population mean, estimation of diagnostic accuracy, estimation of association and reliability, sample size calculation for case control studies, and sample size calculation for cohort studies. Limited technical details and explanations of underlying theoretical assumption for each method have been included to ensure the adaptability of the method with minimal theoretical understating and statistical knowledge.

6 citations


Posted Content
TL;DR: This paper proposes a beta-binomial model that allows for a taxon's relative abundance to be associated with covariates of interest, and exploits this model in order to propose tests not only for differential relative abundance, but also for differential variability.
Abstract: Using a sample from a population to estimate the proportion of the population with a certain category label is a broadly important problem. In the context of microbiome studies, this problem arises when researchers wish to use a sample from a population of microbes to estimate the population proportion of a particular taxon, known as the taxon's relative abundance. In this paper, we propose a beta-binomial model for this task. Like existing models, our model allows for a taxon's relative abundance to be associated with covariates of interest. However, unlike existing models, our proposal also allows for the overdispersion in the taxon's counts to be associated with covariates of interest. We exploit this model in order to propose tests not only for differential relative abundance, but also for differential variability. The latter is particularly valuable in light of speculation that dysbiosis, the perturbation from a normal microbiome that can occur in certain disease conditions, may manifest as a loss of stability, or increase in variability, of the counts associated with each taxon. We demonstrate the performance of our proposed model using a simulation study and an application to soil microbial data.

5 citations


Journal ArticleDOI
TL;DR: In this article, proportion estimators and associated variance estimators are proposed for a binary variable with a concomitant variable based on modified ranked set sampling methods, which is a modified ranked-set sampling method.
Abstract: In this paper, proportion estimators and associated variance estimators are proposed for a binary variable with a concomitant variable based on modified ranked set sampling methods, which a...

5 citations


Journal ArticleDOI
TL;DR: A Bayesian Crammer–Rao bound is derived in connection with the estimation of p, the proportion of individuals in the population possessing certain characteristic of unknown proportion, for a dichotomous population characterized by the parameter p.
Abstract: In this article, we consider a dichotomous population characterized by the parameter p defined as the proportion of individuals in the population possessing certain characteristic. The unknown proportion p is our parameter of interest in the present work. Under the assumption that p is a random quantity we derive a Bayesian Crammer–Rao (BCR) bound in connection with the estimation of p. The proposed procedure is based on a ranked set sample (RSS) observed on the variable of interest which is binary in nature. This RSS-based approach is compared with its corresponding SRS (simple random sample) counterpart in the cases of both perfect and imperfect rankings. The proposed procedure is applied for estimating the proportion of children aged months (to the mothers aged 15–49 years of rural India) who are not immunized with the vaccine against measles using National Family Health Survey-3 (2005–2006) data of India.

4 citations


Journal ArticleDOI
TL;DR: The present work proposes a new two-stage randomized response model to get rid of misleading response or non-response due to the stigmatized nature of attribute under the study and results in the unbiased estimator of population proportion possessing the sensitive attribute.
Abstract: The survey related to stigmatized characteristics leads to the non-response problem if it is conducted according to classical (direct) methods, especially, developed for non-sensitive issues; therefore, it needs to be applied appropriate survey methodology to get a reliable response from respondents in incriminating issues. Randomized response model is one of the most recent methods which is attracting the attention of survey practitioners to deal with the problems of non-response because it protects the privacy of individuals in order to acquire the truthful response. The present work proposes a new two-stage randomized response model to get rid of misleading response or non-response due to the stigmatized nature of attribute under the study. The proposed randomized response model results in the unbiased estimator of population proportion possessing the sensitive attribute. The properties of the resultant estimator have been studied and empirical comparisons are performed to show its dominance ov...

4 citations


Journal ArticleDOI
TL;DR: This article considers the case in which both classifiers are fallible and proposes asymptotic and approximate unconditional test procedures based on six test statistics for a population proportion and five approximate sample size formulas based on the recommended test procedures under two models and suggests that both perform satisfactorily for small to large sample sizes and are highly recommended.
Abstract: Double sampling is usually applied to collect necessary information for situations in which an infallible classifier is available for validating a subset of the sample that has already been classified by a fallible classifier. Inference procedures have previously been developed based on the partially validated data obtained by the double-sampling process. However, it could happen in practice that such infallible classifier or gold standard does not exist. In this article, we consider the case in which both classifiers are fallible and propose asymptotic and approximate unconditional test procedures based on six test statistics for a population proportion and five approximate sample size formulas based on the recommended test procedures under two models. Our results suggest that both asymptotic and approximate unconditional procedures based on the score statistic perform satisfactorily for small to large sample sizes and are highly recommended. When sample size is moderate or large, asymptotic procedures based on the Wald statistic with the variance being estimated under the null hypothesis, likelihood rate statistic, log- and logit-transformation statistics based on both models generally perform well and are hence recommended. The approximate unconditional procedures based on the log-transformation statistic under Model I, Wald statistic with the variance being estimated under the null hypothesis, log- and logit-transformation statistics under Model II are recommended when sample size is small. In general, sample size formulae based on the Wald statistic with the variance being estimated under the null hypothesis, likelihood rate statistic and score statistic are recommended in practical applications. The applicability of the proposed methods is illustrated by a real-data example.

4 citations


Journal ArticleDOI
TL;DR: In this paper, an efficient stratified randomized response model based on Chang et al.'s (2004) model was proposed and the variance of the proposed estimator of πs, the proportion of the resp...
Abstract: This paper proposes an efficient stratified randomized response model based on Chang et al.'s (2004) model. We have obtained the variance of the proposed estimator of πs, the proportion of the resp...

3 citations


Journal ArticleDOI
TL;DR: This study has presented an example with sparse data on college cheating and a simulation study to illustrate the properties of the EM algorithm and the Bayesian method, and discusses two extensions to accommodate finite population sampling and optional responses.
Abstract: In sample surveys with sensitive items, sampled units may not respond or they respond untruthfully. Usually a negative answer is given when it is actually positive, thereby leading to an estimate of the population proportion of positives (sensitive proportion) that is too small. In our study, we have binary data obtained from the unrelated-question design, and both the sensitive proportion and the nonsensitive proportion are of interest. A respondent answers the sensitive item with a known probability, and to avoid non-identifiable parameters, at least two (not necessarily exactly two) different random mechanisms are used, but only one for each cluster of respondents. The key point here is that the counts are sparse (very small sample sizes), and we show how to overcome some of the problems associated with the unrelated question design. A standard approach to this problem is to use the expectation-maximization (EM) algorithm. However, because we consider only small sample sizes (sparse counts), the EM algorithm may not converge and asymptotic theory, which can permit normality assumptions for inference, is not appropriate; so we develop a Bayesian method. To compare the EM algorithm and the Bayesian method, we have presented an example with sparse data on college cheating and a simulation study to illustrate the properties of our procedure. Finally, we discuss two extensions to accommodate finite population sampling and optional responses.

3 citations


Journal ArticleDOI
TL;DR: In this paper, the authors considered the use of the zero-truncated binomial distribution as a randomization device while estimating the population proportion of a sensitive characteristic, and the resultant population proportion was estimated.
Abstract: In this article, we consider the use of the zero-truncated binomial distribution as a randomization device while estimating the population proportion of a sensitive characteristic. The resultant ne...

Posted Content
TL;DR: A 'Neyman-Pearson lemma' for binomial data under DP is proved, which proves that in general, DP hypothesis tests can be written in terms of linear constraints, and for exchangeable data can always be expressed as a function of the empirical distribution.
Abstract: We derive uniformly most powerful (UMP) tests for simple and one-sided hypotheses for a population proportion within the framework of Differential Privacy (DP), optimizing finite sample performance. We show that in general, DP hypothesis tests can be written in terms of linear constraints, and for exchangeable data can always be expressed as a function of the empirical distribution. Using this structure, we prove a 'Neyman-Pearson lemma' for binomial data under DP, where the DP-UMP only depends on the sample sum. Our tests can also be stated as a post-processing of a random variable, whose distribution we coin ''Truncated-Uniform-Laplace'' (Tulap), a generalization of the Staircase and discrete Laplace distributions. Furthermore, we obtain exact $p$-values, which are easily computed in terms of the Tulap random variable. Using the above techniques, we show that our tests can be applied to give uniformly most accurate one-sided confidence intervals and optimal confidence distributions. We also derive uniformly most powerful unbiased (UMPU) two-sided tests, which lead to uniformly most accurate unbiased (UMAU) two-sided confidence intervals. We show that our results can be applied to distribution-free hypothesis tests for continuous data. Our simulation results demonstrate that all our tests have exact type I error, and are more powerful than current techniques.

Journal ArticleDOI
11 Feb 2019
TL;DR: An easy-to-implement closed-form algorithm for drawing from the posterior distributions is derived using a Bayesian approach for the difference of two population proportion parameters based on two independent samples of binomial data subject to one type of misclassification.
Abstract: We construct a point and interval estimation using a Bayesian approach for the difference of two population proportion parameters based on two independent samples of binomial data subject to one type of misclassification. Specifically, we derive an easy-to-implement closed-form algorithm for drawing from the posterior distributions. For illustration, we applied our algorithm to a real data example. Finally, we conduct simulation studies to demonstrate the efficiency of our algorithm for Bayesian inference.