scispace - formally typeset
Search or ask a question

Showing papers on "Population proportion published in 2020"


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a beta-binomial model for estimating the relative abundance of a particular taxon in a population of microbes, which allows for the overdispersion in the taxon's counts to be associated with covariates of interest.
Abstract: Using a sample from a population to estimate the proportion of the population with a certain category label is a broadly important problem. In the context of microbiome studies, this problem arises when researchers wish to use a sample from a population of microbes to estimate the population proportion of a particular taxon, known as the taxon's relative abundance. In this paper, we propose a beta-binomial model for this task. Like existing models, our model allows for a taxon's relative abundance to be associated with covariates of interest. However, unlike existing models, our proposal also allows for the overdispersion in the taxon's counts to be associated with covariates of interest. We exploit this model in order to propose tests not only for differential relative abundance, but also for differential variability. The latter is particularly valuable in light of speculation that dysbiosis, the perturbation from a normal microbiome that can occur in certain disease conditions, may manifest as a loss of stability, or increase in variability, of the counts associated with each taxon. We demonstrate the performance of our proposed model using a simulation study and an application to soil microbial data.

188 citations


Journal ArticleDOI
TL;DR: It turns out that the proposed estimator is substantially more efficient than its simple random sampling and ranked set sampling analogs, as the true population proportion tends to zero/unity.
Abstract: This article studies the properties of the maximum likelihood estimator of the population proportion in ranked set sampling with extreme ranks. The maximum likelihood estimator is described and its...

36 citations


Journal ArticleDOI
15 Jan 2020
TL;DR: In this article, the authors derive uniformly most powerful (UMP) tests for simple and one-sided hypotheses for a population proportion within the framework of differential privacy, optimizing finite sample performance.
Abstract: We derive uniformly most powerful (UMP) tests for simple and one-sided hypotheses for a population proportion within the framework of Differential Privacy (DP), optimizing finite sample performance. We show that in general, DP hypothesis tests can be written in terms of linear constraints, and for exchangeable data can always be expressed as a function of the empirical distribution. Using this structure, we prove a `Neyman-Pearson lemma' for binomial data under DP, where the DP-UMP only depends on the sample sum. Our tests can also be stated as a post-processing of a random variable, whose distribution we coin ``Truncated-Uniform-Laplace'' (Tulap), a generalization of the Staircase and discrete Laplace distributions. Furthermore, we obtain exact p-values, which are easily computed in terms of the Tulap random variable. Using the above techniques, we show that our tests can be applied to give uniformly most accurate one-sided confidence intervals and optimal confidence distributions. We also derive uniformly most powerful unbiased (UMPU) two-sided tests, which lead to uniformly most accurate unbiased (UMAU) two-sided confidence intervals. We show that our results can be applied to distribution-free hypothesis tests for continuous data. Our simulation results demonstrate that all our tests have exact type I error, and are more powerful than current techniques.

15 citations


Journal ArticleDOI
TL;DR: This paper illustrates via real-life examples that in contrast to classical test theory, fuzzy hypothesis testing provides an additional partial and gradual consideration of the indifference zone for both complementary and non-complementary hypotheses.

13 citations


Journal ArticleDOI
TL;DR: This study focus on estimating π, the prevalence sensitive characteristics, and ω, the sensitivity level of the underlying sensitive question, using two-question and split-sample approaches for parameters estimation.
Abstract: In this study, we propose optional randomized response technique (RRT) models in binary response situation. Gupta, Gupta, and Singh (2002) introduced the basic premise of optional RRT model, that a...

6 citations


Posted ContentDOI
07 Jul 2020-medRxiv
TL;DR: Sub-Saharan African countries did defy the dire predictions of the COVID-19 burden, and South-Africa always exceeded the predicted number and population proportion of CO VID-19 infections.
Abstract: Introduction Since its identification, the COVID-19 infection has caused substantial mortality and morbidity worldwide, but sub-Saharan Africa seems to defy the predictions. We aimed to verify this hypothesis using strong statistical methods. Methods We conducted a cross-sectional study comparing the projected and actual numbers as well as population proportions of COVID-19 cases in the 46 sub-Saharan African countries on May 1st, May 29th (4 weeks later) and June 26th (8 weeks later). The source of the projected number of cases was a publication by scientists from the Center for Mathematical Modeling of Infectious Diseases of the London School of Hygiene & Tropical Medicine, whereas the actual number of cases was obtained from the WHO situation reports. We calculated the percentage difference between the projected and actual numbers of cases per country. Further, “N-1” chi-square tests with Bonferroni correction were used to compare the projected and actual population proportion of COVID-19 cases, along with the 95% confidence interval of the difference between these population proportions. All statistical tests were 2-sided, with 0.05 used as threshold for statistical significance. Results On May 1st, May 29th and June 26th, respectively 40 (86.95%), 45 (97.82%) and 41 (89.13%) of the sub-Saharan African countries reported a number of confirmed cases that was lower than the predicted number of 1000 cases for May 1st and 10000 for both May 29th and June 26th. At these dates, the population proportions of confirmed Covid-19 cases were significantly lower (p-value Conclusion Sub-Saharan African countries did defy the dire predictions of the COVID-19 burden. Preventive measures should be further enforced to preserve this positive outcome.

4 citations


Journal ArticleDOI
TL;DR: In this article, the authors derived a closed formula for the exact distribution of the difference between two independent sample proportions, and used it to perform related inferences such as a confidence interval.
Abstract: Comparing two population proportions using confidence interval could be misleading in many cases, such as the sample size being small and the test being based on normal approximation. In this case, the only one option that we have is to collect a large sample. Unfortunately, the large sample might not be possible. One example is a person suffering from a rare disease. The main purpose of this journal is to derive a closed formula for the exact distribution of the difference between two independent sample proportions, and use it to perform related inferences such as a confidence interval, regardless of the sample sizes and compare with the existing Wald, Agresti-Caffo and Score. In this journal, we have derived a closed formula for the exact distribution of the difference between two independent sample proportions. This distribution doesn’t need any requirements, and can be used to perform inferences such as: a hypothesis test for two population proportions, regardless of the nature of the distribution and the sample sizes. We claim that exact distribution has the least confidence width among Wald, Agresti-Caffo and Score, so it is suitable for inferences of the difference between the population proportion regardless of sample size.

3 citations


Journal ArticleDOI
TL;DR: In this paper, a two fold approach for estimation of the population proportion of a sensitive attribute is proposed, which gives respondents choice if they want to be sensitive or not, depending on the sensitivity of the attribute.
Abstract: In this article, we propose a two folded approach for estimation of the population proportion of a sensitive attribute. The rationale of proposed technique is to give respondents choice if they wan...

3 citations


Journal ArticleDOI
TL;DR: In this article, the estimation of a population proportion, using the auxiliary information available, which is incorporated into the estimation procedure by a probit model fit, is discussed, and three probit models are used.
Abstract: This article discusses the estimation of a population proportion, using the auxiliary information available, which is incorporated into the estimation procedure by a probit model fit. Three probit ...

2 citations


Journal ArticleDOI
TL;DR: This paper establishes the consistency of this tree bootstrap approach in the case of documentclass[12pt]{minimal} and establishes the uncertainty in population proportion estimates from respondent-driven sampling using the tree boot strap method.
Abstract: Respondent-driven sampling is an approach for estimating features of populations that are difficult to access using standard survey tools, e.g., the fraction of injection drug users who are HIV positive. Baraff et al. (2016) introduced an approach to estimating uncertainty in population proportion estimates from respondent-driven sampling using the tree bootstrap method. In this paper we establish the consistency of this tree bootstrap approach in the case of [Formula: see text]-trees.

2 citations


Journal ArticleDOI
TL;DR: This paper has corrected a major mistake in the research paper of Singh and Mathur and proposed the corresponding corrected estimator of sensitive population proportion, and obtained the variance of the proposed estimator.
Abstract: In this paper, we have pointed out a major mistake in the research paper of Singh and Mathur [(2004). Unknown repeated trials in the unrelated question randomized response model, Biometrical Journal, 46:375-378]. We have corrected this mistake and proposed the corresponding corrected estimator of sensitive population proportion. Furthermore, we have obtained the variance of our proposed estimator. Likewise, Singh and Mathur, we have also compared the variance of our proposed estimator with that of the Greenberg et al.'s estimator theoretically as well as numerically.

Journal Article
TL;DR: The factors affecting the interprovincial transmission and development of coronavirus disease 2019 in China are explored with a view to providing recommendations for the formulation of preventive and control measures according to the actual conditions in different regions during the outbreak of the severe infectious disease.
Abstract: Objective: To explore the factors affecting the interprovincial transmission and development of coronavirus disease 2019 (COVID-19) in China, with a view to providing recommendations for the formulation of preventive and control measures according to the actual conditions in different regions during the outbreak of the severe infectious disease Methods: We collected the total number of confirmed cases of COVID-19 in 30 provinces and cities in China by the end of 24:00 February 25, 2020 Then we also collected the distance from each region to Hubei province, the proportion of population moving out from Wuhan city from January 1 to January 23, population density, urban population, traffic passenger volume, passenger turnover volume and other relevant data of each region The cumulative confirmed cases including the most of imported cases by the end of 24:00 January 29, 2020 were taken as the first-stage cases cluster, and the cumulative newly confirmed cases including the most of secondary cases from 0:00 January 30 to 24:00 February 25, 2020 were taken as the second-stage cases cluster Pearson bivariate correlation and linear fitting regression method were adopted to analyze the effects of population migration, transportation, economy and other factors on the transmission and development of COVID-19 in different regions In the linear fitting regression, the multi-factor optimal subset model was used to screen the factors most closely related to COVID-19 Results: The distance from each region to Hubei province was negatively correlated with the first-stage cases cluster with the most of imported cases and the second-stage cases cluster with the most of secondary cases(t=-3 654, t=-3 679, both P2 760, all P<0 05) GDP and the proportion of population moving out from Wuhan were most closely related to the first-stage cases cluster with the most of imported cases (t=4 173, t=7 851, all P<0 05) The first-stage cases cluster, the proportion of population moving out from Wuhan, and urban population were most closely related to the second-stage cases cluster with the most of secondary cases (t=4 734, t=3 491, t=2 855, all P<0 05) Results: GDP and the proportion of population moving out from Wuhan city had the greatest impact on the stage with the most of imported cases The imported cases, the proportion of population moving out from Wuhan and the urban population had the greatest impact on the stage with the most of secondary cases In the early stage of epidemic outbreak with the most of imported cases,we should consider strengthening the prevention and control of the epidemic in areas with high level of GDP and high proportion of population moving out from the epidemic area The flow of population should be restricted more strictly as soon as possible in order to effectively curb the outbreak of the epidemic In the later-stage of epidemic with the most of secondary cases, regionalized control policies should be formulated mainly according to the indicators of imported cases, the population proportion fromtheepidemic area, and the urban population Finally, the contact of population should be restricted reasonably to prevent further development of the epidemic

Posted ContentDOI
10 Jul 2020-medRxiv
TL;DR: Using mathematical relationships relating these generalized logistic distributions, the populationportion remaining Susceptible can be approximated using the inverse of a standard cumulative logistic distribution, while the population proportion actively Infectious can be approximation using the density of a logistic or log-logistic distribution.
Abstract: Infectious epidemics are often described using a three-compartment Susceptible-Infectious-Removed (SIR) model, whose solution can be shown to involve generalizations of the logistic distribution. Using mathematical relationships relating these generalized logistic distributions, the population proportion remaining Susceptible can be approximated using the inverse of a standard cumulative logistic distribution, while the population proportion actively Infectious can be approximated using the density of a logistic or log-logistic distribution. Conversely, the parameters of an underlying SIR model can be approximately inferred from population-based data that have been estimated using logistic and/or log-logistic models.

Journal ArticleDOI
01 Jun 2020
TL;DR: In this article, a simple survey technique is applied to estimate the population proportion π of a sensitive trait, in addition to T, the probability that a respondent truthfully states that he or she bears a sensitive character when questioned directly and examined its properties.
Abstract: In this paper, a simple survey technique is applied to estimate the population proportion π of a sensitive trait, in addition to T, the probability that a respondent truthfully states that he or she bears a sensitive character when questioned directly and examined its properties. It has been found that the suggested model is efficient. Numerical illustrations are presented to support the theoretical results.

Journal ArticleDOI
TL;DR: This article proposed a model-based predictive estimator of the finite population proportion of a misclassified binary response, when information on the auxiliary variable(s) is available for all units in the population.
Abstract: We propose a model-based predictive estimator of the finite population proportion of a misclassified binary response, when information on the auxiliary variable(s) is available for all units in the population. Asymptotic properties of the misclassification-adjusted predictive estimator are also explored. We propose a computationally efficient bootstrap variance estimator that exhibits better performance compared to usual analytical variance estimator. The performance of the proposed estimator is compared with other commonly used design-based estimators through extensive simulation studies. The results are supplemented by an empirical study based on literacy data.

Book ChapterDOI
22 Dec 2020

Book
20 Sep 2020
TL;DR: In this article, the authors suggest an estimator using two auxiliary variables in stratified random sampling for estimating population mean and almost unbiased estimators using known value of some population parameter(s) with known population proportion of an auxiliary variable has been used.
Abstract: The main aim of the present book is to suggest some improved estimators using auxiliary and attribute information in case of simple random sampling and stratified random sampling and some inventory models related to capacity constraints. This volume is a collection of five papers, written by six co-authors (listed in the order of the papers): Dr. Rajesh Singh, Dr. Sachin Malik, Dr. Florentin Smarandache, Dr. Neeraj Kumar, Mr. Sanjey Kumar & Pallavi Agarwal. In the first chapter authors suggest an estimator using two auxiliary variables in stratified random sampling for estimating population mean. In second chapter they proposed a family of estimators for estimating population means using known value of some population parameters. In Chapter third an almost unbiased estimator using known value of some population parameter(s) with known population proportion of an auxiliary variable has been used. In Chapter four the authors investigates a fuzzy economic order quantity model for two storage facility. The demand, holding cost, ordering cost, storage capacity of the own - warehouse are taken as trapezoidal fuzzy numbers. In Chapter five a two-warehouse inventory model deals with deteriorating items, with stock dependent demand rate and model affected by inflation under the pattern of time value of money over a finite planning horizon. Shortages are allowed and partially backordered depending on the waiting time for the next replenishment. The purpose of this model is to minimize the total inventory cost by using the genetic algorithm. This book will be helpful for the researchers and students who are working in the field of sampling techniques and inventory control.