scispace - formally typeset
Search or ask a question

Showing papers on "Sample size determination published in 1997"


Journal ArticleDOI
TL;DR: An examination of the performance of the tests when the correct model has a quadratic term but a model containing only the linear term has been fit shows that the Pearson chi-square, the unweighted sum-of-squares, the Hosmer-Lemeshow decile of risk, the smoothed residual sum- of-Squares and Stukel's score test, have power exceeding 50 per cent to detect moderate departures from linearity.
Abstract: Recent work has shown that there may be disadvantages in the use of the chi-square-like goodness-of-fit tests for the logistic regression model proposed by Hosmer and Lemeshow that use fixed groups of the estimated probabilities. A particular concern with these grouping strategies based on estimated probabilities, fitted values, is that groups may contain subjects with widely different values of the covariates. It is possible to demonstrate situations where one set of fixed groups shows the model fits while the test rejects fit using a different set of fixed groups. We compare the performance by simulation of these tests to tests based on smoothed residuals proposed by le Cessie and Van Houwelingen and Royston, a score test for an extended logistic regression model proposed by Stukel, the Pearson chi-square and the unweighted residual sum-of-squares. These simulations demonstrate that all but one of Royston's tests have the correct size. An examination of the performance of the tests when the correct model has a quadratic term but a model containing only the linear term has been fit shows that the Pearson chi-square, the unweighted sum-of-squares, the Hosmer-Lemeshow decile of risk, the smoothed residual sum-of-squares and Stukel's score test, have power exceeding 50 per cent to detect moderate departures from linearity when the sample size is 100 and have power over 90 per cent for these same alternatives for samples of size 500. All tests had no power when the correct model had an interaction between a dichotomous and continuous covariate but only the continuous covariate model was fit. Power to detect an incorrectly specified link was poor for samples of size 100. For samples of size 500 Stukel's score test had the best power but it only exceeded 50 per cent to detect an asymmetric link function. The power of the unweighted sum-of-squares test to detect an incorrectly specified link function was slightly less than Stukel's score test. We illustrate the tests within the context of a model for factors associated with low birth weight.

1,666 citations


Book
15 Jan 1997
TL;DR: The aim of this book is to help designers and marketers better understand their clients' needs and improve the quality of their work.
Abstract: Preface to the Second Edition Preface to the First Edition 1. Basic Design Considerations 2. The Normal Distribution 3. Comparing Independent Groups for Binary, Ordered Categorical and Continuous Data 4. Comparing Paired Groups for Binary, Ordered Categorical and Continuous Outcomes 5. Demonstrating Equivalence 6. Confidence Intervals 7. Post-Marketing Surveillance 8. The Correlation Coefficient 9. Comparing Two Survival Curves 10. Phase II Trials 11. Observer Agreement Studies 12. Randomization Cumalative References Author Index Subject Index

1,188 citations


Journal ArticleDOI
TL;DR: The special characteristics of items, confounds by minor, unwanted covariance, and the likelihood of a general factor-and better understanding of factor analysis means that the default procedure of many statistical packages (Little Jiffy) is no longer adequate for exploratory item factor analysis.
Abstract: The special characteristics of items-low reliability, confounds by minor, unwanted covariance, and the likelihood of a general factor-and better understanding of factor analysis means that the default procedure of many statistical packages (Little Jiffy) is no longer adequate for exploratory item factor analysis. It produces too many factors and precludes a general factor even when that means the factors extracted are nonreplicable. More appropriate procedures that reduce these problems are presented, along with how to select the sample, sample size required, and how to select items for scales. Proposed scales can be evaluated by their correlations with the factors; a new procedure for doing so eliminates the biased values produced by correlating them with either total or factor scores. The role of exploratory factor analysis relative to cluster analysis and confirmatory factor analysis is noted.

984 citations


Journal ArticleDOI
TL;DR: In this paper, a generality of latent variable modeling of individual differences in development over time is demonstrated with a particular emphasis on randomized intervention studies and an approach for the estimation of power to detect treatment effects in this framework is demonstrated.
Abstract: The generality of latent variable modeling of individual differences in development over time is demonstrated with a particular emphasis on randomized intervention studies. First, a brief overview is given of biostatistica l and psychometric approaches to repeated measures analysis. Second, the generality of the psychometric approach is indicated by some nonstandard models. Third, a multiple-population analysis approach is proposed for the estimation of treatment effects. The approach clearly describes the treatment effect as development that differs from normative, control-group development. This framework allows for interactions between treatment and initial status in their effects on development. Finally, an approach for the estimation of power to detect treatment effects in this framework is demonstrated. Illustrations of power calculations are carried out with artificial data, varying the sample sizes, number of timepoints, and treatment effect sizes. Real data are used to illustrate analysis strategies and power calculations. Further modeling extensions are discussed.

730 citations


01 Jan 1997
TL;DR: In this article, the authors explore the frequently encountered problem that a tuneseries formed as an average of a sample of individr4al timeseries has a variance that depends upon the size of the sample.
Abstract: This note explores the frequently encountered problem that a tuneseries formed as an average of a sample of individr4al timeseries has a variance that depends upon the size of the sample. Methods for adjusting the timeseries to reduce this uar2bnce bias are demonstrated -first a simple one and then extensions to it to allow for time-dependent and tirnescale-dependent effects. The discussion and techniques are applicable to the construction of tree-ring chronologies from an average of individual cores (or trees) and to the calculation of regional tree-growth timeseries by aueraging individual chronologies.

413 citations



Journal ArticleDOI
01 Oct 1997-Geoderma
TL;DR: In this article, the authors compare the performance of two sampling strategies, namely, Stratified Simple Random Sampling (SRS) and Systematic Sampling with the block kriging predictor (SY, t OK ), in a simulation study on the basis of the design-based quality criteria, which can be assembled in a decision tree that can be helpful in choosing between the two approaches.

388 citations


Journal ArticleDOI
TL;DR: Using actual elbow flexor make and break strength measurements, this article illustrates a method for estimating a confidence interval for the SEM, shows how an a priori specification of confidence interval width can be used to estimate sample size, and provides several approaches for comparing error variances.
Abstract: The intraclass correlation coefficient (ICC) and the standard error of measurement (SEM) are two reliability coefficients that are reported frequently. Both measures are related; however, they define distinctly different properties. The magnitude of the ICC defines a measure's ability to discriminate among subjects, and the SEM quantifies error in the same units as the original measurement. Most of the statistical methodology addressing reliability presented in the physical therapy literature (eg, point and interval estimations, sample size calculations) focuses on the ICC. Using actual elbow flexor make and break strength measurements, this article illustrates a method for estimating a confidence interval for the SEM, shows how an a priori specification of confidence interval width can be used to estimate sample size, and provides several approaches for comparing error variances (and square root of the error variance, or the SEM).

385 citations


Journal ArticleDOI
TL;DR: The structural components method is extended to the estimation of the Receiver Operating Characteristics (ROC) curve area for clustered data, incorporating the concepts of design effect and effective sample size used by Rao and Scott (1992, Biometrics 48, 577-585) for clustered binary data.
Abstract: Current methods for estimating the accuracy of diagnostic tests require independence of the test results in the sample. However, cases in which there are multiple test results from the same patient are quite common. In such cases, estimation and inference of the accuracy of diagnostic tests must account for intracluster correlation. In the present paper, the structural components method of DeLong, DeLong, and Clarke-Pearson (1988, Biometrics 44, 837-844) is extended to the estimation of the Receiver Operating Characteristics (ROC) curve area for clustered data, incorporating the concepts of design effect and effective sample size used by Rao and Scott (1992, Biometrics 48, 577-585) for clustered binary data. Results of a Monte Carlo simulation study indicate that the size of statistical tests that assume independence is inflated in the presence of intracluster correlation. The proposed method, on the other hand, appropriately handles a wide variety of intracluster correlations, e.g., correlations between true disease statuses and between test results. In addition, the method can be applied to both continuous and ordinal test results. A strategy for estimating sample size requirements for future studies using clustered data is discussed.

376 citations


Journal ArticleDOI
TL;DR: In this paper, the authors suggest that confidence intervals be used in lieu of retrospective power analyses for null hypotheses that were not rejected to assess the likely size of the true effect, and that minimum biologically significant effect sizes be used for all power analyses, and if retrospective power estimates are to be reported, then the α-level, effect sizes, and sample sizes used in calculations must also be reported.
Abstract: Statistical power analysis can be used to increase the efficiency of research efforts and to clarify research results. Power analysis is most valuable in the design or planning phases of research efforts. Such prospective (a priori) power analyses can be used to guide research design and to estimate the number of samples necessary to achieve a high probability of detecting biologically significant effects. Retrospective (a posteriori) power analysis has been advocated as a method to increase information about hypothesis tests that were not rejected. However, estimating power for tests of null hypotheses that were not rejected with the effect size observed in the study is incorrect; these power estimates will always be ≤0.50 when bias adjusted and have no relation to true power. Therefore, retrospective power estimates based on the observed effect size for hypothesis tests that were not rejected are misleading; retrospective power estimates are only meaningful when based on effect sizes other than the observed effect size, such as those effect sizes hypothesized to be biologically significant. Retrospective power analysis can be used effectively to estimate the number of samples or effect size that would have been necessary for a completed study to have rejected a specific null hypothesis. Simply presenting confidence intervals can provide additional information about null hypotheses that were not rejected, including information about the size of the true effect and whether or not there is adequate evidence to accept a null hypothesis as true. We suggest that (1) statistical power analyses be routinely incorporated into research planning efforts to increase their efficiency, (2) confidence intervals be used in lieu of retrospective power analyses for null hypotheses that were not rejected to assess the likely size of the true effect, (3) minimum biologically significant effect sizes be used for all power analyses, and (4) if retrospective power estimates are to be reported, then the α-level, effect sizes, and sample sizes used in calculations must also be reported.

363 citations


Journal ArticleDOI
TL;DR: In this article, Monte Carlo simulations were conducted to examine the degree to which the statistical power of moderated multiple regression (MMR) to detect the effects of a dichotomous moderator variable was affected by the main and interactive effects of predictor variable range restriction, total sample size, sample sizes for 2 moderator variable-based subgroups, predictor variable intercorrelation, and magnitude of the moderating effect.
Abstract: Monte Carlo simulations were conducted to examine the degree to which the statistical power of moderated multiple regression (MMR) to detect the effects of a dichotomous moderator variable was affected by the main and interactive effects of (a) predictor variable range restriction, (b) total sample size, (c) sample sizes for 2 moderator variablebased subgroups, (d) predictor variable intercorrelation, and (e) magnitude of the moderating effect. Results showed that the main and interactive influences of these variables may have profound effects on power. Thus, future attempts to detect moderating effects with MMR should consider the power implications of both the main and interactive effects of the variables assessed in the present study. Otherwise, even moderating effects of substantial magnitude may go undetected.

Journal ArticleDOI
TL;DR: A number of techniques for dealing with multicollinearity are discussed and a dataset from a recent study of risk factors for pneumonia in swine is demonstrated using a dataset used in this paper.

Journal ArticleDOI
TL;DR: A new sampling technique is presented that generates and inverts the Hammersley points to provide a representative sample for multivariate probability distributions and is compared to a sample obtained from a Latin hypercube design by propagating it through a set of nonlinear functions.
Abstract: The basic setting of this article is that of parameter-design studies using data from computer models. A general approach to parameter design is introduced by coupling an optimizer directly with the computer simulation model using stochastic descriptions of the noise factors. The computational burden of these approaches can be extreme, however, and depends on the sample size used for characterizing the parametric uncertainties. In this article, we present a new sampling technique that generates and inverts the Hammersley points (a low-discrepancy design for placing n points uniformly in a k-dimensional cube) to provide a representative sample for multivariate probability distributions. We compare the performance of this to a sample obtained from a Latin hypercube design by propagating it through a set of nonlinear functions. The number of samples required to converge to the mean and variance is used as a measure of performance. The sampling technique based on the Hammersley points requires far fewer sampl...

Journal ArticleDOI
TL;DR: This paper presents a method to compute sample sizes and statistical powers for studies involving correlated observations, and appeals to a statistic based on the generalized estimating equation method for correlated data.
Abstract: Correlated data occur frequently in biomedical research. Examples include longitudinal studies, family studies, and ophthalmologic studies. In this paper, we present a method to compute sample sizes and statistical powers for studies involving correlated observations. This is a multivariate extension of the work by Self and Mauritsen (1988, Biometrics 44, 79-86), who derived a sample size and power formula for generalized linear models based on the score statistic. For correlated data, we appeal to a statistic based on the generalized estimating equation method (Liang and Zeger, 1986, Biometrika 73, 13-22). We highlight the additional assumptions needed to deal with correlated data. Some special cases that are commonly seen in practice are discussed, followed by simulation studies.

Journal ArticleDOI
TL;DR: In this paper, a small number of simple problems, such as estimating the mean of a normal distribution or the slope in a regression equation, are covered, and some key techniques are presented.
Abstract: This paper is concerned with methods of sample size determination. The approach is to cover a small number of simple problems, such as estimating the mean of a normal distribution or the slope in a regression equation, and to present some key techniques. The methods covered are in two groups: frequentist and Bayesian. Frequentist methods specify a null and alternative hypothesis for the parameter of interest and then find the sample size by controlling both size and power. These methods often need to use prior information but cannot allow for the uncertainty that is associated with it. By contrast, the Bayesian approach offers a wide variety of techniques, all of which offer the ability to deal with uncertainty associated with prior information.

Journal ArticleDOI
TL;DR: Potential applications of the P-value distribution under the alternative hypothesis to the design, analysis, and interpretation of results of clinical trials are considered.
Abstract: The P-value is a random variable derived from the distribution of the test statistic used to analyze a data set and to test a null hypothesis. Under the null hypothesis, the P-value based on a continuous test statistic has a uniform distribution over the interval [0, 1], regardless of the sample size of the experiment. In contrast, the distribution of the P-value under the alternative hypothesis is a function of both sample size and the true value or range of true values of the tested parameter. The characteristics, such as mean and percentiles, of the P-value distribution can give valuable insight into how the P-value behaves for a variety of parameter values and sample sizes. Potential applications of the P-value distribution under the alternative hypothesis to the design, analysis, and interpretation of results of clinical trials are considered.

Journal ArticleDOI
TL;DR: This paper showed that independent locational observations contain more spatial information than n autocorrelated observations and developed a statistical test of the null hypothesis that successive observations are independent, which is robust when used with data collected from utilization distributions that are not normal but sensitive to nonstationary distributions induced by shifts in centers of activity or variance-covariance structure.
Abstract: In a previous study, we showed that n independent locational observations contain more spatial information than n autocorrelated observations. We also developed a statistical test of the null hypothesis that successive observations are independent. Here, we expand our discussion of testing for independence by clarifying assumptions associated with the tests. Specifically, the tests are robust when used with data collected from utilization distributions that are not normal, but they are sensitive to nonstationary distributions induced by shifts in centers of activity or variance-covariance structure. We also used simulations to examine how negative bias in kernel and polygon estimators of home-range size is influenced by level of autocorrelation, sampling rate, sampling design, and study duration. Relative bias increased with increasing levels of autocorrelation and reduced sample sizes. Kernel (95%) estimates were less biased than minimum convex polygon estimates. The effect of autocorrelation is greatest when low levels of bias (> -5%) are desired. For percent relative bias in the range of -20% to -5%, though, collection of moderately autocorrelated data bears little cost in terms of additional loss of spatial information relative to an equal number of independent observations. Tests of independence, when used with stationary data, provide a useful measure of the rate of home-range use and a means of checking assumptions associated with analyses of habitat use. However, our results indicate that exclusive use of independent observations is unnecessary when estimating home-range size with kernel or polygon methods.

Book
01 Oct 1997
TL;DR: In this paper, the authors describe the objectives and scope of statistical inference, its objectives, its scope, and its goals. But they do not discuss the theoretical basis for some common statistical methods.
Abstract: 1. Statistics: Its Objectives And Scope. 2. Describing Statistical Populations. 3. Statistical Inference: Basic Concepts. 4. Inferences About One Or Two Populations: Interval Data. 5. Inferences About One Or Two Populations: Ordinal Data. 6. Inferences About One Or Two Populations: Categorical Data. 7. Designing Research Studies. 8. Single Factor Studies. 9. Single Factor Studies: Comparing Means And Determining Sample Sizes. 10. Simple Linear Regression. 11. Multiple Linear Regression. 12. The General Linear Model. 13. Completely Randomized Factorial Experiments. 14. Random And Mixed Effects Anova 15. Anova Models With Block Effects. 16. Repeated Measures Studies. Appendix A: Notes On Theoretical Statistics. Appendix B: Notes On Theoretical Basis For Some Common Statistical Methods. Appendix C: Statistical Tables.

Journal ArticleDOI
TL;DR: In this article, a Monte Carlo approach was used to examine bias in the estimation of indirect effects and their associated standard errors and found that robust standard errors consistently yielded the most accurate estimates of sampling variability.
Abstract: A Monte Carlo approach was used to examine bias in the estimation of indirect effects and their associated standard errors. In the simulation design, (a) sample size, (b) the level of nonnormality characterizing the data, (c) the population values of the model parameters, and (d) the type of estimator were systematically varied. Estimates of model parameters were generally unaffected by either nonnormality or small sample size. Under severely nonnormal conditions, normal theory maximum likelihood estimates of the standard error of the mediated effect exhibited less bias (approximately 10% to 20% too small) compared to the standard errors of the structural regression coefficients (20% to 45% too small). Asymptotically distribution free standard errors of both the mediated effect and the structural parameters were substantially affected by sample size, but not nonnormality. Robust standard errors consistently yielded the most accurate estimates of sampling variability.

Journal ArticleDOI
01 Jul 1997
TL;DR: A fully Bayesian treatment of the problem of how large a sample to take from a population in order to make an inference about, or to take a decision concerning, some feature of the population is described.
Abstract: This paper discusses the problem of how large a sample to take from a population in order to make an inference about, or to take a decision concerning, some feature of the population. It first describes a fully Bayesian treatment. It then compares this with other methods described in recent papers in The Statistician. The major contrast lies in the use of a utility function in the Bayesian approach, whereas other methods use constraints, such as fixing an error probability.

Journal ArticleDOI
TL;DR: A stimulation study conducts a stimulation study to evaluate coverage error, interval width and relative bias of four main methods for the construction of confidence intervals of log-normal means: the naive method; Cox's method; a conservative method; and a parametric bootstrap method.
Abstract: In this paper we conduct a stimulation study to evaluate coverage error, interval width and relative bias of four main methods for the construction of confidence intervals of log-normal means: the naive method; Cox's method; a conservative method; and a parametric bootstrap method. The simulation study finds that the naive method is inappropriate, that Cox's method has the smallest coverage error for moderate and large sample sizes, and that the bootstrap method has the smallest coverage error for small sample sizes. In addition, Cox's method produces the smallest interval width among the three appropriate methods. We also apply the four methods to a real data set to contrast the differences.

Journal ArticleDOI
TL;DR: Although the Breslow approximation is the default in many standard software packages, the Efron method for handling ties is to be preferred, particularly when the sample size is small either from the outset or due to heavy censoring.
Abstract: Survival-time studies sometimes do not yield distinct failure times. Several methods have been proposed to handle the resulting ties. The goal of this paper is to compare these methods. Simulations were conducted, in which failure times were generated for a two-sample problem with an exponential hazard, a constant hazard ratio, and no censoring. Failure times were grouped to produce heavy, moderate, and light ties, corresponding to a mean of 10.0, 5.0, and 2.5 failures per interval. Cox proportional hazards models were fit using each of three approximations for handling ties with each interval size for sample sizes of n = 25, 50, 250, and 500 in each group. The Breslow (1974, Biometrics 30, 89-99) approximation tends to underestimate the true beta, while the Kalbfleisch-Prentice (1973, Biometrika 60, 267-279) approximation tends to overestimate beta. As the ties become heavier, the bias of these approximations increases. The Efron (1977, Journal of the American Statistical Association 72, 557-565) approximation performs far better than the other two, particularly with moderate or heavy ties; even with n = 25 in each group, the bias is under 2%, and for sample sizes larger than 50 per group, it is less than 1%. Except for the heaviest ties in the smallest sample, confidence interval coverage for all three estimators fell in the range of 94-96%. However, the tail probabilities were asymmetric with the Breslow and Kalbfleisch-Prentice formulas; using the Efron approximation, they were closer to the nominal 2.5%. Although the Breslow approximation is the default in many standard software packages, the Efron method for handling ties is to be preferred, particularly when the sample size is small either from the outset or due to heavy censoring.

Journal ArticleDOI
TL;DR: In this article, a theoretical analysis of the sampling distribution of correlations led to the surprising conclusion that the use of small samples has a potential advantage for the early detection of a correlation.
Abstract: A theoretical analysis (Y. Kareev, 1995b) of the sampling distribution of correlations led to the surprising conclusion that the use of small samples has a potential advantage for the early detection of a correlation. This is so because the distribution is highly skewed, and the smaller the sample size, the more the distribution is skewed. This article describes 2 experiments that were designed as empirical tests of this conclusion. In Experiment 1 (N = 112), the authors compared the predictions of participants differing in their working-memory capacity (hence in the size of the samples they were likely to consider). In Experiment 2 (N = 144), the authors compared the predictions of participants who viewed samples of different sizes, whose size was determined by the authors. The results fully supported Y. Kareev's conclusion : In both experiments, participants with lower capacity (or smaller samples) indeed perceived the correlation as more extreme and were more accurate in their predictions.

Posted Content
TL;DR: In this article, the structural parameters of the PMLE model are estimated using three simulation-based estimators, namely, the Simulated Methods of Moments estimator, the indirect inference estimator and the matching score estimator.
Abstract: The non-negativity constraint on inventories imposed on the rational expectations theory of speculative storage implies that the conditional mean and variance of commodity prices are nonlinear in lagged prices and have a kink at a threshold point. In this paper, the structural parameters of this model are estimated using three simulation based estimators. The finite sample properties of the Simulated Methods of Moments estimator of Duffie and Singleton (1993), the Indirect Inference estimator of Gourieroux, Monfort and Renault (1993), and the matching score estimator of Gallant and Tauchen (1996) are assessed. Exploiting the invariant distribution implied by the theory allows us to assess the error induced by simulations. Our results show that while all three estimators produce reasonably good estimates with properties that stack up well with those of the PMLE, there are tradeoffs among the three estimators in terms of bias, efficiency, and computation demands. Some estimators are more sensitive to the sample size and the number of simulations than others. A careful choice of the moments/auxiliary models can lead to a substantial reduction in bias and an improvement in efficiency. Increasing the number of simulated data points can sometimes reduce the bias and improve the efficiency of the estimates when the sample size is small.

Journal ArticleDOI
TL;DR: Several Bayesian and mixed Bayesian/likelihood approaches to sample size calculations based on lengths and coverages of posterior credible intervals are applied to the design of an experiment to estimate the difference between two binomial proportions.
Abstract: Sample size estimation is a major component of the design of virtually every experiment in medicine. Prudent use of the available prior information is a crucial element of experimental planning. Most sample size formulae in current use employ this information only in the form of point estimates, even though it is usually more accurately expressed as a distribution over a range of values. In this paper, we review several Bayesian and mixed Bayesian/likelihood approaches to sample size calculations based on lengths and coverages of posterior credible intervals. We apply these approaches to the design of an experiment to estimate the difference between two binomial proportions, and we compare results to those derived from standard formulae. Consideration of several criteria can contribute to selection of a final sample size.

Journal ArticleDOI
TL;DR: In this article, the authors propose the hypothesis that human intuition conforms to the "empirical law of large numbers" and distinguish between two kinds of tasks, one that can be solved by this intuition and one for which it is not suAcient (sampling distributions).
Abstract: According to Jacob Bernoulli, even the ‘stupidest man’ knows that the larger one’s sample of observations, the more confidence one can have in being close to the truth about the phenomenon observed. Two-and-a-half centuries later, psychologists empirically tested people’s intuitions about sample size. One group of such studies found participants attentive to sample size; another found participants ignoring it. We suggest an explanation for a substantial part of these inconsistent findings. We propose the hypothesis that human intuition conforms to the ‘empirical law of large numbers’ and distinguish between two kinds of tasks — one that can be solved by this intuition (frequency distributions) and one for which it is not suAcient (sampling distributions). A review of the literature reveals that this distinction can explain a substantial part of the apparently inconsistent results. * c 1997 by John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: For typical ROC curves, the DBM method is robust in testing for modality effects in the null case, given a sufficient sample size, and the test formodality difference performed well for the low and intermediate ROC curve, even with small case samples.

Journal Article
TL;DR: Although effect sizes were small to moderate, the consistent pattern of reduced hospital days across a majority of studies suggests for the first time that home care has a significant impact on this costly outcome.
Abstract: OBJECTIVE: To examine the impact of home care on hospital days. DATA SOURCES: Search of automated databases covering 1964-1994 using the key words "home care," "hospice," and "healthcare for the elderly." Home care literature review references also were inspected for additional citations. STUDY SELECTION: Of 412 articles that examined impact on hospital use/cost, those dealing with generic home care that reported hospital admissions/cost and used a comparison group receiving customary care were selected (N = 20). STUDY DESIGN: A meta-analytic analysis used secondary data sources between 1967 and 1992. DATA EXTRACTION: Study characteristics that could have an impact on effect size (i.e., country of origin, study design, disease characteristics of study sample, and length of follow-up) were abstracted and coded to serve as independent variables. Available statistics on hospital days necessary to calculate an effect size were extracted. If necessary information was missing, the authors of the articles were contacted. METHODS: Effect sizes and homogeneity of variance measures were calculated using Dstat software, weighted for sample size. Overall effect sizes were compared by the study characteristics described above. PRINCIPAL FINDINGS: Effect sizes indicate a small to moderate positive impact of home care in reducing hospital days, ranging from 2.5 to 6 days (effect sizes of -.159 and -.379, respectively), depending on the inclusion of a large quasi-experimental study with a large treatment effect. When this outlier was removed from analysis, the effect size for studies that targeted terminally ill patients exclusively was homogeneous across study subcategories; however, the effect size of studies that targeted nonterminal patients was heterogeneous, indicating that unmeasured variables or interactions account for variability. CONCLUSION: Although effect sizes were small to moderate, the consistent pattern of reduced hospital days across a majority of studies suggests for the first time that home care has a significant impact on this costly outcome.

Journal ArticleDOI
TL;DR: Simon's two-stage to a three-stage design is extended, and tables for both optimal and minimax designs are provided, to reduce the expected sample size when the treatment is not promising a priori and when the accrual rate is slow.
Abstract: The objective of a phase II cancer clinical trial is to screen a treatment that can produce a similar or better response rate compared to the current treatment results. This screening is usually carried out in two stages as proposed by Simon. For ineffective treatment, the trial should terminate at the first stage. Ensign et al. extended two-stage optimal designs to three stages; however, they restricted the rejection region in the first stage to be zero response, and the sample size to at least 5. This paper extends Simon's two-stage to a three-stage design without these restrictions, and provides tables for both optimal and minimax designs. One can use the three-stage design to reduce the expected sample size when the treatment is not promising a priori and when the accrual rate is slow. The average reduction in size from a two-stage to three-stage design is 10 per cent.

Journal ArticleDOI
TL;DR: In this article, a modification of the W statistic was proposed, such that it can be extended for all sample sizes, and the critical values of W, i.e., the modification of W was given for n up to 5000.
Abstract: The W statistic of Shapiro and Wilk provides the best omnibus test of normality, but its application is limited up to n= 50. This study modifies W, such that it can be extended for all sample sizes. The critical values of W, i.e. the modification of W, is given for n up to 5000. The empirical moments show that the null distribution of W is skewed to the left and is consistant for all sample sizes. Empirical powers of W are also comparable with those of W.