scispace - formally typeset
Search or ask a question

Showing papers in "Psychological Methods in 1996"


Journal ArticleDOI
TL;DR: In this article, a framework for hypothesis testing and power analysis in the assessment of fit of covariance structure models is presented, where the value of confidence intervals for fit indices is emphasized.
Abstract: A framework for hypothesis testing and power analysis in the assessment of fit of covariance structure models is presented. We emphasize the value of confidence intervals for fit indices, and we stress the relationship of confidence intervals to a framework for hypothesis testing. The approach allows for testing null hypotheses of not-good fit, reversing the role of the null hypothesis in conventional tests of model fit, so that a significant result provides strong support for good fit. The approach also allows for direct estimation of power, where effect size is defined in terms of a null and alternative value of the root-mean-square error of approximation fit index proposed by J. H. Steiger and J. M. Lind (1980). It is also feasible to determine minimum sample size required to achieve a given level of power for any test of fit in this framework. Computer programs and examples are provided for power analyses and calculation of minimum sample sizes.

8,401 citations


Journal ArticleDOI
TL;DR: A review of the distinction between various forms of intraclass correlation coefficients (ICC) can be found in this article, followed by a discussion of the relationship between the two types of ICCs.
Abstract: Although intraclass correlation coefficients (ICCs) are commonly used in behavioral measurement, psychometrics, and behavioral genetics, procedures available for forming inferences about ICCs are not widely known. Following a review of the distinction between various forms of the ICC, this article p

5,858 citations


Journal ArticleDOI
TL;DR: In this paper, Monte Carlo simulations were used to investigate the performance of three X 2 test statistics in confirmatory factor analysis (CFA): Normal theory maximum likelihood )~2 (ML), Browne's asymptotic distribution free X 2 (ADF), and the Satorra-Bentler rescaled X 2(SB) under varying conditions of sample size, model specification, and multivariate distribution.
Abstract: Monte Carlo computer simulations were used to investigate the performance of three X 2 test statistics in confirmatory factor analysis (CFA). Normal theory maximum likelihood )~2 (ML), Browne's asymptotic distribution free X 2 (ADF), and the Satorra-Bentler rescaled X 2 (SB) were examined under varying conditions of sample size, model specification, and multivariate distribution. For properly specified models, ML and SB showed no evidence of bias under normal distributions across all sample sizes, whereas ADF was biased at all but the largest sample sizes. ML was increasingly overestimated with increasing nonnormality, but both SB (at all sample sizes) and ADF (only at large sample sizes) showed no evidence of bias. For misspecified models, ML was again inflated with increasing nonnormality, but both SB and ADF were underestimated with increasing nonnormality. It appears that the power of the SB and ADF test statistics to detect a model misspecification is attenuated given nonnormally distributed data.

4,168 citations


Journal ArticleDOI
TL;DR: In this article, the authors address issues that arise when meta-analyses are conducted on experiments with matched groups or repeated measures designs, and discuss procedures for computing effect size appropriately from matched groups and repeated measures.
Abstract: Tests for experiments with matched groups or repeated measures designs use error terms that involve the correlation between the measures as well as the variance of the data. The larger the correlation between the measures, the smaller the error and the larger the test statistic. If an effect size is computed from the test statistic without taking the correlation between the measures into account, effect size will be overestimated . Procedures for computing effect size appropriately from matched groups or repeated measures designs are discussed. The purpose of this article is to address issues that arise when meta-analyses are conducted on experiments with matched groups or repeated measures designs. It should be made clear at the outset that although this article pertains to metaanalyses of experiments with correlated measures, it does not pertain to meta-analyses of correlations. Such experimental designs, often called matched groups designs or designs with repeated measures, we call correlated designs (CDs), and their analysis is decidedly different from that of the independent groups designs (IGDs). The matched groups design in its simplest form occurs when subjects are matched on some variable and then randomly assigned by matched pairs to experimental and control conditions. The correlation for this type of CD is the correlation between experimental and control scores across matched pairs. The second type of CD is the repeated measures design, which in its simplest form tests the same subject under both experimental and control conditions, usually in random or counterbalanced orders to minimize carryover. The correlation of importance here is the correlation that commonly occurs between the repeated measures, and this correlation is often quite high in human research.

1,642 citations


Journal ArticleDOI
TL;DR: This article showed that the benefits that they believe flow from use of significance testing are illusory and should be replaced with point estimates and confidence intervals in individual studies and with meta-analyses in the integration of multiple studies.
Abstract: Data analysis methods in psychology still emphasize statistical significance testing, despite numerous articles demonstrating its severe deficiencies. It is now possible to use meta-analysis to show that reliance on significance testing retards the development of cumulative knowledge. But reform of teaching and practice will also require that researchers learn that the benefits that they believe flow from use of significance testing are illusory. Teachers must revamp their courses to bring students to understand that (a) reliance on significance testing retards the growth of cumulative research knowledge; (b) benefits widely believed to flow from significance testing do not in fact exist; and (c) significance testing methods must be replaced with point estimates and confidence intervals in individual studies and with meta-analyses in the integration of multiple studies. This reform is essential to the future progress of cumulative knowledge in psychological research.

1,012 citations


Journal ArticleDOI
TL;DR: In this paper, the authors examined evidence for dimensional and typological models of dissociation and reviewed previous research with the Dissociative Experiences Scale (DES; E. B. Bernstein-Carlson & F. W. Putnam, 1986) and note that this scale, like other dissociation questionnaires, was developed to
Abstract: This article examined evidence for dimensional and typological models of dissociation. The authors reviewed previous research with the Dissociative Experiences Scale (DES; E. B. Bernstein-Carlson & F. W. Putnam, 1986) and note that this scale, like other dissociation questionnaires, was developed to

814 citations



Journal ArticleDOI
TL;DR: In this article, the authors examine 26 real-world "case studies" and explicate them on the basis of the principles of reliability theory found in Cronbach (1947, 1951) and Cronbach et al. (1972).
Abstract: technical principles to concrete research problems that appear in a wide variety of forms and with a myriad of obscuring features, details, and idiosyncracies. This article is based on a different approach: examination of a series of concrete research scenarios that we have encountered in our work as researchers, advisors to researchers, or reviewers. This article examines 26 real-world "case studies" and explicates them on the basis of the principles of reliability theory found in Cronbach (1947, 1951) and Cronbach et al. (1972). Although we cite and rely on these sources, other sources (e.g., Thorndike, 1951) would yield identical resolutions; only the terminology would be sometimes (slightly) different. In classical measurement theory, the fundamental general formula for the observed correlation between any two measures, x and y, is yields the dissattenuation formula:

485 citations




Journal ArticleDOI
TL;DR: The line between sufficient and insufficient evidence is currently set at p <.05; there is little reason for allowing experimenters to select their own value of alpha as mentioned in this paper, thus null hypothesis testing is an optimal method for demonstrating sufficient evidence for an ordinal claim.
Abstract: The many criticisms of null hypothesis testing suggest when it is not useful and what is should not be used for. This article explores when and why its use is appropriate. Null hypothesis testing is insufficient when size of effect is important, but it is ideal for testing ordinal claims relating the order of conditions, which are common in psychology. Null hypothesis testing also is insufficient for determining beliefs, but it is ideal for demonstrating sufficient evidential strength to support an ordinal claim, with sufficient evidence being 1 criterion for a finding entering the corpus of legitimate findings in psychology. The line between sufficient and insufficient evidence is currently set at p < .05; there is little reason for allowing experimenters to select their own value of alpha. Thus null hypothesis testing is an optimal method for demonstrating sufficient evidence for an ordinal claim.


Journal ArticleDOI
TL;DR: In this paper, the authors show that for the case of two raters and two possible ratings, the assumption of equal frequencies can be dropped, allowing for almost immediate sample-size determination for a variety of common study designs.
Abstract: In recent years, researchers in the psychosocial and biomedical sciences have become increasingly aware of the importance of sample-size calculations in the design of research projects. Such considerations are, however, rarely applied for studies involving agreement of raters. Published results on this topic are limited and generally provide rather complex formulas. In addition, they generally make the assumption that the raters have the same set of frequencies for the possible ratings. In this article I show that for the case of 2 raters and 2 possible ratings the assumptions of equal frequencies can be dropped. Tables that allow for almost immediate sample-size determination for a variety of common study designs are given.






Journal ArticleDOI
TL;DR: Laschewer, A. D. as mentioned in this paper studied the effect of computer assisted instruction as a coaching technique for the Scholastic Aptitude Test preparation of high school juniors.
Abstract: s International, 25, 324. (University Microfilms No. 64-4257) *Laschewer, A. D. (1986). The effect of computer assisted instruction as a coaching technique for the Scholastic Aptitude Test preparation of high school juniors. Unpublished doctoral dissertation, Hofstra Uni-

Journal ArticleDOI
TL;DR: In this article, the authors extend the usual approach to the assessment of test or rater reliability to situations that have previously not been appropriate for the application of this standard (Spearman-Brown) approach.
Abstract: The authors extend the usual approach to the assessment of test or rater reliability to situations that have previously not been appropriate for the application of this standard (Spearman-Brown) approach. Specifically, the authors (a) provide an accurate overall estimate of the reliability of a test (or a panel of raters) comprising 2 or more different kinds of items (or raters), a quite common situation, and (b) provide a simple procedure for constructing the overall instrument when it comprises 2 or more kinds of items, judges, or raters, each with its own costs and its own reliabilities.




Journal ArticleDOI
TL;DR: Permutation tests, which require far fewer assumptions and are an attractive alternative to the standard asymptotic methods for assigning significance, are used to winnow the set of transitions initially identified as significant even further as discussed by the authors.
Abstract: Assigning p values to all transitions in a matrix based on their z scores is problematic on 2 counts: The z scores may not be normally distributed, and transitions are interconnected. Permutation tests, which require far fewer assumptions, are an attractive alternative to the standard asymptotic methods for assigning significance. Moreover, when asymptotic z scores are only somewhat above their critical value and sequences are short, often the exact probabilities of permutation tests are not less than .05. Log-linear and permutation methods may be used to winnow the set of transitions initially identified as significant even further. A computer program that performs these tests is available from the authors.




Journal ArticleDOI
TL;DR: In this paper, the authors proposed a combined estimator that combines effect-size estimates and vote counts for handling missing estimates, which uses all the information available from studies in a research synthesis, and gives weight to all studies proportional to the Fisher information they provide.
Abstract: Missing effect-size estimates pose a difficult problem in meta-analysis. Conventional procedures for dealing with this problem include discarding studies with missing estimates and imputing single values for missing estimates (e.g., 0, mean). An alternative procedure, which combines effect-size estimates and vote counts, is proposed for handling missing estimates. The combined estimator has several desirable features: (a) It uses all the information available from studies in a research synthesis, (b) it is consistent, (c) it is more efficient than other estimators, (d) it has known variance, and (e) it gives weight to all studies proportional to the Fisher information they provide. The combined procedure is the method of choice in a research synthesis when some studies do not provide enough information to compute effect-size estimates but do provide information about the direction or statistical significance of results.