scispace - formally typeset
Journal ArticleDOI

Kappa coefficients in medical research

TLDR
Kappa coefficients are measures of correlation between categorical variables often used as reliability or validity coefficients, and development and definitions of the K by M (ratings) kappas (K x M) are recapitulate and the use of the recommended kappa with applications in medical research is illustrated.
Abstract
Kappa coefficients are measures of correlation between categorical variables often used as reliability or validity coefficients. We recapitulate development and definitions of the K (categories) by M (ratings) kappas (K x M), discuss what they are well- or ill-designed to do, and summarize where kappas now stand with regard to their application in medical research. The 2 x M(M>/=2) intraclass kappa seems the ideal measure of binary reliability; a 2 x 2 weighted kappa is an excellent choice, though not a unique one, as a validity measure. For both the intraclass and weighted kappas, we address continuing problems with kappas. There are serious problems with using the K x M intraclass (K>2) or the various K x M weighted kappas for K>2 or M>2 in any context, either because they convey incomplete and possibly misleading information, or because other approaches are preferable to their use. We illustrate the use of the recommended kappas with applications in medical research.

read more

Citations
More filters
Journal ArticleDOI

The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements

TL;DR: The issue of statistical testing of kappa is considered, including the use of confidence intervals, and appropriate sample sizes for reliability studies using kappa are tabulated.
Journal ArticleDOI

The NimStim set of facial expressions: Judgments from untrained research participants

TL;DR: The results lend empirical support for the validity and reliability of this set of facial expressions as determined by accurate identification of expressions and high intra-participant agreement across two testing sessions, respectively.
Journal ArticleDOI

Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed.

TL;DR: In this paper, the authors developed guidelines for reporting reliability and agreement studies in interrater and intra-arater reliability and agreements, and proposed 15 issues that should be addressed when reporting such studies.
Book

Bayesian Cognitive Modeling: A Practical Course

TL;DR: In this article, the basics of Bayesian analysis are discussed, and a WinBUGS-based approach is presented to get started with WinBUGs, which is based on the SIMPLE model of memory.
Journal ArticleDOI

Clinical classification schemes for predicting hemorrhage: Results from the National Registry of Atrial Fibrillation (NRAF)

TL;DR: In this article, a new bleeding risk scheme, HEMORR 2 HAGES, was proposed to quantify the risk of hemorrhage in elderly patients with atrial fibrillation.
References
More filters
Journal ArticleDOI

The measurement of observer agreement for categorical data

TL;DR: A general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies is presented and tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interob server agreement are developed as generalized kappa-type statistics.
Journal ArticleDOI

A Coefficient of agreement for nominal Scales

TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.
Journal ArticleDOI

Intraclass correlations: uses in assessing rater reliability.

TL;DR: In this article, the authors present guidelines for choosing among six different forms of the intraclass correlation for reliability studies in which n target are rated by k judges, and the confidence intervals for each of the forms are reviewed.
Book

Statistical methods for rates and proportions

TL;DR: In this paper, the basic theory of Maximum Likelihood Estimation (MLE) is used to detect a difference between two different proportions of a given proportion in a single proportion.
Journal ArticleDOI

Bootstrap Methods: Another Look at the Jackknife

TL;DR: In this article, the authors discuss the problem of estimating the sampling distribution of a pre-specified random variable R(X, F) on the basis of the observed data x.