About: Reliability (statistics) is a research topic. Over the lifetime, 78671 publications have been published within this topic receiving 1034804 citations.
Papers published on a yearly basis
TL;DR: A practical guideline for clinical researchers to choose the correct form of ICC is provided and the best practice of reporting ICC parameters in scientific publications is suggested.
Abstract: Objective Intraclass correlation coefficient (ICC) is a widely used reliability index in test-retest, intrarater, and interrater reliability analyses. This article introduces the basic concept of ICC in the content of reliability analysis.
TL;DR: While the kappa is one of the most commonly used statistics to test interrater reliability, it has limitations and levels for both kappa and percent agreement that should be demanded in healthcare studies are suggested.
Abstract: The kappa statistic is frequently used to test interrater reliability. The importance of rater reliability lies in the fact that it represents the extent to which the data collected in the study are correct representations of the variables measured. Measurement of the extent to which data collectors (raters) assign the same score to the same variable is called interrater reliability. While there have been a variety of methods to measure interrater reliability, traditionally it was measured as percent agreement, calculated as the number of agreement scores divided by the total number of scores. In 1960, Jacob Cohen critiqued use of percent agreement due to its inability to account for chance agreement. He introduced the Cohen's kappa, developed to account for the possibility that raters actually guess on at least some variables due to uncertainty. Like most correlation statistics, the kappa can range from -1 to +1. While the kappa is one of the most commonly used statistics to test interrater reliability, it has limitations. Judgments about what level of kappa should be acceptable for health research are questioned. Cohen's suggested interpretation may be too lenient for health related studies because it implies that a score as low as 0.41 might be acceptable. Kappa and percent agreement are compared, and levels for both kappa and percent agreement that should be demanded in healthcare studies are suggested.
TL;DR: For diagnostic systems used to distinguish between two classes of events, analysis in terms of the "relative operating characteristic" of signal detection theory provides a precise and valid measure of diagnostic accuracy.
Abstract: Diagnostic systems of several kinds are used to distinguish between two classes of events, essentially "signals" and "noise". For them, analysis in terms of the "relative operating characteristic" of signal detection theory provides a precise and valid measure of diagnostic accuracy. It is the only measure available that is uninfluenced by decision biases and prior probabilities, and it places the performances of diverse systems on a common, easily interpreted scale. Representative values of this measure are reported here for systems in medical imaging, materials testing, weather forecasting, information retrieval, polygraph lie detection, and aptitude testing. Though the measure itself is sound, the values obtained from tests of diagnostic systems often require qualification because the test data on which they are based are of unsure quality. A common set of problems in testing is faced in all fields. How well these problems are handled, or can be handled in a given field, determines the degree of confidence that can be placed in a measured value of accuracy. Some fields fare much better than others.
01 Nov 1979
TL;DR: The paper shows how reliability is assessed by the retest method, alternative-forms procedure, split-halves approach, and internal consistency method.
Abstract: Explains how social scientists can evaluate the reliability and validity of empirical measurements, discussing the three basic types of validity: criterion related, content, and construct. In addition, the paper shows how reliability is assessed by the retest method, alternative-forms procedure, split-halves approach, and internal consistency method.
TL;DR: The use of reliability and validity are common in quantitative research and now it is reconsidered in the qualitative research paradigm as discussed by the authors, which can also illuminate some ways to test or maximize the validity and reliability of a qualitative study.
Abstract: The use of reliability and validity are common in quantitative research and now it is reconsidered in the qualitative research paradigm. Since reliability and validity are rooted in positivist perspective then they should be redefined for their use in a naturalistic approach. Like reliability and validity as used in quantitative research are providing springboard to examine what these two terms mean in the qualitative research paradigm, triangulation as used in quantitative research to test the reliability and validity can also illuminate some ways to test or maximize the validity and reliability of a qualitative study. Therefore, reliability, validity and triangulation, if they are relevant research concepts, particularly from a qualitative point of view, have to be redefined in order to reflect the multiple ways of establishing truth. Key words: Reliability, Validity, Triangulation, Construct, Qualitative, and Quantitative This article discusses the use of reliability and validity in the qualitative research paradigm. First, the meanings of quantitative and qualitative research are discussed. Secondly, reliability and validity as used in quantitative research are discussed as a way of providing a springboard to examining what these two terms mean and how they can be tested in the qualitative research paradigm. This paper concludes by drawing upon the use of triangulation in the two paradigms (quantitative and qualitative) to show how the changes have influenced our understanding of reliability, validity and triangulation in qualitative studies.
Trending Questions (10)
Related Topics (5)
Competence (human resources)
53.5K papers, 988.8K citations
213.2K papers, 3.8M citations
New product development
41.5K papers, 1M citations
35.5K papers, 878.9K citations
Pattern recognition (psychology)
26.1K papers, 722.8K citations