Benchmarking Physician Performance: Reliability of Individual and Composite Measures

Journal Article•

Benchmarking Physician Performance: Reliability of Individual and Composite Measures

Sarah Hudson Scholle¹, Joachim Roski², Joachim Roski¹, John L. Adams³, Daniel L. Dunn, Eve A. Kerr, Donna Pillittere Dugan¹, Roxanne E. Jensen⁴ - Show less +4 more•Institutions (4)

National Committee for Quality Assurance¹, Brookings Institution², RAND Corporation³, Johns Hopkins University⁴

01 Dec 2008-The American Journal of Managed Care (NIH Public Access)-Vol. 14, Iss: 12, pp 833-838

TL;DR: In typical health plan administrative data, most physicians do not have adequate numbers of quality events to support reliable quality measurement, and the reliability of quality measures should be taken into account when quality information is used for public reporting and accountability.

read less

Abstract: Measuring physician performance is becoming commonplace as health plans and purchasers look for ways to drive quality improvement and to increase physicians' accountability and rewards for achieving quality goals. A recent study1 reported that, among 89% of health maintenance organization plans using physician-oriented pay-for-performance programs, more than one-third measured and rewarded quality at the individual physician level. In addition, public and private purchasers are demanding more information about America's physicians and hospitals to aid in value-based purchasing and selection of health plans and providers.2 However, concerns remain regarding the validity and reliability of such physician performance profiles. Several factors are needed to support fair and accurate comparisons among physicians. These include evidence-based quality measures, complete and accurate data sources, and standardized methods of data collection. Physician-level reliability of a quality measure is another key consideration in this measurement. Physician-level reliability refers to the ability of a quality measure to distinguish an individual physician's performance from the performance of physicians overall. Good physician-level reliability requires the following 2 factors: (1) a sufficient number of patients eligible for a given quality measure and (2) performance variation across physicians on that quality measure.3-5 The greater the number of a physician's patients who are eligible for a quality measure, the more precise the estimate of the physician's performance. When performance variation for a given quality measure across physicians is limited, the likelihood that a physician's performance is statistically significantly different from that of his or her peers is also decreased. Hofer and colleagues6 showed that not controlling for a quality measure's physician-level reliability significantly misrepresented performance differences across physicians. However, adjusting performance profiles in such a manner is not commonplace across the healthcare industry. Ensuring that measurement results are valid and reliable is important when purchasers and plans (and potentially consumers) use the data to make decisions about which physicians get financial rewards or other benefits. The stakes are particularly high when profiling results are used for public reporting or eligibility for participation in a health plan network. Paying attention to the validity and reliability of data will help to ensure that these decisions are based on real differences in performance among physicians rather than any shortcomings of the measurement. Although performance results based on limited sample sizes could be adjusted for the reliability of individual measures,7-9 the creation of composite scores may also be a useful way to increase the reliability of physicians' performance scores.10 Little is known about the extent to which constructing composite scores mitigates the limitations of sample size and reliability, while continuing to provide useful and understandable information.11 To date, there have been few reports regarding the reliability of physician-level performance scores associated with commonly used practices and methods in the healthcare industry. To begin to address this deficiency, this study relied on a large data set that combined patient-level administrative data from 9 large health plans to compute performance for primary care physicians (PCPs) using 27 commonly measured quality indicators. This data set is typical of data sources often used by individual health plans to profile physician performance. Specifically, we examined for each quality measure and composite score the proportion of PCPs who could be evaluated given different minimum sample size criteria and the physician-level reliability under those minimum sample size criteria. Our primary research questions were the following: (1) What is the physician-level reliability of commonly used performance measures calculated exclusively based on administrative data? (2) Can more physicians be reliably evaluated using a composite score?

...read moreread less

Benchmarking Physician Performance: Reliability of Individual and Composite Measures

Citations

Cites background from "Benchmarking Physician Performance:..."

Cites background from "Benchmarking Physician Performance:..."

References

"Benchmarking Physician Performance:..." refers background in this paper

Related Papers (5)