scispace - formally typeset
Search or ask a question

Showing papers by "Benjamin D. Schalet published in 2021"


Journal ArticleDOI
TL;DR: In this article, the authors demonstrate multiple linking procedures (equipercentile, unidimensional IRT calibration, and calibrated projection) with the Patient-Reported Outcomes Measurement Information System Depression bank and the Patient Health Questionnaire-9.
Abstract: The psychometric process used to establish a relationship between the scores of two (or more) instruments is generically referred to as linking. When two instruments with the same content and statistical test specifications are linked, these instruments are said to be equated. Linking and equating procedures have long been used for practical benefit in educational testing. In recent years, health outcome researchers have increasingly applied linking techniques to patient-reported outcome (PRO) data. However, these applications have some noteworthy purposes and associated methodological questions. Purposes for linking health outcomes include the harmonization of data across studies or settings (enabling increased power in hypothesis testing), the aggregation of summed score data by means of score crosswalk tables, and score conversion in clinical settings where new instruments are introduced, but an interpretable connection to historical data is needed. When two PRO instruments are linked, assumptions for equating are typically not met and the extent to which those assumptions are violated becomes a decision point around how (and whether) to proceed with linking. We demonstrate multiple linking procedures—equipercentile, unidimensional IRT calibration, and calibrated projection—with the Patient-Reported Outcomes Measurement Information System Depression bank and the Patient Health Questionnaire-9. We validate this link across two samples and simulate different instrument correlation levels to provide guidance around which linking method is preferred. Finally, we discuss some remaining issues and directions for psychometric research in linking PRO instruments.

20 citations


Journal ArticleDOI
TL;DR: In this article, the Edinburgh Postnatal Depression Scale (EPDS) and Patient-Reported Measurement Information System (PROMIS®) T-score metric were linked using equipercentile and item response theory (IRT) methods.
Abstract: Depression is a leading mental health concern across the U.S. and worldwide. There are myriad assessments to evaluate depressive symptoms, including the Edinburgh Postnatal Depression Scale (EPDS), which is widely used to evaluate women's pre- and postnatal depression but not as prevalent at other timepoints in adulthood, limiting its utility for longitudinal research. As part of the National Institutes of Health's (NIH) Environmental influences on Child Health Outcomes (ECHO) Research Program, the current study sought to develop a common metric so that scores on the EPDS can be converted to the standardized Patient-Reported Measurement Information System (PROMIS®) T-score metric. Drawing on data from the ECHO-Prenatal Alcohol in SIDS and Stillbirth cohort, this study used a single-group linking design, where 1,263 mothers completed the EPDS and PROMIS-Depression measures at the same time. Score linking was conducted using equipercentile and item response theory (IRT) methods. Results showed both linking methods provide robust, congruent results, and subgroup invariance held across age, race, ethnicity, education, and geographic location. The IRT-based unidimensional fixed-parameter calibration was selected due to its model simplicity, and a crosswalk table was established to convert scores from the EPDS to PROMIS T-scores. Overall, this study provides a way to aggregate data across various depression measures and timepoints, such that researchers and clinicians now have the ability to directly compare and combine EPDS data with PROMIS and other depression measures already score-linked to PROMIS. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

10 citations


Journal ArticleDOI
TL;DR: The PROsetta package is created to fill the gap and disseminate a protocol that has been established as a standard practice for linking PROs and other health outcomes, and to provide practical challenges in studies comparing effects across studies and samples.
Abstract: A common problem when using a variety of patient-reported outcomes (PROs) for diverse populations and subgroups is establishing a harmonized scale for the incommensurate outcomes. The lack of comparability in metrics (e.g., raw summed scores vs. scaled scores) among different PROs poses practical challenges in studies comparing effects across studies and samples. Linking has long been used for practical benefit in educational testing. Applying various linking techniques to PRO data has a relatively short history; however, in recent years, there has been a surge of published studies on linking PROs and other health outcomes, owing in part to concerted efforts such as the Patient-Reported Outcomes Measurement Information System (PROMIS®) project and the PRO Rosetta Stone (PROsetta Stone®) project (www.prosettastone.org). Many R packages have been developed for linking in educational settings; however, they are not tailored for linking PROs where harmonization of data across clinical studies or settings serves as the main objective. We created the PROsetta package to fill this gap and disseminate a protocol that has been established as a standard practice for linking PROs.

9 citations


Journal ArticleDOI
TL;DR: In this article, Equipercentile linking methods based on log-linear smoothing approach and non-smoothing approach were used to establish a common metric across the two measures.

8 citations


Journal ArticleDOI
TL;DR: In this paper, the effects of choosing US or country-specific item parameters on PROMIS CAT T-scores were analyzed and compared with (1) US parameters, (2) Dutch item parameters, and (3) US item parameters for DIF-free items and Dutch item parameter for all items (rescaled to the US metric).

7 citations


Journal ArticleDOI
TL;DR: In this article, a crosswalk table between the KOOS Function in Activities of Daily Living (ADL) subscale and the PROMIS PF 10a was created to compare the two measures.
Abstract: Background An increased focus on patient-reported outcome measures (PROMs) has led to a proliferation of these measures in orthopaedic surgery. Mandating a single PROM in clinical and research orthopaedics is not feasible given the breadth of data already collected with older measures and the emergence of psychometrically superior measures. Creating crosswalk tables for scores between measures allows providers to maintain control of measure choice. Furthermore, crosswalk tables permit providers to compare scores collected with older outcome measures with newly collected ones. Given the widespread use of the newer Patient-reported Outcome Measure Information System Physical Function (PROMIS PF) and the established Knee Outcome and Osteoarthritis Score (KOOS), it would be clinically useful to link these two measures. Question/purpose Can the KOOS Function in Activities of Daily Living (ADL) subscale be robustly linked to the PROMIS PF to create a crosswalk table of equivalent scores that accurately reflects a patient's reported physical function level on both scales? Methods We sought to establish a common standardized metric for collected responses to the PROMIS PF and the KOOS ADL to develop equations for converting a PROMIS PF score to a score for the KOOS-ADL subscale and vice versa. To do this, we performed a retrospective, observational study at two academic medical centers and two community hospitals in an urban and suburban healthcare system. Patients 18 years and older who underwent TKA were identified. Between January 2017 and July 2020, we treated 8165 patients with a TKA, 93% of whom had a diagnosis of primary osteoarthritis. Of those, we considered patients who had completed a full KOOS and PROMIS PF 10a on the same date as potentially eligible. Twenty-one percent (1708 of 8165) of patients were excluded because no PROMs were collected at any point, and another 67% (5454 of 8165) were excluded because they completed only one of the required PROMs, leaving 12% (1003 of 8165) for analysis here. PROMs were collected each time they visited the health system before and after their TKAs. Physical function was measured by the PROMIS PF version 1.0 SF 10a and KOOS ADL scale. Analyses to accurately create a crosswalk of equivalent scores between the measures were performed using the equipercentile linking method with both unsmoothed and log linear smoothed score distributions. Results Crosswalks were created, and adequate validation results supported their validity; we also created tables to allow clinicians and clinician scientists to convert individual patients' scores easily. The mean difference between the observed PROMIS PF scores and the scores converted by the crosswalk from the KOOS-ADL scores was -0.08 ± 4.82. A sensitivity analysis was conducted, confirming the effectiveness of these crosswalks to link the scores of two measures from patients both before and after surgery. Conclusion The PROMIS PF 10a can be robustly linked to the KOOS ADL measure. The developed crosswalk table can be used to convert PROMIS PF scores from KOOS ADL and vice versa. Clinical relevance The creation of a crosswalk table between the KOOS Function in ADL subscale and PROMIS PF allows clinicians and researchers to easily convert scores between the measures, thus permitting greater choice in PROM selection while preserving comparability between patient cohorts and PROM data collected from older outcome measures. Creating a crosswalk, or concordance table, between the two scales will facilitate this comparison, especially when pooling data for meta-analyses.

4 citations


Journal ArticleDOI
TL;DR: The authors did not find any significant recall period effect on PF8c responses, and recommend the use of the PROMIS physical function standard, with no specified recall time period.

2 citations


Journal ArticleDOI
TL;DR: The authors fit a nominal response item response theory model to responses to the Health Care Engagement Measure (HEM) and found that higher response category distinctions, such as responding 3 (very true) versus 2 (mostly true) were considerably more discriminating than lower response categories distinctions.
Abstract: As part of a scale development project, we fit a nominal response item response theory model to responses to the Health Care Engagement Measure (HEM). When using the original 5-point response format, categories were not ordered as intended for six of the 23 items. For the remaining, the category boundary discrimination between Categories 0 (not at all true) and 1 (a little bit true) was only weakly discriminating, suggesting uninformative categories. When the lowest two categories were collapsed, psychometric properties improved greatly. Category boundary discriminations within items, however, varied significantly. Specifically, higher response category distinctions, such as responding 3 (very true) versus 2 (mostly true) were considerably more discriminating than lower response category distinctions. Implications for HEM scoring and for improving measurement precision at lower levels of the construct are presented as is the unique role of the nominal response model in category analysis.

2 citations


Journal ArticleDOI
TL;DR: In this paper, the authors conducted cognitive interviews followed by a nation-wide mail survey of US Veteran Administration (VA) healthcare users, where data were collected on 49 candidate healthcare engagement items, as well as measures of self-efficacy for managing symptoms, provider communication, and perceived access Items were subjected to exploratory bifactor, statistical learning, and IRT analyses Cognitive interviews were completed by 56 patients and 9552 VA healthcare users with chronic conditions completed the mail survey Participants were mostly white and male but with sizable minority participation.
Abstract: Healthcare engagement is a core measurement target for efforts to improve healthcare systems This construct is broadly defined as the extent to which healthcare services represent collaborative partnerships with patients Previous qualitative work operationalized healthcare engagement as generalized self-efficacy in four related subdomains: self-management, collaborative communication, health information use, and healthcare navigation Building on this work, our objective was to establish a healthcare engagement instrument that is sufficiently unidimensional to yield a single score We conducted cognitive interviews followed by a nation-wide mail survey of US Veteran Administration (VA) healthcare users Data were collected on 49 candidate healthcare engagement items, as well as measures of self-efficacy for managing symptoms, provider communication, and perceived access Items were subjected to exploratory bifactor, statistical learning, and IRT analyses Cognitive interviews were completed by 56 patients and 9552 VA healthcare users with chronic conditions completed the mail survey Participants were mostly white and male but with sizable minority participation Psychometric analyses and content considerations reduced the item pool to 23 items, which demonstrated a strong general factor (OmegaH of 89) IRT analyses revealed a high level of reliability across the trait range and little DIF across groups Most health information use items were removed during analyses, suggesting a more independent role for this domain We provide quantitative evidence for a relatively unidimensional measure of healthcare engagement Despite developed with VA healthcare users, the measure is intended for general use Future work includes short-form development and validation with other patient groups

2 citations