scispace - formally typeset
Search or ask a question

Showing papers on "Intraclass correlation published in 2018"


Journal ArticleDOI
TL;DR: The results encourage using SF-12v2® to assess health-related quality of life in the Medicaid population with combined physical and behavioral conditions or similar cohorts.
Abstract: Although Short Form (SF)-12 × 2® has been extensively studied and used as a valid measure of health-related quality of life in a variety of population groups, no systematic studies have described the reliability of the measure in patients with behavioral conditions or serious mental illness (SMI). We assessed the internal consistency, split-half reliability and annual test-retest correlations in a sample of 1587 participants with either a combination of physical and behavioral conditions or SMI. The Mosier’s alpha was 0.70 for the Physical Composite Scale (PCS) and 0.69 for the Mental Health Composite Scale (MCS), indicating good internal consistency. We observed strong correlations between physical functioning, physical role and body pain scales (r = 0.55–0.56), and between social functioning, emotional role, and mental health (r = 0.53–0.58). We calculated split-half reliabilities to be 0.74 for physical functioning, 0.75 for physical role, 0.73 for emotional role and 0.65 for mental health respectively. We assessed the annual test-retest correlation using intraclass correlation (ICC) and found an ICC of 0.61 for PCS and 0.57 for MCS composite scores, adjusting for age, sex, race/ethnicity, and CRG. We found no decline in the correlations between baseline and the following study years until year 3. Our results encourage using SF-12v2® to assess health-related quality of life in the Medicaid population with combined physical and behavioral conditions or similar cohorts. The WIN study was registered with clinicaltrials.gov on April 22, 2015. Trial registration number: NCT02440906 . Retrospectively registered.

133 citations


Journal ArticleDOI
TL;DR: The quality of data extracted from a clinical information system widely used for critical care quality improvement and research is measured to validate the use of data repositories to support reliable and efficient use of high quality secondary use data.

84 citations



Journal ArticleDOI
TL;DR: At both population and individual level, HGS and KES showed a low to moderate agreement independently of age and health status, and HGS alone should not be assumed a proxy for overall muscle strength.

76 citations


Journal ArticleDOI
TL;DR: The modified Arabic version of the IPAQ showed acceptable validity and reliability for the assessment of physical activity among Lebanese adults and more studies are necessary in the future to assess its validity compared to a gold-standard criterion measure.
Abstract: The International Physical Actvity Questionnaire (IPAQ) is a validated tool for physical activity assessment used in many countries however no Arabic version of the long-form of this questionnaire exists to this date. Hence, the aim of this study was to cross-culturally adapt and validate an Arabic version of the long International Physical Activity Questionnaire (AIPAQ) equivalent to the French version (F-IPAQ) in a Lebanese population. The guidelines for cross-cultural adaptation provided by the World Health Organization and the International Physical Activity Questionnaire committee were followed. One hundred fifty-nine students and staff members from Saint Joseph University of Beirut were randomly recruited to participate in the study. Items of the A-IPAQ were compared to those from the F-IPAQ for concurrent validity using Spearman’s correlation coefficient. Content validity of the questionnaire was assessed using factor analysis for the A-IPAQ’s items. The physical activity indicators derived from the A-IPAQ were compared with the body mass index (BMI) of the participants for construct validity. The instrument was also evaluated for internal consistency reliability using Cronbach’s alpha and Intraclass Correlation Coefficient (ICC). Finally, thirty-one participants were asked to complete the A-IPAQ on two occasions three weeks apart to examine its test–retest reliability. Bland-Altman analyses were performed to evaluate the extent of agreement between the two versions of the questionnaire and its repeated administrations. A high correlation was observed between answers of the F-IPAQ and those of the A-IPAQ, with Spearman’s correlation coefficients ranging from 0.91 to 1.00 (p < 0.05). Bland-Altman analysis showed a high level of agreement between the two versions with all values scattered around the mean for total physical activity (mean difference = 5.3 min/week, 95% limits of agreement = −145.2 to 155.8). Negative correlations were observed between MET values and BMI, independent of age, gender or university campus. The A-IPAQ showed a high internal consistency reliability with Cronbach’s alpha ranging from 0.769–1.00 (p < 0.001) and intraclass correlation coefficient (ICC) ranging from 0.625–0.999 (p < 0.001), except for a moderate agreement with the moderate garden/yard activity (alpha = 0.682; ICC = 0.518; p < 0.001). The A-IPAQ had moderate-to-good test-retest reliability for most of its items (ICC ranging from 0.66–0.96; p < 0.001) and the Bland-Altman analysis showed a satisfactory agreement between the two administrations of the A-IPAQ for total physical activity (mean difference = 99.8 min/week, 95% limits of agreement = −1105.3; 1304.9) and total vigorous and moderate physical activity (mean difference = −29.7 min/week, 95% limits of agreement = −777.6; 718.2). The modified Arabic version of the IPAQ showed acceptable validity and reliability for the assessment of physical activity among Lebanese adults. More studies are necessary in the future to assess its validity compared to a gold-standard criterion measure.

67 citations


Journal ArticleDOI
TL;DR: When researchers plan to assess the test-retest reliability of patient-reported outcome measures for older people, they need to consider an adequate time interval of approximately 13days and the sample size of about 5 times the number of items.

53 citations


Journal ArticleDOI
TL;DR: This novel application provides a tool to record patient adherence to care processes and PROs, with high agreement with traditional clinical audit, high usability, and patient satisfaction.
Abstract: While patient engagement and clinical audit are key components of successful enhanced recovery programs (ERPs), they require substantial resource allocation. The objective of this study was to assess the validity and usability of a novel mobile device application for education and self-reporting of adherence for patients undergoing bowel surgery within an established ERP. Prospectively recruited patients undergoing bowel surgery within an ERP used a novel app specifically designed to provide daily recovery milestones and record adherence to 15 different ERP processes and six patient-reported outcomes (PROs). Validity was measured by the agreement index (Cohen’s kappa coefficient for categorical, and interclass correlation coefficient (ICC) for continuous variables) between patient-reported data through the app and data recorded by a clinical auditor. Acceptability and usability of the app were measured by the System Usability Scale (SUS). Forty-five patients participated in the study (mean age 61, 64% male). Overall, patients completed 159 of 179 (89%) of the available questionnaires through the app. Median time to complete a questionnaire was 2 min 49 s (i.q.r. 2′32″–4′36″). Substantial (kappa > 0.6) or almost perfect agreement (kappa > 0.8) and strong correlation (ICC > 0.7) between data collected through the app and by the clinical auditor was found for 14 ERP processes and four PROs. Patient-reported usability was high; mean SUS score was 87 (95% CI 83–91). Only 6 (13%) patients needed technical support to use the app. Forty (89%) patients found the app was helpful to achieve their daily goals, and 34 (76%) thought it increased their motivation to recover after surgery. This novel application provides a tool to record patient adherence to care processes and PROs, with high agreement with traditional clinical audit, high usability, and patient satisfaction. Future studies should investigate the use of mobile device apps as strategies to increase adherence to perioperative interventions.

48 citations


Journal ArticleDOI
TL;DR: All the robotic indices except one provided by a novel robotic device for upper limb rehabilitation are reliable, sensitive and strongly correlated both with motor and disability clinical scales.
Abstract: In the last few years, there has been an increasing interest in the use of robotic devices to objectively quantify motor performance of patients after brain damage. Although these robot-derived measures can potentially add meaningful information about the patient’s dexterity, as well as be used as outcome measurements after the rehabilitation treatment, they need to be validated before being used in clinical practice. The present work aims to evaluate the reliability, the validity and the discriminant ability of the metrics provided by a novel robotic device for upper limb rehabilitation. Forty-eight patients with sub-acute stroke and 40 age-matched healthy subjects were involved in this study. Clinical evaluation included: Fugl-Meyer Assessment for the upper limb, Action Research Arm Test, and Barthel Index. Robotic evaluation of the upper limb performance consisted of 14 measures of motor ability quantifying the dexterity in performing planar reaching movements. Patients were evaluated twice, one day apart, to assess the reliability of the robotic metrics, using the Intraclass Correlation Coefficient. Validity was assessed by analyzing the correlation of the robotic metrics with the clinical scales, by means of the Spearman’s Correlation Coefficient. Finally, the ability of the robotic metrics to distinguish between patients with stroke and healthy subjects was investigated with t-tests and the Effect Size. Reliability was found to be excellent for 12 measures and from moderate to good for the remaining 2. Most of the robotic indices were strongly correlated with the clinical scales, while a few showed a moderate correlation and only one was not correlated with the Barthel Index and weakly correlated with the remain two. Finally, all but one the provided metrics were able to discriminate between the two groups, with large effect sizes for most of them. We found that all the robotic indices except one provided by a novel robotic device for upper limb rehabilitation are reliable, sensitive and strongly correlated both with motor and disability clinical scales. Therefore, this device is suitable as evaluation tool for the upper limb motor performance of patients with sub-acute stroke in clinical practice. NCT02879279 .

47 citations


Journal ArticleDOI
TL;DR: The questionnaire showed moderate to good test-retest reliability and a moderate level of validity for assessing SB in youth, similar or slightly better to previously published in this population.

47 citations


Journal ArticleDOI
TL;DR: A minimum of 5 consecutive days of monitoring was required for reliably estimating physical activity (PA) and sedentary behaviour (SB) from accelerometer data in older adults.
Abstract: The purpose of the study was to examine the minimum number of monitoring days for reliably estimating physical activity (PA) and sedentary behaviour (SB) from accelerometer data in older adults. Forty-two older adults from a local senior centre participated in this study. Participants wore an ActiGraph wGT3X-BT on the right hip for 7 consecutive days. Accelerometer data were downloaded to a computer and converted to activity count data in 60s epochs. Time spent in SB and different PA intensity categories were estimated with commonly used activity count cut-points. Participants with at least 7 valid days of monitoring (≥10 h.day−1) were included in the analysis. Intraclass correlation coefficients (ICC) were calculated for determining single-day monitoring reliability. The Spearman-Brown prophecy formula was used to estimate the minimum number of monitoring days required for achieving an ICC of 0.80. Single-day ICC values for time spent in SB and PA intensity categories ranged from 0.45 to 0.61. Mi...

46 citations


Journal ArticleDOI
TL;DR: Reliability for each index was greatest in the terminal ileum and poorest in the rectum, and future studies should assess responsiveness to treatment in order to confirm their utility as evaluative indices in clinical trials and clinical practice.
Abstract: Background: Magnetic resonance enterography is increasingly utilized for assessment of luminal Crohn's disease activity. The Magnetic Resonance Index of Activity and the London Index are the most commonly used outcome measures in clinical trials. We assessed the reliability of these indices and several additional items. Methods: A consensus process clarified scoring conventions and identified additional items based on face validity. Four experienced radiologists evaluated 50 images in triplicate, in random order, at least 1 month apart, using a central image management system. Intra- and interrater reliability were assessed by calculating and comparing intraclass correlation coefficients. Results: Intrarater intraclass correlation coefficients (95% confidence intervals) for the Magnetic Resonance Index of Activity, London, and London "extended" indices and a visual analogue scale were 0.89 (0.84 to 0.91), 0.87 (0.83 to 0.90), 0.89 (0.85 to 0.92), and 0.86 (0.81 to 0.90). Corresponding interrater intraclass correlation coefficients were 0.71 (0.61 to 0.77), 0.67 (0.55 to 0.75), 0.70 (0.61 to 0.76), and 0.71 (0.62 to 0.77). Reliability for each index was greatest in the terminal ileum and poorest in the rectum. All 3 indices were highly correlated with the visual analogue scale; 0.79 (0.71 to 0.85), 0.78 (0.71 to 0.84), and 0.79 (0.72 to 0.85) for the Magnetic Resonance Index of Activity, London, and the London "extended" indices, respectively. Conclusions: "Substantial" interrater reliability was observed for all 3 indices. Future studies should assess responsiveness to treatment in order to confirm their utility as evaluative indices in clinical trials and clinical practice.

Journal ArticleDOI
TL;DR: The NBSS demonstrated good validity and reliability in a large cohort of people with a SCI, and is a suitable tool to assess neurogenic bladder symptoms.
Abstract: Prospective cross-sectional study. Validate the Neurogenic Bladder Symptom Score (NBSS) for people with spinal cord injury (SCI). United States (recruitment from community/tertiary neurourology clinics). We used data from a prospective observational study of people with a SCI who enrolled during December 2015–September 2016. Participants completed the NBSS and other measurement tools (SF-12 and SCI-QOL Bladder Management Complications tool). Data were used to determine the internal consistency (Cronbach’s alpha), validity (hypothesis testing), and test–re-test reliability (using an intraclass correlation coefficient). 609 people with a SCI had complete data. The median NBSS total score was 22 (IQR 15–30), and median quality of life was “mixed”. The Cronbach’s alpha of the total and the incontinence, storage/voiding, and consequences domains was 0.85, 0.93, 0.76, and 0.49 respectively. All item to domain correlations were ≥0.3, aside from 3/7 of the items from the consequences domain. Appropriate correlations between the NBSS domains and external variables and other questionnaires were observed, such as a moderate correlation between the SCI-QOL Bladder Management complications tool and the NBSS total score. For the reliability assessment, 174 people had 3 month followup data and did not have a significant change to their urologic health. The intraclass correlation coefficients were >0.75 for all subdomains and the overall score. The NBSS demonstrated good validity and reliability in a large cohort of people with a SCI, and is a suitable tool to assess neurogenic bladder symptoms. Patient-Centered Outcomes Research Institute (PCORI) Award CER14092138.

Journal ArticleDOI
TL;DR: The main objective of the present study was to translate, adapt and assess a French-language version of the FJS-12 in total hip arthroplasty (THA) patients, finding that the SHO-12 is a valid, reproducible self-administered questionnaire, comparable to the English- language version.
Abstract: Introduction The ability to “forget” a joint implant in everyday life is considered to be the ultimate objective in arthroplasty. Recently, a scoring system, the Forgotten Joint Score (FJS-12), was published based on a self-administered questionnaire comprising 12 questions assessing how far patients had been able to forget their hip or knee prosthesis. The main objective of the present study was to translate, adapt and assess a French-language version of the FJS-12 in total hip arthroplasty (THA) patients. Patients and methods The questionnaire was translated by 2 orthopedic surgeons and a medical physician, all bilingual, then back-translated into English by two native English-speaking translators unacquainted with the original. A concertation meeting adopted a beta-version of this Score de Hanche Oubliee (SHO-12), which was then tested on 15 randomly selected THA patients and adapted according to their comments. The final version was validated following the international COSMIN methodology. Data collection was prospective, included all patients operated on by a single surgeon using a single technique. Reference questionnaires comprised Oxford Hip Score (OHS-12) and modified Harris Hip Score (HHS). The 3 assessments were conducted with a minimum 1 year's follow-up. The SHO-12 was administered twice, with a 1-week interval. Statistical tests assessed construct validity (Pearson correlation test), internal coherence (Cronbach alpha), reliability (intraclass correlation coefficient) and feasibility (percentage missing values, administration time and ceiling and floor effects). Results Translation/back-translation encountered no particular linguistic problems. Fifty-eight patients (63 THAs) responded to all questionnaires: 22 female, 36 male; mean age, 62.7 ± 15.2 years. Mean follow-up was 1.6 ± 0.4 years. SHO-12 correlated strongly with OHS-12 and HHS. Internal coherence was good (alpha = 0.96) and reproducibility excellent. No floor or ceiling effects were found. Conclusion SHO-12, the French-language version of the FJS-12 in THA, is a valid, reproducible self-administered questionnaire, comparable to the English-language version. Level of evidence I, Testing of previously developed diagnostic criteria on consecutive patients – Diagnostic study.

Journal ArticleDOI
TL;DR: Although child and parent reports may both contribute important information, parent report is a valid proxy for child self-reported pain intensity and HRQOL after discharge from inpatient pediatric surgery, which may prove important for better understanding pain experiences and intervention needs.

Journal ArticleDOI
TL;DR: It is demonstrated that using a correlation coefficient is not appropriate for assessing the interchangeability of 2 such measurement methods, and an alternative approach is described, the since widely applied graphical Bland–Altman Plot, which is based on a simple estimation of the mean and standard deviation of differences between measurements by the 2 methods.
Abstract: Correlation and agreement are 2 concepts that are widely applied in the medical literature and clinical practice to assess for the presence and strength of an association. However, because correlation and agreement are conceptually distinct, they require the use of different statistics. Agreement is a concept that is closely related to but fundamentally different from and often confused with correlation. The idea of agreement refers to the notion of reproducibility of clinical evaluations or biomedical measurements. The intraclass correlation coefficient is a commonly applied measure of agreement for continuous data. The intraclass correlation coefficient can be validly applied specifically to assess intrarater reliability and interrater reliability. As its name implies, the Lin concordance correlation coefficient is another measure of agreement or concordance. In undertaking a comparison of a new measurement technique with an established one, it is necessary to determine whether they agree sufficiently for the new to replace the old. Bland and Altman demonstrated that using a correlation coefficient is not appropriate for assessing the interchangeability of 2 such measurement methods. They in turn described an alternative approach, the since widely applied graphical Bland-Altman Plot, which is based on a simple estimation of the mean and standard deviation of differences between measurements by the 2 methods. In reading a medical journal article that includes the interpretation of diagnostic tests and application of diagnostic criteria, attention is conventionally focused on aspects like sensitivity, specificity, predictive values, and likelihood ratios. However, if the clinicians who interpret the test cannot agree on its interpretation and resulting typically dichotomous or binary diagnosis, the test results will be of little practical use. Such agreement between observers (interobserver agreement) about a dichotomous or binary variable is often reported as the kappa statistic. Assessing the interrater agreement between observers, in the case of ordinal variables and data, also has important biomedical applicability. Typically, this situation calls for use of the Cohen weighted kappa. Questionnaires, psychometric scales, and diagnostic tests are widespread and increasingly used by not only researchers but also clinicians in their daily practice. It is essential that these questionnaires, scales, and diagnostic tests have a high degree of agreement between observers. It is therefore vital that biomedical researchers and clinicians apply the appropriate statistical measures of agreement to assess the reproducibility and quality of these measurement instruments and decision-making processes.

Journal ArticleDOI
TL;DR: The PD-scale has good reliability and validity for early screening of PD in critically ill children and can be validly and reliably used by nurses to this aim.
Abstract: Reports of increasing incidence rates of delirium in critically ill children are reason for concern. We evaluated the measurement properties of the pediatric delirium component (PD-scale) of the Sophia Observation Withdrawal Symptoms scale Pediatric Delirium scale (SOS-PD scale). In a multicenter prospective observational study in four Dutch pediatric ICUs (PICUs), patients aged ≥ 3 months and admitted for ≥ 48 h were assessed with the PD-scale thrice daily. Criterion validity was assessed: if the PD-scale score was ≥ 4, a child psychiatrist clinically assessed the presence or absence of PD according to the Diagnostic and statistical manual of mental disorders (DSM)-IV. In addition, the child psychiatrist assessed a randomly selected group to establish the false-negative rate. The construct validity was assessed by calculating the Pearson coefficient (rp) for correlation between the PD-scale and Cornell Assessment Pediatric Delirium (CAP-D) scores. Interrater reliability was determined by comparing paired nurse-researcher PD-scale assessments and calculating the intraclass correlation coefficient (ICC). Four hundred eighty-five patients with a median age of 27.0 months (IQR 8–102) were included, of whom 48 patients were diagnosed with delirium by the child psychiatrist. The PD-scale had overall sensitivity of 92.3% and specificity of 96.5% compared to the psychiatrist diagnosis for a cutoff score ≥4 points. The rp between the PD-scale and the CAP-D was 0.89 (CI 95%, 0.82–0.93; p < 0.001). The ICC of 75 paired nurse-researcher observations was 0.99 (95% CI, 0.98–0.99). The PD-scale has good reliability and validity for early screening of PD in critically ill children. It can be validly and reliably used by nurses to this aim.

Journal ArticleDOI
TL;DR: Vessel density measurements showed good repeatability and reproducibility by OCT-A in the peripapillary retina, the vessel density was positively related to RNFL thickness and negatively related to optic disc area and rim area.
Abstract: PURPOSE To evaluate the reliability of vessel density measurements in the peripapillary retina using optical coherence tomography angiography (OCT-A) and to analyze the correlation with retinal nerve fiber layer (RNFL) thickness in healthy subjects. METHOD Thirty-five healthy volunteers were recruited in the study. The optic disc region was scanned three times with spectral-domain OCT (SD-OCT) and split-spectrum amplitude decorrelation angiography by two skilled examiners. Vessel density of the peripapillary retina was automatically calculated by the software RTVue-XR (version 2015.1.1.98). The RNFL thickness on the optic nerve head was measured by SD-OCT. The coefficient of variation (CV), coefficient of repeatability, and intraclass correlation coefficients (ICC) were calculated for intraobserver repeatability. The Bland-Altman analysis was used to determine interobserver reproducibility. Correlations between peripapillary retinal vessel density and RNFL thickness were analyzed using a multivariable linear regression. RESULTS The mean age of the volunteers was 47.0 ± 29.7 years. The intraobserver repeatability in different sectors of the peripapillary retina was good with a high coefficient of repeatability, low CV (< 0.2%), and high ICC (0.847-0.952). The interobserver reproducibility was also good in different sectors, but should be interpreted with caution due to the difference bias caused by different observers in some quadrants. There was a significant positive correlation between vessel density and RNFL thickness; optic disc rim area and disc area were negatively related to vessel density (p = 0.008 and p = 0.001, respectively). CONCLUSION Vessel density measurements showed good repeatability and reproducibility by OCT-A in the peripapillary retina, the vessel density was positively related to RNFL thickness and negatively related to optic disc area and rim area.

Journal ArticleDOI
TL;DR: The Component Timed-Up-and-Go is a reliable and valid clinical tool for detailed assessment of prosthetic mobility in people with non-vascular lower limb amputation and demonstrated excellent test–retest reliability with ICCs ranging from .98 to .86 for total and component times.
Abstract: Objective:Using a custom mobile application to evaluate the reliability and validity of the Component Timed-Up-and-Go test to assess prosthetic mobility in people with lower limb amputation.Design:Cross-sectional design.Setting:National conference for people with limb loss.Subjects:A total of 118 people with non-vascular cause of lower limb amputation participated. Subjects had a mean age of 48 (±13.7) years and were an average of 10 years post amputation. Of them, 54% (n = 64) of subjects were male.Intervention:None.Main measure:The Component Timed-Up-and-Go was administered using a mobile iPad application, generating a total time to complete the test and five component times capturing each subtask (sit to stand transitions, linear gait, turning) of the standard timed-up-and-go test. The outcome underwent test–retest reliability using intraclass correlation coefficients (ICCs) and convergent validity analyses through correlation with self-report measures of balance and mobility.Results:The Component Time...

Journal ArticleDOI
TL;DR: The SCI-FCS-I was found to be reliable and a valid outcome measure for assessing manual wheelchair concerns about falling in the Italian population.
Abstract: Psychometrics study. The objective of this study was to develop an Italian version of the Spinal Cord Injury-Falls Concern Scale (SCI-FCS) and examine its reliability and validity. Multicenter study in spinal units in Northern and Southern Italy. The scale also was administered to non-hospitalized outpatient clinic patients. The original scale was translated from English to Italian using the “Translation and Cultural Adaptation of Patient-Reported Outcomes Measures” guidelines. The reliability and validity of the culturally adapted scale were assessed following the “Consensus-Based Standards for the Selection of Health Status Measurement Instruments” checklist. The SCI-FCS-I internal consistency, inter-rater, and intra-rater reliability were examined using Cronbach’s alpha coefficient and the intraclass correlation coefficient, respectively. Concurrent validity was evaluated using Pearson’s correlation coefficient with the Italian version of the short form of the Wheelchair Use Confidence Scale for Manual Wheelchair Users (WheelCon-M-I-short form). The Italian version of the SCI-FCS-I was administered to 124 participants from 1 June to 30 September 2017. The mean ± SD of the SCI-FCS-I score was 16.73 ± 5.88. All SCI-FCS items were either identical or similar in meaning to the original version’s items. Cronbach’s α was 0.827 (p < 0.01), the inter-rater reliability was 0.972 (p < 0.01), and the intra-rater reliability was 0.973 (p < 0.01). Pearson’s correlation coefficient of the SCI-FCS-I scores with the WheelCon-M-I-short form was 0.56 (p < 0.01). The SCI-FCS-I was found to be reliable and a valid outcome measure for assessing manual wheelchair concerns about falling in the Italian population.

Journal ArticleDOI
TL;DR: ASL-PWI performs well in differentiating PCNSL from GBM in both qualitative and quantitative analyses, and the visual scoring template demonstrated good diagnostic performance, similar to quantitative analysis.
Abstract: To evaluate the diagnostic performance of arterial spin labelling perfusion weighted images (ASL-PWIs) to differentiate primary CNS lymphoma (PCNSL) from glioblastoma (GBM). ASL-PWIs of pathologically confirmed PCNSL (n = 21) or GBM (n = 93) were analysed. For qualitative analysis, tumours were visually scored into five categories based on ASL-CBF maps. For quantitative analysis, normalised CBF values were derived by contralateral grey matter (GM) in intra- and peritumoral areas (nCBFintratumoral and nCBFperitumoral, respectively). Visual scoring scales and quantitative parameters from PCNSL and GBM were compared. In addition, the area under the receiver-operating characteristic (ROC) curve was used to determine the diagnostic accuracy of ASL-PWI for differentiating PCNSL from GBM. Weighted kappa or intraclass correlation coefficients (ICCs) were used to assess reliability between two observers. In qualitative analysis, scores 5 (CBFintratumoral>CBFGM, 68.8% [64/93]) and 4 (CBFintratumoral ≈ CBFGM, 47.6% [10/21]) were the most frequently reported scores for GBM and PCNSL, respectively. In quantitative analysis, both nCBFintratumoral and nCBFperitumoral in PCNSL were significantly lower than those in the GBM (nCBFintratumoral, 0.89 ± 0.59 [mean and SD] vs. 2.68 ± 1.89, p 0.8) in differentiating PCNSL from GBM. • The visual scoring template demonstrated good diagnostic performance, similar to quantitative analysis. • nCBFperitumoral demonstrated better diagnostic performance than nCBFintratumoral or visual scoring.

Journal ArticleDOI
TL;DR: The PsAID is a reliable, feasible and discriminative measure in patients with PsA and strong correlation of individual items with other PROMS represent an opportunity to reduce questionnaire burden for patients in studies and clinical practice.
Abstract: Objectives The Psoriatic Arthritis Impact of Disease (PsAID) Questionnaire is a recently developed patient-reported outcome measure (PROM) of disease impact in psoriatic arthritis (PsA). We set out to assess the validity in an independent cohort of patients, estimate the minimally important difference for improvement and explore the potential of individual components of the PsAID in clinical practice. Methods Data were collected prospectively for a single-centre cohort of patients with PsA. Construct validity was assessed by Spearman correlation with other PROMs and reliability by intraclass correlation coefficient (ICC) at 1 week. Sensitivity to change at 3 months was determined by the standardised response mean (SRM) in those patients with active disease requiring a change in treatment. Results A total of 129 patients (mean ±SD age 52.1±13.3, 57% women, disease duration 10.2±8 years) completed the baseline questionnaires and assessments. The mean baseline PsAID12 score was 3.92±2.26 with an ICC of 0.91 (95%CI 0.87 to 0.94). The SE of measurement was 0.51 and the minimal detectable change was 1.41. There was strong correlation (r≥0.70) with most of the PROMs studied and moderate correlation with clinical outcomes (r=0.40–0.57). The SRM of the PsAID12 was 0.74 (95%CI 0.45 to 0.97). There was strong correlation with individual PsAID items and their corresponding PROM questionnaires (r≥0.67). Conclusion The PsAID is a reliable, feasible and discriminative measure in patients with PsA. The good responsiveness of the PsAID and strong correlation of individual items with other PROMS represent an opportunity to reduce questionnaire burden for patients in studies and clinical practice.

Journal ArticleDOI
TL;DR: Force platform measures provided reliable information on balance function in healthy older adults; however, small learning effects were evident, particularly for the SOT, and a relationship between FP measures, which assess underlying balance mechanisms, and clinical balance and gait measures was not strongly supported.
Abstract: Background and purpose Postural control declines with aging and is an independent risk factor for falls in older adults. Objective examination of balance function is warranted to direct fall prevention strategies. Force platform (FP) systems provide quantitative measures of postural control and analysis of different aspects of balance. The purpose of this study was to examine the reliability and validity of FP measures in healthy older adults. Methods This study enrolled 46 healthy elderly adults, mean age 67.67 (5.1) years, who had no history of falls. They were assessed on 3 standardized tests on the NeuroCom Equitest FP system: limits of stability (LOS), motor control test (MCT), and sensory organization test (SOT). The test battery was administered twice within a 10-day period for test-retest reliability; intraclass correlation coefficients (ICCs), standard error of measurement (SEM), and minimal detectable change based on a 95% confidence interval (MDC95) were calculated. FP measures were compared with criterion clinical balance (Mini-BESTest and Functional Gait Assessment) and gait (10-m walk and 6-minute walk) measures to examine concurrent validity using Pearson correlation coefficients. Multiple linear regression analysis examined whether age and activity level were associated with FP performance. The α level was set at P Results SOT composite equilibrium scores, MCT average latency, and LOS end point excursion measures all demonstrated excellent test-retest reliability (ICC = 0.90, 0.85, and 0.77, respectively), whereas moderate to good reliability was found for SOT vestibular ratio score (ICC = 0.71). There was large variability in performance in this healthy elderly cohort, resulting in relatively large MDC95 for these measures, especially for the LOS test. Fair correlations were found between LOS end point excursion and clinical balance and gait measures (r = 0.31-0.49), and between MCT average latency and gait measures only (r =-0.32). No correlations were found between SOT measures and clinical balance and gait measures. Age was only marginally significantly (P = .055) associated with LOS end point excursion but was not associated with SOT or MCT measures, and activity level was not associated with any of the FP measures. Conclusion FP measures provided reliable information on balance function in healthy older adults; however, small learning effects were evident, particularly for the SOT. The SEM and MDC95 for the LOS and SOT measures were relatively large for this healthy elderly cohort. A relationship between FP measures, which assess underlying balance mechanisms, and clinical balance and gait measures was not strongly supported in this study. Further research is needed to justify the value of adding FP measures to a test battery for balance assessment in older adults without a history of falls.

Journal ArticleDOI
TL;DR: The PASE-I showed positive results for reliability and validity and will be of great use to clinicians and researchers in evaluating and managing physical activities in the Italian older adults population.
Abstract: Objective. The aim of the study was to translate and culturally adapt the Physical Activity Scale for the Elderly into Italian (PASE-I) and to evaluate its psychometric properties in the Italian older adults healthy population. Methods. For translation and cultural adaptation, the “Translation and Cultural Adaptation of Patient-Reported Outcomes Measures” guidelines have been followed. Participants included healthy individuals between 55 and 75 years old. The reliability and validity were assessed following the “Consensus-Based Standards for the Selection of Health Status Measurement Instruments” checklist. To evaluate internal consistency and test-retest reliability, Cronbach’s α and Intraclass Correlation Coefficient (ICC) were, respectively, calculated. The Berg Balance Score (BBS) and the PASE-I were administered together, and Pearson’s correlation coefficient was calculated for validity. Results. All the PASE-I items were identical or similar to the original version. The scale was administered twice within a week to 94 Italian healthy older people. The mean PASE-I score in this study was 159±77.88. Cronbach’s α was 0.815 (p < 0.01) and ICC was 0.977 (p < 0.01). The correlation with the BBS was 0.817 (p < 0.01). Conclusions. The PASE-I showed positive results for reliability and validity. This scale will be of great use to clinicians and researchers in evaluating and managing physical activities in the Italian older adults population.

Journal ArticleDOI
TL;DR: Psychometric evidence supports the NEI VFQ-25 as a reliable and valid cross-sectional measure of the impact of GA on patient visual function and vision-related quality of life.

Journal ArticleDOI
02 Jan 2018-PLOS ONE
TL;DR: All four liver fibrosis phantoms could be differentiated by quantitative elastography, by all platforms (p<0.001) and in the Bland-Altman analysis the differences in measurements were larger for thephantoms with higher Young’s modulus.
Abstract: This study aimed to assess and validate the repeatability and agreement of quantitative elastography of novel shear wave methods on four individual tissue-mimicking liver fibrosis phantoms with different known Young's modulus. We used GE Logiq E9 2D-SWE, Philips iU22 ARFI (pSWE), Samsung TS80A SWE (pSWE), Hitachi Ascendus (SWM) and Transient Elastography (TE). Two individual investigators performed all measurements non-continued and in parallel. The methods were evaluated for inter- and intraobserver variability by intraclass correlation, coefficient of variation and limits of agreement using the median elastography value. All systems used in this study provided high repeatability in quantitative measurements in a liver fibrosis phantom and excellent inter- and intraclass correlations. All four elastography platforms showed excellent intra-and interobserver agreement (interclass correlation 0.981-1.000 and intraclass correlation 0.987-1.000) and no significant difference in mean elasticity measurements for all systems, except for TE on phantom 4. All four liver fibrosis phantoms could be differentiated by quantitative elastography, by all platforms (p<0.001). In the Bland-Altman analysis the differences in measurements were larger for the phantoms with higher Young's modulus. All platforms had a coefficient of variation in the range 0.00-0.21 for all four phantoms, equivalent to low variance and high repeatability.

Journal ArticleDOI
TL;DR: According to the results, self-reported measurements of weight and height can be used cautiously as valid alternatives to determine weight status and the weighted kappa coefficient showed substantial agreement among the weight status categories.
Abstract: Self-reported measures have been used to obtain weight and height information in some epidemiological surveys. The validation of such information is necessary to guarantee data quality. This study assessed the validity of self-reported weight and height to determine weight status. Data were obtained in the Brazilian National Health Survey, a Brazilian household-based nationwide survey carried out in 2013. In this survey, 40,366 individuals (aged ≥ 18 years) provided self-reported and measured information about weight and height. Student’s paired t-test was used to verify the differences between self-reported and measured data. The agreement between measurements was obtained using the intraclass correlation coefficient (ICC) and Bland-Altman method. To evaluate variations in weight status categorizations, the weighted kappa coefficient and exact agreement were used. Sensitivity and specificity were estimated for the self-reported information to classify overweight and obese individuals. There was high agreement between self-reported and measured weight, height, and body mass index (ICC > 0.88). The mean agreements estimated by the Bland-Altman method were 99.6% for weight and 100.6% for height. The weighted kappa coefficient showed substantial agreement among the weight status categories (> 0.66); the exact agreement was 77%. Sensibility and specificity for overweight (83% and 87.5%, respectively) and obesity (73.4% and 96.7%, respectively) were considered high for the sociodemographic characteristics evaluated. According to our results, self-reported measurements of weight and height can be used cautiously as valid alternatives to determine weight status.

Journal ArticleDOI
TL;DR: The navicular drop test appears to be a reproducible, valid, and simple test for evaluating medial longitudinal arch height, having fewer disadvantages than using footprint parameters.

Journal ArticleDOI
TL;DR: The RGI‐C is valid and reliable for detecting clinically important changes in skeletal manifestations of severe HPP in newborns, infants, and children, including during asfotase alfa treatment.
Abstract: Hypophosphatasia (HPP) is the heritable metabolic disease characterized by impaired skeletal mineralization due to low activity of the tissue-nonspecific isoenzyme of alkaline phosphatase. Although HPP during growth often manifests with distinctive radiographic skeletal features, no validated method was available to quantify them, including changes over time. We created the Radiographic Global Impression of Change (RGI-C) scale to assess changes in the skeletal burden of pediatric HPP. Site-specific pairs of radiographs of newborns, infants, and children with HPP from three clinical studies of asfotase alfa, an enzyme replacement therapy for HPP, were obtained at baseline and during treatment. Each pair was scored by three pediatric radiologists ("raters"), with nine raters across the three studies. Intrarater and interrater agreement was determined by weighted Kappa coefficients. Interrater reliability was assessed using intraclass correlation coefficients (ICCs) and by two-way random effects analysis of variance (ANOVA) and a mixed-model repeated measures ANOVA. Pearson correlation coefficients evaluated relationships of the RGI-C to the Rickets Severity Scale (RSS), Pediatric Outcomes Data Collection Instrument Global Function Parent Normative Score, Childhood Health Assessment Questionnaire Disability Index, 6-Minute Walk Test percent predicted, and Z-score for height in patients aged 6 to 12 years at baseline. Eighty-nine percent (8/9) of raters showed substantial or almost perfect intrarater agreement of sequential RGI-C scores (weighted Kappa coefficients, 0.72 to 0.93) and moderate or substantial interrater agreement (weighted Kappa coefficients, 0.53 to 0.71) in patients aged 0 to 12 years at baseline. Moderate-to-good interrater reliability was observed (ICC, 0.57 to 0.65). RGI-C scores were significantly (p ≤ 0.0065) correlated with the RSS and with measures of global function, disability, endurance, and growth in the patients aged 6 to 12 years at baseline. Thus, the RGI-C is valid and reliable for detecting clinically important changes in skeletal manifestations of severe HPP in newborns, infants, and children, including during asfotase alfa treatment. © 2018 The Authors. Journal of Bone and Mineral Research Published by Wiley Periodicals Inc.

Journal ArticleDOI
TL;DR: Few studies included nerve repair in their sample for the psychometric analysis of outcome measures, so moderate evidence could be confirmed and Rosén-Lundborg score had emerging evidence of reliability and validity as a comprehensive outcome following nerve repair.
Abstract: Outcome after nerve repair of the hand needs standardized psychometrically robust measures. We aimed to systematically review the psychometric properties of available functional, motor, and sensory assessment instruments after nerve repair. This systematic review of health measurement instruments searched databases from 1966 to 2017. Pairs of raters conducted data extraction and quality assessment using a structured tool for clinical measurement studies. Kappa correlation was used to define the agreement prior to consensus for individual items, and intraclass correlation coefficient (ICC) was used to assess reliability between raters. A narrative synthesis described quality and content of the evidence. Sixteen studies were included for final critical appraisal scores. Kappa ranged from 0.31 to 0.82 and ICC was 0.81. Motor domain had manual muscle testing with Kappa from 0.72 to 0.93 and a dynamometer ICC reliability between 0.92 and 0.98. Sensory domain had touch threshold Semmes-Weinstein monofilaments (SWM) as the most responsive measure while two-point discrimination (2PD) was the least responsive (effect size 1.2 and 0.1). A stereognosis test, Shape and Texture Identification (STI), had Kappa test-retest reliability of 0.79 and inter-rater reliability of 0.61, with excellent sensibility and specificity. Manual tactile test had moderate to mild correlation with 2PD and SWM. Function domain presented Rosen-Lundborg score with Spearman correlations of 0.83 for total score. Patient-reported outcomes measurements had ICC of 0.85 and internal consistency from 0.88 to 0.96 with Patient-Rated Wrist and Hand Evaluation with higher score for reliability and Spearman correlation between 0.38 and 0.89 for validity. Few studies included nerve repair in their sample for the psychometric analysis of outcome measures, so moderate evidence could be confirmed. Manual muscle test and Rotterdam Intrinsic Hand Myometer dynamometer had excellent reliability but insufficient data on validity or responsiveness. Touch threshold testing was more responsive than 2PD test. The locognosia test and STI had limited but positive supporting data related to validity. Rosen-Lundborg score had emerging evidence of reliability and validity as a comprehensive outcome following nerve repair. Few questionnaires were considered reliable and valid to assess cold intolerance. There is no patient-reported outcome measurement following nerve repair that provides comprehensive assessment of symptoms and function by patient perspective.

Journal ArticleDOI
TL;DR: Investigating test-retest and inter-rater reliability of FPI-6 total and individual scores for the assessment of foot posture of adults and older adults suggests that the current version of F PI-6 can be a useful tool to assess foot posture for adults and should be further examined.
Abstract: Background Previous studies have suggested that the Foot Posture Index (FPI-6) is valid and reliable to evaluate foot posture of adults and children. However, studies with adults had some important limitations. In addition, it is not clear if FPI-6 is reliable for older adults. Variations in foot structure, such as edema, bone callosity and bunions, are more frequent in older adults, which may compromise FPI-6 reliability for this population. Objectives To investigate test-retest and inter-rater reliability of FPI-6 total and individual scores for the assessment of foot posture of adults and older adults. Methods Twenty-one adults and 19 older adults participated in this study. The examiners performed FPI-6 on two days of data collection. We used Cohen Weighted Kappa and Intraclass Correlation Coefficient for categorical and continuous variables, respectively. Results For adults, FPI-6 scores demonstrated test-retest reliability varying from fair to substantial and inter-rater reliability varying from fair to almost perfect. For older adults, FPI-6 scores demonstrated test-retest reliability varying from not reliable to moderate and inter-rater reliability varying from fair to almost perfect. The examiners demonstrated more than 80% of agreement in all FPI-6 scores for adults and older adults. Conclusions The relatively low reliability in light of this high level of agreement suggest that the current version of FPI-6 can be a useful tool to assess foot posture for adults and should be further examined. On the other hand, FPI-6 should be cautiously used for older adults.