To describe the development and calibration of the banks and scales of the Quality of Life in Neurological Disorders (Neuro-QOL) project, commissioned by the National Institute of Neurological ...Disorders and Stroke to develop a bilingual (English/Spanish), clinically relevant, and psychometrically robust health-related quality-of-life (HRQOL) assessment tool.
Classic and modern test construction methods were used, including input from essential stakeholder groups.
An online patient panel testing service and 11 academic medical centers and clinics from across the United States and Puerto Rico that treat major neurologic disorders.
Adult and pediatric patients representing different neurologic disorders specified in this study, proxy respondents for select conditions (stroke, pediatric conditions), and English- and Spanish-speaking participants from the general population.
Not applicable.
Multiple generic and condition-specific measures used to provide construct validity evidence for the new Neuro-QOL tool.
Neuro-QOL has developed 14 generic item banks and 8 targeted scales to assess HRQOL in 5 adult (stroke, multiple sclerosis, Parkinson's disease, epilepsy, amyotrophic lateral sclerosis) and 2 pediatric conditions (epilepsy, muscular dystrophies).
The Neuro-QOL system will continue to evolve, with validation efforts in clinical populations and new bank development in health domains not presently included. The potential for Neuro-QOL measures in rehabilitation research and clinical settings is discussed.
To develop a spinal cord injury (SCI)-specific patient-reported outcome (PRO) measure of health-related quality of life (QOL) covering multiple domains of functioning, including physical, emotional, ...and social health.
Focus groups.
Four SCI Model Systems rehabilitation hospitals.
Individuals with SCI (n=65) and clinicians (n=42).
Not applicable.
Spinal Cord Injury Quality of Life Measurement System (SCI-QOL).
Qualitative analysis yielded 3 domains of primary importance: physical-medical health, emotional health, and social participation. Results were used to guide domain and item decisions in the development of the SCI-QOL PRO measurement system. Qualitative data were used to develop item pools with item content specific to individuals with SCI across a wide spectrum of functioning. When possible, items from other major measurement initiatives were included verbatim in the item pools to link the measurement systems and facilitate cross-study and cross-population comparisons.
Issues that affect individuals' QOL after SCI are varied and several issues are unique to individuals who have had a traumatic injury. From these qualitative data, 3 major domains and 18 subdomains of functioning were identified. Item pools were developed in each of these 18 areas to measure functioning related to physical-medical issues, emotional status, and social participation.
Purpose We examined symptom variability in men and women with urological chronic pelvic pain syndrome. We describe symptom fluctuations as related to early symptom regression and its effect on ...estimated 1-year symptom change. We also describe a method to quantify patient specific symptom variability. Materials and Methods Symptoms were assessed biweekly in 424 subjects with urological chronic pelvic pain syndrome during 1 year. To evaluate the impact of early symptom regression subjects were classified as improved, no change or worse according to the rate of change using 1) all data, 2) excluding week 0 and 3) excluding weeks 0 and 2. Patient specific, time varying variability was calculated at each interval using a sliding window approach. Patients were classified as high, medium or low variability at each time and ultimately as high or low variability overall based on the variability for the majority of contacts. Results Prior to excluding early weeks to adjust for early symptom regression 25% to 38% and 5% to 6% of patients were classified as improved and worse, respectively. After adjustment the percent of patients who were improved or worse ranged from 15% to 25% and 6% to 9%, respectively. High and low variability phenotypes were each identified in 25% to 30% of participants. Conclusions Patients with urological chronic pelvic pain syndrome show symptom variability. At study enrollment patients had worse symptoms on average, resulting in a regression effect that influenced the estimated proportion of those who were improved or worse. Prospective studies should include a run-in period to account for regression to the mean and other causes of early symptom regression. Further, symptom variability may be quantified and used to characterize longitudinal symptom profiles of urological chronic pelvic pain syndrome.
To examine agreement between patient and proxy responses on the Quality of Life in Neurological Disorders (Neuro-QoL) instruments after stroke.
Cross-sectional observational substudy of the ...longitudinal, multisite, multicondition Neuro-QoL validation study.
In-person, interview-guided, patient-reported outcomes.
Convenience sample of dyads (N=86) of community-dwelling persons with stroke and their proxy respondents.
Not applicable.
Dyads concurrently completed short forms of 8 or 9 items for the 13 Neuro-QoL adult domains using the patient-proxy perspective. Agreement was examined at the scale-level with difference scores, intraclass correlation coefficients (ICCs), effect size statistics, and Bland-Altman plots, and at the item-level with kappa coefficients.
We found no mean differences between patients and proxies on the Applied Cognition-General Concerns, Depression, Satisfaction With Social Roles and Activities, Stigma, and Upper Extremity Function (Fine Motor, activities of daily living) short forms. Patients rated themselves more favorably on the Applied Cognition-Executive Function, Ability to Participate in Social Roles and Activities, Lower Extremity Function (Mobility), Positive Affect and Well-Being, Anxiety, Emotional and Behavioral Dyscontrol, and Fatigue short forms. The largest mean patient-proxy difference observed was 3 T-score points on the Lower Extremity Function (Mobility). ICCs ranged from .34 to .59. However, limits of agreement showed dyad differences exceeding ±20 T-score points, and item-level agreement ranged from not significant to weighted kappa=.34.
Proxy responses on Neuro-QoL short forms can complement responses of moderate- to high-functioning community-dwelling persons with stroke and augment group-level analyses, but do not substitute for individual patient ratings. Validation is needed for other stroke populations.
Abstract Objective To determine clinically meaningful changes (CMCs) for the Functional Assessment of Cancer Therapy–Prostate (FACT–P). Methods We obtained data from a Phase III trial of atrasentan ...in metastatic hormone-refractory prostate cancer patients (n = 809). We determined anchor-based differences using Karnofsky Performance Status (KPS), bone alkaline phosphatase (BAP), hemoglobin, time to disease progression (TTP), adverse events (AE), and survival. One-third and one-half standard deviation and standard error of measurement (SEM) were used as distribution-based criteria for CMCs. Comparison across baseline FACT–P domains and derived scales FACT–P total score, Trial Outcome Index (TOI) score, prostate cancer subscale (PCS) score, pain-related score, and FACT Advanced Prostate Symptom Index (FAPSI) were conducted for KPS, BAP, and hemoglobin using Student's t tests. Twelve-week change scores were compared for TTP, AE, and survival using ANCOVA. Results CMCs were estimated as 6 to 10 for FACT–P total score, 5 to 9 for FACT–P TOI score, 2 to 3 for FACT–P PCS, 1 to 2 for the 4 PCS pain-related questions, and 2 to 3 for FAPSI. CMCs were also estimated using distribution-based criteria. Kappa statistics were computed to determine the degree of correspondence between the recommended guideline of 1.0 SEM and empirically derived standards. Most of the kappas for health-related quality of life domains and SEM standards had “substantial” to “almost perfect” concordance. Conclusions The significant relationship between clinical and quality of life data provides support for the use of CMCs to increase interpretability of FACT–P scores.
To illustrate how measurement practices can be advanced by using as an example the fatigue item bank (FIB) and its applications (short forms and computerized adaptive testing CAT) that were developed ...through the National Institutes of Health Patient Reported Outcomes Measurement Information System (PROMIS) Cooperative Group.
Psychometric analysis of data collected by an Internet survey company using item response theory-related techniques.
A U.S. general population representative sample collected through the Internet.
Respondents used for dimensionality evaluation of the PROMIS FIB (N=603) and item calibrations (N=14,931).
Not applicable.
Fatigue items (112) developed by the PROMIS fatigue domain working group, 13-item Functional Assessment of Chronic Illness Therapy-Fatigue, and 4-item Medical Outcomes Study 36-Item Short Form Health Survey Vitality scale.
The PROMIS FIB version 1, which consists of 95 items, showed acceptable psychometric properties. CAT showed consistently better precision than short forms. However, all 3 short forms showed good precision for most participants in that more than 95% of the sample could be measured precisely with reliability greater than 0.9.
Measurement practice can be advanced by using a psychometrically sound measurement tool and its applications. This example shows that CAT and short forms derived from the PROMIS FIB can reliably estimate fatigue reported by the U.S. general population. Evaluation in clinical populations is warranted before the item bank can be used for clinical trials.
The articles in this supplement present recent advances in the measurement of patient-reported health-related quality-of-life (HRQOL) outcomes. Specifically, these articles highlight the combined ...efforts of the National Institutes of Health, National Institute for Neurological Disorders and Stroke, National Center on Medical Rehabilitation Research, National Institute on Disability and Rehabilitation Research, and Department of Veterans Affairs Rehabilitation Research and Development Service to improve HRQOL measurement. In addition, this supplement is intended to provide rehabilitation professionals with information about these efforts and the implications that these advances in outcomes measurement have for rehabilitation medicine and clinical practice. These new measurement scales use state-of-the-art method techniques, including item response theory and computerized adaptive testing. In addition, scale development involves both qualitative and quantitative methods, as well as the administration of items to hundreds or even thousands of research participants. The scales deliberately have been built with overlap of items between scales so that linkages and equivalency scores can be computed. Ultimately, these scales should facilitate direct comparison of outcomes instruments across studies and will serve as standard data elements across research trials without compromising the specificity of disease- or condition-targeted measures. This supplement includes the initial publications for many of these new measurement initiatives, each of which provides researchers and clinicians with better tools for evaluation of the efficacy of their interventions.
Abstract Background Patient-reported outcomes (PROs) are the consequences of disease and/or its treatment as reported by the patient. The importance of PRO measures in clinical trials for new drugs, ...biological agents, and devices was underscored by the release of the US Food and Drug Administration's draft guidance for industry titled “Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims.” The intent of the guidance was to describe how the FDA will evaluate the appropriateness and adequacy of PRO measures used as effectiveness end points in clinical trials. In response to the expressed need of ISPOR members for further clarification of several aspects of the draft guidance, ISPOR's Health Science Policy Council created three task forces, one of which was charged with addressing the implications of the draft guidance for the collection of PRO data using electronic data capture modes of administration (ePRO). The objective of this report is to present recommendations from ISPOR's ePRO Good Research Practices Task Force regarding the evidence necessary to support the comparability, or measurement equivalence, of ePROs to the paper-based PRO measures from which they were adapted. Methods The task force was composed of the leadership team of ISPOR's ePRO Working Group and members of another group (i.e., ePRO Consensus Development Working Group) that had already begun to develop recommendations regarding ePRO good research practices. The resulting task force membership reflected a broad array of backgrounds, perspectives, and expertise that enriched the development of this report. The prior work became the starting point for the Task Force report. A subset of the task force members became the writing team that prepared subsequent iterations of the report that were distributed to the full task force for review and feedback. In addition, review beyond the task force was sought and obtained. Along with a presentation and discussion period at an ISPOR meeting, a draft version of the full report was distributed to roughly 220 members of a reviewer group. The reviewer group comprised individuals who had responded to an emailed invitation to the full membership of ISPOR. This Task Force report reflects the extensive internal and external input received during the 16-month good research practices development process. Results/Recommendations An ePRO questionnaire that has been adapted from a paper-based questionnaire ought to produce data that are equivalent or superior (e.g., higher reliability) to the data produced from the original paper version. Measurement equivalence is a function of the comparability of the psychometric properties of the data obtained via the original and adapted administration mode. This comparability is driven by the amount of modification to the content and format of the original paper PRO questionnaire required during the migration process. The magnitude of a particular modification is defined with reference to its potential effect on the content, meaning, or interpretation of the measure's items and/or scales. Based on the magnitude of the modification, evidence for measurement equivalence can be generated through combinations of the following: cognitive debriefing/testing, usability testing, equivalence testing, or, if substantial modifications have been made, full psychometric testing. As long as only minor modifications were made to the measure during the migration process, a substantial body of existing evidence suggests that the psychometric properties of the original measure will still hold for the ePRO version. Hence, an evaluation limited to cognitive debriefing and usability testing only may be sufficient. However, where more substantive changes in the migration process has occurred, confirming that the adaptation to the ePRO format did not introduce significant response bias and that the two modes of administration produce essentially equivalent results is necessary. Recommendations regarding the study designs and statistical approaches for assessing measurement equivalence are provided. Conclusions The electronic administration of PRO measures offers many advantages over paper administration. We provide a general framework for decisions regarding the level of evidence needed to support modifications that are made to PRO measures when they are migrated from paper to ePRO devices. The key issues include: 1) the determination of the extent of modification required to administer the PRO on the ePRO device and 2) the selection and implementation of an effective strategy for testing the measurement equivalence of the two modes of administration. We hope that these good research practice recommendations provide a path forward for researchers interested in migrating PRO measures to electronic data collection platforms.
Summary Background When the mechanism of action behind treatment toxicity reflects the intended effect on the treatment target, the toxicity might be a useful marker for efficacy. During endocrine ...treatment of breast cancer, the occurrence of symptoms related to oestrogen depletion or oestrogen blockade might thus be a predictor of treatment effectiveness. In this retrospective analysis, the relation between the reported incidence of vasomotor or joint symptoms and breast cancer recurrence in the Arimidex, Tamoxifen, Alone or in Combination (ATAC) trial is assessed. Methods Women with hormone-receptor-positive tumours who reported vasomotor or joint symptoms at the first follow-up visit (3 months) in the ATAC trial, (which assessed tamoxifen or anastrozole for adjuvant treatment of postmenopausal breast cancer), were compared with women without these symptoms to see if there was a relation between these symptoms and subsequent recurrence. The ATAC trial is registered as an International Standard Randomised Controlled Trial, number ISRCTN18233230. Findings 1486 of 3964 (37·5%) eligible women reported newly emergent vasomotor symptoms at the 3-month follow-up visit and had lower subsequent recurrence than those who did not report these symptoms (223 during 10 752 women-years of follow-up vs 366 during 11 573 woman-years of follow-up, respectively; hazard ratio HR 0·84 95% CI 0·71–1·00, p=0·04; adjusted for age, body-mass index, previous hormone-replacement therapy, nodal status, tumour size, and tumour grade). A greater decrease in breast-cancer recurrence was seen for the 1245 of 3964 (31·4%) eligible women who reported new joint symptoms at the 3-month follow-up visit compared with those not reporting these symptoms (158 during 9242 women-years of follow-up vs 366 during 11 573 women-years of follow-up; adjusted HR 0·60 0·50–0·72, p<0·0001). Interpretation The appearance of new vasomotor symptoms or joint symptoms within the first 3 months of treatment is a useful biomarker, suggesting a greater response to endocrine treatment compared with women without these symptoms. Awareness of the relation between early treatment-emergent symptoms and beneficial response to therapy might be useful when reassuring patients who present with them, and might help to improve long-term treatment adherence when symptoms cannot be alleviated effectively. Funding Cancer Research UK and AstraZeneca.
Abstract Objectives Health valuation studies enhance economic evaluations of treatments by estimating the value of health-related quality of life (HRQOL). The Patient-Reported Outcomes Measurement ...Information System (PROMIS) includes a 29-item short-form HRQOL measure, the PROMIS-29. Methods To value PROMIS-29 responses on a quality-adjusted life-year scale, we conducted a national survey (N = 7557) using quota sampling based on the US 2010 Census. Based on 541 paired comparisons with over 350 responses each, pair-specific probabilities were incorporated into a weighted least-squared estimator. Results All losses in HRQOL influenced choice; however, respondents valued losses in physical function, anxiety, depression, sleep, and pain more than those in fatigue and social functioning. Conclusions This article introduces a novel approach to valuing HRQOL for economic evaluations using paired comparisons and provides a tool to translate PROMIS-29 responses into quality-adjusted life-years.