Background The use of global health items permits an efficient way of gathering general perceptions of health. These items provide useful summary information about health and are predictive of health ...care utilization and subsequent mortality. Methods Analyses of 10 self-reported global health items obtained from an internet survey as part of the Patient-Reported Outcome Measurement Information System (PROMIS) project. We derived summary scores from the global health items. We estimated the associations of the summary scores with the EQ-5D index score and the PROMIS physical function, pain, fatigue, emotional distress, and social health domain scores. Results Exploratory and confirmatory factor analyses supported a two-factor model. Global physical health (GPH; 4 items on overall physical health, physical function, pain, and fatigue) and global mental health (GMH; 4 items on quality of life, mental health, satisfaction with social activities, and emotional problems) scales were created. The scales had internal consistency reliability coefficients of 0.81 and 0.86, respectively. GPH correlated more strongly with the EQ-5D than did GMH (r = 0.76 vs. 0.S9). GPH correlated most strongly with pain impact (r = —0.75) whereas GMH correlated most strongly with depressive symptoms (r = —0.71). Conclusions Two dimensions representing physical and mental health underlie the global health items in PROMIS. These global health scales can be used to efficiently summarize physical and mental health in patient-reported outcome studies.
OBJECTIVE:The objective of this study was to develop the classification system for version of the SF-6D (SF-6Dv2) from the SF-36v2. SF-6Dv2 is an improved version of SF-6D, one of the most widely ...used generic measures of health for the calculation of quality-adjusted life years.
STUDY DESIGN AND SETTING:A 3-step process was undertaken to generate a new classification system(1) factor analysis to establish dimensionality; (2) Rasch analysis to understand item performance; and (3) tests of differential item function. To evaluate robustness, Rasch analyses were performed in multiple subsets of 2 large cross-sectional datasets from recently discharged hospital patients and online patient samples.
RESULTS:On the basis of factor analysis, other psychometric evidence, cross-cultural considerations, and amenability to valuation, the 6-dimension classification used in SF-6D was maintained. SF-6Dv2 resulted in the following modifications to SF-6Da simpler classification of physical function with clearer separation between levels; a more detailed 5-level description of role limitations; using negative wording to describe vitality; and using pain severity rather than pain interference.
CONCLUSIONS:The SF-6Dv2 classification system describes more distinct levels of health than SF-6D, changes the descriptions used for a number of dimensions and provides clearer wording for health state valuation. The second stage of the study has developed a utility value set using discrete choice methods so that the measure can be used in health technology assessment. Further work should investigate the psychometric characteristics of the new instrument.
Abstract Objective To document the development and psychometric evaluation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF) item bank and static ...instruments. Study Design and Setting The items were evaluated using qualitative and quantitative methods. A total of 16,065 adults answered item subsets ( n > 2,200/item) on the Internet, with oversampling of the chronically ill. Classical test and item response theory methods were used to evaluate 149 PROMIS PF items plus 10 Short Form-36 and 20 Health Assessment Questionnaire-Disability Index items. A graded response model was used to estimate item parameters, which were normed to a mean of 50 (standard deviation SD = 10) in a US general population sample. Results The final bank consists of 124 PROMIS items covering upper, central, and lower extremity functions and instrumental activities of daily living. In simulations, a 10-item computerized adaptive test (CAT) eliminated floor and decreased ceiling effects, achieving higher measurement precision than any comparable length static tool across four SDs of the measurement range. Improved psychometric properties were transferred to the CAT's superior ability to identify differences between age and disease groups. Conclusion The item bank provides a common metric and can improve the measurement of PF by facilitating the standardization of patient-reported outcome measures and implementation of CATs for more efficient PF assessments over a larger range.
IMPORTANCE: It is well established that selected lifestyle factors are individually associated with lower risk of chronic diseases, but how combinations of these factors are associated with ...disease-free life-years is unknown. OBJECTIVE: To estimate the association between healthy lifestyle and the number of disease-free life-years. DESIGN, SETTING, AND PARTICIPANTS: A prospective multicohort study, including 12 European studies as part of the Individual-Participant-Data Meta-analysis in Working Populations Consortium, was performed. Participants included 116 043 people free of major noncommunicable disease at baseline from August 7, 1991, to May 31, 2006. Data analysis was conducted from May 22, 2018, to January 21, 2020. EXPOSURES: Four baseline lifestyle factors (smoking, body mass index, physical activity, and alcohol consumption) were each allocated a score based on risk status: optimal (2 points), intermediate (1 point), or poor (0 points) resulting in an aggregated lifestyle score ranging from 0 (worst) to 8 (best). Sixteen lifestyle profiles were constructed from combinations of these risk factors. MAIN OUTCOMES AND MEASURES: The number of years between ages 40 and 75 years without chronic disease, including type 2 diabetes, coronary heart disease, stroke, cancer, asthma, and chronic obstructive pulmonary disease. RESULTS: Of the 116 043 people included in the analysis, the mean (SD) age was 43.7 (10.1) years and 70 911 were women (61.1%). During 1.45 million person-years at risk (mean follow-up, 12.5 years; range, 4.9-18.6 years), 17 383 participants developed at least 1 chronic disease. There was a linear association between overall healthy lifestyle score and the number of disease-free years, such that a 1-point improvement in the score was associated with an increase of 0.96 (95% CI, 0.83-1.08) disease-free years in men and 0.89 (95% CI, 0.75-1.02) years in women. Comparing the best lifestyle score with the worst lifestyle score was associated with 9.9 (95% CI 6.7-13.1) additional years without chronic diseases in men and 9.4 (95% CI 5.4-13.3) additional years in women (P < .001 for dose-response). All of the 4 lifestyle profiles that were associated with the highest number of disease-free years included a body-mass index less than 25 (calculated as weight in kilograms divided by height in meters squared) and at least 2 of the following factors: never smoking, physical activity, and moderate alcohol consumption. Participants with 1 of these lifestyle profiles reached age 70.3 (95% CI, 69.9-70.8) to 71.4 (95% CI, 70.9-72.0) years disease free depending on the profile and sex. CONCLUSIONS AND RELEVANCE: In this multicohort analysis, various healthy lifestyle profiles appeared to be associated with gains in life-years without major chronic diseases.
Purpose
The minimal important change (MIC) is defined as the smallest within-individual change in a patient-reported outcome measure (PROM) that patients on average perceive as important. We describe ...a method to estimate this value based on longitudinal confirmatory factor analysis (LCFA). The method is evaluated and compared with a recently published method based on longitudinal item response theory (LIRT) in simulated and real data. We also examined the effect of sample size on bias and precision of the estimate.
Methods
We simulated 108 samples with various characteristics in which the true MIC was simulated as the mean of individual MICs, and estimated MICs based on LCFA and LIRT. Additionally, both MICs were estimated in existing PROMIS Pain Behavior data from 909 patients. In another set of 3888 simulated samples with sample sizes of 125, 250, 500, and 1000, we estimated LCFA-based MICs.
Results
The MIC was equally well recovered with the LCFA-method as using the LIRT-method, but the LCFA analyses were more than 50 times faster. In the Pain Behavior data (with higher scores indicating more pain behavior), an LCFA-based MIC for improvement was estimated to be 2.85 points (on a simple sum scale ranging 14–42), whereas the LIRT-based MIC was estimated to be 2.60. The sample size simulations showed that smaller sample sizes decreased the precision of the LCFA-based MIC and increased the risk of model non-convergence.
Conclusion
The MIC can accurately be estimated using LCFA, but sample sizes need to be preferably greater than 125.
Patient-reported transition ratings are supposed to reflect the change between a previous baseline health state and a present follow-up state, but may reflect the present state to a greater extent. ...This so-called “present state bias” (PSB) potentially threatens the validity of transition ratings. Several criteria have been proposed to assess PSB. We examined how well these criteria perform and to which extent confirmatory factor analysis (CFA) for categorical data provides an accurate assessment of the degree of PSB.
We simulated multiple samples with baseline and follow-up item responses to a hypothetical questionnaire, and transition ratings. The samples varied with respect to various distributional characteristics and the degree of PSB. The performance of criteria proposed in the literature, and a new CFA-based criterion, were evaluated by the proportion of explained variance in PSB. In addition, four real datasets were analyzed.
The known criteria explained 36–74% of the variance in PSB. A new CFA-based criterion, namely the ratio of the factor loadings of the transition ratings plus one, explained 81–98% of the variance in PSB across the samples.
Present state bias in transition ratings can be estimated accurately using CFA.
INTRODUCTION:The inclusion of reference values for common patient-reported outcomes (PROs) measures in clinical care settings provides a clinically relevant context for an individual patient’s PRO ...scores. PRO reference values are currently not reported in clinical care settings. This is a missed opportunity, as clinicians are familiar with the presence and interpretation of reference values, commonly provided alongside laboratory test results. Incorporating PRO reference values into clinical PRO reporting requiresan understanding of the clinical purpose, the availability of an appropriate reference value, and graphical representation.
METHODS FOR PRO SCORE INTERPRETATION:We present reference value terminology adapted for PROs and discuss important differences between using reference values in the PRO score interpretation compared to other types of clinical measures from clinical chemistry. We outline the basic methodological approaches in obtaining a PRO reference sample and calculating reference intervals. Lastly, we provide recommendations on how to present and use PRO reference values in clinical care settings.
DISCUSSION:There is a strong, long-standing discipline behind reference value development and application in psychology and medicine, allowing for both providers and patients to understand comparisons and identify what is “out of range.” PRO reference values can be communicated in a wide range of ways within clinical care settings and are adaptable as required to different patient populations or clinical care situations. However, a notable adoption barrier is the expense and methodological expertise needed to establish and apply PRO reference values effectively in clinical encounters.
Objectives The aim of this study was to describe the development and the content of the Danish Psychosocial Work Environment Questionnaire (DPQ) and to test its reliability and validity. Methods We ...describe the identification of dimensions, the development of items, and the qualitative and quantitative tests of the reliability and validity of the DPQ. Reliability and validity of a 150 item version of the DPQ was evaluated in a stratified sample of 8958 employees in 14 job groups of which 4340 responded. Reliability was investigated using internal consistency and test-retest reliability. The factorial validity was investigated using confirmatory factor analysis (CFA). For each multi-item scale, we undertook CFA within each job group and multi-group CFA to investigate factorial invariance across job groups. Finally, using multi-group multi-factor CFA, we investigated whether scales were empirically distinct. Results Internal consistency reliabilities and test-retest reliabilities were satisfactory. Factorial validity of the multi-item scales was satisfactory within each of the 14 job groups. Factorial invariance was demonstrated for 10 of the 28 multi-item scales. The hypothesis that the scales of the DPQ were empirically distinct was supported. The final DPQ version consisted of 119 items covering 38 different psychosocial work environment dimensions. Conclusions Overall, the DPQ is a reliable and valid instrument for assessing psychosocial working conditions in a variety of job groups. The results indicate, however, that questions about psychosocial working conditions may be understood differently across job groups, which may have implications for the comparability of questionnaire-based measures of psychosocial working conditions across job groups.
Purpose
Meaningful thresholds are needed to interpret patient-reported outcome measure (PROM) results. This paper introduces a new method, based on item response theory (IRT), to estimate such ...thresholds. The performance of the method is examined in simulated datasets and two real datasets, and compared with other methods.
Methods
The IRT method involves fitting an IRT model to the PROM items and an anchor item indicating the criterion state of interest. The difficulty parameter of the anchor item represents the meaningful threshold on the latent trait. The latent threshold is then linked to the corresponding expected PROM score. We simulated 4500 item response datasets to a 10-item PROM, and an anchor item. The datasets varied with respect to the mean and standard deviation of the latent trait, and the reliability of the anchor item. The real datasets consisted of a depression scale with a clinical depression diagnosis as anchor variable and a pain scale with a patient acceptable symptom state (PASS) question as anchor variable.
Results
The new IRT method recovered the true thresholds accurately across the simulated datasets. The other methods, except one, produced biased threshold estimates if the state prevalence was smaller or greater than 0.5. The adjusted predictive modeling method matched the new IRT method (also in the real datasets) but showed some residual bias if the prevalence was smaller than 0.3 or greater than 0.7.
Conclusions
The new IRT method perfectly recovers meaningful (interpretational) thresholds for multi-item questionnaires, provided that the data satisfy the assumptions for IRT analysis.
This study aimed at developing a shortened version of the EORTC QLQ-C30, one of the most widely used health-related quality of life questionnaires in oncology, for palliative care research. The study ...included interviews with 41 patients and 66 health care professionals in palliative care to determine the appropriateness, relevance and importance of the various domains of the QLQ-C30. Item response theory methods were used to shorten scales. Patients and health care professionals rated pain, physical function, emotional function, fatigue, global health status/quality of life, nausea/vomiting, appetite, dyspnoea, constipation, and sleep as most important. Therefore, these scales/items were retained in the questionnaire. Four scales were shortened without reducing measurement precision. Important dimensions not covered by the questionnaire were identified. The resulting 15-item EORTC QLQ-C15-PAL is a ‘core questionnaire’ for palliative care. Depending on the research questions, it may be supplemented by additional items, modules or questionnaires.