The Professional Quality of Life (ProQOL) scale is one of the most widely used measures of compassion satisfaction and fatigue despite there being little publicly available evidence to support its ...validity. This study, conducted among a sample of 310 child protection workers, assessed the construct validity of this measure using confirmatory factor analysis (CFA) and bifactor modeling. The CFA failed to confirm the adequacy of the three‐factor structure proposed by Stamm (2010). In response, a bifactor model postulating a factor structure with a general factor in addition to independent factors (compassion satisfaction, job burnout, and secondary traumatic stress) was proposed, highlighting the unidimensionality of the ProQOL while allowing for each subscale to be used separately. Moreover, this bifactor model of the ProQOL was moderately correlated with the Posttraumatic Disorder Checklist, r = −.427, p < .001, and strongly correlated with scales of well‐being at work, r = .694, p < .001, and psychological distress at work, r = −.666, p < .001, thus supporting the ProQOL's convergent validity. No associations were found between the ProQOL and the Life Event Checklist, which supports the ProQOL's discriminant validity. Overall, the results indicated that compassion satisfaction and compassion fatigue represent higher and lower levels of the same construct rather than two different constructs. Researchers and clinicians could therefore compute a single score to rate professionals’ individual levels of professional quality of life.
Resumen
Spanish s by Asociación Chilena de Estrés Traumático (ACET)
Validez de Constructo del instrumento de medición de Calidad de Vida Profesional (ProQoL) en una muestra de trabajadores de protección infantil
ESCALA DE CALIDAD DE VIDA PROFESIONAL: VALIDEZ DE CONSTRUCTO
La escala de Calidad de Vida Profesional (ProQoL en su sigla en inglés) es una de las medidas más ampliamente usadas de compasión satisfacción y fatiga por compasión, a pesar de que hay escasa evidencia disponible públicamente que soporte su validez. Este estudio, realizado en una muestra de 310 trabajadores de protección infantil, evaluó la validez de constructo de esta medición usando análisis confirmatorio de factores (CFA, por su sigla en inglés) y modelado bifactor.
El CFA no confirmó la idoneidad de la estructura de tres factores propuesta por Stamm (2010). En respuesta, se propone un modelo bifactor que postula una estructura factorial con un factor general junto a factores independientes (compasión satisfacción, agotamiento laboral y estrés traumático secundario), destacando la unidimensionalidad de la ProQoL mientras permite el uso por separado de cada subescala. Más aún, este modelo bifactor de la ProQoL estuvo moderadamente correlacionado con la Lista de Chequeo de Trastorno Postraumático, r = ‐.427, p < .001, y fuertemente correlacionado con escalas de bienestar en el trabajo, r = .694, p <.001, y malestar psicológico en el trabajo, r = ‐.666, p <.001, apoyando por tanto la validez convergente de la ProQoL. No se encontraron asociaciones entre la ProQoL y la Lista de Chequeo de Eventos Vitales, lo cual apoya la validez discriminante de la ProQoL. En suma, los resultados indicaron que la compasión satisfacción y la fatiga por compasión representan niveles más altos y más bajos del mismo constructo más que dos constructos diferentes. Los investigadores y los clínicos podrían por lo tanto calcular un único puntaje para calificar los niveles individuales de calidad de vida profesional de los profesionales.
抽象
Traditional and Simplified Chinese s by the Asian Society for Traumatic Stress Studies (AsianSTSS)
簡體及繁體中文撮要由亞洲創傷心理研究學會翻譯
Construct Validity of the Professional Quality of Life (ProQoL) measurement instrument among a sample of child protection workers
Traditional Chinese
標題: 兒童護理工作者中, 專業生活品質(ProQOL)量表作為測量工具的建構效度
撮要: 雖然目前仍欠缺公開的實證以證明專業生活品質(ProQOL)量表的效度, 其仍為最廣泛用以測量慈心滿足及慈心疲倦的量表。本研究樣本為310名兒童護理工作者, 運用驗證性因素分析(CFA)和雙因素模型, 評估此量表的建構效度。CFA未能證實Stamm(2010)提出的三因素結構的適當性。因此, 我們採用一個雙因素模型, 假設一個除了獨立因素 (慈心滿足、工作枯竭、次級創傷後壓力) 外, 亦有一個共通因素的因素結構, 凸顯ProQOL量表的單向度性, 同時讓各個子量表可分開使用。此外, 這個ProQOL的雙因素模型跟創傷後壓力症檢查表有中等的關連(r = ‐.427, p < .001), 並跟工作幸福感(r = .694, p < .001)和工作中的心理悲痛(r = ‐.666, p < .001)量表分數有強勁關連, 繼而證實了量表的聚合效度。ProQOL與生命事件量表並無關連, 證實了ProQOL的區別效度。整體來說, 慈心滿足及慈心疲倦反映為同一構念高低水平的分別, 而非來自兩個不同構念。因此, 研究員和臨床治療師可以單一分數, 為專業人員的專業生活品質水平評分。
Simplified Chinese
标题: 儿童护理工作者中, 专业生活质量(ProQOL)量表作为测量工具的建构效度
撮要: 虽然目前仍欠缺公开的实证以证明专业生活质量(ProQOL)量表的效度, 其仍为最广泛用以测量慈心满足及慈心疲倦的量表。本研究样本为310名儿童护理工作者, 运用验证性因素分析(CFA)和双因素模型, 评估此量表的建构效度。CFA未能证实Stamm(2010)提出的三因素结构的适当性。因此, 我们采用一个双因素模型, 假设一个除了独立因素 (慈心满足、工作枯竭、次级创伤后压力) 外, 亦有一个共通因素的因素结构, 凸显ProQOL量表的单向度性, 同时让各个子量表可分开使用。此外, 这个ProQOL的双因素模型跟创伤后压力症检查表有中等的关连(r = ‐.427, p < .001), 并跟工作幸福感(r = .694, p < .001)和工作中的心理悲痛(r = ‐.666, p < .001)量表分数有强劲关连, 继而证实了量表的聚合效度。ProQOL与生命事件量表并无关连, 证实了ProQOL的区别效度。整体来说, 慈心满足及慈心疲倦反映为同一构念高低水平的分别, 而非来自两个不同构念。因此, 研究员和临床治疗师可以单一分数, 为专业人员的专业生活质量水平评分。
Theories about the emotion elicitation process have been proposed as a scoring rationale for tests of emotional understanding (EU) – a subcomponent of emotional intelligence (EI). Theory-based ...scoring represents a considerable improvement over approaches that rely on rather subjective group judgements. The aim of this article is twofold: Firstly, we discuss an important limitation of appraisal theories for scoring EU tests. We argue that theory-based scoring is only unambiguous if the cognitive appraisals of the target persons are presented in the situational descriptions. Secondly, we provide a theory-based situational judgement test of EU, the Theory-Based Test of Emotional Understanding (TBEU), which takes this limitation into account. In a study of N = 200 we present initial validity evidence of this new measure with regard to its intended one-dimensional structure, its relations to classical intelligence (in terms of convergent validity evidence), and the Big Five personality traits (in terms of discriminant validity evidence) at the level of latent variables. Overall, the results support the usefulness of emotion theories for the assessment of EU.
•Development of a situational judgement test of emotional understanding•Stronger theoretical foundation of scoring keys•Latent variable correlations with intelligence and the Big-Five personality traits
Objectives:
Disorder has been measured by various data sources; however, little attention has been given to comparing the construct validity of different measures obtained through various methods in ...capturing social disorder and related phenomena.
Methods:
The multitrait-multimethod approach was used to triangulate the consistency between social disorder, prostitution and drug activity across resident surveys, systematic social observations, and police calls for service data.
Results:
Prostitution and drug activity showed convergent validity, while there was little evidence that social disorder was consistently measured across the three methods. None of the three social problem measures showed high discriminant validity. Drug activity seems to have highest trait-specific discriminant validity across measures, and prostitution is the most identifiable measure across data sources. Social disorder was found to have low discriminant validity. However, the agreement between databases varies across the type of social problems.
Conclusions:
Social disorder appears to the most difficult concept to define and measure consistently. The lack of correspondence across data sources cautions against the use of a single source of information in studying disorder. Future studies should explore the factors that shape perceptions of disorder and how to best measure disorder to test the broken windows thesis and related concepts.
Despite the wide use of the Strengths and Difficulties Questionnaire (SDQ) to assess adolescent mental health, its psychometric functionality is still under debate. This study investigated the ...structural validity and reliability of the SDQ scores, and the resemblance of the SDQ sum scores and factor scores. Factor one-dimensionality and competing multifactor structures were tested against data. With the best acceptable models, measurement invariance was tested between genders and over time. Subscale reliability and correspondence between subscale sum scores and factor scores were estimated. The nationally representative self-report data from 23,980 Finnish early (12-13 years) and mid- (15-16 years) adolescents (50.4% girls) were collected from two cohorts in 2008 and 2013. The results showed that among early adolescents, the revised SDQ with a controlled method effect had an excellent fit. In contrast, none of the tested models had an acceptable fit among the mid-adolescents. Among early adolescents, strong measurement invariance was achieved between genders and over time. Three of the five subscales were one-dimensional, and all subscales had low reliability. The resemblance between the subscale sum scores and factor scores was alarmingly low. Researchers should be cautious when using the SDQ Total Difficulties sum score or the subscale scores as they may be substantially biased, and practitioners should desist from using the SDQ as a screening tool in its current form. This study strongly supports the revision of the SDQ. In line with the previous findings, we suggest rewording the worst functioning items and revising the reverse-worded difficulties items.
Public Significance Statement
The self-reported SDQ contains method effects which can and should be controlled when the SDQ is used in research, and more research is needed to guarantee the reliable use of the SDQ sum scores for assessing adolescent mental health, because the sum scores in their current form may be substantially biased.
The authors evaluated the reliability and validity of a set of 7 behavioral decision-making tasks, measuring different aspects of the decision-making process. The tasks were administered to ...individuals from diverse populations. Participants showed relatively consistent performance within and across the 7 tasks, which were then aggregated into an Adult Decision-Making Competence (A-DMC) index that showed good reliability. The validity of the 7 tasks and of overall A-DMC emerges in significant relationships with measures of socioeconomic status, cognitive ability, and decision-making styles. Participants who performed better on the A-DMC were less likely to report negative life events indicative of poor decision making, as measured by the Decision Outcomes Inventory. Significant predictive validity remains when controlling for demographic measures, measures of cognitive ability, and constructive decision-making styles. Thus, A-DMC appears to be a distinct construct relevant to adults' real-world decisions.
Abstract Background The US Food and Drug Administration’s guidance for industry document on patient-reported outcomes (PRO) defines content validity as “the extent to which the instrument measures ...the concept of interest” (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. Methods We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. Results If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Conclusion Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures.
The Static-99, Static-99R, and STABLE-2007 are internationally well-established instruments for predicting static and dynamic risks of sexual recidivism in individuals convicted of sexual offenses. ...Previous meta-analyses assessed their predictive and incremental validity, but none has yet compared the two Static versions and the Static-STABLE combinations. Here, we implemented diagnostic test accuracy network meta-analysis (DTA-NMA) to compare all tests and identify optimal cutoffs in one comprehensive analysis. The DTA-NMA included 32 samples comprising 45,224 adult male individuals. More information was available on the Static-99 (22 samples; 34,316 individuals) and the Static-99R (13 samples; 27,243 individuals), compared to the Static-99/STABLE-2007 (three samples; 762 individuals), the Static-99R/STABLE-2007 (two samples; 2,972 individuals), and the STABLE-2007 (three samples; 816 individuals). The primary outcome was the area under the receiver operating characteristic curve (AUC). The secondary outcomes were sensitivity and specificity. Optimal cutoffs were determined using the Youden index. The AUC suggested moderate predictive validity for Static-99 and Static-99R, whereas STABLE-2007 had no predictive value. The optimal cutoff of Static-99R was suggested to have higher specificity than that of Static-99, whereas sensitivity was comparable between instruments. The notion of incremental validity for STABLE-2007 could not be confirmed. This work represents the first meta-analysis to compare Static-99, Static-99R, STABLE-2007, and their combinations in one analysis. Static-99R demonstrated the highest specificity in predicting the risk of sexual recidivism, indicating a potential advantage in detecting true nonrecidivists. The findings are discussed, considering the current recommendations for assessing the risk of sexual recidivism in the criminal justice system.
Public Significance Statement
This meta-analysis suggests an advantage of the Static-99R over the Static-99 in predictive validity and no incremental validity of the STABLE-2007 in assessing the risk of sexual recidivism in adult male individuals convicted of sexual offenses.
Organizational culture is an important predictor of organizational effectiveness, but it is also part of an organizational system that consists of highly interdependent elements such as strategy, ...structure, leadership, and high performance work practices (HPWPs). As such, accounting for the effect of culture's system correlates is important to specify more precisely organizational culture's predictive value for organizational outcomes. To date, however, efforts to connect culture with its system correlates have proceeded independently without integration. This trend is problematic because it raises questions about the strength of culture's association with its system correlates, and it casts uncertainty about organizational culture's predictive validity for organizational outcomes relative to other elements of an organization's system. We addressed these issues by conducting a meta-analysis based on 148 independent samples (N = 26,196 organizations and 556,945 informants). Results generally supported hypothesized predictions linking culture with strategy, structure, leadership, and HPWPs. Meta-analytic regressions and relative weight analyses further revealed that culture dimensions explained unique variance in effectiveness criteria after controlling for the effects of leadership and HPWPs but varied across effectiveness criteria in terms of relative importance. We discuss theoretical and practical implications and highlight several avenues for future research.
As research and clinical settings increasingly emphasize questions of change, it is crucial that our mechanistic and outcome variables are established as reliable and valid measures of such change. ...However, there is often a mismatch between the purposes for which symptom measures were developed and validated versus their application. Traditional psychometric theory has focused largely on between-person change, whereas increasingly research and clinical questions concern within-person change. We examined the psychometric properties of two commonly used measures of obsessive-compulsive symptoms (Yale-Brown Obsessive Compulsive Scale, YBOCS; Dimensional Obsessive-Compulsive Scale, DOCS) within a longitudinal treatment context (N = 570). Regarding reliability, we applied traditional (i.e., internal consistency at each week) and novel methods that allow for examination of the reliability of both within- and between-person change (i.e., variance partitioning based on generalizability theory). We examined longitudinal concurrent validity by correlating per-person slopes of obsessive-compulsive and depression symptom measures obtained via mixed-effects models. Within-person change reliability was acceptable or good for the YBOCS and DOCS total scores (.77, .83), suggesting that these measures are capable of capturing meaningful changes that exist within persons over time, and between-person change reliability was excellent (.99-1.0). Per-person slopes analyses supported the longitudinal concurrent validity of both measures. Our data support the continued use of the YBOCS and DOCS as measures of obsessive-compulsive symptoms for the purpose of many longitudinal research questions. The current study provides a template for reestablishing the psychometric properties of other commonly used measures in the context of longitudinal investigations.
Public Significance Statement
Within research and clinical settings, it is crucial that the measures used to assess change are established as reliable and valid (i.e., consistently measuring changes in the constructs expected to change during treatment). Our research using novel statistical methods suggests that two measures of obsessive-compulsive symptoms (YBOCS and DOCS) meaningfully capture changes that occur within individuals during treatment, as well as differences between individuals.
This paper introduces virtual reality as an experimental method for the language sciences and provides a review of recent studies using the method to answer fundamental, psycholinguistic research ...questions. It is argued that virtual reality demonstrates that ecological validity and experimental control should not be conceived of as two extremes on a continuum, but rather as two orthogonal factors. Benefits of using virtual reality as an experimental method include that in a virtual environment, as in the real world, there is no artificial spatial divide between participant and stimulus. Moreover, virtual reality experiments do not necessarily have to include a repetitive trial structure or an unnatural experimental task. Virtual agents outperform experimental confederates in terms of the consistency and replicability of their behavior, allowing for reproducible science across participants and research labs. The main promise of virtual reality as a tool for the experimental language sciences, however, is that it shifts theoretical focus towards the interplay between different modalities (e.g., speech, gesture, eye gaze, facial expressions) in dynamic and communicative real-world environments, complementing studies that focus on one modality (e.g., speech) in isolation.