We developed an algorithm for identifying U.S. veterans with a history of posttraumatic stress disorder (PTSD), using the Department of Veterans Affairs (VA) electronic medical record (EMR) system. ...This work was motivated by the need to create a valid EMR‐based phenotype to identify thousands of cases and controls for a genome‐wide association study of PTSD in veterans. We used manual chart review (n = 500) as the gold standard. For both the algorithm and chart review, three classifications were possible: likely PTSD, possible PTSD, and likely not PTSD. We used Lasso regression with cross‐validation to select statistically significant predictors of PTSD from the EMR and then generate a predicted probability score of being a PTSD case for every participant in the study population (range: 0–1.00). Comparing the performance of our probabilistic approach (Lasso algorithm) to a rule‐based approach (International Classification of Diseases ICD algorithm), the Lasso algorithm showed modestly higher overall percent agreement with chart review than the ICD algorithm (80% vs. 75%), higher sensitivity (0.95 vs. 0.84), and higher accuracy (AUC = 0.95 vs. 0.90). We applied a 0.7 probability cut‐point to the Lasso results to determine final PTSD case‐control status for the VA population. The final algorithm had a 0.99 sensitivity, 0.99 specificity, 0.95 positive predictive value, and 1.00 negative predictive value for PTSD classification (grouping possible PTSD and likely not PTSD) as determined by chart review. This algorithm may be useful for other research and quality improvement endeavors within the VA.
Resumen
Spanish s by Asociación Chilena de Estrés Traumático (ACET)
Validación de un algoritmo basado en registros médicos electrónicos para identificar el trastorno por estrés postraumático en veteranos de los EE. UU.
VALIDACIÓN DE ALGORITOMO DE TEPT
Desarrollamos un algoritmo para identificar a los veteranos de EE. UU. con historial de trastorno de estrés postraumático (TEPT), utilizando el sistema de registro médico electrónico (RME) del Departamento de Asuntos de Veteranos (AS). Este trabajo fue motivado por la necesidad de crear un fenotipo válido, basado en RME para identificar miles de casos y controles para un estudio de asociación del genoma del TEPT en los veteranos. Utilizamos la revisión manual de tablas (n = 500) como gold estándar. Tanto para el algoritmo como para la revisión de la tabla, fueron posibles tres clasificaciones: PTSD probable, PTSD posible y probablemente no PTSD. Usamos la regresión Lasso con validación cruzada para seleccionar los factores de pronóstico estadísticamente significativos del TEPT a partir de la RME y luego generar una puntuación de probabilidad pronosticada de ser un caso de TEPT para cada participante en la población del estudio (rango: 0–1.00). Comparando el rendimiento de nuestro enfoque probabilístico (algoritmo Lasso) con un enfoque basado en reglas (algoritmo de Clasificación Internacional de Enfermedades CIE), el algoritmo Lasso mostró un porcentaje de acuerdo global modestamente más alto con la revisión de tablas que el algoritmo CIE (80% vs. 75). %), mayor sensibilidad (0.95 frente a 0.84) y mayor precisión (AUC = 0.95 frente a 0.90). Aplicamos un punto de corte de probabilidad de 0.7 a los resultados de Lasso para determinar el estado final de control de caso de TEPT para la población de AV. El algoritmo final tuvo una sensibilidad de 0.99, una especificidad de 0.99, un valor predictivo positivo de 0.95 y un valor predictivo negativo de 1.00 para la clasificación de TEPT (agrupación de TEPT posible y probablemente no TEPT) según lo determinado por la revisión de la tabla. Este algoritmo puede ser útil para otros esfuerzos de investigación y mejora de la calidad dentro del AV.
抽象
Traditional and Simplified Chinese s by the Asian Society for Traumatic Stress Studies (AsianSTSS)
簡體及繁體中文撮要由亞洲創傷心理研究學會翻譯
Validation of an Electronic Medical Record‐Based Algorithm for Identifying Posttraumatic Stress Disorder in U.S. Veterans
Traditional Chinese
標題: 用以找出患創傷後壓力症美國退役軍人的電子健康紀錄為本演算法的效度驗證
撮要: 我們採用美國退伍軍人事務部(VA)電子健康紀錄(EMR)系統, 建立用以找出曾患創傷後壓力症(PTSD)的美國退役軍人的演算法。我們有見針對退役軍人患PTSD的基因組關連研究需有一個有效的EMR為本表型, 以找出數以千計的個案和對照, 因而作此研究。我們以人手進行圖表回顧作為黃金標準 (n = 500)。在演算法和圖表回顧中的分類法都有三種有可能:很大機會患PTSD、有可能患PTSD、 很大機會沒有患PTSD。我們採用Lasso迴歸法並進行交叉驗證, 從EMR選取具統計顯著性的PTSD預測變量, 然後對每個研究樣本作出患PTSD的概率分數預測(值域: 0–1.00)。我們採用的概率分析法(Lasso 演算法) 相比規條為本的分析法 (國際疾病分類 ICD 演算法), 在圖表回顧方面反映稍為較高的整體吻合百分比 (80% vs. 75%), 並有較高敏感度(0.95 vs. 0.84)和準確度(AUC = 0.95 vs. 0.90)。針對Lasso的分析結果, 我們以0.7概率作為取錄點以最後定義VA人口的PTSD個案對照狀態。圖表回顧對於PTSD 分類 (把有可能患 PTSD 及很大機會沒患PTSD結合 ) , 最後的演算法敏感度為 0.99、特殊度 0.99、正向預測值 0.95、負向預測值1.00。此演算法對VA的其他研究和非量化改善計劃可能有所幫助。
Simplified Chinese
标题: 用以找出患创伤后压力症美国退役军人的电子健康纪录为本算法的效度验证
撮要: 我们采用美国退伍军人事务部(VA)电子健康纪录(EMR)系统, 建立用以找出曾患创伤后压力症(PTSD)的美国退役军人的算法。我们有见针对退役军人患PTSD的基因组关连研究需有一个有效的EMR为本表型, 以找出数以千计的个案和对照, 因而作此研究。我们以人手进行图表回顾作为黄金标准 (n = 500)。在算法和图表回顾中的分类法都有三种有可能:很大机会患PTSD、有可能患PTSD、 很大机会没有患PTSD。我们采用Lasso回归法并进行交叉验证, 从EMR选取具统计显著性的PTSD预测变量, 然后对每个研究样本作出患PTSD的概率分数预测(值域: 0–1.00)。我们采用的概率分析法(Lasso 算法) 相比规条为本的分析法 (国际疾病分类 ICD 算法), 在图表回顾方面反映稍为较高的整体吻合百分比 (80% vs. 75%), 并有较高敏感度(0.95 vs. 0.84)和准确度(AUC = 0.95 vs. 0.90)。针对Lasso的分析结果, 我们以0.7概率作为取录点以最后定义VA人口的PTSD个案对照状态。图表回顾对于PTSD 分类 (把有可能患 PTSD 及很大机会没患PTSD结合 ) , 最后的算法敏感度为 0.99、特殊度 0.99、正向预测值 0.95、负向预测值1.00。此算法对VA的其他研究和非量化改善计划可能有所帮助。
Full text
Available for:
BFBNIB, DOBA, FZAB, GIS, IJS, IZUM, KILJ, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, UILJ, UKNU, UL, UM, UPUK
Previous studies of the relationship between fried food consumption and coronary artery disease (CAD) have yielded conflicting results. We tested the hypothesis that frequent fried food consumption ...is associated with a higher risk of incident CAD events in Million Veteran Program (MVP) participants.
Veterans Health Administration electronic health record data were linked to questionnaires completed at MVP enrollment. Self-reported fried food consumption at baseline was categorized: (<1, 1–3, 4–6 times per week or daily). The outcome of interest was non-fatal myocardial infarction or CAD events. We fitted a Cox regression model adjusting for age, sex, race, education, exercise, smoking and alcohol consumption.
Of 154,663 MVP enrollees with survey data, mean age was 64 years and 90% were men. During a mean follow-up of approximately 3 years, there were 6,725 CAD events. There was a positive linear relationship between frequency of fried food consumption and risk of CAD (p for trend 0.0015). Multivariable adjusted hazard ratios (95% CI) were 1.0 (ref), 1.07 (1.01–1.13), 1.08 (1.01–1.16), and 1.14 (1.03–1.27) across consecutive increasing categories of fried food intake.
In a large national cohort of U.S. Veterans, fried food consumption has a positive, dose-dependent association with CAD.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Post-COVID-19 condition (colloquially known as "long COVID-19") characterized as postacute sequelae of SARS-CoV-2 has no universal clinical case definition. Recent efforts have focused on ...understanding long COVID-19 symptoms, and electronic health record (EHR) data provide a unique resource for understanding this condition. The introduction of the International Classification of Diseases, Tenth Revision (ICD-10) code U09.9 for "Post COVID-19 condition, unspecified" to identify patients with long COVID-19 has provided a method of evaluating this condition in EHRs; however, the accuracy of this code is unclear.
This study aimed to characterize the utility and accuracy of the U09.9 code across 3 health care systems-the Veterans Health Administration, the Beth Israel Deaconess Medical Center, and the University of Pittsburgh Medical Center-against patients identified with long COVID-19 via a chart review by operationalizing the World Health Organization (WHO) and Centers for Disease Control and Prevention (CDC) definitions.
Patients who were COVID-19 positive with either a U07.1 ICD-10 code or positive polymerase chain reaction test within these health care systems were identified for chart review. Among this cohort, we sampled patients based on two approaches: (1) with a U09.9 code and (2) without a U09.9 code but with a new onset long COVID-19-related ICD-10 code, which allows us to assess the sensitivity of the U09.9 code. To operationalize the long COVID-19 definition based on health agency guidelines, symptoms were grouped into a "core" cluster of 11 commonly reported symptoms among patients with long COVID-19 and an extended cluster that captured all other symptoms by disease domain. Patients having ≥2 symptoms persisting for ≥60 days that were new onset after their COVID-19 infection, with ≥1 symptom in the core cluster, were labeled as having long COVID-19 per chart review. The code's performance was compared across 3 health care systems and across different time periods of the pandemic.
Overall, 900 patient charts were reviewed across 3 health care systems. The prevalence of long COVID-19 among the cohort with the U09.9 ICD-10 code based on the operationalized WHO definition was between 23.2% and 62.4% across these health care systems. We also evaluated a less stringent version of the WHO definition and the CDC definition and observed an increase in the prevalence of long COVID-19 at all 3 health care systems.
This is one of the first studies to evaluate the U09.9 code against a clinical case definition for long COVID-19, as well as the first to apply this definition to EHR data using a chart review approach on a nationwide cohort across multiple health care systems. This chart review approach can be implemented at other EHR systems to further evaluate the utility and performance of the U09.9 code.
A significant proportion of SARS-CoV-2 infected individuals experience post-COVID-19 condition months after initial infection.
To determine the rates, clinical setting, risk factors, and symptoms ...associated with the documentation of International Statistical Classification of Diseases Tenth Revision (ICD-10), code U09.9 for post-COVID-19 condition after acute infection.
This retrospective cohort study was performed within the US Department of Veterans Affairs (VA) health care system. Veterans with a positive SARS-CoV-2 test result between October 1, 2021, the date ICD-10 code U09.9 was introduced, and January 31, 2023 (n = 388 980), and a randomly selected subsample of patients with the U09.9 code (n = 350) whose symptom prevalence was assessed by systematic medical record review, were included in the analysis.
Positive SARS-CoV-2 test result.
Rates, clinical setting, risk factors, and symptoms associated with ICD-10 code U09.9 in the medical record.
Among the 388 980 persons with a positive SARS-CoV-2 test, the mean (SD) age was 61.4 (16.1) years; 87.3% were men. In terms of race and ethnicity, 0.8% were American Indian or Alaska Native, 1.4% were Asian, 20.7% were Black, 9.3% were Hispanic or Latino, 1.0% were Native Hawaiian or Other Pacific Islander; and 67.8% were White. Cumulative incidence of U09.9 documentation was 4.79% (95% CI, 4.73%-4.87%) at 6 months and 5.28% (95% CI, 5.21%-5.36%) at 12 months after infection. Factors independently associated with U09.9 documentation included older age, female sex, Hispanic or Latino ethnicity, comorbidity burden, and severe acute infection manifesting by symptoms, hospitalization, or ventilation. Primary vaccination (adjusted hazard ratio AHR, 0.80 95% CI, 0.78-0.83) and booster vaccination (AHR, 0.66 95% CI, 0.64-0.69) were associated with a lower likelihood of U09.9 documentation. Marked differences by geographic region and facility in U09.9 code documentation may reflect local screening and care practices. Among the 350 patients undergoing systematic medical record review, the most common symptoms documented in the medical records among patients with the U09.9 code were shortness of breath (130 37.1%), fatigue or exhaustion (78 22.3%), cough (63 18.0%), reduced cognitive function or brain fog (22 6.3%), and change in smell and/or taste (20 5.7%).
In this cohort study of 388 980 veterans, documentation of ICD-10 code U09.9 had marked regional and facility-level variability. Strong risk factors for U09.9 documentation were identified, while vaccination appeared to be protective. Accurate and consistent documentation of U09.9 is needed to maximize its utility in tracking patients for clinical care and research. Future studies should examine the long-term trajectory of individuals with U09.9 documentation.
IMPORTANCE: Data are limited regarding statin therapy for primary prevention of atherosclerotic cardiovascular disease (ASCVD) in adults 75 years and older. OBJECTIVE: To evaluate the role of statin ...use for mortality and primary prevention of ASCVD in veterans 75 years and older. DESIGN, SETTING, AND PARTICIPANTS: Retrospective cohort study that used Veterans Health Administration (VHA) data on adults 75 years and older, free of ASCVD, and with a clinical visit in 2002-2012. Follow-up continued through December 31, 2016. All data were linked to Medicare and Medicaid claims and pharmaceutical data. A new-user design was used, excluding those with any prior statin use. Cox proportional hazards models were fit to evaluate the association of statin use with outcomes. Analyses were conducted using propensity score overlap weighting to balance baseline characteristics. EXPOSURES: Any new statin prescription. MAIN OUTCOMES AND MEASURES: The primary outcomes were all-cause and cardiovascular mortality. Secondary outcomes included a composite of ASCVD events (myocardial infarction, ischemic stroke, and revascularization with coronary artery bypass graft surgery or percutaneous coronary intervention). RESULTS: Of 326 981 eligible veterans (mean SD age, 81.1 4.1 years; 97% men; 91% white), 57 178 (17.5%) newly initiated statins during the study period. During a mean follow-up of 6.8 (SD, 3.9) years, a total 206 902 deaths occurred including 53 296 cardiovascular deaths, with 78.7 and 98.2 total deaths/1000 person-years among statin users and nonusers, respectively (weighted incidence rate difference IRD/1000 person-years, –19.5 95% CI, –20.4 to –18.5). There were 22.6 and 25.7 cardiovascular deaths per 1000 person-years among statin users and nonusers, respectively (weighted IRD/1000 person-years, –3.1 95 CI, –3.6 to –2.6). For the composite ASCVD outcome there were 123 379 events, with 66.3 and 70.4 events/1000 person-years among statin users and nonusers, respectively (weighted IRD/1000 person-years, –4.1 95% CI, –5.1 to –3.0). After propensity score overlap weighting was applied, the hazard ratio was 0.75 (95% CI, 0.74-0.76) for all-cause mortality, 0.80 (95% CI, 0.78-0.81) for cardiovascular mortality, and 0.92 (95% CI, 0.91-0.94) for a composite of ASCVD events when comparing statin users with nonusers. CONCLUSIONS AND RELEVANCE: Among US veterans 75 years and older and free of ASCVD at baseline, new statin use was significantly associated with a lower risk of all-cause and cardiovascular mortality. Further research, including from randomized clinical trials, is needed to more definitively determine the role of statin therapy in older adults for primary prevention of ASCVD.
The Million Veteran Program (MVP) was established in 2011 as a national research initiative to determine how genetic variation influences the health of US military veterans. Here we genotyped 312,571 ...MVP participants using a custom biobank array and linked the genetic data to laboratory and clinical phenotypes extracted from electronic health records covering a median of 10.0 years of follow-up. Among 297,626 veterans with at least one blood lipid measurement, including 57,332 black and 24,743 Hispanic participants, we tested up to around 32 million variants for association with lipid levels and identified 118 novel genome-wide significant loci after meta-analysis with data from the Global Lipids Genetics Consortium (total n > 600,000). Through a focus on mutations predicted to result in a loss of gene function and a phenome-wide association study, we propose novel indications for pharmaceutical inhibitors targeting PCSK9 (abdominal aortic aneurysm), ANGPTL4 (type 2 diabetes) and PDE3B (triglycerides and coronary disease).
Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased ...the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures, which reduce the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1-2 d if all data are available; however, the timing is largely dependent on the chart review stage, which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no).
Habitual alcohol use can be an indicator of alcohol dependence, which is associated with a wide range of serious health problems.
We completed a genome-wide association study in 126,936 European ...American and 17,029 African American subjects in the Veterans Affairs Million Veteran Program for a quantitative phenotype based on maximum habitual alcohol consumption.
ADH1B, on chromosome 4, was the lead locus for both populations: for the European American sample, rs1229984 (p = 4.9 × 10−47); for African American, rs2066702 (p = 2.3 × 10−12). In the European American sample, we identified three additional genome-wide–significant maximum habitual alcohol consumption loci: on chromosome 17, rs77804065 (p = 1.5 × 10−12), at CRHR1 (corticotropin-releasing hormone receptor 1); the protein product of this gene is involved in stress and immune responses; and on chromosomes 8 and 10. European American and African American samples were then meta-analyzed; the associated region at CRHR1 increased in significance to 1.02 × 10−13, and we identified two additional genome-wide significant loci, FGF14 (p = 9.86 × 10−9) (chromosome 13) and a locus on chromosome 11. Besides ADH1B, none of the five loci have prior genome-wide significant support. Post–genome-wide association study analysis identified genetic correlation to other alcohol-related traits, smoking-related traits, and many others. Replications were observed in UK Biobank data. Genetic correlation between maximum habitual alcohol consumption and alcohol dependence was 0.87 (p = 4.78 × 10−9). Enrichment for cell types included dopaminergic and gamma-aminobutyric acidergic neurons in midbrain, and pancreatic delta cells.
The present study supports five novel alcohol-use risk loci, with particularly strong statistical support for CRHR1. Additionally, we provide novel insight regarding the biology of harmful alcohol use.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Abstract only Background: Responses to new anti-hypertensive therapy (AHT) can differ. Cluster analysis of longitudinal systolic blood pressure (SBP) data allows identification of individuals with ...similar trajectories, which enables hypothesis generation to potentially explain the observed patterns. We examined electronic health data from U.S. Veterans initiating AHT from 2002-2009. Methods: SBP was tracked for 1 year before and up to 2 years after AHT initiation. Subjects were clustered using a K-means approach with the mean of the squared Euclidean distances as a distance metric and evaluated for the optimal number of clusters. Thin-plate splines were used to display the smoothed trajectories of each group with predicted SBP measures over time. Results: A total of 45,598 subjects contributed 783,852 SBP measurements. Five clusters of subjects with similar trajectories were produced, providing visualization of SBP Figure 1 prior to initiation, immediately after starting therapy, and after longer treatment duration. For example, Group 1 had the lowest mean age, lowest HDL-C, lowest LDL-C, highest triglycerides, lowest baseline SBP, lowest insulin use, and highest statin use. Conclusions: Trajectory clustering for SBP identifies distinct response groups that differ in response to therapy, laboratory measures, and medication use. Future analyses can examine anti-hypertensive medication use, compliance and genetic factors to identify potential causes for these trajectories. These cluster analyses can provide new analytical approaches related to risk factor diagnosis and treatment.
Abstract
The development of phenotypes using electronic health records is a resource-intensive process. Therefore, the cataloging of phenotype algorithm metadata for reuse is critical to accelerate ...clinical research. The Department of Veterans Affairs (VA) has developed a standard for phenotype metadata collection which is currently used in the VA phenomics knowledgebase library, CIPHER (Centralized Interactive Phenomics Resource), to capture over 5000 phenotypes. The CIPHER standard improves upon existing phenotype library metadata collection by capturing the context of algorithm development, phenotyping method used, and approach to validation. While the standard was iteratively developed with VA phenomics experts, it is applicable to the capture of phenotypes across healthcare systems. We describe the framework of the CIPHER standard for phenotype metadata collection, the rationale for its development, and its current application to the largest healthcare system in the United States.