To estimate the positive predictive value (PPV) of International Classification of Diseases, Tenth Revision (ICD-10) code U07.1, COVID-19 virus identified, in the Department of Veterans of Affairs ...(VA).
Records of ICD-10 code U07.1 from inpatient, outpatient, and emergency/urgent care settings were extracted from VA medical record data from 4/01/2020 to 3/31/2021. A weighted, random sample of 1500 records from each quarter of the one-year observation period was reviewed by study personnel to confirm active COVID-19 infection at the time of diagnosis and classify reasons for false positive records. PPV was estimated overall and compared across clinical setting and quarters.
We identified 664,406 records of U07.1. Among the 1500 reviewed, 237 were false positives (PPV: 84.2%, 95% CI: 82.4-86.0). PPV ranged from 77.7% in outpatient settings to 93.8% in inpatient settings and was 83.3% in quarter 1, 80.5% in quarter 2, 86.1% in quarter 3, and 83.6% in quarter 4. The most common reasons for false positive records were history of COVID-19 (44.3%) and orders for laboratory tests (21.5%).
The PPV of ICD-10 code U07.1 is low, especially in outpatient settings. Directed training may improve accuracy of coding to levels that are deemed adequate for future use in surveillance efforts.
The epidemiology and prognostic impact of increased pulmonary pressure among HIV-infected individuals in the antiretroviral therapy era is not well described.
To examine the prevalence, clinical ...features, and outcomes of increased echocardiographic pulmonary pressure in HIV-infected and -uninfected individuals.
This study evaluated 8,296 veterans referred for echocardiography with reported pulmonary artery systolic pressure (PASP) estimates from the Veterans Aging Cohort study, an observational cohort of HIV-infected and -uninfected veterans matched by age, sex, race/ethnicity, and clinical site. The primary outcome was adjusted mortality by HIV status.
PASP was reported in 2,831 HIV-infected and 5,465 HIV-uninfected veterans (follow-up mean ± SD, 3.8 ± 2.6 yr). As compared with uninfected veterans, HIV-infected veterans with HIV viral load greater than 500 copies/ml (odds ratio, 1.27; 95% confidence interval CI, 1.05-1.54) and those with CD4 cell count less than 200 cells/μl (odds ratio, 1.28; 95% CI, 1.02-1.60) had a higher prevalence of PASP greater than or equal to 40 mm Hg. As compared with uninfected veterans with a PASP less than 40 mm Hg, HIV-infected veterans with a PASP greater than or equal to 40 mm Hg had an increased risk of death (adjusted hazard ratio, 1.78; 95% CI, 1.57-2.01). This risk persisted even among participants without prevalent comorbidities (adjusted hazard ratio, 3.61; 95% CI, 2.17-6.01). The adjusted risk of mortality in HIV-infected veterans was higher at all PASP values than in uninfected veterans, including at values currently considered to be normal.
HIV-infected people with high HIV viral loads or low CD4 cell counts have a higher prevalence of increased PASP than uninfected people. Mortality risk in HIV-infected veterans increases at lower values of PASP than previously recognized and is present even among those without prevalent comorbidities. These findings may inform clinical decision-making regarding screening and surveillance of pulmonary hypertension in HIV-infected individuals.
Introduction
Identifying occurrences of medication side effects and adverse drug events (ADEs) is an important and challenging task because they are frequently only mentioned in clinical narrative ...and are not formally reported.
Methods
We developed a natural language processing (NLP) system that aims to identify mentions of symptoms and drugs in clinical notes and label the relationship between the mentions as indications or ADEs. The system leverages an existing word embeddings model with induced word clusters for dimensionality reduction. It employs a conditional random field (CRF) model for named entity recognition (NER) and a random forest model for relation extraction (RE).
Results
Final performance of each model was evaluated separately and then combined on a manually annotated evaluation set. The micro-averaged F1 score was 80.9% for NER, 88.1% for RE, and 61.2% for the integrated systems. Outputs from our systems were submitted to the NLP Challenges for Detecting Medication and Adverse Drug Events from Electronic Health Records (MADE 1.0) competition (Yu et al. in
http://bio-nlp.org/index.php/projects/39-nlp-challenges
,
2018
). System performance was evaluated in three tasks (NER, RE, and complete system) with multiple teams submitting output from their systems for each task. Our RE system placed first in Task 2 of the challenge and our integrated system achieved third place in Task 3.
Conclusion
Adding to the growing number of publications that utilize NLP to detect occurrences of ADEs, our study illustrates the benefits of employing innovative feature engineering.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Abstract Background This study aims to assess the impact of healthy lifestyle on prostate cancer (PCa) risk in a diverse population. Methods Data for 281,923 men from the Million Veteran Program ...(MVP), a nationwide, health system–based cohort study, were analyzed. Self‐reported information at enrollment included smoking status, exercise, diet, family history of PCa, and race/ethnicity. Body mass index (BMI) was obtained from clinical records. Genetic risk was assessed via a validated polygenic score. Cox proportional hazards models were used to assess associations with PCa outcomes. Results After accounting for ancestry, family history, and genetic risk, smoking was associated with an increased risk of metastatic PCa (hazard ratio HR, 1.83; 95% confidence interval CI, 1.64–2.02; p < 10 −16 ) and fatal PCa (HR, 2.73; 95% CI, 2.36–3.25; p < 10 −16 ). Exercise was associated with a reduced risk of fatal PCa (HR, 0.86; 95% CI, 0.76–0.98; p = .03). Higher BMI was associated with a slightly reduced risk of fatal PCa, and diet score was not independently associated with any end point. Association with exercise was strongest among those who had nonmetastatic PCa at MVP enrollment. Absolute reductions in the risk of fatal PCa via lifestyle factors were greatest among men of African ancestry (1.7% for nonsmokers vs. 6.1% for smokers) or high genetic risk (1.4% for nonsmokers vs. 4.3% for smokers). Conclusions Healthy lifestyle is minimally related to the overall risk of developing PCa but is associated with a substantially reduced risk of dying from PCa. In multivariable analyses, both exercise and not smoking remain independently associated with reduced metastatic and fatal PCa.
In this Million Veteran Program study, exercise and not smoking are shown to be minimally related to the overall risk of developing prostate cancer but are associated with a substantially reduced risk of dying from prostate cancer.
Full text
Available for:
BFBNIB, FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
In order to investigate the mechanisms of cardiovascular disease in HIV infected and uninfected patients, an analysis of echocardiogram reports is required for a large longitudinal multi-center ...study.
A natural language processing system using a dictionary lookup, rules, and patterns was developed to extract heart function measurements that are typically recorded in echocardiogram reports as measurement-value pairs. Curated semantic bootstrapping was used to create a custom dictionary that extends existing terminologies based on terms that actually appear in the medical record. A novel disambiguation method based on semantic constraints was created to identify and discard erroneous alternative definitions of the measurement terms. The system was built utilizing a scalable framework, making it available for processing large datasets.
The system was developed for and validated on notes from three sources: general clinic notes, echocardiogram reports, and radiology reports. The system achieved F-scores of 0.872, 0.844, and 0.877 with precision of 0.936, 0.982, and 0.969 for each dataset respectively averaged across all extracted values. Left ventricular ejection fraction (LVEF) is the most frequently extracted measurement. The precision of extraction of the LVEF measure ranged from 0.968 to 1.0 across different document types.
This system illustrates the feasibility and effectiveness of a large-scale information extraction on clinical data. New clinical questions can be addressed in the domain of heart failure using retrospective clinical data analysis because key heart function measurements can be successfully extracted using natural language processing.
Rationale aims and objectives
As quality measurement becomes increasingly reliant on the availability of structured electronic medical record (EMR) data, clinicians are asked to perform documentation ...using tools that facilitate data capture. These tools may not be available, feasible, or acceptable in all clinical scenarios. Alternative methods of assessment, including natural language processing (NLP) of clinical notes, may improve the completeness of quality measurement in real‐world practice. Our objective was to measure the quality of care for a set of evidence‐based practices using structured EMR data alone, and then supplement those measures with additional data derived from NLP.
Method
As a case example, we studied the quality of care for posttraumatic stress disorder (PTSD) in the United States Department of Veterans Affairs (VA) over a 20‐year period. We measured two aspects of PTSD care, including delivery of evidence‐based psychotherapy (EBP) and associated use of measurement‐based care (MBC), using structured EMR data. We then recalculated these measures using additional data derived from NLP of clinical note text.
Results
There were 2 098 389 VA patients with a diagnosis of PTSD between 2000 and 2019, 72% (n = 1 515 345) of whom had not previously received EBP for PTSD and were treated after a 2015 mandate to document EBP using templates that generate structured EMR data. Using structured EMR data, we determined that 3.2% (n = 48 004) of those patients met our EBP for PTSD quality standard between 2015 and 2019, and 48.1% (n = 23 088) received associated MBC. With the addition of NLP‐derived data, estimates increased to 4.1% (n = 62 789) and 58.0% (n = 36 435), respectively.
Conclusion
Healthcare quality data can be significantly improved by supplementing structured EMR data with NLP‐derived data. By using NLP, health systems may be able to fill the gaps in documentation when structured tools are not yet available or there are barriers to using them in clinical practice.
Full text
Available for:
DOBA, FZAB, GIS, IJS, IZUM, KILJ, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, SIK, UILJ, UKNU, UL, UM, UPUK
The Veterans Health Administration (VA) is the largest single integrated healthcare system in the US and is likely the largest healthcare provider for people with minoritized sexual orientations ...(e.g., gay, lesbian, bisexual). The purpose of this study was to use electronic health record (EHR) data to replicate self-reported survey findings from the general US population and assess whether sexual orientation is associated with diagnosed physical health conditions that may elevate risk of COVID-19 severity among veterans who utilize the VA.
A retrospective analysis of VA EHR data from January 10, 1999–January 07, 2019 analyzed in 2021. Veterans with minoritized sexual orientations were included if they had documentation of a minoritized sexual orientation within clinical notes identified via natural language processing. Veterans without minoritized sexual orientation documentation comprised the comparison group. Adjusted prevalence and prevalence ratios (aPR) were calculated overall and by race/ethnicity while accounting for differences in distributions of sex assigned at birth, age, calendar year of first VA visit, volumes of healthcare utilization, and VA priority group.
Data from 108,401 veterans with minoritized sexual orientation and 6,511,698 controls were analyzed. After adjustment, veterans with minoritized sexual orientations had a statistically significant elevated prevalence of 10 of the 11 conditions. Amongst the highest disparities observed were COPD (aPR:1.24 95% confidence interval:1.23–1.26), asthma (1.22 1.20–1.24), and stroke (1.26 1.24–1.28).
Findings largely corroborated patterns among the general US population. Further research is needed to determine if these disparities translate to poorer COVID-19 outcomes for individuals with minoritized sexual orientation.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Prolonged exposure therapy (PE) is an effective treatment for posttraumatic stress disorder (PTSD). Identifying metrics of treatment response can guide treatment delivery. The median effective dose ...represents the number of sessions at which there is a 50% probability of clinically meaningful improvement (i.e., 10-point reduction in PTSD checklist). The goal of the current study was to investigate the median effective dose of PE. We identified a cohort of Iraq and Afghanistan war veterans who received psychotherapy for PTSD in the Veterans Health Administration between 2001 and 2017. From this cohort, 10,234 veterans who received PE (as identified using natural language processing) and had ≥2 PTSD symptom measures were included in analyses. To determine how the number of PE sessions and covariates affected clinically meaningful improvement, we utilized a Cox proportional hazards regression, followed by Kaplan-Meier curves to determine the median effective dose. The median effective dose of PE was four sessions. Although some covariates were found to be statistically significant predictors of clinically meaningful improvement (e.g., age, gender, PTSD medications, and depressive disorder comorbidity), these effects were small. Clinicians and patients should consider evaluating treatment response after four sessions to determine preliminary effectiveness of PE.
•The median effective dose of prolonged exposure therapy (PE) was 4 sessions.•Demographics and comorbidities had a small effect on median effective dose.•Session four may be an optimal session to evaluate response to PE.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP