Precise phenotype information is needed to understand the effects of genetic and epigenetic changes on tumor behavior and responsiveness. Extraction and representation of cancer phenotypes is ...currently mostly performed manually, making it difficult to correlate phenotypic data to genomic data. In addition, genomic data are being produced at an increasingly faster pace, exacerbating the problem. The DeepPhe software enables automated extraction of detailed phenotype information from electronic medical records of cancer patients. The system implements advanced Natural Language Processing and knowledge engineering methods within a flexible modular architecture, and was evaluated using a manually annotated dataset of the University of Pittsburgh Medical Center breast cancer patients. The resulting platform provides critical and missing computational methods for computational phenotyping. Working in tandem with advanced analysis of high-throughput sequencing, these approaches will further accelerate the transition to precision cancer treatment.
.
We examined the comparative performance of structured, diagnostic codes vs. natural language processing (NLP) of unstructured text for screening suicidal behavior among pregnant women in electronic ...medical records (EMRs).
Women aged 10-64 years with at least one diagnostic code related to pregnancy or delivery (N = 275,843) from Partners HealthCare were included as our "datamart." Diagnostic codes related to suicidal behavior were applied to the datamart to screen women for suicidal behavior. Among women without any diagnostic codes related to suicidal behavior (n = 273,410), 5880 women were randomly sampled, of whom 1120 had at least one mention of terms related to suicidal behavior in clinical notes. NLP was then used to process clinical notes for the 1120 women. Chart reviews were performed for subsamples of women.
Using diagnostic codes, 196 pregnant women were screened positive for suicidal behavior, among whom 149 (76%) had confirmed suicidal behavior by chart review. Using NLP among those without diagnostic codes, 486 pregnant women were screened positive for suicidal behavior, among whom 146 (30%) had confirmed suicidal behavior by chart review.
The use of NLP substantially improves the sensitivity of screening suicidal behavior in EMRs. However, the prevalence of confirmed suicidal behavior was lower among women who did not have diagnostic codes for suicidal behavior but screened positive by NLP. NLP should be used together with diagnostic codes for future EMR-based phenotyping studies for suicidal behavior.
Morphological factors of intracranial aneurysms and the surrounding vasculature could affect aneurysm rupture risk in a location specific manner. Our goal was to identify image-based morphological ...parameters that correlated with ruptured basilar tip aneurysms. Three-dimensional morphological parameters obtained from CT-angiography (CTA) or digital subtraction angiography (DSA) from 200 patients with basilar tip aneurysms diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016 were evaluated. We examined aneurysm wall irregularity, the presence of daughter domes, hypoplastic, aplastic or fetal PCoAs, vertebral dominance, maximum height, perpendicular height, width, neck diameter, aspect and size ratio, height/width ratio, and diameters and angles of surrounding parent and daughter vessels. Univariable and multivariable statistical analyses were performed to determine statistical significance. In multivariable analysis, presence of a daughter dome, aspect ratio, and larger flow angle were significantly associated with rupture status. We also introduced two new variables, diameter size ratio and parent-daughter angle ratio, which were both significantly inversely associated with ruptured basilar tip aneurysms. Notably, multivariable analyses also showed that larger diameter size ratio was associated with higher Hunt-Hess score while smaller flow angle was associated with higher Fisher grade. These easily measurable parameters, including a new parameter that is unlikely to be affected by the formation of the aneurysm, could aid in screening strategies in high-risk patients with basilar tip aneurysms. One should note, however, that the changes in parameters related to aneurysm morphology may be secondary to aneurysm rupture rather than causal.
We present a cohort of patients with anterior communicating artery (ACoA) aneurysms to investigate morphological characteristics and clinical factors associated with rupture of the aneurysms. 505 ...patients with ACoA aneurysms were identified at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016, with available CT angiography (CTA). Three-dimensional (3D) reconstructions were performed to evaluate aneurysmal morphologic features, including location, projection, irregularity, the presence of daughter dome, height, height/width ratio, and relationships between surrounding vessels. Patient risk factors assessed included patient age, sex, tobacco use, alcohol use, and family history of aneurysms and aneurysmal subarachnoid hemorrhage. Logistic regression was used to build a predictive ACoA score for rupture. Morphologic features associated with ruptured ACoA aneurysms were the presence of a daughter dome (OR 21.4, 95% CI 10.6-43.1), smaller neck diameter (OR 0.55, 95% CI 0.42-0.71), larger aspect ratio (OR 3.57, 95% CI 2.05-6.24), larger flow angle (OR 1.03, 95% CI 1.02-1.05), and smaller ipsilateral A2-ACoA angle (OR 0.98, 95% CI 0.97-1.00). Tobacco use was predominantly associated with morphological factors intrinsic to the aneurysm that were associated with rupture while younger age was also associated with morphologic features extrinsic to the aneurysm that were associated with rupture. The ACoA score had good predictive capacity for rupture with AUC = 0.92 using the 0.632 bootstrap cross-validation for correction of overfitting bias. Ruptured ACoA aneurysms were associated with morphological features that are simple to assess using a simple scoring system. Tobacco use and younger age were predominantly associated with intrinsic and extrinsic morphological features characteristic of rupture, respectively.
Risk of intracranial aneurysm rupture could be affected by geometric features of intracranial aneurysms and the surrounding vasculature in a location specific manner. Our goal is to investigate the ...morphological characteristics associated with ruptured posterior communicating artery (PCoA) aneurysms, as well as patient factors associated with the morphological parameters. Three-dimensional morphological parameters in 409 patients with 432 PCoA aneurysms diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016 who had available CT angiography (CTA) or digital subtraction angiography (DSA) were evaluated. Morphological parameters examined included aneurysm wall irregularity, presence of a daughter dome, presence of hypoplastic or aplastic A1 arteries and hypoplastic or fetal PCoA, perpendicular height, width, neck diameter, aspect and size ratio, height/width ratio, and diameters and angles of surrounding parent and daughter vessels. Univariable and multivariable statistical analyses were performed to determine the association of morphological parameters with rupture of PCoA aneurysms. Additional analyses were performed to determine the association of patient factors with the morphological parameters. Irregular, multilobed PCoA aneurysms with larger height/width ratios and larger flow angles were associated with ruptured PCoA aneurysms, whereas perpendicular height was inversely associated with rupture in a multivariable model. Older age was associated with lower aspect ratio, with a trend towards lower height/width ratio and smaller flow angle, features that are associated with a lower rupture risk. Morphological parameters are easy to assess and could help in risk stratification in patients with unruptured PCoA aneurysms. PCoA aneurysms diagnosed at older age have morphological features associated with lower risk.
Objective
No relapse risk prediction tool is currently available to guide treatment selection for multiple sclerosis (MS). Leveraging electronic health record (EHR) data readily available at the ...point of care, we developed a clinical tool for predicting MS relapse risk.
Methods
Using data from a clinic‐based research registry and linked EHR system between 2006 and 2016, we developed models predicting relapse events from the registry in a training set (n = 1435) and tested the model performance in an independent validation set of MS patients (n = 186). This iterative process identified prior 1‐year relapse history as a key predictor of future relapse but ascertaining relapse history through the labor‐intensive chart review is impractical. We pursued two‐stage algorithm development: (1) L1‐regularized logistic regression (LASSO) to phenotype past 1‐year relapse status from contemporaneous EHR data, (2) LASSO to predict future 1‐year relapse risk using imputed prior 1‐year relapse status and other algorithm‐selected features.
Results
The final model, comprising age, disease duration, and imputed prior 1‐year relapse history, achieved a predictive AUC and F score of 0.707 and 0.307, respectively. The performance was significantly better than the baseline model (age, sex, race/ethnicity, and disease duration) and noninferior to a model containing actual prior 1‐year relapse history. The predicted risk probability declined with disease duration and age.
Conclusion
Our novel machine‐learning algorithm predicts 1‐year MS relapse with accuracy comparable to other clinical prediction tools and has applicability at the point of care. This EHR‐based two‐stage approach of outcome prediction may have application to neurological disease beyond MS.
Alcohol consumption may be a modifiable risk factor for rupture of intracranial aneurysms. Our aim is to evaluate the association between ruptured aneurysms and alcohol consumption, intensity, and ...cessation. The medical records of 4701 patients with 6411 radiographically confirmed intracranial aneurysms diagnosed at the Brigham and Women’s Hospital and Massachusetts General Hospital between 1990 and 2016 were reviewed. Individuals were divided into cases with ruptured aneurysms and controls with unruptured aneurysms. Univariable and multivariable logistic regression analyses were performed to determine the association between alcohol consumption and rupture of intracranial aneurysms. In multivariable analysis, current alcohol use (OR 1.36, 95% CI 1.17–1.58) was associated with rupture status compared with never drinkers, whereas former alcohol use was not significant (OR 1.23, 95% CI 0.92–1.63). In addition, the number of alcoholic beverages per day among current alcohol users (OR 1.13, 95% CI 1.04–1.23) was significantly associated with rupture status, whereas alcohol use intensity was not significant among former users (OR 1.02, 95% CI 0.94–1.11). Current alcohol use and intensity are significantly associated with intracranial aneurysm rupture. However, this increased risk does not persist in former alcohol users, emphasizing the potential importance of alcohol cessation in patients harboring unruptured aneurysms.
To use natural language processing (NLP) in conjunction with the electronic medical record (EMR) to accurately identify patients with cerebral aneurysms and their matched controls.
ICD-9 and Current ...Procedural Terminology codes were used to obtain an initial data mart of potential aneurysm patients from the EMR. NLP was then used to train a classification algorithm with .632 bootstrap cross-validation used for correction of overfitting bias. The classification rule was then applied to the full data mart. Additional validation was performed on 300 patients classified as having aneurysms. Controls were obtained by matching age, sex, race, and healthcare use.
We identified 55,675 patients of 4.2 million patients with ICD-9 and Current Procedural Terminology codes consistent with cerebral aneurysms. Of those, 16,823 patients had the term aneurysm occur near relevant anatomic terms. After training, a final algorithm consisting of 8 coded and 14 NLP variables was selected, yielding an overall area under the receiver-operating characteristic curve of 0.95. After the final algorithm was applied, 5,589 patients were classified as having aneurysms, and 54,952 controls were matched to those patients. The positive predictive value based on a validation cohort of 300 patients was 0.86.
We harnessed the power of the EMR by applying NLP to obtain a large cohort of patients with intracranial aneurysms and their matched controls. Such algorithms can be generalized to other diseases for epidemiologic and genetic studies.
Although smoking is a known risk factor for intracranial aneurysm (IA) rupture, the exact relationship between IA rupture and smoking intensity and duration, as well as duration of smoking cessation, ...remains unknown.
In this case-control study, we analyzed 4,701 patients with 6,411 IAs diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016. We divided individuals into patients with ruptured aneurysms and controls with unruptured aneurysms. We performed univariable and multivariable logistic regression analyses to determine the association between smoking status and ruptured IAs at presentation. In a subgroup analysis among former and current smokers, we assessed the association between ruptured aneurysms and number of packs per day, duration of smoking, and duration since smoking cessation.
In multivariable analysis, current (odds ratio OR 2.21, 95% confidence interval CI 1.89-2.59) and former smoking status (OR 1.56, 95% CI 1.31-1.86) were associated with rupture status at presentation compared with never smokers. In a subgroup analysis among current and former smokers, years smoked (OR 1.02, 95% CI 1.01-1.03) and packs per day (OR 1.46, 95% CI 1.25-1.70) were significantly associated with ruptured aneurysms at presentation, whereas duration since cessation among former smokers was not significant (OR 1.00, 95% CI 0.99-1.02).
Current cigarette smoking, smoking intensity, and smoking duration are significantly associated with ruptured IAs at presentation. However, the significantly increased risk persists after smoking cessation, and smoking cessation does not confer a reduced risk of aneurysmal subarachnoid hemorrhage beyond that of reducing the cumulative dose.