Determining transcriptional factor binding sites (TFBSs) is critical for understanding the molecular mechanisms regulating gene expression in different biological conditions. Biological assays ...designed to directly mapping TFBSs require large sample size and intensive resources. As an alternative, ATAC-seq assay is simple to conduct and provides genomic cleavage profiles that contain rich information for imputing TFBSs indirectly. Previous footprint-based tools are inheritably limited by the accuracy of their bias correction algorithms and the efficiency of their feature extraction models. Here we introduce TAMC (Transcriptional factor binding prediction from ATAC-seq profile at Motif-predicted binding sites using Convolutional neural networks), a deep-learning approach for predicting motif-centric TF binding activity from paired-end ATAC-seq data. TAMC does not require bias correction during signal processing. By leveraging a one-dimensional convolutional neural network (1D-CNN) model, TAMC make predictions based on both footprint and non-footprint features at binding sites for each TF and outperforms existing footprinting tools in TFBS prediction particularly for ATAC-seq data with limited sequencing depth.
SARS-CoV-2 infection has been shown to trigger a wide spectrum of immune responses and clinical manifestations in human hosts. Here, we sought to elucidate novel aspects of the host response to ...SARS-CoV-2 infection through RNA sequencing of peripheral blood samples from 46 subjects with COVID-19 and directly comparing them to subjects with seasonal coronavirus, influenza, bacterial pneumonia, and healthy controls. Early SARS-CoV-2 infection triggers a powerful transcriptomic response in peripheral blood with conserved components that are heavily interferon-driven but also marked by indicators of early B-cell activation and antibody production. Interferon responses during SARS-CoV-2 infection demonstrate unique patterns of dysregulated expression compared to other infectious and healthy states. Heterogeneous activation of coagulation and fibrinolytic pathways are present in early COVID-19, as are IL1 and JAK/STAT signaling pathways, which persist into late disease. Classifiers based on differentially expressed genes accurately distinguished SARS-CoV-2 infection from other acute illnesses (auROC 0.95 95% CI 0.92-0.98). The transcriptome in peripheral blood reveals both diverse and conserved components of the immune response in COVID-19 and provides for potential biomarker-based approaches to diagnosis.
Abstract
Methods used to predict surgical case time often rely upon the current procedural terminology (CPT) code as a nominal variable to train machine-learned models, however this limits the ...ability of the model to incorporate new procedures and adds complexity as the number of unique procedures increases. The relative value unit (RVU, a consensus-derived billing indicator) can serve as a proxy for procedure workload and could replace the CPT code as a primary feature for models that predict surgical case length. Using 11,696 surgical cases from Duke University Health System electronic health records data, we compared boosted decision tree models that predict individual case length, changing the method by which the model coded procedure type; CPT, RVU, and CPT–RVU combined. Performance of each model was assessed by inference time, MAE, and RMSE compared to the actual case length on a test set. Models were compared to each other and to the manual scheduler method that currently exists. RMSE for the RVU model (60.8 min) was similar to the CPT model (61.9 min), both of which were lower than scheduler (90.2 min). 65.2% of our RVU model’s predictions (compared to 43.2% from the current human scheduler method) fell within 20% of actual case time. Using RVUs reduced model prediction time by ninefold and reduced the number of training features from 485 to 44. Replacing pre-operative CPT codes with RVUs maintains model performance while decreasing overall model complexity in the prediction of surgical case length.
The coronavirus disease 2019 (COVID-19) pandemic brought about abrupt changes in the way health care is delivered, and the impact of transitioning outpatient clinic visits to telehealth visits on ...processes of care and outcomes is unclear.
We evaluated ordering patterns during cardiovascular telehealth clinic visits in the Duke University Health System between March 15 and June 30, 2020 and 30-day outcomes compared with in-person visits in the same time frame in 2020 and in 2019.
Within the Duke University Health System, there was a 33.1% decrease in the number of outpatient cardiovascular visits conducted in the first 15 weeks of the COVID-19 pandemic, compared with the same time period in 2019. As a proportion of total visits initially booked, 53% of visits were cancelled in 2020 compared to 35% in 2019. However, patients with cancelled visits had similar demographics and comorbidities in 2019 and 2020. Telehealth visits comprised 9.3% of total visits initially booked in 2020, with younger and healthier patients utilizing telehealth compared with those utilizing in-person visits. Compared with in-person visits in 2020, telehealth visits were associated with fewer new (31.6% for telehealth vs 44.6% for in person) or refill (12.9% vs 15.6%, respectively) medication prescriptions, electrocardiograms (4.3% vs 31.4%), laboratory orders (5.9% vs 21.8%), echocardiograms (7.3% vs 98%), and stress tests (4.4% vs 6.6%). When adjusted for age, race, and insurance status, those who had a telehealth visit or cancelled their visit were less likely to have an emergency department or hospital encounter within 30 days compared with those who had in-person visits (adjusted rate ratios (aRR) 0.76 95% 0.65, 0.89 and aRR 0.71 95% 0.65, 0.78, respectively).
In response to the perceived risks of routine medical care affected by the COVID-19 pandemic, different phenotypes of patients chose different types of outpatient cardiology care. A better understanding of these differences could help define necessary and appropriate mode of care for cardiology patients.
Background and Aims
Whether glycemic control, as opposed to diabetes status, is associated with the severity of NAFLD is open for study. We aimed to evaluate whether degree of glycemic control in the ...years preceding liver biopsy predicts the histological severity of NASH.
Approach and Results
Using the Duke NAFLD Clinical Database, we examined patients with biopsy‐proven NAFLD/NASH (n = 713) and the association of liver injury with glycemic control as measured by hemoglobin A1c (HbA1c). The study cohort was predominantly female (59%) and White (84%) with median (interquartile range) age of 50 (42, 58) years; 49% had diabetes (n = 348). Generalized linear regression models adjusted for age, sex, race, diabetes, body mass index, and hyperlipidemia were used to assess the association between mean HbA1c over the year preceding liver biopsy and severity of histological features of NAFLD/NASH. Histological features were graded and staged according to the NASH Clinical Research Network system. Group‐based trajectory analysis was used to examine patients with at least three HbA1c (n = 298) measures over 5 years preceding clinically indicated liver biopsy. Higher mean HbA1c was associated with higher grade of steatosis and ballooned hepatocytes, but not lobular inflammation. Every 1% increase in mean HbA1c was associated with 15% higher odds of increased fibrosis stage (OR, 1.15; 95% CI, 1.01, 1.31). As compared with good glycemic control, moderate control was significantly associated with increased severity of ballooned hepatocytes (OR, 1.74; 95% CI, 1.01, 3.01; P = 0.048) and hepatic fibrosis (HF; OR, 4.59; 95% CI, 2.33, 9.06; P < 0.01).
Conclusions
Glycemic control predicts severity of ballooned hepatocytes and HF in NAFLD/NASH, and thus optimizing glycemic control may be a means of modifying risk of NASH‐related fibrosis progression.
•Machine-learning-based thyroid-malignancy prediction from cytopathology whole slides.•Beyond multiple instance learning: incorporating multiple global and local labels.•Weakly supervised method ...derived from a lower bound of a maximum likelihood estimator.•Ordinal regression framework for multi-label predictions augments human decisions.
Display omitted
We consider machine-learning-based thyroid-malignancy prediction from cytopathology whole-slide images (WSI). Multiple instance learning (MIL) approaches, typically used for the analysis of WSIs, divide the image (bag) into patches (instances), which are used to predict a single bag-level label. These approaches perform poorly in cytopathology slides due to a unique bag structure: sparsely located informative instances with varying characteristics of abnormality. We address these challenges by considering multiple types of labels: bag-level malignancy and ordered diagnostic scores, as well as instance-level informativeness and abnormality labels. We study their contribution beyond the MIL setting by proposing a maximum likelihood estimation (MLE) framework, from which we derive a two-stage deep-learning-based algorithm. The algorithm identifies informative instances and assigns them local malignancy scores that are incorporated into a global malignancy prediction. We derive a lower bound of the MLE, leading to an improved training strategy based on weak supervision, that we motivate through statistical analysis. The lower bound further allows us to extend the proposed algorithm to simultaneously predict multiple bag and instance-level labels from a single output of a neural network. Experimental results demonstrate that the proposed algorithm provides competitive performance compared to several competing methods, achieves (expert) human-level performance, and allows augmentation of human decisions.
Improved risk stratification and prognosis prediction in sepsis is a critical unmet need. Clinical severity scores and available assays such as blood lactate reflect global illness severity with ...suboptimal performance, and do not specifically reveal the underlying dysregulation of sepsis. Here, we present prognostic models for 30-day mortality generated independently by three scientific groups by using 12 discovery cohorts containing transcriptomic data collected from primarily community-onset sepsis patients. Predictive performance is validated in five cohorts of community-onset sepsis patients in which the models show summary AUROCs ranging from 0.765-0.89. Similar performance is observed in four cohorts of hospital-acquired sepsis. Combining the new gene-expression-based prognostic models with prior clinical severity scores leads to significant improvement in prediction of 30-day mortality as measured via AUROC and net reclassification improvement index These models provide an opportunity to develop molecular bedside tests that may improve risk stratification and mortality prediction in patients with sepsis.
Acute respiratory infections caused by bacterial or viral pathogens are among the most common reasons for seeking medical care. Despite improvements in pathogen-based diagnostics, most patients ...receive inappropriate antibiotics. Host response biomarkers offer an alternative diagnostic approach to direct antimicrobial use. This observational cohort study determined whether host gene expression patterns discriminate noninfectious from infectious illness and bacterial from viral causes of acute respiratory infection in the acute care setting. Peripheral whole blood gene expression from 273 subjects with community-onset acute respiratory infection (ARI) or noninfectious illness, as well as 44 healthy controls, was measured using microarrays. Sparse logistic regression was used to develop classifiers for bacterial ARI (71 probes), viral ARI (33 probes), or a noninfectious cause of illness (26 probes). Overall accuracy was 87% (238 of 273 concordant with clinical adjudication), which was more accurate than procalcitonin (78%, P < 0.03) and three published classifiers of bacterial versus viral infection (78 to 83%). The classifiers developed here externally validated in five publicly available data sets (AUC, 0.90 to 0.99). A sixth publicly available data set included 25 patients with co-identification of bacterial and viral pathogens. Applying the ARI classifiers defined four distinct groups: a host response to bacterial ARI, viral ARI, coinfection, and neither a bacterial nor a viral response. These findings create an opportunity to develop and use host gene expression classifiers as diagnostic platforms to combat inappropriate antibiotic use and emerging antibiotic resistance.
Abstract
We consider machine-learning-based lesion identification and malignancy prediction from clinical dermatological images, which can be indistinctly acquired via smartphone or dermoscopy ...capture. Additionally, we do not assume that images contain single lesions, thus the framework supports both focal or wide-field images. Specifically, we propose a two-stage approach in which we first identify all lesions present in the image regardless of sub-type or likelihood of malignancy, then it estimates their likelihood of malignancy, and through aggregation, it also generates an image-level likelihood of malignancy that can be used for high-level screening processes. Further, we consider augmenting the proposed approach with clinical covariates (from electronic health records) and publicly available data (the ISIC dataset). Comprehensive experiments validated on an independent test dataset demonstrate that (1) the proposed approach outperforms alternative model architectures; (2) the model based on images outperforms a pure clinical model by a large margin, and the combination of images and clinical data does not significantly improves over the image-only model; and (3) the proposed framework offers comparable performance in terms of malignancy classification relative to three board certified dermatologists with different levels of experience.
Background
The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) comprises 6 categories used for the diagnosis of thyroid fine‐needle aspiration biopsy (FNAB). Each category has an ...associated risk of malignancy, which is important in the management of a thyroid nodule. More accurate predictions of malignancy may help to reduce unnecessary surgery. A machine learning algorithm (MLA) was developed to evaluate thyroid FNAB via whole slide images (WSIs) to predict malignancy.
Methods
Files were searched for all thyroidectomy specimens with preceding FNAB over 8 years. All cytologic and surgical pathology diagnoses were recorded and correlated for each nodule. One representative slide from each case was scanned to create a WSI. An MLA was designed to identify follicular cells and predict the malignancy of the final pathology. The test set comprised cases blindly reviewed by a cytopathologist who assigned a TBSRTC category. The area under the receiver operating characteristic curve was used to assess the MLA performance.
Results
Nine hundred eight FNABs met the criteria. The MLA predicted malignancy with a sensitivity and specificity of 92.0% and 90.5%, respectively. The areas under the curve for the prediction of malignancy by the cytopathologist and the MLA were 0.931 and 0.932, respectively.
Conclusions
The performance of the MLA in predicting thyroid malignancy from FNAB WSIs is comparable to the performance of an expert cytopathologist. When the MLA and electronic medical record diagnoses are combined, the performance is superior to the performance of either alone. An MLA may be used as an adjunct to FNAB to assist in refining the indeterminate categories.
The machine learning algorithm performed at human levels in the prediction of thyroid malignancy using whole slide images of thyroid FNABs with an AUC of 0.931. When the machine predictions were combined with human decisions, the AUC increased to 0.962 and helped reduce the amount of indeterminate cases.