Advancements in machine learning algorithms have had a beneficial impact on representation learning, classification, and prediction models built using electronic health record (EHR) data. Effort has ...been put both on increasing models' overall performance as well as improving their interpretability, particularly regarding the decision-making process. In this study, we present a temporal deep learning model to perform bidirectional representation learning on EHR sequences with a transformer architecture to predict future diagnosis of depression. This model is able to aggregate five heterogenous and high-dimensional data sources from the EHR and process them in a temporal manner for chronic disease prediction at various prediction windows. We applied the current trend of pretraining and fine-tuning on EHR data to outperform the current state-of-the-art in chronic disease prediction, and to demonstrate the underlying relation between EHR codes in the sequence. The model generated the highest increases of precision-recall area under the curve (PRAUC) from 0.70 to 0.76 in depression prediction compared to the best baseline model. Furthermore, the self-attention weights in each sequence quantitatively demonstrated the inner relationship between various codes, which improved the model's interpretability. These results demonstrate the model's ability to utilize heterogeneous EHR data to predict depression while achieving high accuracy and interpretability, which may facilitate constructing clinical decision support systems in the future for chronic disease screening and early detection.
Prostate cancer is the most common and second most deadly form of cancer in men in the United States. The classification of prostate cancers based on Gleason grading using histological images is ...important in risk assessment and treatment planning for patients. Here, we demonstrate a new region-based convolutional neural network framework for multi-task prediction using an epithelial network head and a grading network head. Compared with a single-task model, our multi-task model can provide complementary contextual information, which contributes to better performance. Our model is achieved a state-of-the-art performance in epithelial cells detection and Gleason grading tasks simultaneously. Using fivefold cross-validation, our model is achieved an epithelial cells detection accuracy of 99.07% with an average area under the curve of 0.998. As for Gleason grading, our model is obtained a mean intersection over union of 79.56% and an overall pixel accuracy of 89.40%.
Current clinical practice relies on clinical history to determine the time since stroke (TSS) onset. Imaging-based determination of acute stroke onset time could provide critical information to ...clinicians in deciding stroke treatment options, such as thrombolysis. The patients with unknown or unwitnessed TSS are usually excluded from thrombolysis, even if their symptoms began within the therapeutic window. In this paper, we demonstrate a machine learning approach for TSS classification using routinely acquired imaging sequences. We develop imaging features from the magnetic resonance (MR) images and train machine learning models to classify the TSS. We also propose a deep-learning model to extract hidden representations for the MR perfusion-weighted images and demonstrate classification improvement by incorporating these additional deep features. The cross-validation results show that our best classifier achieved an area under the curve of 0.765, with a sensitivity of 0.788 and a negative predictive value of 0.609, outperforming existing methods. We show that the features generated by our deep-learning algorithm correlate with the MR imaging features, and validate the robustness of the model on imaging parameter variations (e.g., year of imaging). This paper advances magnetic resonance imaging analysis one-step-closer to an operational decision support tool for stroke treatment guidance.
Large numbers of histopathological images have been digitized into high resolution whole slide images, opening opportunities in developing computational image analysis tools to reduce pathologists' ...workload and potentially improve inter- and intra-observer agreement. Most previous work on whole slide image analysis has focused on classification or segmentation of small pre-selected regions-of-interest, which requires fine-grained annotation and is non-trivial to extend for large-scale whole slide analysis. In this paper, we proposed a multi-resolution multiple instance learning model that leverages saliency maps to detect suspicious regions for fine-grained grade prediction. Instead of relying on expensive region- or pixel-level annotations, our model can be trained end-to-end with only slide-level labels. The model is developed on a large-scale prostate biopsy dataset containing 20,229 slides from 830 patients. The model achieved 92.7% accuracy, 81.8% Cohen's Kappa for benign, low grade (i.e. Grade group 1) and high grade (i.e. Grade group ≥ 2) prediction, an area under the receiver operating characteristic curve (AUROC) of 98.2% and an average precision (AP) of 97.4% for differentiating malignant and benign slides. The model obtained an AUROC of 99.4% and an AP of 99.8% for cancer detection on an external dataset.
Display omitted
•A multi-resolution multiple instance learning model is developed for Gleason grade group classification.•The model can localize suspicious regions, and then classify cancer grade at a higher magnification with selected tiles.•The model doesn’t require fine-grained annotations and can be trained with slide-level labels from pathology reports.•The model was evaluated on a large independent test set and an external dataset, and achieved promising results.
Recent developments in machine learning algorithms have enabled models to exhibit impressive performance in healthcare tasks using electronic health record (EHR) data. However, the heterogeneous ...nature and sparsity of EHR data remains challenging. In this work, we present a model that utilizes heterogeneous data and addresses sparsity by representing diagnoses, procedures, and medication codes with temporal Hierarchical Clinical Embeddings combined with Topic modeling (HCET) on clinical notes. HCET aggregates various categories of EHR data and learns inherent structure based on hospital visits for an individual patient. We demonstrate the potential of the approach in the task of predicting depression at various time points prior to a clinical diagnosis. We found that HCET outperformed all baseline methods with a highest improvement of 0.07 in precision-recall area under the curve (PRAUC). Furthermore, applying attention weights across EHR data modalities significantly improved the performance as well as the model's interpretability by revealing the relative weight for each data modality. Our results demonstrate the model's ability to utilize heterogeneous EHR information to predict depression, which may have future implications for screening and early detection.
Bone age assessment (BAA) is clinically important as it can be used to diagnose endocrine and metabolic disorders during child development. Existing deep learning based methods for classifying bone ...age use the global image as input, or exploit local information by annotating extra bounding boxes or key points. However, training with the global image underutilizes discriminative local information, while providing extra annotations is expensive and subjective. In this paper, we propose an attention-guided approach to automatically localize the discriminative regions for BAA without any extra annotations. Specifically, we first train a classification model to learn the attention maps of the discriminative regions, finding the hand region, the most discriminative region (the carpal bones), and the next most discriminative region (the metacarpal bones). Guided by those attention maps, we then crop the informative local regions from the original image and aggregate different regions for BAA. Instead of taking BAA as a general regression task, which is suboptimal due to the label ambiguity problem in the age label space, we propose using joint age distribution learning and expectation regression, which makes use of the ordinal relationship among hand images with different individual ages and leads to more robust age estimation. Extensive experiments are conducted on the RSNA pediatric bone age data set. Without using extra manual annotations, our method achieves competitive results compared with existing state-of-the-art deep learning-based methods that require manual annotation. Code is available at https://github.com/chenchao666/Bone-Age-Assessment .
Around 50% of hospital readmissions due to heart failure are preventable, with lack of adherence to prescribed self-care as a driving factor. Remote tracking and reminders issued by mobile health ...devices could help to promote self-care, which could potentially reduce these readmissions.
We sought to investigate two factors: (1) feasibility of enrolling heart failure patients in a remote monitoring regimen that uses wireless sensors and patient-reported outcome measures; and (2) their adherence to using the study devices and completing patient-reported outcome measures.
Twenty heart failure patients participated in piloting a remote monitoring regimen. Data collection included: (1) physical activity using wrist-worn activity trackers; (2) body weight using bathroom scales; (3) medication adherence using smart pill bottles; and (4) patient -reported outcomes using patient-reported outcome measures.
We evaluated 150 hospitalized heart failure patients and enrolled 20 individuals. Two factors contributed to 50% (65/130) being excluded from the study: smartphone ownership and patient discharge. Over the course of the study, 60.0% of the subjects wore the activity tracker for at least 70% of the hours, and 45.0% used the scale for more than 70% of the days. The pill bottle was used less than 10% of the days by 55.0% of the subjects.
Our method of recruiting heart failure patients prior to hospital discharge may not be feasible as the enrollment rate was low. Once enrolled, the majority of subjects maintained a high adherence to wearing the activity tracker but low adherence to using the pill bottle and completing the follow-up surveys. Scale usage was fair, but it received positive reviews from most subjects. Given the observed usage and feedback, we suggest mobile health-driven interventions consider including an activity tracker and bathroom scale. We also recommend administering a shorter survey more regularly and through an easier interface.
Abstract
Objective
To demonstrate enabling multi-institutional training without centralizing or sharing the underlying physical data via federated learning (FL).
Materials and Methods
Deep learning ...models were trained at each participating institution using local clinical data, and an additional model was trained using FL across all of the institutions.
Results
We found that the FL model exhibited superior performance and generalizability to the models trained at single institutions, with an overall performance level that was significantly better than that of any of the institutional models alone when evaluated on held-out test sets from each institution and an outside challenge dataset.
Discussion
The power of FL was successfully demonstrated across 3 academic institutions while avoiding the privacy risk associated with the transfer and pooling of patient data.
Conclusion
Federated learning is an effective methodology that merits further study to enable accelerated development of models across institutions, enabling greater generalizability in clinical use.
Deep neural networks, in particular convolutional networks, have rapidly become a popular choice for analyzing histopathology images. However, training these models relies heavily on a large number ...of samples manually annotated by experts, which is cumbersome and expensive. In addition, it is difficult to obtain a perfect set of labels due to the variability between expert annotations. This paper presents a novel active learning (AL) framework for histopathology image analysis, named PathAL. To reduce the required number of expert annotations, PathAL selects two groups of unlabeled data in each training iteration: one "informative" sample that requires additional expert annotation, and one "confident predictive" sample that is automatically added to the training set using the model's pseudo-labels. To reduce the impact of the noisy-labeled samples in the training set, PathAL systematically identifies noisy samples and excludes them to improve the generalization of the model. Our model advances the existing AL method for medical image analysis in two ways. First, we present a selection strategy to improve classification performance with fewer manual annotations. Unlike traditional methods focusing only on finding the most uncertain samples with low prediction confidence, we discover a large number of high confidence samples from the unlabeled set and automatically add them for training with assigned pseudo-labels. Second, we design a method to distinguish between noisy samples and hard samples using a heuristic approach. We exclude the noisy samples while preserving the hard samples to improve model performance. Extensive experiments demonstrate that our proposed PathAL framework achieves promising results on a prostate cancer Gleason grading task, obtaining similar performance with 40% fewer annotations compared to the fully supervised learning scenario. An ablation study is provided to analyze the effectiveness of each component in PathAL, and a pathologist reader study is conducted to validate our proposed algorithm.
Developing large-scale datasets with research-quality annotations is challenging due to the high cost of refining clinically generated markup into high precision annotations. We evaluated the direct ...use of a large dataset with only clinically generated annotations in development of high-performance segmentation models for small research-quality challenge datasets.
We used a large retrospective dataset from our institution comprised of 1,620 clinically generated segmentations, and two challenge datasets (PROMISE12: 50 patients, ProstateX-2: 99 patients). We trained a 3D U-Net convolutional neural network (CNN) segmentation model using our entire dataset, and used that model as a template to train models on the challenge datasets. We also trained versions of the template model using ablated proportions of our dataset, and evaluated the relative benefit of those templates for the final models. Finally, we trained a version of the template model using an out-of-domain brain cancer dataset, and evaluated the relevant benefit of that template for the final models. We used five-fold cross-validation (CV) for all training and evaluation across our entire dataset.
Our model achieves state-of-the-art performance on our large dataset (mean overall Dice 0.916, average Hausdorff distance 0.135 across CV folds). Using this model as a pre-trained template for refining on two external datasets significantly enhanced performance (30% and 49% enhancement in Dice scores respectively). Mean overall Dice and mean average Hausdorff distance were 0.912 and 0.15 for the ProstateX-2 dataset, and 0.852 and 0.581 for the PROMISE12 dataset. Using even small quantities of data to train the template enhanced performance, with significant improvements using 5% or more of the data.
We trained a state-of-the-art model using unrefined clinical prostate annotations and found that its use as a template model significantly improved performance in other prostate segmentation tasks, even when trained with only 5% of the original dataset.