Subsurface stratigraphic modeling is crucial for a variety of environmental, societal, and economic challenges. However, the need for specific sedimentological skills in sediment core analysis may ...constitute a limitation. Methods based on Machine Learning and Deep Learning can play a central role in automatizing this time-consuming procedure. In this work, using a robust dataset of high-resolution digital images from continuous sediment cores of Holocene age that reflect a wide spectrum of continental to shallow-marine depositional environments, we outline a novel deep-learning-based approach to perform automatic semantic segmentation directly on core images, leveraging the power of convolutional neural networks. To optimize the interpretation process and maximize scientific value, we use six sedimentary facies associations as target classes in lieu of ineffective classification methods based uniquely on lithology. We propose an automated model that can rapidly characterize sediment cores, allowing immediate guidance for stratigraphic correlation and subsurface reconstructions.
Wound management is a fundamental task in standard clinical practice. Automated solutions already exist for humans, but there is a lack of applications regarding wound management for pets. Precise ...and efficient wound assessment is helpful to improve diagnosis and to increase the effectiveness of treatment plans for chronic wounds. In this work, we introduced a novel pipeline for the segmentation of pet wound images. Starting from a model pre-trained on human-based wound images, we applied a combination of transfer learning (TL) and active semi-supervised learning (ASSL) to automatically label a large dataset. Additionally, we provided a guideline for future applications of TL+ASSL training strategy on image datasets. We compared the effectiveness of the proposed training strategy, monitoring the performance of an EfficientNet-b3 U-Net model against the lighter solution provided by a MobileNet-v2 U-Net model. We obtained 80% of correctly segmented images after five rounds of ASSL training. The EfficientNet-b3 U-Net model significantly outperformed the MobileNet-v2 one. We proved that the number of available samples is a key factor for the correct usage of ASSL training. The proposed approach is a viable solution to reduce the time required for the generation of a segmentation dataset.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Background: Radiomics is a field of research medicine and data science in which quantitative imaging features are extracted from medical images and successively analyzed to develop models for ...providing diagnostic, prognostic, and predictive information. The purpose of this work was to develop a machine learning model to predict the survival probability of 85 cervical cancer patients using PET and CT radiomic features as predictors. Methods: Initially, the patients were divided into two mutually exclusive sets: a training set containing 80% of the data and a testing set containing the remaining 20%. The entire analysis was separately conducted for CT and PET features. Genetic algorithms and LASSO regression were used to perform feature selection on the initial PET and CT feature sets. Two different survival models were employed: the Cox proportional hazard model and random survival forest. The Cox model was built using the subset of features obtained with the feature selection process, while all the available features were used for the random survival forest model. The models were trained on the training set; cross-validation was used to fine-tune the models and to obtain a preliminary measurement of the performance. The models were then validated on the test set, using the concordance index as the metric. In addition, alternative versions of the models were developed using tumor recurrence as an adjunct feature to evaluate its impact on predictive performance. Finally, the selected CT and PET features were combined to build a further Cox model. Results: The genetic algorithm was superior to the LASSO regression for feature selection. The best performing model was the Cox model, which was built using the selected CT features; it achieved a concordance index score of 0.707. With the addition of tumor recurrence as a predictive feature, the Cox CT model reached a concordance index score of 0.776. PET features, however, proved to be inadequate for survival prediction. The CT model performed better than the model with combined PET and CT features. Conclusions: The results showed that radiomic features can be used to successfully predict survival probability in cervical cancer patients. In particular, CT radiomic features proved to be better predictors than PET radiomic features in this specific case.
Background: Microvascular invasion (MVI) is a necessary step in the metastatic evolution of hepatocellular carcinoma liver tumors. Predicting the onset of MVI in the initial stages of the tumors ...could improve patient survival and the quality of life. In this study, the possibility of using radiomic features to predict the presence/absence of MVI was evaluated. Methods: Multiphase contrast-enhanced computed tomography (CECT) images were collected from 49 patients, and the radiomic features were extracted from the tumor region and the zone of transition. The most-relevant features were selected; the dataset was balanced, and the presence/absence of MVI was classified. The dataset was split into training and test sets in three ways using cross-validation: the first applied feature selection and dataset balancing outside cross-validation; the second applied dataset balancing outside and feature selection inside; the third applied the entire pipeline inside the cross-validation procedure. Results: The features from the tumor areas on CECT showed both the portal and the arterial phases to be the most predictive. The three pipelines showed receiver operating characteristic area under the curve (ROC AUC) scores of 0.89, 0.84, and 0.61, respectively. Conclusions: The results obtained confirmed the efficiency of multiphase CECT and the ZOT in detecting MVI. The results showed a significant difference in the performance of the three pipelines, highlighting that a non-rigorous pipeline design could lead to model performance and generalization capabilities that are too optimistic.
Background: Hematological malignancies are rare and complex diseases and as a consequence, multimodal data (ranging from clinical and genomic information to images) are required to improve diagnosis, ...prognosis and personalized treatments. However, collecting all these layers of information is challenging, in particular when collecting cytological and histological images from the bone marrow (BM) reproducing disease morphologic features. Synthetic data generation by Artificial Intelligence (AI) can circumvent these issues by generating images conditioned from textual inputs (i.e. reports from pathologists), which are widely available and contain many useful clinical information. This technology can enrich data with synthetic images, thus boosting translational research and improving the performances of precision medicine strategies based on multimodal information. Aims:This project was conducted by GenoMed4all and Synthema EU consortia, with the aim to:1)Apply generative models to real-world dataset with histological images of patients with myeloid neoplasms (MN). 2) Develop a Synthetic Images Validation Framework (SIVF) to evaluate the utility and fidelity of generated images. 3) Verify the capability of synthetic images to accelerate research and to improve clinical models. Methods:We implemented Stable Diffusion (SD) generative model fine-tuned on hematological data to generate Hematoxylin and Eosin (H&E) images of MN patients. We implemented a domain specific language model (HematoBERT) to encode textual input as condition for the generation process. Use cases were Myelodysplastic Syndrome (MDS), Acute Myeloid Leukemia (AML) and Myeloproliferative Neoplasm (MPN) patients, with available BM biopsies and their reports from pathologists, genomic and clinical data. We applied SIVF to evaluate distributions of morphological features extracted from real and synthetic images. Clinical validation was performed on disease classification and survival probability prediction, using real and synthetic images features (experimental setting is reported in Figure 1). Results: We trained SD model on 200 patients with available BM biopsies and associated reports. We first performed SIVF to compare extracted morphological features (geometrical, color and texture features of cells nuclei) from synthetic and real images of 55 patients never seen by the model. Results proved that features distributions and correlations in both datasets were comparable. Similar results were obtained performing SIVF on each single patient data. We verified if synthetic data augmentation could improve performances on MN classification (i.e. models able to correctly assign a single patient to a specific clinical entity according to the 2022 WHO classification criteria). We implemented three XGBOOST models to classify patients' disease. Classifiers were trained and validated on morphological extracted features of images from a real set of patients (n=614), a synthetic group (n=396) and a mixed dataset (n=1010). Data augmentation improved classification performance by 10% (F1 Score) when we tested it on the three different validation sets. Finally, demographics, clinical features, genomics (cytogenetics and gene mutations) were included as covariates together with morphological features extracted from BM biopsies in L1 penalized Cox's proportional hazards models, considering Overall Survival as primary endpoint. Models were fitted on two different cohorts of real patients (n=182, n=294). Then we added 112 synthetic patients to both sets and refitted the models. We observed an improvement in performances of >10% (C-Index) for both cases (Figure 2), with morphological features (such as “major axis” of nuclei) being selected among the best predictors. All these results confirmed that data augmentation through synthetic data is a viable approach and can significantly improve the models capability to efficiently capture clinical outcomes at individual patient level. Conclusion:AI generated images preserve properties of real-world images, replicating cells morphological features relevant to identify hematological diseases and their clinical status. This approach based on widely available textual data allows effective data augmentation and effortless data sharing, thus accelerating and improving precision medicine research in hematology.
Background: The availability of multimodal patient data, such as demographics, clinical, imaging, treatment, quality of life, outcomes and wearables data, as well as genome sequencing, have paved the ...way for the development of multimodal clinical solutions that introduce personalized or precision medicine. The clinical report is an information layer that contains relevant information about the disease in addition to the patient's point of view. Natural language processing (NLP) is a branch of artificial intelligence (AI) and its pre-trained language models are the key technology for extracting value from this data layer. Aims: This project was conducted by GenoMed4all and Synthema EU consortia, with the aim to: 1) Build an AI language model specific for the hematology domain. 2) Use NLP technology to extract relevant information from clinical reports and perform unsupervised stratification of patients, in order to 3) demonstrate that the clinical report is earlier access to data relative to disease clinical phenotype and biology and provide important information for patient stratification and prediction of clinical outcomes. Methods: To translate text sentences into numerical embeddings, we implemented bidirectional encoder representations from transformers (BERT) framework. To learn text representations and correlations within data, we performed domain-adaptation by fine-tuned pre-trained model on hematological clinical reports of patients with myeloproliferative neoplasms (MPN), myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML). Patient stratification was performed by HDBSCAN clustering on text embedding encoded by BERT (HematoBERT). Clusters validation was performed by assessing patients' diagnosis and survival probability. Finally, we compared domain-tuned HematoBERT vs pre-trained non-contextualized models. Results: We implemented HematoBERT based on the bert-base-multilingual-uncased version of BERT. Training data were hematological text reports of 1,328 patients. During fine-tuning, texts were tokenized, then we randomly replaced 15% of the tokens with masked tokens, training the model to predict them. We performed stratification using clinical reports from a validation cohort of 360 patients. We identified 7 clusters, defined according to similar words in meaning that were placed in a specific topic. We extracted the most important words and concepts for each cluster (topic) and we summarized them into effective descriptions for each group of patients. Two clusters included MDS patients with excess blasts, and without excess blasts with ring sideroblasts and del5q (n=69, n=115). One cluster included patients with excess blasts and MDS/MPN (n= 33). Two clusters included MPN patients with primary and secondary myelofibrosis, and MPN patients most including subjects affected with polycythemia vera and essential thrombocythemia (n=35, n=46). Two clusters included patients with AML from MDS and therapy-related AML, and patients with de novo AML (n=22, n=42). Clinical validation was performed based on the diagnosis and survival probability of patients assigned to clusters. Patients' diagnoses were compatible with the cluster assignment (Figure 1). Frequency of gene mutations (as assessed by targeted Next-Generation Sequencing) among different clusters reflected the well-known genotypic-phenotypic associations in MDS, MPN and AML. Kaplan-Maier curves indicated significative risk stratification in clusters in terms of survival probability (Figure 2), similar to stratifications performed on clinical and genomic data. Finally, we evaluate the domain adaptation by comparing the model to other pre-trained non-contextualized ones. Pseudo perplexity score (PPS), accuracy and F1 score were calculated to quantify how good the models are when they see new data, predicting the next word given the context of the sentence. HematoBERT obtained high PPS, accuracy and F1 scores, outperforming the other models also trained on generic clinical domains. Conclusion: Domain-adapted language models are able to understand contexts and correlations in documents. HematoBERT can be used to extract relevant features from clinical reports. This data layer is relevant to perform disease stratification of patients based on clinical and genomic information and could be integrated into next-generation multimodal models of personalized medicine.
Many automated approaches have been proposed in literature to quantify clinically relevant wound features based on image processing analysis, aiming at removing human subjectivity and accelerate ...clinical practice. In this work we present a fully automated image processing pipeline leveraging deep learning and a large wound segmentation dataset to perform wound detection and following prediction of the Photographic Wound Assessment Tool (PWAT), automatizing the clinical judgement of the adequate wound healing. Starting from images acquired by smartphone cameras, a series of textural and morphological features are extracted from the wound areas, aiming to mimic the typical clinical considerations for wound assessment. The resulting extracted features can be easily interpreted by the clinician and allow a quantitative estimation of the PWAT scores. The features extracted from the region-of-interests detected by our pre-trained neural network model correctly predict the PWAT scale values with a Spearman's correlation coefficient of 0.85 on a set of unseen images. The obtained results agree with the current state-of-the-art and provide a benchmark for future artificial intelligence applications in this research field.
Aim: Machine learning (ML) and deep learning (DL) predictive models have been employed widely in clinical settings. Their potential support and aid to the clinician of providing an objective measure ...that can be shared among different centers enables the possibility of building more robust multicentric studies. This study aimed to propose a user-friendly and low-cost tool for COVID-19 mortality prediction using both an ML and a DL approach. Method: We enrolled 2348 patients from several hospitals in the Province of Reggio Emilia. Overall, 19 clinical features were provided by the Radiology Units of Azienda USL-IRCCS of Reggio Emilia, and 5892 radiomic features were extracted from each COVID-19 patient’s high-resolution computed tomography. We built and trained two classifiers to predict COVID-19 mortality: a machine learning algorithm, or support vector machine (SVM), and a deep learning model, or feedforward neural network (FNN). In order to evaluate the impact of the different feature sets on the final performance of the classifiers, we repeated the training session three times, first using only clinical features, then employing only radiomic features, and finally combining both information. Results: We obtained similar performances for both the machine learning and deep learning algorithms, with the best area under the receiver operating characteristic (ROC) curve, or AUC, obtained exploiting both clinical and radiomic information: 0.803 for the machine learning model and 0.864 for the deep learning model. Conclusions: Our work, performed on large and heterogeneous datasets (i.e., data from different CT scanners), confirms the results obtained in the recent literature. Such algorithms have the potential to be included in a clinical practice framework since they can not only be applied to COVID-19 mortality prediction but also to other classification problems such as diabetic prediction, asthma prediction, and cancer metastases prediction. Our study proves that the lesion’s inhomogeneity depicted by radiomic features combined with clinical information is relevant for COVID-19 mortality prediction.
•A decentralized computational framework for generator coherency analysis is designed.•Spatially distributed measurements acquired by PMU are processed.•Detailed results are presented and discussed ...in order to prove the effectiveness.
In this paper generator coherency analysis of power system is investigated via some signal processing techniques. Sensor data analysis here designed is based on the fusion of advanced signal processing techniques for sensing-based coherency identification, including k-means and fuzzy k-means clustering, agglomerative hierarchical cluster tree, and Independent Component Analysis (ICA). Detailed results are presented and discussed in order to prove the effectiveness of the techniques and carry out a comparative assessment.