Abstract
The coronavirus disease 2019 (COVID-19) pandemic has clearly shown that major challenges and threats for humankind need to be addressed with global answers and shared decisions. Data and ...their analytics are crucial components of such decision-making activities. Rather interestingly, one of the most difficult aspects is reusing and sharing of accurate and detailed clinical data collected by Electronic Health Records (EHR), even if these data have a paramount importance. EHR data, in fact, are not only essential for supporting day-by-day activities, but also they can leverage research and support critical decisions about effectiveness of drugs and therapeutic strategies. In this paper, we will concentrate our attention on collaborative data infrastructures to support COVID-19 research and on the open issues of data sharing and data governance that COVID-19 had made emerge. Data interoperability, healthcare processes modelling and representation, shared procedures to deal with different data privacy regulations, and data stewardship and governance are seen as the most important aspects to boost collaborative research. Lessons learned from COVID-19 pandemic can be a strong element to improve international research and our future capability of dealing with fast developing emergencies and needs, which are likely to be more frequent in the future in our connected and intertwined world.
In recent years, high-throughput sequencing technologies provide unprecedented opportunity to depict cancer samples at multiple molecular levels. The integration and analysis of these multi-omics ...datasets is a crucial and critical step to gain actionable knowledge in a precision medicine framework. This paper explores recent data-driven methodologies that have been developed and applied to respond major challenges of stratified medicine in oncology, including patients' phenotyping, biomarker discovery, and drug repurposing. We systematically retrieved peer-reviewed journals published from 2014 to 2019, select and thoroughly describe the tools presenting the most promising innovations regarding the integration of heterogeneous data, the machine learning methodologies that successfully tackled the complexity of multi-omics data, and the frameworks to deliver actionable results for clinical practice. The review is organized according to the applied methods: Deep learning, Network-based methods, Clustering, Features Extraction, and Transformation, Factorization. We provide an overview of the tools available in each methodological group and underline the relationship among the different categories. Our analysis revealed how multi-omics datasets could be exploited to drive precision oncology, but also current limitations in the development of multi-omics data integration.
Genomic variant interpretation is a critical step of the diagnostic procedure, often supported by the application of tools that may predict the damaging impact of each variant or provide a ...guidelines-based classification. We propose the application of Machine Learning methodologies, in particular Penalized Logistic Regression, to support variant classification and prioritization. Our approach combines ACMG/AMP guidelines for germline variant interpretation as well as variant annotation features and provides a probabilistic score of pathogenicity, thus supporting the prioritization and classification of variants that would be interpreted as uncertain by the ACMG/AMP guidelines. We compared different approaches in terms of variant prioritization and classification on different datasets, showing that our data-driven approach is able to solve more variant of uncertain significance (VUS) cases in comparison with guidelines-based approaches and in silico prediction tools.
One of the areas where Artificial Intelligence is having more impact is machine learning, which develops algorithms able to learn patterns and decision rules from data. Machine learning algorithms ...have been embedded into data mining pipelines, which can combine them with classical statistical strategies, to extract knowledge from data. Within the EU-funded MOSAIC project, a data mining pipeline has been used to derive a set of predictive models of type 2 diabetes mellitus (T2DM) complications based on electronic health record data of nearly one thousand patients. Such pipeline comprises clinical center profiling, predictive model targeting, predictive model construction and model validation. After having dealt with missing data by means of random forest (RF) and having applied suitable strategies to handle class imbalance, we have used Logistic Regression with stepwise feature selection to predict the onset of retinopathy, neuropathy, or nephropathy, at different time scenarios, at 3, 5, and 7 years from the first visit at the Hospital Center for Diabetes (not from the diagnosis). Considered variables are gender, age, time from diagnosis, body mass index (BMI), glycated hemoglobin (HbA1c), hypertension, and smoking habit. Final models, tailored in accordance with the complications, provided an accuracy up to 0.838. Different variables were selected for each complication and time scenario, leading to specialized models easy to translate to the clinical practice.
Abstract Most cardiomyopathies are familial diseases. Cascade family screening identifies asymptomatic patients and family members with early traits of disease. The inheritance is autosomal dominant ...in a majority of cases, and recessive, X-linked, or matrilinear in the remaining. For the last 50 years, cardiomyopathy classifications have been based on the morphofunctional phenotypes, allowing cardiologists to conveniently group them in broad descriptive categories. However, the phenotype may not always conform to the genetic characteristics, may not allow risk stratification, and may not provide pre-clinical diagnoses in the family members. Because genetic testing is now increasingly becoming a part of clinical work-up, and based on the genetic heterogeneity, numerous new names are being coined for the description of cardiomyopathies associated with mutations in different genes; a comprehensive nosology is needed that could inform the clinical phenotype and involvement of organs other than the heart, as well as the genotype and the mode of inheritance. The recently proposed MOGE(S) nosology system embodies all of these characteristics, and describes the morphofunctional phenotype (M), organ(s) involvement (O), genetic inheritance pattern (G), etiological annotation (E) including genetic defect or underlying disease/substrate, and the functional status (S) of the disease using both the American College of Cardiology/American Heart Association stage and New York Heart Association functional class. The proposed nomenclature is supported by a web-assisted application and assists in the description of cardiomyopathy in symptomatic or asymptomatic patients and family members in the context of genetic testing. It is expected that such a nomenclature would help group cardiomyopathies on their etiological basis, describe complex genetics, and create collaborative registries.
Defective cell migration causes delayed wound healing (WH) and chronic skin lesions. Autologous micrograft (AMG) therapies have recently emerged as a new effective and affordable treatment able to ...improve wound healing capacity. However, the precise molecular mechanism through which AMG exhibits its beneficial effects remains unrevealed. Herein we show that AMG improves skin re-epithelialization by accelerating the migration of fibroblasts and keratinocytes. More specifically, AMG-treated wounds showed improvement of indispensable events associated with successful wound healing such as granulation tissue formation, organized collagen content, and newly formed blood vessels. We demonstrate that AMG is enriched with a pool of WH-associated growth factors that may provide the starting signal for a faster endogenous wound healing response. This work links the increased cell migration rate to the activation of the extracellular signal-regulated kinase (ERK) signaling pathway, which is followed by an increase in matrix metalloproteinase expression and their extracellular enzymatic activity. Overall we reveal the AMG-mediated wound healing transcriptional signature and shed light on the AMG molecular mechanism supporting its potential to trigger a highly improved wound healing process. In this way, we present a framework for future improvements in AMG therapy for skin tissue regeneration applications.
A substantial increase in the knowledge of the genetic basis of cardiomyopathy has occurred, and noninvasive phenotypic characterization has become significantly more sophisticated. ...the American ...Heart Association (AHA) (7) and the European Society of Cardiology (ESC) (8) in the last decade have proposed revisions to the classification of cardiomyopathic disorders. In the ESC 2008 classification, the cardiomyopathy was defined as familial when present in more than 1 member of the family.\n Disease MIM# Phenotype Inheritance Age of Onset Disease Gene Cardiac Phenotype Extracardiac Markers/Involvement of Other Organs Treatment Multiple acyl-CoA dehydrogenase deficiency Glutaric acidemia IIA 231680 AR Neonatal ETFA DCM, neonatal Nervous, skeletal, muscle, liver, kidney (often polycystic), metabolic acidosis, hypoglycemia  Glutaric acidemia IIB 231680 AR Neonatal, childhood ETFB Sudden neonatal death Nervous, skeletal, muscle, liver  Glutaric acidemia IIC 231680 AR Childhood to adult ETFDH DCM Nervous, skeletal, muscle, liver, kidney (often polycystic), lung, metabolic acidosis, hypoglycemia  Primary, systemic, carnitine transporter deficiency 212140 AR Childhood to adult SLC22A5 DCM, HCM < Total plasma carnitine, hypoketotic hypoglycemia, hepatomegaly, elevated transaminases, and hyperammonemia in infants; skeletal myopathy, > creatine kinase, in childhood; cardiomyopathy, arrhythmias, or fatigability in adulthood Carnitine supplementation Chanarin-Dorfman syndrome (NLSD-I) 275630 AR Childhood to adult ABHD5 DCM Skin (ichthyosiform erythroderma), liver, muscle, nervous (with possible MR), ocular Suggested: diet low in long-chain fatty acids; retinoids for skin in patients w/o liver dysfunction Neutral lipid storage disease with myopathy (NLSD-M) 610717 AR Childhood to adult PNPLA2low * DCM Myopathy  Table 5 Major Lipid Storage Disorders With Possible Myocardial Involvement MR = mental retardation; other abbreviations as in Table 1.
The integration of data and knowledge from heterogeneous sources can be a key success factor in drug design, drug repurposing and multi-target therapies. In this context, biological networks provide ...a useful instrument to highlight the relationships and to model the phenomena underlying therapeutic action in cancer. In our work, we applied network-based modeling within a novel bioinformatics pipeline to identify promising multi-target drugs. Given a certain tumor type/subtype, we derive a disease-specific Protein-Protein Interaction (PPI) network by combining different data-bases and knowledge repositories. Next, the application of suitable graph-based algorithms allows selecting a set of potentially interesting combinations of drug targets. A list of drug candidates is then extracted by applying a recent data fusion approach based on matrix tri-factorization. Available knowledge about selected drugs mechanisms of action is finally exploited to identify the most promising candidates for planning in vitro studies. We applied this approach to the case of Triple Negative Breast Cancer (TNBC), a subtype of breast cancer whose biology is poorly understood and that lacks of specific molecular targets. Our "in-silico" findings have been confirmed by a number of in vitro experiments, whose results demonstrated the ability of the method to select candidates for drug repurposing.
Deregulation of chromatin modifiers, including DNA helicases, is emerging as one of the mechanisms underlying the transformation of anaplastic lymphoma kinase negative (ALK
) anaplastic large cell ...lymphoma (ALCL). We recently identified the DNA-helicase HELLS as central for proficient ALK
ALCL proliferation and progression. Here we assessed in detail its function by performing RNA-sequencing profiling coupled with bioinformatic prediction to identify HELLS targets and transcriptional cooperators. We demonstrated that HELLS, together with the transcription factor YY1, contributes to an appropriate cytokinesis via the transcriptional regulation of genes involved in cleavage furrow regulation. Binding target promoters, HELLS primes YY1 recruitment and transcriptional activation of cytoskeleton genes including the small GTPases RhoA and RhoU and their effector kinase Pak2. Single or multiple knockdowns of these genes reveal that RhoA and RhoU mediate HELLS effects on cell proliferation and cell division of ALK
ALCLs. Collectively, our work demonstrates the transcriptional role of HELLS in orchestrating a complex transcriptional program sustaining neoplastic features of ALK
ALCL.
The COVID-19 pandemic has been a catastrophic event that has seriously endangered the world's population. Governments have largely been unprepared to deal with such an unprecedented calamity, ...partially due to the lack of sufficient or adequately fine-grained data necessary for forecasting the pandemic's evolution. To fill this gap, researchers worldwide have been collecting data about different aspects of COVID-19's evolution and government responses to them so as to provide the foundation for informative models and tools that can be used to mitigate the current pandemic and possibly prevent future ones. Indeed, since the early stages of the pandemic, a number of research initiatives were launched with this goal, including the PERISCOPE (Pan-European Response to the ImpactS of COVID-19 and future Pandemics and Epidemics) Project, funded by the European Commission. PERISCOPE aims to investigate the broad socio-economic and behavioral impacts of the COVID-19 pandemic, with the goal of making Europe more resilient and prepared for future large-scale risks. The purpose of this study, carried out as part of the PERISCOPE project, is to provide a first European-level analysis of the effect of government policies on the spread of the virus. To do so, we assessed the relationship between a novel index, the Policy Intensity Index, and four epidemiological variables collected by the European Centre for Disease Control and Prevention, and then applied a comprehensive Pan-European population model based on Multilevel Vector Autoregression. This model aims at identifying effects that are common to some European countries while treating country-specific policies as covariates, explaining the different evolution of the pandemic in nine selected countries due to data availability: Spain, France, Netherlands, Latvia, Slovenia, Greece, Ireland, Cyprus, Estonia. Results show that specific policies' effectiveness tend to vary consistently within the different countries, although in general policies related to Health Monitoring and Health Resources are the most effective for all countries.