The Protein Data Bank (PDB) is the global archive for structural information on macromolecules, and a popular resource for researchers, teachers, and students, amassing more than one million unique ...users each year. Crystallographic structure models in the PDB (more than 100,000 entries) are optimized against the crystal diffraction data and geometrical restraints. This process of crystallographic refinement typically ignored hydrogen bond (H‐bond) distances as a source of information. However, H‐bond restraints can improve structures at low resolution where diffraction data are limited. To improve low‐resolution structure refinement, we present methods for deriving H‐bond information either globally from well‐refined high‐resolution structures from the PDB‐REDO databank, or specifically from on‐the‐fly constructed sets of homologous high‐resolution structures. Refinement incorporating HOmology DErived Restraints (HODER), improves geometrical quality and the fit to the diffraction data for many low‐resolution structures. To make these improvements readily available to the general public, we applied our new algorithms to all crystallographic structures in the PDB: using massively parallel computing, we constructed a new instance of the PDB‐REDO databank (https://pdb-redo.eu). This resource is useful for researchers to gain insight on individual structures, on specific protein families (as we demonstrate with examples), and on general features of protein structure using data mining approaches on a uniformly treated dataset.
Rare monogenic disorders often share molecular etiologies involved in the pathogenesis of common diseases. Congenital disorders of glycosylation (CDG) and deglycosylation (CDDG) are rare pediatric ...disorders with symptoms that range from mild to life threatening. A biological mechanism shared among CDG and CDDG as well as more common neurodegenerative diseases such as Alzheimer's disease and amyotrophic lateral sclerosis, is endoplasmic reticulum (ER) stress. We developed isogenic human cellular models of two types of CDG and the only known CDDG to discover drugs that can alleviate ER stress. Systematic phenotyping confirmed ER stress and identified elevated autophagy among other phenotypes in each model. We screened 1049 compounds and scored their ability to correct aberrant morphology in each model using an agnostic cell-painting assay based on >300 cellular features. This primary screen identified multiple compounds able to correct morphological phenotypes. Independent validation shows they also correct cellular phenotypes and alleviate each of the ER stress markers identified in each model. Many of the active compounds are associated with microtubule dynamics, which points to new therapeutic opportunities for both rare and more common disorders presenting with ER stress, such as Alzheimer's disease and amyotrophic lateral sclerosis.
MicroRNAs are a class of noncoding RNA molecules that co-regulate the expression of multiple genes via mRNA transcript degradation or translation inhibition. Since they often target entire pathways, ...they may be better drug targets than genes or proteins. MicroRNAs are known to be dysregulated in many tumours and associated with aggressive or poor prognosis phenotypes. Since they regulate mRNA in a tissue specific manner, their functional mRNA targets are poorly understood. In previous work, we developed a method to identify direct mRNA targets of microRNA using patient matched microRNA/mRNA expression data using an anti-correlation signature. This method, applied to clear cell Renal Cell Carcinoma (ccRCC), revealed many new regulatory pathways compromised in ccRCC. In the present paper, we apply this method to identify dysregulated microRNA/mRNA mechanisms in ovarian cancer using data from The Cancer Genome Atlas (TCGA).
TCGA Microarray data was normalized and samples whose class labels (tumour or normal) were ambiguous with respect to consensus ensemble K-Means clustering were removed. Significantly anti-correlated and correlated genes/microRNA differentially expressed between tumour and normal samples were identified. TargetScan was used to identify gene targets of microRNA.
We identified novel microRNA/mRNA mechanisms in ovarian cancer. For example, the expression level of RAD51AP1 was found to be strongly anti-correlated with the expression of hsa-miR-140-3p, which was significantly down-regulated in the tumour samples. The anti-correlation signature was present separately in the tumour and normal samples, suggesting a direct causal dysregulation of RAD51AP1 by hsa-miR-140-3p in the ovary. Other pairs of potentially biological relevance include: hsa-miR-145/E2F3, hsa-miR-139-5p/TOP2A, and hsa-miR-133a/GCLC. We also identified sets of positively correlated microRNA/mRNA pairs that are most likely result from indirect regulatory mechanisms.
Our findings identify novel microRNA/mRNA relationships that can be verified experimentally. We identify both generic microRNA/mRNA regulation mechanisms in the ovary as well as specific microRNA/mRNA controls which are turned on or off in ovarian tumours. Our results suggest that the disease process uses specific mechanisms which may be significant for their utility as early detection biomarkers or in the development of microRNA therapies in treating ovarian cancers. The positively correlated microRNA/mRNA pairs suggest the existence of novel regulatory mechanisms that proceed via intermediate states (indirect regulation) in ovarian tumorigenesis.
The tumor suppressor p53: Cancer and aging Feng, Zhaohui; Hu, Wenwei; Rajagopal, Gunaretnam ...
Cell cycle (Georgetown, Tex.),
20/4/1/, Letnik:
7, Številka:
7
Journal Article
Recenzirano
Odprti dostop
Aging, like many other biological processes, is subject to regulation by genes that reside in pathways that have been conserved during evolution. The insulin/ IGF-1 pathway, mTOR pathway and p53 ...pathway are among those conserved pathways that impact upon longevity and aging-related diseases such as cancer. Most cancers arise in the last quarter of life span with the frequency increasing exponentially with time, and mutation accumulation in critical genes (e.g. p53) in individual cells over a lifetime is thought to be the reason. Recently, we found that the efficiency of the p53 response to stress decline significantly with age in mice, and the time of onset of this decreased p53 response correlates with the life span of mice. Given the crucial role of the p53 in tumor prevention, this decline in p53 activity at older ages in animals could contribute to the observed dramatic increases in cancer frequency, and provides a plausible explanation for the correlation between tumorigenesis and aging in addition to the accumulation of DNA mutations over lifetime. We discuss here the coordination and communication between the p53 pathway and the IGF-1-mTOR pathways, and their possible impact on cancer and longevity.
The synthesis of the gonadotropin subunits is directed by pulsatile gonadotropin-releasing hormone (GnRH) from the hypothalamus, with the frequency of GnRH pulses governing the differential ...expression of the common alpha-subunit, luteinizing hormone beta-subunit (LHbeta) and follicle-stimulating hormone beta-subunit (FSHbeta). Three mitogen-activated protein kinases, (MAPKs), ERK1/2, JNK and p38, contribute uniquely and combinatorially to the expression of each of these subunit genes. In this study, using both experimental and computational methods, we found that dual specificity phosphatase regulation of the activity of the three MAPKs through negative feedback is required, and forms the basis for decoding the frequency of pulsatile GnRH. A fourth MAPK, ERK5, was shown also to be activated by GnRH. ERK5 was found to stimulate FSHbeta promoter activity and to increase FSHbeta mRNA levels, as well as enhancing its preference for low GnRH pulse frequencies. The latter is achieved through boosting the ultrasensitive behavior of FSHbeta gene expression by increasing the number of MAPK dependencies, and through modulating the feedforward effects of JNK activation on the GnRH receptor (GnRH-R). Our findings contribute to understanding the role of changing GnRH pulse-frequency in controlling transcription of the pituitary gonadotropins, which comprises a crucial aspect in regulating reproduction. Pulsatile stimuli and oscillating signals are integral to many biological processes, and elucidation of the mechanisms through which the pulsatility is decoded explains how the same stimulant can lead to various outcomes in a single cell.
The Li–Fraumeni syndrome (LFS) and its variant form (LFL) is a familial predisposition to multiple forms of childhood, adolescent, and adult cancers associated with germ-line mutation in the TP53 ...tumor suppressor gene. Individual disparities in tumor patterns are compounded by acceleration of cancer onset with successive generations. It has been suggested that this apparent anticipation pattern may result from germ-line genomic instability in TP53 mutation carriers, causing increased DNA copy-number variations (CNVs) with successive generations. To address the genetic basis of phenotypic disparities of LFS/LFL, we performed whole-genome sequencing (WGS) of 13 subjects from two generations of an LFS kindred. Neither de novo CNV nor significant difference in total CNV was detected in relation with successive generations or with age at cancer onset. These observations were consistent with an experimental mouse model system showing that trp53 deficiency in the germ line of father or mother did not increase CNV occurrence in the offspring. On the other hand, individual records on 1,771 TP53 mutation carriers from 294 pedigrees were compiled to assess genetic anticipation patterns (International Agency for Research on Cancer TP53 database). No strictly defined anticipation pattern was observed. Rather, in multigeneration families, cancer onset was delayed in older compared with recent generations. These observations support an alternative model for apparent anticipation in which rare variants from noncarrier parents may attenuate constitutive resistance to tumorigenesis in the offspring of TP53 mutation carriers with late cancer onset.
Significance Germ-line mutation in the tumor suppressor TP53 causes Li–Fraumeni syndrome (LFS), a complex predisposition to multiple cancers. Types of cancers and ages at diagnosis vary among subjects and families, with apparent genetic anticipation: i.e., earlier cancer onset with successive generations. It has been proposed that anticipation is caused by accumulation of copy-number variations (CNV) in a context of TP53 haploinsufficiency. Using genome/exome sequencing, we found no evidence of increased rates of CNVs in two successive generations of TP53 mutation carriers and in successive generations of Trp53 -deficient mice. We propose a stochastic model called “genetic regression” to explain apparent anticipation in LFS, caused by segregation of rare SNP and de novo mutations rather than by cumulative DNA damage.
The advent of genotype data from large-scale efforts that catalog the genetic variants of different populations have given rise to new avenues for multifactorial disease association studies. Recent ...work shows that genotype data from the International HapMap Project have a high degree of transferability to the wider population. This implies that the design of genotyping studies on local populations may be facilitated through inferences drawn from information contained in HapMap populations.
To facilitate analysis of HapMap data for characterizing the haplotype structure of genes or any chromosomal regions, we have developed an integrated web-based resource, iHAP. In addition to incorporating genotype and haplotype data from the International HapMap Project and gene information from the UCSC Genome Browser Database, iHAP also provides capabilities for inferring haplotype blocks and selecting tag SNPs that are representative of haplotype patterns. These include block partitioning algorithms, block definitions, tag SNP definitions, as well as SNPs to be "force included" as tags. Based on the parameters defined at the input stage, iHAP performs on-the-fly analysis and displays the result graphically as a webpage. To facilitate analysis, intermediate and final result files can be downloaded.
The iHAP resource, available at http://ihap.bii.a-star.edu.sg, provides a convenient yet flexible approach for the user community to analyze HapMap data and identify candidate targets for genotyping studies.
Major depressive disorder (MDD) is a leading cause of disability affecting ∼322M people worldwide, yet treatments offer limited efficacy across symptoms. Patients with MDD are clinically and ...biologically heterogeneous, which complicates the identification of causal mechanisms that can inform therapeutic development. Our study aimed to leverage machine learning (ML) and the vast constellation of data available in the UK Biobank (UKB) to identify subtypes of individuals with probable MDD (pMDD) and assess differences in their neurobiological and genetic architecture.
The study included UKB data from health records, blood and urine biomarkers, self-reported questionnaires, cognitive assessments, neuroimaging data (T1-weighted MRI, diffusion tensor imaging (DTI)), and genetics in up to 500K individuals. pMDD cases and controls were defined by inclusion/exclusion criteria based on self-reported information, clinical diagnoses, and medication use (Howard et al. 2018). An XGBoost classifier and explainable AI framework was used to select key features that classified pMDD cases vs. controls. We applied k-means clustering using the top 50 most informative features, reduced with auto-encoder, for patient subtyping. Genetic characterization of subtypes included genome-wide variant-level association analyses using REGENIE and genetic correlation analyses using HDL. Neuroanatomic characterization included Firth logistic regression and analysis of covariance to identify significant T1-wighted MRI and DTI brain regions of interest associated with pMDD subtypes.
Analysis included 60,813 pMDD cases and 231,787 controls. Predictive performance of the pMDD classifier was 73%. Eight distinct clusters representing subtypes of pMDD were observed. For example, Cluster 0 was driven by mental disorders, substance abuse, and higher testosterone levels, and Cluster 2 by lipid metabolism, peripheral nerve disorders, and higher LDL cholesterol and testosterone. Cluster 7 was driven by suicidal ideation and substance abuse, reflecting a more severe MDD subtype. GWAS of pMDD subtypes revealed 6 independent, cluster-specific, significant loci, with genetic correlation revealing differences in genetic architecture across the eight clusters. Further genetic characterization of subtypes against external summary statistics for suicide attempts (Mullins et al., 2022) revealed highest genetic correlation with Cluster 7, supporting its identification as a severe subtype driven by shared genetic underpinnings for suicide risk. Neuromorphometric differences across subtypes were observed for measures of structural connectivity (DTI), but not for measures of brain thickness, area, or volume. Significant alternations in fractional anisotropy were observed in the anterior corona radiata of Cluster 2; posterior thalamic radiation and superior corona radiata of Cluster 0, and the tapetum of Cluster 7, all relative to controls.
Our findings demonstrate the utility of ML-driven approaches in stratifying complex diseases into genetic and neurobiologically distinct subtypes to improve our understanding of disease etiology and reveal the underlying mechanisms driving clinical heterogeneity. Insights from research applying ML-driven approaches hold potential to enable precision psychiatry in drug development—facilitating novel target identification and informing clinical trials that ultimately pair the patient to the right drug at the right time.
Many arrhythmias are triggered by abnormal electrical activity at the ionic channel and cell level, and then evolve spatio-temporally within the heart. To understand arrhythmias better and to ...diagnose them more precisely by their ECG waveforms, a whole-heart model is required to explore the association between the massively parallel activities at the channel/cell level and the integrative electrophysiological phenomena at organ level.
We have developed a method to build large-scale electrophysiological models by using extended cellular automata, and to run such models on a cluster of shared memory machines. We describe here the method, including the extension of a language-based cellular automaton to implement quantitative computing, the building of a whole-heart model with Visible Human Project data, the parallelization of the model on a cluster of shared memory computers with OpenMP and MPI hybrid programming, and a simulation algorithm that links cellular activity with the ECG.
We demonstrate that electrical activities at channel, cell, and organ levels can be traced and captured conveniently in our extended cellular automaton system. Examples of some ECG waveforms simulated with a 2-D slice are given to support the ECG simulation algorithm. A performance evaluation of the 3-D model on a four-node cluster is also given.
Quantitative multicellular modeling with extended cellular automata is a highly efficient and widely applicable method to weave experimental data at different levels into computational models. This process can be used to investigate complex and collective biological activities that can be described neither by their governing differentiation equations nor by discrete parallel computation. Transparent cluster computing is a convenient and effective method to make time-consuming simulation feasible. Arrhythmias, as a typical case, can be effectively simulated with the methods described.