R-loops are DNA-RNA hybrids enriched at CpG islands (CGIs) that can regulate chromatin states
. How R-loops are recognized and interpreted by specific epigenetic readers is unknown. Here we show that ...GADD45A (growth arrest and DNA damage protein 45A) binds directly to R-loops and mediates local DNA demethylation by recruiting TET1 (ten-eleven translocation 1). Studying the tumor suppressor TCF21 (ref.
), we find that antisense long noncoding (lncRNA) TARID (TCF21 antisense RNA inducing promoter demethylation) forms an R-loop at the TCF21 promoter. Binding of GADD45A to the R-loop triggers local DNA demethylation and TCF21 expression. TARID transcription, R-loop formation, DNA demethylation, and TCF21 expression proceed sequentially during the cell cycle. Oxidized DNA demethylation intermediates are enriched at genomic R-loops and their levels increase upon RNase H1 depletion. Genomic profiling in embryonic stem cells identifies thousands of R-loop-dependent TET1 binding sites at CGIs. We propose that GADD45A is an epigenetic R-loop reader that recruits the demethylation machinery to promoter CGIs.
Exome sequencing studies of autism spectrum disorders (ASDs) have identified many de novo mutations but few recurrently disrupted genes. We therefore developed a modified molecular inversion probe ...method enabling ultra-low-cost candidate gene resequencing in very large cohorts. To demonstrate the power of this approach, we captured and sequenced 44 candidate genes in 2446 ASD probands. We discovered 27 de novo events in 16 genes, 59% of which are predicted to truncate proteins or disrupt splicing. We estimate that recurrent disruptive mutations in six genes—CHD8, DYRK1A, GRIN2B, TBR1, PTEN, and TBL1XR1—may contribute to 1% of sporadic ASDs. Our data support associations between specific genes and reciprocal subphenotypes (CHD8-macrocephaly and DYRK1A-microcephaly) and replicate the importance of a β-catenin—chromatin-remodeling network to ASD etiology.
Cervical cancer remains one of the leading causes of cancer-related deaths worldwide. Here we report the extensive molecular characterization of 228 primary cervical cancers, one of the largest ...comprehensive genomic studies of cervical cancer to date. We observed notable APOBEC mutagenesis patterns and identified SHKBP1, ERBB3, CASP8, HLA-A and TGFBR2 as novel significantly mutated genes in cervical cancer. We also discovered amplifications in immune targets CD274 (also known as PD-L1) and PDCD1LG2 (also known as PD-L2), and the BCAR4 long non-coding RNA, which has been associated with response to lapatinib. Integration of human papilloma virus (HPV) was observed in all HPV18-related samples and 76% of HPV16-related samples, and was associated with structural aberrations and increased target-gene expression. We identified a unique set of endometrial-like cervical cancers, comprised predominantly of HPV-negative tumours with relatively high frequencies of KRAS, ARID1A and PTEN mutations. Integrative clustering of 178 samples identified keratin-low squamous, keratin-high squamous and adenocarcinoma-rich subgroups. These molecular analyses reveal new potential therapeutic targets for cervical cancers.
Most of the current knowledge on the genetic basis of adaptive evolution is based on the analysis of single nucleotide polymorphisms (SNPs). Despite increasing evidence for their causal role, the ...contribution of structural variants to adaptive evolution remains largely unexplored. In this work, we analyzed the population frequencies of 1,615 Transposable Element (TE) insertions annotated in the reference genome of Drosophila melanogaster, in 91 samples from 60 worldwide natural populations. We identified a set of 300 polymorphic TEs that are present at high population frequencies, and located in genomic regions with high recombination rate, where the efficiency of natural selection is high. The age and the length of these 300 TEs are consistent with relatively young and long insertions reaching high frequencies due to the action of positive selection. Besides, we identified a set of 21 fixed TEs also likely to be adaptive. Indeed, we, and others, found evidence of selection for 84 of these reference TE insertions. The analysis of the genes located nearby these 84 candidate adaptive insertions suggested that the functional response to selection is related with the GO categories of response to stimulus, behavior, and development. We further showed that a subset of the candidate adaptive TEs affects expression of nearby genes, and five of them have already been linked to an ecologically relevant phenotypic effect. Our results provide a more complete understanding of the genetic variation and the fitness-related traits relevant for adaptive evolution. Similar studies should help uncover the importance of TE-induced adaptive mutations in other species as well.
Anophthalmia and microphthalmia (AM) are the most severe malformations of the eye, corresponding respectively to reduced size or absent ocular globe. Wide genetic heterogeneity has been reported and ...different genes have been demonstrated to be causative of syndromic and non‐syndromic forms of AM. We screened seven AM genes GDF6 (growth differentiation factor 6), FOXE3 (forkhead box E3), OTX2 (orthodenticle protein homolog 2), PAX6 (paired box 6), RAX (retina and anterior neural fold homeobox), SOX2 (SRY sex determining region Y‐box 2), and VSX2 (visual system homeobox 2 gene) in a cohort of 150 patients with isolated or syndromic AM. The causative genetic defect was identified in 21% of the patients (32/150). Point mutations were identified by direct sequencing of these genes in 25 patients (13 in SOX2, 4 in RAX, 3 in OTX2, 2 in FOXE3, 1 in VSX2, 1 in PAX6, and 1 in GDF6). In addition eight gene deletions (five SOX2, two OTX2 and one RAX) were identified using a semi‐quantitative multiplex polymerase chain reaction (PCR) quantitative multiplex PCR amplification of short fluorescent fragments (QMPSF). The causative genetic defect was identified in 21% of the patients. This result contributes to our knowledge of the molecular basis of AM, and will facilitate accurate genetic counselling.
Genetic studies of blood pressure (BP) to date have mainly analyzed common variants (minor allele frequency > 0.05). In a meta-analysis of up to ~1.3 million participants, we discovered 106 new ...BP-associated genomic regions and 87 rare (minor allele frequency ≤ 0.01) variant BP associations (P < 5 × 10
), of which 32 were in new BP-associated loci and 55 were independent BP-associated single-nucleotide variants within known BP-associated regions. Average effects of rare variants (44% coding) were ~8 times larger than common variant effects and indicate potential candidate causal genes at new and known loci (for example, GATA5 and PLCB3). BP-associated variants (including rare and common) were enriched in regions of active chromatin in fetal tissues, potentially linking fetal development with BP regulation in later life. Multivariable Mendelian randomization suggested possible inverse effects of elevated systolic and diastolic BP on large artery stroke. Our study demonstrates the utility of rare-variant analyses for identifying candidate genes and the results highlight potential therapeutic targets.
The widespread use of elite sires by means of artificial insemination in livestock breeding leads to the frequent emergence of recessive genetic defects, which cause significant economic and animal ...welfare concerns. Here we show that the availability of genome-wide, high-density SNP panels, combined with the typical structure of livestock populations, markedly accelerates the positional identification of genes and mutations that cause inherited defects. We report the fine-scale mapping of five recessive disorders in cattle and the molecular basis for three of these: congenital muscular dystony (CMD) types 1 and 2 in Belgian Blue cattle and ichthyosis fetalis in Italian Chianina cattle. Identification of these causative mutations has an immediate translation into breeding practice, allowing marker assisted selection against the defects through avoidance of at-risk matings.
Identifying and understanding changes in cancer genomes is essential for the development of targeted therapeutics. Here we analyse systematically more than 70 pairs of primary human colon tumours by ...applying next-generation sequencing to characterize their exomes, transcriptomes and copy-number alterations. We have identified 36,303 protein-altering somatic changes that include several new recurrent mutations in the Wnt pathway gene TCF7L2, chromatin-remodelling genes such as TET2 and TET3 and receptor tyrosine kinases including ERBB3. Our analysis for significantly mutated cancer genes identified 23 candidates, including the cell cycle checkpoint kinase ATM. Copy-number and RNA-seq data analysis identified amplifications and corresponding overexpression of IGF2 in a subset of colon tumours. Furthermore, using RNA-seq data we identified multiple fusion transcripts including recurrent gene fusions involving R-spondin family members RSPO2 and RSPO3 that together occur in 10% of colon tumours. The RSPO fusions were mutually exclusive with APC mutations, indicating that they probably have a role in the activation of Wnt signalling and tumorigenesis. Consistent with this we show that the RSPO fusion proteins were capable of potentiating Wnt signalling. The R-spondin gene fusions and several other gene mutations identified in this study provide new potential opportunities for therapeutic intervention in colon cancer.
Objectives The purpose of this study is investigate the effects of variants in the apolipoprotein(a) gene ( LPA ) on vascular diseases with different atherosclerotic and thrombotic components. ...Background It is unclear whether the LPA variants rs10455872 and rs3798220, which correlate with lipoprotein(a) levels and coronary artery disease (CAD), confer susceptibility predominantly via atherosclerosis or thrombosis. Methods The 2 LPA variants were combined and examined as LPA scores for the association with ischemic stroke (and TOAST Trial of Org 10172 in Acute Stroke Treatment subtypes) (effective sample size ne = 9,396); peripheral arterial disease ( ne = 5,215); abdominal aortic aneurysm ( ne = 4,572); venous thromboembolism ( ne = 4,607); intracranial aneurysm ( ne = 1,328); CAD ( ne = 12,716), carotid intima-media thickness (n = 3,714), and angiographic CAD severity (n = 5,588). Results LPA score was associated with ischemic stroke subtype large artery atherosclerosis (odds ratio OR: 1.27; p = 6.7 × 10–4 ), peripheral artery disease (OR: 1.47; p = 2.9 × 10–14 ), and abdominal aortic aneurysm (OR: 1.23; p = 6.0 × 10–5 ), but not with the ischemic stroke subtypes cardioembolism (OR: 1.03; p = 0.69) or small vessel disease (OR: 1.06; p = 0.52). Although the LPA variants were not associated with carotid intima-media thickness, they were associated with the number of obstructed coronary vessels (p = 4.8 × 10–12 ). Furthermore, CAD cases carrying LPA risk variants had increased susceptibility to atherosclerotic manifestations outside of the coronary tree (OR: 1.26; p = 0.0010) and had earlier onset of CAD (–1.58 years/allele; p = 8.2 × 10–8 ) than CAD cases not carrying the risk variants. There was no association of LPA score with venous thromboembolism (OR: 0.97; p = 0.63) or intracranial aneurysm (OR: 0.85; p = 0.15). Conclusions LPA sequence variants were associated with atherosclerotic burden, but not with primarily thrombotic phenotypes.
Rare genetic disorders (RGDs) often exhibit significant clinical variability among affected individuals, a disease characteristic termed variable expressivity. Recently, the aggregate effect of ...common variation, quantified as polygenic scores (PGSs), has emerged as an effective tool for predictions of disease risk and trait variation in the general population. Here, we measure the effect of PGSs on 11 RGDs including four sex-chromosome aneuploidies (47,XXX; 47,XXY; 47,XYY; 45,X) that affect height; two copy-number variant (CNV) disorders (16p11.2 deletions and duplications) and a Mendelian disease (melanocortin 4 receptor deficiency (MC4R)) that affect BMI; and two Mendelian diseases affecting cholesterol: familial hypercholesterolemia (FH; LDLR and APOB) and familial hypobetalipoproteinemia (FHBL; PCSK9 and APOB). Our results demonstrate that common, polygenic factors of relevant complex traits frequently contribute to variable expressivity of RGDs and that PGSs may be a useful metric for predicting clinical severity in affected individuals and for risk stratification.