Technological and other advances over the past decades have led to the discovery of thousands of gene-disease associations for autosomal and X-linked recessive Mendelian disorders. Combined with ...recent improvements in assessing individual variants in each human genome, these developments offer the possibility of testing populations for all known severe recessive genetic disorders. Past experience has provided the framework for expanded carrier screening, but many challenges remain regarding which disorders to include, how to interpret variants and how to incorporate newly discovered gene-disease links into existing screening programmes.
The complete, ungapped sequence of the short arms of human acrocentric chromosomes (SAACs) is still unknown almost 20 years after the near completion of the Human Genome Project. Yet these short arms ...of Chromosomes 13, 14, 15, 21, and 22 contain the ribosomal DNA (rDNA) genes, which are of paramount importance for human biology. The sequences of SAACs show an extensive variation in the copy number of the various repetitive elements, the full extent of which is currently unknown. In addition, the full spectrum of repeated sequences, their organization, and the low copy number functional elements are also unknown. The Telomere-to-Telomere (T2T) Project using mainly long-read sequence technology has recently completed the assembly of the genome from a hydatidiform mole, CHM13, and has thus established a baseline reference for further studies on the organization, variation, functional annotation, and impact in human disorders of all the previously unknown genomic segments, including the SAACs. The publication of the initial results of the T2T Project will update and improve the reference genome for a better understanding of the evolution and function of the human genome.
Understanding how genetic variation affects distinct cellular phenotypes, such as gene expression levels, alternative splicing and DNA methylation levels, is essential for better understanding of ...complex diseases and traits. Furthermore, how inter-individual variation of DNA methylation is associated to gene expression is just starting to be studied. In this study, we use the GenCord cohort of 204 newborn Europeans' lymphoblastoid cell lines, T-cells and fibroblasts derived from umbilical cords. The samples were previously genotyped for 2.5 million SNPs, mRNA-sequenced, and assayed for methylation levels in 482,421 CpG sites. We observe that methylation sites associated to expression levels are enriched in enhancers, gene bodies and CpG island shores. We show that while the correlation between DNA methylation and gene expression can be positive or negative, it is very consistent across cell-types. However, this epigenetic association to gene expression appears more tissue-specific than the genetic effects on gene expression or DNA methylation (observed in both sharing estimations based on P-values and effect size correlations between cell-types). This predominance of genetic effects can also be reflected by the observation that allele specific expression differences between individuals dominate over tissue-specific effects. Additionally, we discover genetic effects on alternative splicing and interestingly, a large amount of DNA methylation correlating to alternative splicing, both in a tissue-specific manner. The locations of the SNPs and methylation sites involved in these associations highlight the participation of promoter proximal and distant regulatory regions on alternative splicing. Overall, our results provide high-resolution analyses showing how genome sequence variation has a broad effect on cellular phenotypes across cell-types, whereas epigenetic factors provide a secondary layer of variation that is more tissue-specific. Furthermore, the details of how this tissue-specificity may vary across inter-relations of molecular traits, and where these are occurring, can yield further insights into gene regulation and cellular biology as a whole.
ABSTRACT
The consistent and unambiguous description of sequence variants is essential to report and exchange information on the analysis of a genome. In particular, DNA diagnostics critically depends ...on accurate and standardized description and sharing of the variants detected. The sequence variant nomenclature system proposed in 2000 by the Human Genome Variation Society has been widely adopted and has developed into an internationally accepted standard. The recommendations are currently commissioned through a Sequence Variant Description Working Group (SVD‐WG) operating under the auspices of three international organizations: the Human Genome Variation Society (HGVS), the Human Variome Project (HVP), and the Human Genome Organization (HUGO). Requests for modifications and extensions go through the SVD‐WG following a standard procedure including a community consultation step. Version numbers are assigned to the nomenclature system to allow users to specify the version used in their variant descriptions. Here, we present the current recommendations, HGVS version 15.11, and briefly summarize the changes that were made since the 2000 publication. Most focus has been on removing inconsistencies and tightening definitions allowing automatic data processing. An extensive version of the recommendations is available online, at http://www.HGVS.org/varnomen.
HIV-1 Nef, a protein important for the development of AIDS, has well-characterized effects on host membrane trafficking and receptor downregulation. By an unidentified mechanism, Nef increases the ...intrinsic infectivity of HIV-1 virions in a host-cell-dependent manner. Here we identify the host transmembrane protein SERINC5, and to a lesser extent SERINC3, as a potent inhibitor of HIV-1 particle infectivity that is counteracted by Nef. SERINC5 localizes to the plasma membrane, where it is efficiently incorporated into budding HIV-1 virions and impairs subsequent virion penetration of susceptible target cells. Nef redirects SERINC5 to a Rab7-positive endosomal compartment and thereby excludes it from HIV-1 particles. The ability to counteract SERINC5 was conserved in Nef encoded by diverse primate immunodeficiency viruses, as well as in the structurally unrelated glycosylated Gag from murine leukaemia virus. These examples of functional conservation and convergent evolution emphasize the fundamental importance of SERINC5 as a potent anti-retroviral factor.
The past 45 years have witnessed a triumph in the discovery of genes and genetic variation that cause Mendelian disorders due to high impact variants. Important discoveries and organized projects ...have provided the necessary tools and infrastructure for the identification of gene defects leading to thousands of monogenic phenotypes. This endeavor can be divided in three phases in which different laboratory strategies were employed for the discovery of disease‐related genes: (i) the biochemical phase, (ii) the genetic linkage followed by positional cloning phase, and (iii) the sequence identification phase. However, much more work is needed to identify all the high impact genomic variation that substantially contributes to the phenotypic variation.
Large intergenic noncoding RNAs (lincRNAs) are still poorly functionally characterized. We analyzed the genetic and epigenetic regulation of human lincRNA expression in the GenCord collection by ...using three cell types from 195 unrelated European individuals. We detected a considerable number of cis expression quantitative trait loci (cis-eQTLs) and demonstrated that the genetic regulation of lincRNA expression is independent of the regulation of neighboring protein-coding genes. lincRNAs have relatively more cis-eQTLs than do equally expressed protein-coding genes with the same exon number. lincRNA cis-eQTLs are located closer to transcription start sites (TSSs) and their effect sizes are higher than cis-eQTLs found for protein-coding genes, suggesting that lincRNA expression levels are less constrained than that of protein-coding genes. Additionally, lincRNA cis-eQTLs can influence the expression level of nearby protein-coding genes and thus could be considered as QTLs for enhancer activity. Enrichment of expressed lincRNA promoters in enhancer marks provides an additional argument for the involvement of lincRNAs in the regulation of transcription in cis. By investigating the epigenetic regulation of lincRNAs, we observed both positive and negative correlations between DNA methylation and gene expression (expression quantitative trait methylation eQTMs), as expected, and found that the landscapes of passive and active roles of DNA methylation in gene regulation are similar to protein-coding genes. However, lincRNA eQTMs are located closer to TSSs than are protein-coding gene eQTMs. These similarities and differences in genetic and epigenetic regulation between lincRNAs and protein-coding genes contribute to the elucidation of potential functions of lincRNAs.
X-chromosome inactivation (XCI) provides a dosage compensation mechanism where, in each female cell, one of the two X chromosomes is randomly silenced. However, some genes on the inactive X ...chromosome and outside the pseudoautosomal regions escape from XCI and are expressed from both alleles (escapees). We investigated XCI at single-cell resolution combining deep single-cell RNA sequencing with whole-genome sequencing to examine allelic-specific expression in 935 primary fibroblast and 48 lymphoblastoid single cells from five female individuals. In this framework we integrated an original method to identify and exclude doublets of cells. In fibroblast cells, we have identified 55 genes as escapees including five undescribed escapee genes. Moreover, we observed that all genes exhibit a variable propensity to escape XCI in each cell and cell type and that each cell displays a distinct expression profile of the escapee genes. A metric, the Inactivation Score—defined as the mean of the allelic expression profiles of the escapees per cell—enables us to discover a heterogeneous and continuous degree of cellular XCI with extremes represented by “inactive” cells, i.e., cells exclusively expressing the escaping genes from the active X chromosome and “escaping” cells expressing the escapees from both alleles. We found that this effect is associated with cell-cycle phases and, independently, with the XIST expression level, which is higher in the quiescent phase (G0). Single-cell allele-specific expression is a powerful tool to identify novel escapees in different tissues and provide evidence of an unexpected cellular heterogeneity of XCI.
APOBEC3A and APOBEC3B, cytidine deaminases of the APOBEC family, are among the main factors causing mutations in human cancers. APOBEC deaminates cytosines in single-stranded DNA (ssDNA). A fraction ...of the APOBEC-induced mutations occur as clusters ("kataegis") in single-stranded DNA produced during repair of double-stranded breaks (DSBs). However, the properties of the remaining 87% of nonclustered APOBEC-induced mutations, the source and the genomic distribution of the ssDNA where they occur, are largely unknown. By analyzing genomic and exomic cancer databases, we show that >33% of dispersed APOBEC-induced mutations occur on the lagging strand during DNA replication, thus unraveling the major source of ssDNA targeted by APOBEC in cancer. Although methylated cytosine is generally more mutation-prone than nonmethylated cytosine, we report that methylation reduces the rate of APOBEC-induced mutations by a factor of roughly two. Finally, we show that in cancers with extensive APOBEC-induced mutagenesis, there is almost no increase in mutation rates in late replicating regions (contrary to other cancers). Because late-replicating regions are depleted in exons, this results in a 1.3-fold higher fraction of mutations residing within exons in such cancers. This study provides novel insight into the APOBEC-induced mutagenesis and describes the peculiarity of the mutational processes in cancers with the signature of APOBEC-induced mutations.
The gonad is a unique biological system for studying cell-fate decisions. However, major questions remain regarding the identity of somatic progenitor cells and the transcriptional events driving ...cell differentiation. Using time-series single-cell RNA sequencing on XY mouse gonads during sex determination, we identified a single population of somatic progenitor cells prior to sex determination. A subset of these progenitors differentiates into Sertoli cells, a process characterized by a highly dynamic genetic program consisting of sequential waves of gene expression. Another subset of multipotent cells maintains their progenitor state but undergoes significant transcriptional changes restricting their competence toward a steroidogenic fate required for the differentiation of fetal Leydig cells. Our findings confirm the presence of a unique multipotent progenitor population in the gonadal primordium that gives rise to both supporting and interstitial lineages. These also provide the most granular analysis of the transcriptional events occurring during testicular cell-fate commitment.
Display omitted
•A single Nr5a1+ progenitor cell population was detected prior sex determination•Progenitors are able to give rise to first Sertoli and later to fetal Leydig cells•Sertoli cell differentiation is characterized by a highly dynamic genetic program•The remaining interstitial progenitors gradually acquire a steroidogenic fate
Using single-cell RNA sequencing of gonadal somatic cells during male sex determination, Stévant et al. identify a single Nr5a1-expressing progenitor cell population before sex determination that undergoes temporal fate specification with competence windows to differentiate first toward Sertoli cells or later to fetal Leydig cells.