The number of human genomes being genotyped or sequenced increases exponentially and efficient haplotype estimation methods able to handle this amount of data are now required. Here we present a ...method, SHAPEIT4, which substantially improves upon other methods to process large genotype and high coverage sequencing datasets. It notably exhibits sub-linear running times with sample size, provides highly accurate haplotypes and allows integrating external phasing information such as large reference panels of haplotypes, collections of pre-phased variants and long sequencing reads. We provide SHAPEIT4 in an open source format and demonstrate its performance in terms of accuracy and running times on two gold standard datasets: the UK Biobank data and the Genome In A Bottle.
In order to discover quantitative trait loci, multi-dimensional genomic datasets combining DNA-seq and ChiP-/RNA-seq require methods that rapidly correlate tens of thousands of molecular phenotypes ...with millions of genetic variants while appropriately controlling for multiple testing.
We have developed FastQTL, a method that implements a popular cis-QTL mapping strategy in a user- and cluster-friendly tool. FastQTL also proposes an efficient permutation procedure to control for multiple testing. The outcome of permutations is modeled using beta distributions trained from a few permutations and from which adjusted P-values can be estimated at any level of significance with little computational cost. The Geuvadis & GTEx pilot datasets can be now easily analyzed an order of magnitude faster than previous approaches.
Source code, binaries and comprehensive documentation of FastQTL are freely available to download at http://fastqtl.sourceforge.net/
emmanouil.dermitzakis@unige.ch or olivier.delaneau@unige.ch
Supplementary data are available at Bioinformatics online.
Expression quantitative trait loci: present and future Nica, Alexandra C.; Dermitzakis, Emmanouil T.
Philosophical transactions of the Royal Society of London. Series B. Biological sciences,
06/2013, Volume:
368, Issue:
1620
Journal Article
Peer reviewed
Open access
The last few years have seen the development of large efforts for the analysis of genome function, especially in the context of genome variation. One of the most prominent directions has been the ...extensive set of studies on expression quantitative trait loci (eQTLs), namely, the discovery of genetic variants that explain variation in gene expression levels. Such studies have offered promise not just for the characterization of functional sequence variation but also for the understanding of basic processes of gene regulation and interpretation of genome-wide association studies. In this review, we discuss some of the key directions of eQTL research and its implications.
Full text
Available for:
BFBNIB, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK
Sex determination is a unique process that allows the study of multipotent progenitors and their acquisition of sex-specific fates during differentiation of the gonad into a testis or an ovary. Using ...time series single-cell RNA sequencing (scRNA-seq) on ovarian Nr5a1-GFP+ somatic cells during sex determination, we identified a single population of early progenitors giving rise to both pre-granulosa cells and potential steroidogenic precursor cells. By comparing time series single-cell RNA sequencing of XX and XY somatic cells, we provide evidence that gonadal supporting cells are specified from these early progenitors by a non-sex-specific transcriptomic program before pre-granulosa and Sertoli cells acquire their sex-specific identity. In XX and XY steroidogenic precursors, similar transcriptomic profiles underlie the acquisition of cell fate but with XX cells exhibiting a relative delay. Our data provide an important resource, at single-cell resolution, for further interrogation of the molecular and cellular basis of mammalian sex determination.
Display omitted
•XX Nr5a1+ progenitors give rise to pre-granulosa and steroidogenic precursor cells•Supporting lineages commit similarly with the exception of Sry in XY cells•Sertoli and granulosa cell differentiation is temporally asymmetric•XY and XX progenitors progressively acquire a steroidogenic precursor fate
Using single-cell RNA sequencing of Nr5a1-expressing gonadal somatic cells during female and male sex determination, Stévant et al. deconvoluted the cell lineage specification process and sex-specific differentiation of both the supporting and the steroidogenic cell lineages at a transcriptomic level.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
With the advent of RNA-sequencing technology, we can detect different types of alternative splicing and determine how DNA variation regulates splicing. However, given the short read lengths used in ...most population-based RNA-sequencing experiments, quantifying transcripts accurately remains a challenge. Here we present a method, Altrans, for discovery of alternative splicing quantitative trait loci (asQTLs). To assess the performance of Altrans, we compared it to Cufflinks and MISO in simulations and Cufflinks for asQTL discovery. Simulations show that in the presence of unannotated transcripts, Altrans performs better in quantifications than Cufflinks and MISO. We have applied Altrans and Cufflinks to the Geuvadis dataset, which comprises samples from European and African populations, and discovered (FDR = 1%) 1,427 and 166 asQTLs with Altrans and 1,737 and 304 asQTLs with Cufflinks for Europeans and Africans, respectively. We show that, by discovering a set of asQTLs in a smaller subset of European samples and replicating these in the remaining larger subset of Europeans, both methods achieve similar replication levels (95% for both methods). We find many Altrans-specific asQTLs, which replicate to a high degree (93%). This is mainly due to junctions absent from the annotations and hence not tested with Cufflinks. The asQTLs are significantly enriched for biochemically active regions of the genome, functional marks, and variants in splicing regions, highlighting their biological relevance. We present an approach for discovering asQTLs that is a more direct assessment of splicing compared to other methods and is complementary to other transcript quantification methods.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Understanding how genetic variation affects distinct cellular phenotypes, such as gene expression levels, alternative splicing and DNA methylation levels, is essential for better understanding of ...complex diseases and traits. Furthermore, how inter-individual variation of DNA methylation is associated to gene expression is just starting to be studied. In this study, we use the GenCord cohort of 204 newborn Europeans' lymphoblastoid cell lines, T-cells and fibroblasts derived from umbilical cords. The samples were previously genotyped for 2.5 million SNPs, mRNA-sequenced, and assayed for methylation levels in 482,421 CpG sites. We observe that methylation sites associated to expression levels are enriched in enhancers, gene bodies and CpG island shores. We show that while the correlation between DNA methylation and gene expression can be positive or negative, it is very consistent across cell-types. However, this epigenetic association to gene expression appears more tissue-specific than the genetic effects on gene expression or DNA methylation (observed in both sharing estimations based on P-values and effect size correlations between cell-types). This predominance of genetic effects can also be reflected by the observation that allele specific expression differences between individuals dominate over tissue-specific effects. Additionally, we discover genetic effects on alternative splicing and interestingly, a large amount of DNA methylation correlating to alternative splicing, both in a tissue-specific manner. The locations of the SNPs and methylation sites involved in these associations highlight the participation of promoter proximal and distant regulatory regions on alternative splicing. Overall, our results provide high-resolution analyses showing how genome sequence variation has a broad effect on cellular phenotypes across cell-types, whereas epigenetic factors provide a secondary layer of variation that is more tissue-specific. Furthermore, the details of how this tissue-specificity may vary across inter-relations of molecular traits, and where these are occurring, can yield further insights into gene regulation and cellular biology as a whole.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
This article is based on the address given by the author at the 2021 virtual meeting of the American Society of Human Genetics (ASHG). The video of the original address can be found at the ASHG ...website
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Population scale studies combining genetic information with molecular phenotypes (for example, gene expression) have become a standard to dissect the effects of genetic variants onto organismal ...phenotypes. These kinds of data sets require powerful, fast and versatile methods able to discover molecular Quantitative Trait Loci (molQTL). Here we propose such a solution, QTLtools, a modular framework that contains multiple new and well-established methods to prepare the data, to discover proximal and distal molQTLs and, finally, to integrate them with GWAS variants and functional annotations of the genome. We demonstrate its utility by performing a complete expression QTL study in a few easy-to-perform steps. QTLtools is open source and available at https://qtltools.github.io/qtltools/.
The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants, but also more recently to assist in the interpretation and ...elucidation of disease signals. To date, many studies have looked in specific tissues and population-based samples, but there has been limited assessment of the degree of inter-population variability in regulatory variation. We analyzed genome-wide gene expression in lymphoblastoid cell lines from a total of 726 individuals from 8 global populations from the HapMap3 project and correlated gene expression levels with HapMap3 SNPs located in cis to the genes. We describe the influence of ancestry on gene expression levels within and between these diverse human populations and uncover a non-negligible impact on global patterns of gene expression. We further dissect the specific functional pathways differentiated between populations. We also identify 5,691 expression quantitative trait loci (eQTLs) after controlling for both non-genetic factors and population admixture and observe that half of the cis-eQTLs are replicated in one or more of the populations. We highlight patterns of eQTL-sharing between populations, which are partially determined by population genetic relatedness, and discover significant sharing of eQTL effects between Asians, European-admixed, and African subpopulations. Specifically, we observe that both the effect size and the direction of effect for eQTLs are highly conserved across populations. We observe an increasing proximity of eQTLs toward the transcription start site as sharing of eQTLs among populations increases, highlighting that variants close to TSS have stronger effects and therefore are more likely to be detected across a wider panel of populations. Together these results offer a unique picture and resource of the degree of differentiation among human populations in functional regulatory variation and provide an estimate for the transferability of complex trait variants across populations.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Vaccines are thought to be the best available solution for controlling the ongoing SARS-CoV-2 pandemic. However, the emergence of vaccine-resistant strains may come too rapidly for current vaccine ...developments to alleviate the health, economic and social consequences of the pandemic. To quantify and characterize the risk of such a scenario, we created a SIR-derived model with initial stochastic dynamics of the vaccine-resistant strain to study the probability of its emergence and establishment. Using parameters realistically resembling SARS-CoV-2 transmission, we model a wave-like pattern of the pandemic and consider the impact of the rate of vaccination and the strength of non-pharmaceutical intervention measures on the probability of emergence of a resistant strain. As expected, we found that a fast rate of vaccination decreases the probability of emergence of a resistant strain. Counterintuitively, when a relaxation of non-pharmaceutical interventions happened at a time when most individuals of the population have already been vaccinated the probability of emergence of a resistant strain was greatly increased. Consequently, we show that a period of transmission reduction close to the end of the vaccination campaign can substantially reduce the probability of resistant strain establishment. Our results suggest that policymakers and individuals should consider maintaining non-pharmaceutical interventions and transmission-reducing behaviours throughout the entire vaccination period.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK