Germline copy number variants (CNVs) and single-nucleotide polymorphisms (SNPs) form the basis of inter-individual genetic variation. Although the phenotypic effects of SNPs have been extensively ...investigated, the effects of CNVs is relatively less understood. To better characterize mechanisms by which CNVs affect cellular phenotype, we tested their association with variable CpG methylation in a genome-wide manner. Using paired CNV and methylation data from the 1000 genomes and HapMap projects, we identified genome-wide associations by methylation quantitative trait locus (mQTL) analysis. We found individual CNVs being associated with methylation of multiple CpGs and vice versa. CNV-associated methylation changes were correlated with gene expression. CNV-mQTLs were enriched for regulatory regions, transcription factor-binding sites (TFBSs), and were involved in long-range physical interactions with associated CpGs. Some CNV-mQTLs were associated with methylation of imprinted genes. Several CNV-mQTLs and/or associated genes were among those previously reported by genome-wide association studies (GWASs). We demonstrate that germline CNVs in the genome are associated with CpG methylation. Our findings suggest that structural variation together with methylation may affect cellular phenotype.
Proteins destined for import into the nucleus contain nuclear localization signals (NLSs) that are recognized by import receptors termed karyopherins or importins. Until recently, the only nuclear ...import sequence that had been well defined and characterized was the classical NLS (cNLS), which is recognized by importin α. However, Chook and coworkers (Lee, B. J., Cansizoglu, A. E., Süel, K. E., Louis, T. H., Zhang, Z., and Chook, Y. M. (2006) Cell 126, 543–558) have provided new insight into nuclear targeting with their identification of a novel NLS, termed the PY-NLS, that is recognized by the human karyopherin β2/transportin (Kapβ2) receptor. Here, we demonstrate that the PY-NLS is conserved in Saccharomyces cerevisiae and show for the first time that the PY-NLS is a functional nuclear targeting sequence in vivo. The apparent ortholog of Kapβ2 in yeast, Kap104, has two known cargos, the mRNA-binding proteins Hrp1 and Nab2, which both contain putative PY-NLS-like sequences. We find that the PY-NLS-like sequence within Hrp1, which closely matches the PY-NLS consensus, is both necessary and sufficient for nuclear import and is also required for receptor binding and protein function. In contrast, the PY-NLS-like sequences in Nab2, which vary from the PY-NLS consensus, are not required for proper import or protein function, suggesting that Kap104 may interact with different cargos using multiple mechanisms. Dissection of the PY-NLS consensus reveals that the minimal PY-NLS in yeast consists of the C-terminal portion of the human consensus, R/H/KX2–5PY, with upstream basic or hydrophobic residues enhancing the targeting function. Finally, we apply this analysis to a bioinformatic search of the yeast proteome as a preliminary search for new potential Kap104 cargos.
Gene expression differences are shaped by selective pressures and contribute to phenotypic differences between species. We identified 964 copy number differences (CNDs) of conserved sequences across ...three primate species and examined their potential effects on gene expression profiles. Samples with copy number different genes had significantly different expression than samples with neutral copy number. Genes encoding regulatory molecules differed in copy number and were associated with significant expression differences. Additionally, we identified 127 CNDs that were processed pseudogenes and some of which were expressed. Furthermore, there were copy number-different regulatory regions such as ultraconserved elements and long intergenic noncoding RNAs with the potential to affect expression. We postulate that CNDs of these conserved sequences fine-tune developmental pathways by altering the levels of RNA.
Current cell-free DNA (cfDNA) next generation sequencing (NGS) precision oncology workflows are typically limited to targeted and/or disease-specific applications. In advanced cancer, disease burden ...and cfDNA tumor content are often elevated, yielding unique precision oncology opportunities. We sought to demonstrate the utility of a pan-cancer, rapid, inexpensive, whole genome NGS of cfDNA approach (PRINCe) as a precision oncology screening strategy via ultra-low coverage (~0.01x) tumor content determination through genome-wide copy number alteration (CNA) profiling. We applied PRINCe to a retrospective cohort of 124 cfDNA samples from 100 patients with advanced cancers, including 76 men with metastatic castration-resistant prostate cancer (mCRPC), enabling cfDNA tumor content approximation and actionable focal CNA detection, while facilitating concordance analyses between cfDNA and tissue-based NGS profiles and assessment of cfDNA alteration associations with mCRPC treatment outcomes. Therapeutically relevant focal CNAs were present in 42 (34%) cfDNA samples, including 36 of 93 (39%) mCRPC patient samples harboring AR amplification. PRINCe identified pre-treatment cfDNA CNA profiles facilitating disease monitoring. Combining PRINCe with routine targeted NGS of cfDNA enabled mutation and CNA assessment with coverages tuned to cfDNA tumor content. In mCRPC, genome-wide PRINCe cfDNA and matched tissue CNA profiles showed high concordance (median Pearson correlation = 0.87), and PRINCe detectable
amplifications predicted reduced time on therapy, independent of therapy type (Kaplan-Meier log-rank test, chi-square = 24.9,
< 0.0001). Our screening approach enables robust, broadly applicable cfDNA-based precision oncology for patients with advanced cancer through scalable identification of therapeutically relevant CNAs and pre-/post-treatment genomic profiles, enabling cfDNA- or tissue-based precision oncology workflow optimization.
Copy number variants (CNVs), defined as losses and gains of segments of genomic DNA, are a major source of genomic variation.
In this study, we identified over 2,000 human CNVs that overlap with ...orthologous chimpanzee or orthologous macaque CNVs. Of these, 170 CNVs overlap with both chimpanzee and macaque CNVs, and these were collapsed into 34 hotspot regions of CNV formation. Many of these hotspot regions of CNV formation are functionally relevant, with a bias toward genes involved in immune function, some of which were previously shown to evolve under balancing selection in humans. The genes in these primate CNV formation hotspots have significant differential expression levels between species and show evidence for positive selection, indicating that they have evolved under species-specific, directional selection.
These hotspots of primate CNV formation provide a novel perspective on divergence and selective pressures acting on these genomic regions.
Genome diversity in Ukraine Oleksyk, Taras K; Wolfsberger, Walter W; Weber, Alexandra M ...
Gigascience,
01/2021, Letnik:
10, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Abstract
Background
The main goal of this collaborative effort is to provide genome-wide data for the previously underrepresented population in Eastern Europe, and to provide cross-validation of the ...data from genome sequences and genotypes of the same individuals acquired by different technologies. We collected 97 genome-grade DNA samples from consented individuals representing major regions of Ukraine that were consented for public data release. BGISEQ-500 sequence data and genotypes by an Illumina GWAS chip were cross-validated on multiple samples and additionally referenced to 1 sample that has been resequenced by Illumina NovaSeq6000 S4 at high coverage.
Results
The genome data have been searched for genomic variation represented in this population, and a number of variants have been reported: large structural variants, indels, copy number variations, single-nucletide polymorphisms, and microsatellites. To our knowledge, this study provides the largest to-date survey of genetic variation in Ukraine, creating a public reference resource aiming to provide data for medical research in a large understudied population.
Conclusions
Our results indicate that the genetic diversity of the Ukrainian population is uniquely shaped by evolutionary and demographic forces and cannot be ignored in future genetic and biomedical studies. These data will contribute a wealth of new information bringing forth a wealth of novel, endemic and medically related alleles.
In recent years there has been a growing interest in the role of copy number variations (CNV) in genetic diseases. Though there has been rapid development of technologies and statistical methods ...devoted to detection in CNVs from array data, the inherent challenges in data quality associated with most hybridization techniques remains a challenging problem in CNV association studies.
To help address these data quality issues in the context of family-based association studies, we introduce a statistical framework for the intensity-based array data that takes into account the family information for copy-number assignment. The method is an adaptation of traditional methods for modeling SNP genotype data that assume Gaussian mixture model, whereby CNV calling is performed for all family members simultaneously and leveraging within family-data to reduce CNV calls that are incompatible with Mendelian inheritance while still allowing de-novo CNVs. Applying this method to simulation studies and a genome-wide association study in asthma, we find that our approach significantly improves CNV calls accuracy, and reduces the Mendelian inconsistency rates and false positive genotype calls. The results were validated using qPCR experiments.
In conclusion, we have demonstrated that the use of family information can improve the quality of CNV calling and hopefully give more powerful association test of CNVs.
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, ...short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple ...populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.