CRISPR-Cas9 genome editing has potential to cure diseases without current treatments, but therapies must be safe. Here we show that CRISPR-Cas9 editing can introduce unintended mutations in vivo, ...which are passed on to the next generation. By editing fertilized zebrafish eggs using four guide RNAs selected for off-target activity in vitro, followed by long-read sequencing of DNA from >1100 larvae, juvenile and adult fish across two generations, we find that structural variants (SVs), i.e., insertions and deletions ≥50 bp, represent 6% of editing outcomes in founder larvae. These SVs occur both at on-target and off-target sites. Our results also illustrate that adult founder zebrafish are mosaic in their germ cells, and that 26% of their offspring carries an off-target mutation and 9% an SV. Hence, pre-testing for off-target activity and SVs using patient material is advisable in clinical applications, to reduce the risk of unanticipated effects with potentially large implications.
Over the past decade, the Database of Genomic Variants (DGV; http://dgv.tcag.ca/) has provided a publicly accessible, comprehensive curated catalogue of structural variation (SV) found in the genomes ...of control individuals from worldwide populations. Here, we describe updates and new features, which have expanded the utility of DGV for both the basic research and clinical diagnostic communities. The current version of DGV consists of 55 published studies, comprising >2.5 million entries identified in >22,300 genomes. Studies included in DGV are selected from the accessioned data sets in the archival SV databases dbVar (NCBI) and DGVa (EBI), and then further curated for accuracy and validity. The core visualization tool (gbrowse) has been upgraded with additional functions to facilitate data analysis and comparison, and a new query tool has been developed to provide flexible and interactive access to the data. The content from DGV is regularly incorporated into other large-scale genome reference databases and represents a standard data resource for new product and database development, in particular for copy number variation testing in clinical labs. The accurate cataloguing of variants in DGV will continue to enable medical genetics and genome sequencing research.
Significant advances have been made over the past 5 years in mapping and characterizing structural variation in the human genome. Despite this progress, our understanding of inversion variants is ...still very restricted. While unbalanced variants such as copy number variations can be mapped using array-based approaches, strategies for characterization of inversion variants have been limited and underdeveloped. Traditional cytogenetic approaches have long been able to identify microscopic inversion events, but discovery of submicroscopic events has remained elusive and largely ignored. With the advent of paired-end sequencing approaches, it is now possible to map inversions across the human genome. Based on the paired-end sequencing studies published to date, it is now feasible to make a first map of inversions across the human genome and to use this map to explore the characteristics and distribution of this form of variation. The current map of inversions indicates that many remain to be identified, especially in the smaller size ranges. This review provides an overview of the current knowledge about human inversions and their contribution to human phenotypes. Further characterization of inversions should be considered as an important step towards a deeper understanding of human variation and genome dynamics.
Transcriptome analysis has mainly relied on analyzing RNA sequencing data from whole cells, overlooking the impact of subcellular RNA localization and its influence on our understanding of gene ...function, and interpretation of gene expression signatures in cells. Here, we separated cytosolic and nuclear RNA from human fetal and adult brain samples and performed a comprehensive analysis of cytosolic and nuclear transcriptomes. There are significant differences in RNA expression for protein-coding and lncRNA genes between cytosol and nucleus. We show that transcripts encoding the nuclear-encoded mitochondrial proteins are significantly enriched in the cytosol compared to the rest of protein-coding genes. Differential expression analysis between fetal and adult frontal cortex show that results obtained from the cytosolic RNA differ from results using nuclear RNA both at the level of transcript types and the number of differentially expressed genes. Our data provide a resource for the subcellular localization of thousands of RNA transcripts in the human brain and highlight differences in using the cytosolic or the nuclear transcriptomes for expression analysis.
Chromoanagenesis is a genomic event responsible for the formation of complex structural chromosomal rearrangements (CCRs). Germline chromoanagenesis is rare and the majority of reported cases are ...associated with an affected phenotype. Here, we report a healthy female carrying two de novo CCRs involving chromosomes 4, 19, 21 and X and chromosomes 7 and 11, respectively, with a total of 137 breakpoint junctions (BPJs). We characterized the CCRs using a hybrid-sequencing approach, combining short-read sequencing, nanopore sequencing, and optical mapping. The results were validated using multiple cytogenetic methods, including fluorescence in situ hybridization, spectral karyotyping, and Sanger sequencing. We identified 137 BPJs, which to our knowledge is the highest number of reported breakpoint junctions in germline chromoanagenesis. We also performed a statistical assessment of the positioning of the breakpoints, revealing a significant enrichment of BPJ-affecting genes (96 intragenic BPJs, 26 genes,
p
< 0.0001), indicating that the CCRs formed during active transcription of these genes. In addition, we find that the DNA fragments are unevenly and non-randomly distributed across the derivative chromosomes indicating a multistep process of scattering and re-joining of DNA fragments. In summary, we report a new maximum number of BPJs (137) in germline chromoanagenesis. We also show that a hybrid sequencing approach is necessary for the correct characterization of complex CCRs. Through in-depth statistical assessment, it was found that the CCRs most likely was formed through an event resembling chromoplexy—a catastrophic event caused by erroneous transcription factor binding.
Schizophrenia is a complex neurodevelopmental disorder with high rate of morbidity and mortality. While the heritability rate is high, the precise etiology is still unknown. Although schizophrenia is ...a central nervous system disorder, studies using peripheral tissues have also been established to search for patient specific biomarkers and to increase understanding of schizophrenia etiology. Among all peripheral tissues, fibroblasts stand out as they are easy to obtain and culture. Furthermore, they keep genetic stability for long period and exhibit molecular similarities to cells from nervous system. Using a unique set of fibroblast samples from a genetically isolated population in northern Sweden, we performed whole transcriptome sequencing to compare differentially expressed genes in seven controls and nine patients. We found differential fibroblast expression between cases and controls for 48 genes, including eight genes previously implicated in schizophrenia or schizophrenia related pathways; HGF, PRRT2, EGR1, EGR3, C11orf87, TLR3, PLEKHH2 and PIK3CD. Weighted gene correlation network analysis identified three differentially co-expressed networks of genes significantly-associated with schizophrenia. All three modules were significantly suppressed in patients compared to control, with one module highly enriched in genes involved in synaptic plasticity, behavior and synaptic transmission. In conclusion, our results support the use of fibroblasts for identification of differentially expressed genes in schizophrenia and highlight dysregulation of synaptic networks as an important mechanism in schizophrenia.
Structural variations of DNA greater than 1 kilobase in size account for most bases that vary among human genomes, but are still relatively under-ascertained. Here we use tiling oligonucleotide ...microarrays, comprising 42 million probes, to generate a comprehensive map of 11,700 copy number variations (CNVs) greater than 443 base pairs, of which most (8,599) have been validated independently. For 4,978 of these CNVs, we generated reference genotypes from 450 individuals of European, African or East Asian ancestry. The predominant mutational mechanisms differ among CNV size classes. Retrotransposition has duplicated and inserted some coding and non-coding DNA segments randomly around the genome. Furthermore, by correlation with known trait-associated single nucleotide polymorphisms (SNPs), we identified 30 loci with CNVs that are candidates for influencing disease susceptibility. Despite this, having assessed the completeness of our map and the patterns of linkage disequilibrium between CNVs and SNPs, we conclude that, for complex traits, the heritability void left by genome-wide association studies will not be accounted for by common CNVs.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
We identified 255 loci across the human genome that contain genomic imbalances among unrelated individuals. Twenty-four variants are present in > 10% of the individuals that we examined. Half of ...these regions overlap with genes, and many coincide with segmental duplications or gaps in the human genome assembly. This previously unappreciated heterogeneity may underlie certain human phenotypic variation and susceptibility to disease and argues for a more dynamic human genome structure.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
The COVID-19 pandemic highlighted the need for a rapid, convenient, and scalable diagnostic method for detecting a novel pathogen amidst a global pandemic. While command-line interface tools offer ...automation for SARS-CoV-2 Oxford Nanopore Technology sequencing data analysis, they are inapplicable to users with limited programming skills. A solution is to establish such automated workflows within a graphical user interface software. We developed two workflows in the software Geneious Prime 2022.1.1, adapted for data obtained from the Midnight and Artic’s nCoV-2019 sequencing protocols. Both workflows perform trimming, read mapping, consensus generation, and annotation on SARS-CoV-2 Nanopore sequencing data. Additionally, one workflow includes phylogenetic assignment using the bioinformatic tools pangolin and Nextclade as plugins. The basic workflow was validated in 2020, adhering to the requirements of the European Centre for Disease Prevention and Control for SARS-CoV-2 sequencing and analysis. The enhanced workflow, providing phylogenetic assignment, underwent validation at Uppsala University Hospital by analysing 96 clinical samples. It provided accurate diagnoses matching the original results of the basic workflow while also reducing manual clicks and analysis time. These bioinformatic workflows streamline SARS-CoV-2 Nanopore data analysis in Geneious Prime, saving time and manual work for operators lacking programming knowledge.
Cervical carcinoma has a heritable genetic component, but the genetic basis of cervical cancer is still not well understood.
We performed a genome-wide association study of 731 422 single nucleotide ...polymorphisms (SNPs) in 1075 cervical cancer case subjects and 4014 control subjects and replicated it in 1140 case subjects and 1058 control subjects. The association between top SNPs and cervical cancer was estimated by odds ratios (ORs) and 95% confidence intervals (CIs) with unconditional logistic regression. All statistical tests were two-sided.
Three independent loci in the major histocompatibility complex (MHC) region at 6p21.3 were associated with cervical cancer: the first is adjacent to the MHC class I polypeptide-related sequence A gene (MICA) (rs2516448; OR = 1.42, 95% CI = 1.31 to 1.54; P = 1.6×10(-18)); the second is between HLA-DRB1 and HLA-DQA1 (rs9272143; OR = 0.67, 95% CI = 0.62 to 0.72; P = 9.3×10(-24)); and the third is at HLA-DPB2 (rs3117027; OR=1.25, 95% CI = 1.15 to 1.35; P = 4.9×10(-8)). We also confirmed previously reported associations of B*0702 and DRB1*1501-DQB1*0602 with susceptibility to and DRB1*1301-DQA1*0103-DQB1*0603 with protection against cervical cancer. The three new loci are statistically independent of these specific human leukocyte antigen alleles/haplotypes. MICA encodes a membrane-bound protein that acts as a ligand for NKG2D to activate antitumor effects. The risk allele of rs2516448 is in perfect linkage disequilibrium with a frameshift mutation (A5.1) of MICA, which results in a truncated protein. Functional analysis shows that women carrying this mutation have lower levels of membrane-bound MICA.
Three novel loci in the MHC may affect susceptibility to cervical cancer in situ, including the MICA-A5.1 allele that may cause impaired immune activation and increased risk of tumor development.