The epithelial-mesenchymal signaling involving SHH-FOXF1, TBX4-FGF10, and TBX2 pathways is an essential transcriptional network operating during early lung organogenesis. However, precise regulatory ...interactions between different genes and proteins in this pathway are incompletely understood.
To identify TBX2 and TBX4 genome-wide binding sites, we performed chromatin immunoprecipitation followed by next-generation sequencing (ChIP-seq) in human fetal lung fibroblasts IMR-90.
We identified 14,322 and 1,862 sites strongly-enriched for binding of TBX2 and TBX4, respectively, 43.95% and 18.79% of which are located in the gene promoter regions. Gene Ontology, pathway enrichment, and DNA binding motif analyses revealed a number of overrepresented cues and transcription factor binding motifs relevant for lung branching that can be transcriptionally regulated by TBX2 and/or TBX4. In addition, TBX2 and TBX4 binding sites were found enriched around and within FOXF1 and its antisense long noncoding RNA FENDRR, indicating that the TBX4-FGF10 cascade may directly interact with the SHH-FOXF1 signaling.
We highlight the complexity of transcriptional network driven by TBX2 and TBX4 and show that disruption of this crosstalk during morphogenesis can play a substantial role in etiology of lung developmental disorders.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
There are over 25 tools dedicated for the detection of Copy Number Variants (CNVs) using Whole Exome Sequencing (WES) data based on read depth analysis. The tools reported consist of several steps, ...including: (i) calculation of read depth for each sequencing target, (ii) normalization, (iii) segmentation and (iv) actual CNV calling. The essential aspect of the entire process is the normalization stage, in which systematic errors and biases are removed and the reference sample set is used to increase the signal-to-noise ratio. Although some CNV calling tools use dedicated algorithms to obtain the optimal reference sample set, most of the advanced CNV callers do not include this feature. To our knowledge, this work is the first attempt to assess the impact of reference sample set selection on CNV detection performance.
We used WES data from the 1000 Genomes project to evaluate the impact of various methods of reference sample set selection on CNV calling performance of three chosen state-of-the-art tools: CODEX, CNVkit and exomeCopy. Two naive solutions (all samples as reference set and random selection) as well as two clustering methods (k-means and k nearest neighbours (kNN) with a variable number of clusters or group sizes) have been evaluated to discover the best performing sample selection method.
The performed experiments have shown that the appropriate selection of the reference sample set may greatly improve the CNV detection rate. In particular, we found that smart reduction of reference sample size may significantly increase the algorithms' precision while having negligible negative effect on sensitivity. We observed that a complete CNV calling process with the k-means algorithm as the selection method has significantly better time complexity than kNN-based solution.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The goal of this study was to assess the scale of low-level parental mosaicism in exome sequencing (ES) databases.
We analyzed approximately 2000 family trio ES data sets from the Baylor-Hopkins ...Center for Mendelian Genomics (BHCMG) and Baylor Genetics (BG). Among apparent de novo single-nucleotide variants identified in the affected probands, we selected rare unique variants with variant allele fraction (VAF) between 30% and 70% in the probands and lower than 10% in one of the parents.
Of 102 candidate mosaic variants validated using amplicon-based next-generation sequencing, droplet digital polymerase chain reaction, or blocker displacement amplification, 27 (26.4%) were confirmed to be low- (VAF between 1% and 10%) or very low (VAF <1%) level mosaic. Detection precision in parental samples with two or more alternate reads was 63.6% (BHCMG) and 43.6% (BG). In nine investigated individuals, we observed variability of mosaic ratios among blood, saliva, fibroblast, buccal, hair, and urine samples.
Our computational pipeline enables robust discrimination between true and false positive candidate mosaic variants and efficient detection of low-level mosaicism in ES samples. We confirm that the presence of two or more alternate reads in the parental sample is a reliable predictor of low-level parental somatic mosaicism.
Although mosaic variation has been known to cause disease for decades, high-throughput sequencing technologies with the analytical sensitivity to consistently detect variants at reduced allelic ...fractions have only recently emerged as routine clinical diagnostic tests. To date, few systematic analyses of mosaic variants detected by diagnostic exome sequencing for diverse clinical indications have been performed.
To investigate the frequency, type, allelic fraction, and phenotypic consequences of clinically relevant somatic mosaic single nucleotide variants (SNVs) and characteristics of the corresponding genes, we retrospectively queried reported mosaic variants from a cohort of ~ 12,000 samples submitted for clinical exome sequencing (ES) at Baylor Genetics.
We found 120 mosaic variants involving 107 genes, including 80 mosaic SNVs in proband samples and 40 in parental/grandparental samples. Average mosaic alternate allele fraction (AAF) detected in autosomes and in X-linked disease genes in females was 18.2% compared with 34.8% in X-linked disease genes in males. Of these mosaic variants, 74 variants (61.7%) were classified as pathogenic or likely pathogenic and 46 (38.3%) as variants of uncertain significance. Mosaic variants occurred in disease genes associated with autosomal dominant (AD) or AD/autosomal recessive (AR) (67/120, 55.8%), X-linked (33/120, 27.5%), AD/somatic (10/120, 8.3%), and AR (8/120, 6.7%) inheritance. Of note, 1.7% (2/120) of variants were found in genes in which only somatic events have been described. Nine genes had recurrent mosaic events in unrelated individuals which accounted for 18.3% (22/120) of all detected mosaic variants in this study. The proband group was enriched for mosaicism affecting Ras signaling pathway genes.
In sum, an estimated 1.5% of all molecular diagnoses made in this cohort could be attributed to a mosaic variant detected in the proband, while parental mosaicism was identified in 0.3% of families analyzed. As ES design favors breadth over depth of coverage, this estimate of the prevalence of mosaic variants likely represents an underestimate of the total number of clinically relevant mosaic variants in our cohort.
Megacystis-microcolon-intestinal hypoperistalsis syndrome (MMIHS) is a rare disorder of enteric smooth muscle function affecting the intestine and bladder. Patients with this severe phenotype are ...dependent on total parenteral nutrition and urinary catheterization. The cause of this syndrome has remained a mystery since Berdon's initial description in 1976. No genes have been clearly linked to MMIHS. We used whole-exome sequencing for gene discovery followed by targeted Sanger sequencing in a cohort of patients with MMIHS and intestinal pseudo-obstruction. We identified heterozygous ACTG2 missense variants in 15 unrelated subjects, ten being apparent de novo mutations. Ten unique variants were detected, of which six affected CpG dinucleotides and resulted in missense mutations at arginine residues, perhaps related to biased usage of CpG containing codons within actin genes. We also found some of the same heterozygous mutations that we observed as apparent de novo mutations in MMIHS segregating in families with intestinal pseudo-obstruction, suggesting that ACTG2 is responsible for a spectrum of smooth muscle disease. ACTG2 encodes γ2 enteric actin and is the first gene to be clearly associated with MMIHS, suggesting an important role for contractile proteins in enteric smooth muscle disease.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Alveolar capillary dysplasia with misalignment of pulmonary veins (ACDMPV) is a rare lethal congenital lung disorder in neonates characterized by severe progressive respiratory failure and refractory ...pulmonary hypertension, resulting from underdevelopment of the peripheral pulmonary tree. Causative heterozygous single nucleotide variants (SNVs) or copy-number variant (CNV) deletions involving FOXF1 or its distant lung-specific enhancer on chromosome 16q24.1 have been identified in 80-90% of ACDMPV patients. FOXF1 maps closely to and regulates the oppositely oriented FENDRR, with which it also shares regulatory elements.
To better understand the transcriptional networks downstream of FOXF1 that are relevant for lung organogenesis, using RNA-seq, we have examined lung transcriptomes in 12 histopathologically verified ACDMPV patients with or without pathogenic variants in the FOXF1 locus and analyzed gene expression profile in FENDRR-depleted fetal lung fibroblasts, IMR-90.
RNA-seq analyses in ACDMPV neonates revealed changes in the expression of several genes, including semaphorins (SEMAs), neuropilin 1 (NRP1), and plexins (PLXNs), essential for both epithelial branching and vascular patterning. In addition, we have found deregulation of the vascular endothelial growth factor (VEGF) signaling that also controls pulmonary vasculogenesis and a lung-specific endothelial gene TMEM100 known to be essential in vascular morphogenesis. Interestingly, we have observed a substantial difference in gene expression profiles between the ACDMPV samples with different types of FOXF1 defect. Moreover, partial overlap between transcriptome profiles of ACDMPV lungs with FOXF1 SNVs and FENDRR-depleted IMR-90 cells suggests contribution of FENDRR to ACDMPV etiology.
Our transcriptomic data imply potential crosstalk between several lung developmental pathways, including interactions between FOXF1-SHH and SEMA-NRP or VEGF/VEGFR2 signaling, and provide further insight into complexity of lung organogenesis in humans.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Due to the limitations of the current routine diagnostic methods, low-level somatic mosaicism with variant allele fraction (VAF) < 10% is often undetected in clinical settings. To date, only a few ...studies have attempted to analyze tissue distribution of low-level parental mosaicism in a large clinical exome sequencing (ES) cohort.
Using a customized bioinformatics pipeline, we analyzed apparent de novo single-nucleotide variants or indels identified in the affected probands in ES trio data at Baylor Genetics clinical laboratories. Clinically relevant variants with VAFs between 30 and 70% in probands and lower than 10% in one parent were studied. DNA samples extracted from saliva, buccal cells, redrawn peripheral blood, urine, hair follicles, and nail, representing all three germ layers, were tested using PCR amplicon next-generation sequencing (amplicon NGS) and droplet digital PCR (ddPCR).
In a cohort of 592 clinical ES trios, we found 61 trios, each with one parent suspected of low-level mosaicism. In 21 parents, the variants were validated using amplicon NGS and seven of them by ddPCR in peripheral blood DNA samples. The parental VAFs in blood samples varied between 0.08 and 9%. The distribution of VAFs in additional tissues ranged from 0.03% in hair follicles to 9% in re-drawn peripheral blood.
Our study illustrates the importance of analyzing ES data using sensitive computational and molecular methods for low-level parental somatic mosaicism for clinically relevant variants previously diagnosed in routine clinical diagnostics as apparent de novo.
The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA ...methylation and with non-allelic homologous recombination (NAHR) mediated by low-copy repeats (LCRs). Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs) from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH) chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Application of whole genome sequencing (WGS) enables identification of non-coding variants that play a phenotype-modifying role and are undetectable by exome sequencing. Recently, non-coding ...regulatory single nucleotide variants (SNVs) have been reported in patients with lethal lung developmental disorders (LLDDs) or congenital scoliosis with recurrent copy-number variant (CNV) deletions at 17q23.1q23.2 or 16p11.2, respectively.
Here, we report a deceased newborn with pulmonary hypertension and pulmonary interstitial emphysema with features suggestive of pulmonary hypoplasia, resulting in respiratory failure and neonatal death soon after birth. Using the array comparative genomic hybridization and WGS, two heterozygous recurrent CNV deletions: ~ 2.2 Mb on 17q23.1q23.2, involving TBX4, and ~ 600 kb on 16p11.2, involving TBX6, that both arose de novo on maternal chromosomes were identified. In the predicted lung-specific enhancer upstream to TBX4, we have detected seven novel putative regulatory non-coding SNVs that were absent in 13 control individuals with the overlapping deletions but without any structural lung anomalies.
Our findings further support a recently reported model of complex compound inheritance of LLDD in which both non-coding and coding heterozygous TBX4 variants contribute to the lung phenotype. In addition, this is the first report of a patient with combined de novo heterozygous recurrent 17q23.1q23.2 and 16p11.2 CNV deletions.
ABSTRACT
Inverse paralogous low‐copy repeats (IP‐LCRs) can cause genome instability by nonallelic homologous recombination (NAHR)‐mediated balanced inversions. When disrupting a dosage‐sensitive ...gene(s), balanced inversions can lead to abnormal phenotypes. We delineated the genome‐wide distribution of IP‐LCRs >1 kB in size with >95% sequence identity and mapped the genes, potentially intersected by an inversion, that overlap at least one of the IP‐LCRs. Remarkably, our results show that 12.0% of the human genome is potentially susceptible to such inversions and 942 genes, 99 of which are on the X chromosome, are predicted to be disrupted secondary to such an inversion! In addition, IP‐LCRs larger than 800 bp with at least 98% sequence identity (duplication/triplication facilitating IP‐LCRs, DTIP‐LCRs) were recently implicated in the formation of complex genomic rearrangements with a duplication‐inverted triplication–duplication (DUP‐TRP/INV‐DUP) structure by a replication‐based mechanism involving a template switch between such inverted repeats. We identified 1,551 DTIP‐LCRs that could facilitate DUP‐TRP/INV‐DUP formation. Remarkably, 1,445 disease‐associated genes are at risk of undergoing copy‐number gain as they map to genomic intervals susceptible to the formation of DUP‐TRP/INV‐DUP complex rearrangements. We implicate inverted LCRs as a human genome architectural feature that could potentially be responsible for genomic instability associated with many human disease traits.