Significance Whole-exome sequencing (WES) is gradually being optimized to identify mutations in increasing proportions of the protein-coding exome, but whole-genome sequencing (WGS) is becoming an ...attractive alternative. WGS is currently more expensive than WES, but its cost should decrease more rapidly than that of WES. We compared WES and WGS on six unrelated individuals. The distribution of quality parameters for single-nucleotide variants (SNVs) and insertions/deletions (indels) was more uniform for WGS than for WES. The vast majority of SNVs and indels were identified by both techniques, but an estimated 650 high-quality coding SNVs (∼3% of coding variants) were detected by WGS and missed by WES. WGS is therefore slightly more efficient than WES for detecting mutations in the targeted exome.
We compared whole-exome sequencing (WES) and whole-genome sequencing (WGS) in six unrelated individuals. In the regions targeted by WES capture (81.5% of the consensus coding genome), the mean numbers of single-nucleotide variants (SNVs) and small insertions/deletions (indels) detected per sample were 84,192 and 13,325, respectively, for WES, and 84,968 and 12,702, respectively, for WGS. For both SNVs and indels, the distributions of coverage depth, genotype quality, and minor read ratio were more uniform for WGS than for WES. After filtering, a mean of 74,398 (95.3%) high-quality (HQ) SNVs and 9,033 (70.6%) HQ indels were called by both platforms. A mean of 105 coding HQ SNVs and 32 indels was identified exclusively by WES whereas 692 HQ SNVs and 105 indels were identified exclusively by WGS. We Sanger-sequenced a random selection of these exclusive variants. For SNVs, the proportion of false-positive variants was higher for WES (78%) than for WGS (17%). The estimated mean number of real coding SNVs (656 variants, ∼3% of all coding HQ SNVs) identified by WGS and missed by WES was greater than the number of SNVs identified by WES and missed by WGS (26 variants). For indels, the proportions of false-positive variants were similar for WES (44%) and WGS (46%). Finally, WES was not reliable for the detection of copy-number variations, almost all of which extended beyond the targeted regions. Although currently more expensive, WGS is more powerful than WES for detecting potential disease-causing mutations within WES regions, particularly those due to SNVs.
Human genes governing innate immunity provide a valuable tool for the study of the selective pressure imposed by microorganisms on host genomes. A comprehensive, genome-wide study of how selective ...constraints and adaptations have driven the evolution of innate immunity genes is missing. Using full-genome sequence variation from the 1000 Genomes Project, we first show that innate immunity genes have globally evolved under stronger purifying selection than the remainder of protein-coding genes. We identify a gene set under the strongest selective constraints, mutations in which are likely to predispose individuals to life-threatening disease, as illustrated by STAT1 and TRAF3. We then evaluate the occurrence of local adaptation and detect 57 high-scoring signals of positive selection at innate immunity genes, variation in which has been associated with susceptibility to common infectious or autoimmune diseases. Furthermore, we show that most adaptations targeting coding variation have occurred in the last 6,000–13,000 years, the period at which populations shifted from hunting and gathering to farming. Finally, we show that innate immunity genes present higher Neandertal introgression than the remainder of the coding genome. Notably, among the genes presenting the highest Neandertal ancestry, we find the TLR6-TLR1-TLR10 cluster, which also contains functional adaptive variation in Europeans. This study identifies highly constrained genes that fulfill essential, non-redundant functions in host survival and reveals others that are more permissive to change—containing variation acquired from archaic hominins or adaptive variants in specific populations—improving our understanding of the relative biological importance of innate immunity pathways in natural conditions.
Whole exome sequencing (WES) has proven an effective tool for the discovery of genetic defects in patients with primary immunodeficiencies (PIDs). However, success in dissecting the genetic etiology ...of common variable immunodeficiency (CVID) has been limited. We outline a practical framework for using WES to identify causative genetic defects in these subjects. WES was performed on 50 subjects diagnosed with CVID who had at least one of the following criteria: early onset, autoimmune/inflammatory manifestations, low B lymphocytes, and/or familial history of hypogammaglobulinemia. Following alignment and variant calling, exomes were screened for mutations in 269 PID-causing genes. Variants were filtered based on the mode of inheritance and reported frequency in the general population. Each variant was assessed by study of familial segregation and computational predictions of deleteriousness. Out of 433 variations in PID-associated genes, we identified 17 probable disease-causing mutations in 15 patients (30%). These variations were rare or private and included monoallelic mutations in NFKB1, STAT3, CTLA4, PIK3CD, and IKZF1, and biallelic mutations in LRBA and STXBP2. Forty-two other damaging variants were found but were not considered likely disease-causing based on the mode of inheritance and/or patient phenotype. WES combined with analysis of PID-associated genes is a cost-effective approach to identify disease-causing mutations in CVID patients with severe phenotypes and was successful in 30% of our cohort. As targeted therapeutics are becoming the mainstay of treatment for non-infectious manifestations in CVID, this approach will improve management of patients with more severe phenotypes.
Network biology has the capability to integrate, represent, interpret, and model complex biological systems by collectively accommodating biological omics data, biological interactions and ...associations, graph theory, statistical measures, and visualizations. Biological networks have recently been shown to be very useful for studies that decipher biological mechanisms and disease etiologies and for studies that predict therapeutic responses, at both the molecular and system levels. In this review, we briefly summarize the general framework of biological network studies, including data resources, network construction methods, statistical measures, network topological properties, and visualization tools. We also introduce several recent biological network applications and methods for the studies of rare diseases.
Brown adipose tissue (BAT) is a promising therapeutic target against obesity. Therefore, research on the genetic architecture of BAT could be key for the development of successful therapies against ...this complex phenotype. Hypothesis-driven candidate gene association studies are useful for studying genetic determinants of complex traits, but they are dependent upon the previous knowledge to select candidate genes. Here, we predicted 107 novel-BAT candidate genes in silico using the uncoupling protein one (UCP1) as the hallmark of BAT activity. We first identified the top 1% of human genes predicted by the human gene connectome to be biologically closest to the UCP1, estimating 167 additional pathway genes (BAT connectome). We validated this prediction by showing that 60 genes already associated with BAT were included in the connectome and they were biologically closer to each other than expected by chance (p < 2.2 × 10
). The rest of genes (107) are potential candidates for BAT, being also closer to known BAT genes and more expressed in BAT biopsies than expected by chance (p < 2.2 × 10
; p = 4.39 × 10
). The resulting new list of predicted human BAT genes should be useful for the discovery of novel BAT genes and metabolic pathways.
Over the last decade next generation sequencing (NGS) has been extensively used to identify new pathogenic mutations and genes causing rare genetic diseases. The efficient analyses of NGS data is not ...trivial and requires a technically and biologically rigorous pipeline that addresses data quality control, accurate variant filtration to minimize false positives and false negatives, and prioritization of the remaining genes based on disease genomics and physiological knowledge. This review provides a pipeline including all these steps, describes popular software for each step of the analysis, and proposes a general framework for the identification of causal mutations and genes in individual patients of rare genetic diseases.
Abstract Background Myocarditis is inflammation of the heart muscle that can follow various viral infections. Why children only rarely develop life-threatening acute viral myocarditis (AVM), given ...that the causal viral infections are common, is unknown. Genetic lesions might underlie such susceptibilities. Mouse genetic studies demonstrated that interferon (IFN)-α/β immunity defects increased susceptibility to virus-induced myocarditis. Moreover, variations in human TLR3 , a potent inducer of IFNs, were proposed to underlie AVM. Objectives This study sought to evaluate the hypothesis that human genetic factors may underlie AVM in previously healthy children. Methods We tested the role of TLR3-IFN immunity using human induced pluripotent stem cell-derived cardiomyocytes. We then performed whole-exome sequencing of 42 unrelated children with acute myocarditis (AM), some with proven viral causes. Results We found that TLR3- and STAT1 -deficient cardiomyocytes were not more susceptible to Coxsackie virus B3 (CVB3) infection than control cells. Moreover, CVB3 did not induce IFN-α/β and IFN-α/β-stimulated genes in control cardiomyocytes. Finally, exogenous IFN-α did not substantially protect cardiomyocytes against CVB3. We did not observe a significant enrichment of rare variations in TLR3- or IFN-α/β-related genes. Surprisingly, we found that homozygous but not heterozygous rare variants in genes associated with inherited cardiomyopathies were significantly enriched in AM-AVM patients compared with healthy individuals (p = 2.22E-03) or patients with other diseases (p = 1.08E-04). Seven of 42 patients (16.7%) carried rare biallelic (homozygous or compound heterozygous) nonsynonymous or splice-site variations in 6 cardiomyopathy-associated genes ( BAG3 , DSP , PKP2 , RYR2 , SCN5A , or TNNI3 ). Conclusions Previously silent recessive defects of the myocardium may predispose to acute heart failure presenting as AM, notably after common viral infections in children.
Severe influenza disease strikes otherwise healthy children and remains unexplained. We report compound heterozygous null mutations in IRF7, which encodes the transcription factor interferon ...regulatory factor 7, in an otherwise healthy child who suffered life-threatening influenza during primary infection. In response to influenza virus, the patient's leukocytes and plasmacytoid dendritic cells produced very little type I and III interferons (IFNs). Moreover, the patient's dermal fibroblasts and induced pluripotent stem cell (iPSC)–derived pulmonary epithelial cells produced reduced amounts of type I IFN and displayed increased influenza virus replication. These findings suggest that IRF7-dependent amplification of type I and III IFNs is required for protection against primary infection by influenza virus in humans. They also show that severe influenza may result from single-gene inborn errors of immunity.
The advent of next-generation sequencing (NGS) in 2010 has transformed medicine, particularly the growing field of inborn errors of immunity. NGS has facilitated the discovery of novel ...disease-causing genes and the genetic diagnosis of patients with monogenic inborn errors of immunity. Whole-exome sequencing (WES) is presently the most cost-effective approach for research and diagnostics, although whole-genome sequencing offers several advantages. The scientific or diagnostic challenge consists in selecting 1 or 2 candidate variants among thousands of NGS calls. Variant- and gene-level computational methods, as well as immunologic hypotheses, can help narrow down this genome-wide search. The key to success is a well-informed genetic hypothesis on 3 key aspects: mode of inheritance, clinical penetrance, and genetic heterogeneity of the condition. This determines the search strategy and selection criteria for candidate alleles. Subsequent functional validation of the disease-causing effect of the candidate variant is critical. Even the most up-to-date dry lab cannot clinch this validation without a seasoned wet lab. The multifariousness of variations entails an experimental rigor even greater than traditional Sanger sequencing–based approaches in order not to assign a condition to an irrelevant variant. Finding the needle in the haystack takes patience, prudence, and discernment.