De novo mutations (DNMs) cause a large proportion of severe rare diseases of childhood. DNMs that occur early may result in mosaicism of both somatic and germ cells. Such early mutations can cause ...recurrence of disease. We scanned 1,007 sibling pairs from 251 families and identified 878 DNMs shared by siblings (ssDNMs) at 448 genomic sites. We estimated DNM recurrence probability based on parental mosaicism, sharing of DNMs among siblings, parent-of-origin, mutation type and genomic position. We detected 57.2% of ssDNMs in the parental blood. The recurrence probability of a DNM decreases by 2.27% per year for paternal DNMs and 1.78% per year for maternal DNMs. Maternal ssDNMs are more likely to be T>C mutations than paternal ssDNMs, and less likely to be C>T mutations. Depending on the properties of the DNM, the recurrence probability ranges from 0.011% to 28.5%. We have launched an online calculator to allow estimation of DNM recurrence probability for research purposes.
Mycobacterium tuberculosis infections cause 9 million new tuberculosis cases and 1.5 million deaths annually. To identify variants conferring risk of tuberculosis, we tested 28.3 million variants ...identified through whole-genome sequencing of 2,636 Icelanders for association with tuberculosis (8,162 cases and 277,643 controls), pulmonary tuberculosis (PTB) and M. tuberculosis infection. We found association of three variants in the region harboring genes encoding the class II human leukocyte antigens (HLAs): rs557011T (minor allele frequency (MAF) = 40.2%), associated with M. tuberculosis infection (odds ratio (OR) = 1.14, P = 3.1 × 10(-13)) and PTB (OR = 1.25, P = 5.8 × 10(-12)), and rs9271378G (MAF = 32.5%), associated with PTB (OR = 0.78, P = 2.5 × 10(-12))--both located between HLA-DQA1 and HLA-DRB1--and a missense variant encoding p.Ala210Thr in HLA-DQA1 (MAF = 19.1%, rs9272785), associated with M. tuberculosis infection (P = 9.3 × 10(-9), OR = 1.14). We replicated association of these variants with PTB in samples of European ancestry from Russia and Croatia (P < 5.9 × 10(-4)). These findings show that the HLA class II region contributes to genetic risk of tuberculosis, possibly through reduced presentation of protective M. tuberculosis antigens to T cells.
Marfan syndrome (MFS) is an autosomal dominant condition characterized by aortic aneurysm, skeletal abnormalities, and lens dislocation, and is caused by variants in the FBN1 gene. To explore causes ...of MFS and the prevalence of the disease in Iceland we collected information from all living individuals with a clinical diagnosis of MFS in Iceland (n = 32) and performed whole-genome sequencing of those who did not have a confirmed genetic diagnosis (27/32). Moreover, to assess a potential underdiagnosis of MFS in Iceland we attempted a genotype-based approach to identify individuals with MFS. We interrogated deCODE genetics' database of 35,712 whole-genome sequenced individuals to search for rare sequence variants in FBN1. Overall, we identified 15 pathogenic or likely pathogenic variants in FBN1 in 44 individuals, only 22 of whom were previously diagnosed with MFS. The most common of these variants, NM_000138.4:c.8038 C > T p.(Arg2680Cys), is present in a multi-generational pedigree, and was found to stem from a single forefather born around 1840. The p.(Arg2680Cys) variant associates with a form of MFS that seems to have an enrichment of abdominal aortic aneurysm, suggesting that this may be a particularly common feature of p.(Arg2680Cys)-associated MFS. Based on these combined genetic and clinical data, we show that MFS prevalence in Iceland could be as high as 1/6,600 in Iceland, compared to 1/10,000 based on clinical diagnosis alone, which indicates underdiagnosis of this actionable genetic disorder.
Understanding of sequence diversity is the cornerstone of analysis of genetic disorders, population genetics, and evolutionary biology. Here, we present an update of our sequencing set to 15,220 ...Icelanders who we sequenced to an average genome-wide coverage of 34X. We identified 39,020,168 autosomal variants passing GATK filters: 31,079,378 SNPs and 7,940,790 indels. Calling de novo mutations (DNMs) is a formidable challenge given the high false positive rate in sequencing datasets relative to the mutation rate. Here we addressed this issue by using segregation of alleles in three-generation families. Using this transmission assay, we controlled the false positive rate and identified 108,778 high quality DNMs. Furthermore, we used our extended family structure and read pair tracing of DNMs to a panel of phased SNPs, to determine the parent of origin of 42,961 DNMs.
In this paper the effect of SNPs on expression levels in Nimblegen RNA expression microarrays is investigated. A vast number of replicates of probe pairs representing both alleles of SNPs on 14 loci ...allows accurate estimation of the difference in signal intensities both within and between probe pairs. The majority of probe-pairs with sufficiently high expression have significant differences in expression levels within the pair and the difference shows concordance with the genotype of the samples. With two or more replicates of each probe, the allele-to-allele variance dominates the error in estimating the difference within the probe-pair, ten replicates are needed for adequate power in calling a true difference within a single probe-pair. Using the expression level of the probe within the probe-pair that has the higher value gives more accurate estimates. When using probes at loci containing known SNP's one should use probes containing both alleles of the SNP.
In this paper the TileShuffle method is evaluated as a search method for candidate lncRNAs at 8q24.2. The method is run on three microarrays. Microarrays which all contained the same sample and ...repeated copies of tiled probes. This allows the coherence of the selection method within and between microarrays to be estimated by Monte Carlo simulations on the repeated probes.