Estimating individual genome-wide autozygosity is important both in the identification of recessive disease variants via homozygosity mapping and in the investigation of the effects of genome-wide ...homozygosity on traits of biomedical importance. Approaches have tended to involve either single-point estimates or rather complex multipoint methods of inferring individual autozygosity, all on the basis of limited marker data. Now, with the availability of high-density genome scans, a multipoint, observational method of estimating individual autozygosity is possible. Using data from a 300,000 SNP panel in 2618 individuals from two isolated and two more-cosmopolitan populations of European origin, we explore the potential of estimating individual autozygosity from data on runs of homozygosity (ROHs). Termed F
roh, this is defined as the proportion of the autosomal genome in runs of homozygosity above a specified length. Mean F
roh distinguishes clearly between subpopulations classified in terms of grandparental endogamy and population size. With the use of good pedigree data for one of the populations (Orkney), F
roh was found to correlate strongly with the inbreeding coefficient estimated from pedigrees (r = 0.86). Using pedigrees to identify individuals with no shared maternal and paternal ancestors in five, and probably at least ten, generations, we show that ROHs measuring up to 4 Mb are common in demonstrably outbred individuals. Given the stochastic variation in ROH number, length, and location and the fact that ROHs are important whether ancient or recent in origin, approaches such as this will provide a more useful description of genomic autozygosity than has hitherto been possible.
Placental growth factor (PlGF) is a member of the vascular endothelial growth factor family and is involved in bone marrow-derived cell activation, endothelial stimulation and pathological ...angiogenesis. High levels of PlGF have been observed in several pathological conditions especially in cancer, cardiovascular, autoimmune and inflammatory diseases. Little is known about the genetics of circulating PlGF levels. Indeed, although the heritability of circulating PlGF levels is around 40%, no studies have assessed the relation between PlGF plasma levels and genetic variants at a genome-wide level. In the current study, PlGF plasma levels were measured in a population-based sample of 2085 adult individuals from three isolated populations of South Italy. A GWAS was performed in a discovery cohort (N = 1600), followed by a de novo replication (N = 468) from the same populations. The meta-analysis of the discovery and replication samples revealed one signal significantly associated with PlGF circulating levels. This signal was mapped to the PlGF co-receptor coding gene NRP1, indicating its important role in modulating the PlGF plasma levels. Two additional signals, at the PlGF receptor coding gene FLT1 and RAPGEF5 gene, were identified at a suggestive level. Pathway and TWAS analyses highlighted genes known to be involved in angiogenesis and immune response, supporting the link between these processes and PlGF regulation. Overall, these data improve our understanding of the genetic variation underlying circulating PlGF levels. This in turn could lead to new preventive and therapeutic strategies for a wide variety of PlGF-related pathologies.
ABSTRACT
In the search for genetic associations with complex traits, population isolates offer the advantage of reduced genetic and environmental heterogeneity. In addition, cost‐efficient ...next‐generation association approaches have been proposed in these populations where only a subsample of representative individuals is sequenced and then genotypes are imputed into the rest of the population. Gene mapping in such populations thus requires high‐quality genetic imputation and preliminary phasing. To identify an effective study design, we compare by simulation a range of phasing and imputation software and strategies. We simulated 1,115,604 variants on chromosome 10 for 477 members of the large complex pedigree of Campora, a village within the established isolate of Cilento in southern Italy. We assessed the phasing performance of identical by descent based software ALPHAPHASE and SLRP, LD‐based software SHAPEIT2, SHAPEIT3, and BEAGLE, and new software EAGLE that combines both methodologies. For imputation we compared IMPUTE2, IMPUTE4, MINIMAC3, BEAGLE, and new software PBWT. Genotyping errors and missing genotypes were simulated to observe their effects on the performance of each software. Highly accurate phased data were achieved by all software with SHAPEIT2, SHAPEIT3, and EAGLE2 providing the most accurate results. MINIMAC3, IMPUTE4, and IMPUTE2 all performed strongly as imputation software and our study highlights the considerable gain in imputation accuracy provided by a genome sequenced reference panel specific to the population isolate.
Background
Variants of COL4A1/COL4A2 genes have been reported in fetal intracranial hemorrhage (ICH) cases but their prevalence and characteristics have not been established in a large series of ...fetuses. Fetal neonatal alloimmune thrombocytopenia is a major acquired ICH factor but the prevalence and characteristics of inherited platelet disorder (IPD) gene variants leading to thrombocytopenia are unknown. Herein, we screened COL4A1/COL4A2 and IPD genes in a large series of ICH fetuses.
Methods
A cohort of 194 consecutive ICH fetuses were first screened for COL4A1/COL4A2 variants. We manually curated a list of 64 genes involved in IPD and investigated them in COL4A1/COL4A2 negative fetuses, using exome sequencing data from 101 of these fetuses.
Result
Pathogenic variants of COL4A1/COL4A2 genes were identified in 36 fetuses (19%). They occurred de novo in 70% of the 32 fetuses for whom parental DNA was available. Pathogenic variants in two megakaryopoiesis genes (MPL and MECOM genes) were identified in two families with recurrent and severe fetal ICH, with variable extraneurological pathological features.
Conclusion
Our study emphasizes the genetic heterogeneity of fetal ICH and the need to screen both COL4A1/COL4A2 and IPD genes in the etiological investigation of fetal ICH to allow proper genetic counseling.
Key points
What's already known about this topic?
COL4A1/COL4A2 pathogenic variants have been reported in several fetal intracranial hemorrhage (ICH) case reports but their prevalence and characteristics in a large series of fetal ICH is lacking.
Fetal neonatal alloimmune thrombocytopeniais a well‐known cause of thrombocytopenia and ICH in infants and fetuses but very little is known regarding the role in fetal ICH of variants of inherited platelet disorder genes leading to thrombocytopenia.
What does this study add?
Fetal ICH is a highly heterogeneous condition with COL4A1/COL4A2 pathogenic variants accounting for 19% of cases with a very high de novo rate.
Albeit rare, pathogenic variants of megakaryopoiesis genes are associated with ICH and screening of these genes should be performed in fetal ICH etiological investigation.
In the human genome, about 750 genes contain one intron excised by the minor spliceosome. This spliceosome comprises its own set of snRNAs, among which U4atac. Its noncoding gene,
, has been found ...mutated in Taybi-Linder (TALS/microcephalic osteodysplastic primordial dwarfism type 1), Roifman (RFMN), and Lowry-Wood (LWS) syndromes. These rare developmental disorders, whose physiopathological mechanisms remain unsolved, associate ante- and post-natal growth retardation, microcephaly, skeletal dysplasia, intellectual disability, retinal dystrophy, and immunodeficiency. Here, we report bi-allelic
mutations in five patients presenting with traits suggestive of the Joubert syndrome (JBTS), a well-characterized ciliopathy. These patients also present with traits typical of TALS/RFMN/LWS, thus widening the clinical spectrum of
-associated disorders and indicating ciliary dysfunction as a mechanism downstream of minor splicing defects. Intriguingly, all five patients carry the n.16G>A mutation, in the Stem II domain, either at the homozygous or compound heterozygous state. A gene ontology term enrichment analysis on minor intron-containing genes reveals that the cilium assembly process is over-represented, with no less than 86 cilium-related genes containing at least one minor intron, among which there are 23 ciliopathy-related genes. The link between
mutations and ciliopathy traits is supported by alterations of primary cilium function in TALS and JBTS-like patient fibroblasts, as well as by
zebrafish model, which exhibits ciliopathy-related phenotypes and ciliary defects. These phenotypes could be rescued by WT but not by pathogenic variants-carrying human U4atac. Altogether, our data indicate that alteration of cilium biogenesis is part of the physiopathological mechanisms of TALS/RFMN/LWS, secondarily to defects of minor intron splicing.
The 1000 Genomes Project provides a unique source of whole genome sequencing data for studies of human population genetics and human diseases. The last release of this project includes more than ...2,500 sequenced individuals from 26 populations. Although relationships among individuals have been investigated in some of the populations, inbreeding has never been studied. In this article, we estimated the genomic inbreeding coefficient of each individual and found an unexpected high level of inbreeding in 1000 Genomes data: nearly a quarter of the individuals were inbred and around 4% of them had inbreeding coefficients similar or greater than the ones expected for first-cousin offspring. Inbred individuals were found in each of the 26 populations, with some populations showing proportions of inbred individuals above 50%. We also detected 227 previously unreported pairs of close relatives (up to and including first-cousins). Thus, we propose subsets of unrelated and outbred individuals, for use by the scientific community. In addition, because admixed populations are present in the 1000 Genomes Project, we performed simulations to study the robustness of inbreeding coefficient estimates in the presence of admixture. We found that our multi-point approach (FSuite) was quite robust to admixture, unlike single-point methods (PLINK).
Biallelic variants in RNU4ATAC, a non-coding gene transcribed into the minor spliceosome component U4atac snRNA, are responsible for three rare recessive developmental diseases, namely ...Taybi-Linder/MOPD1, Roifman and Lowry-Wood syndromes. Next-generation sequencing of clinically heterogeneous cohorts (children with either a suspected genetic disorder or a congenital microcephaly) recently identified mutations in this gene, illustrating how profoundly these technologies are modifying genetic testing and assessment. As RNU4ATAC has a single non-coding exon, the bioinformatic prediction algorithms assessing the effect of sequence variants on splicing or protein function are irrelevant, which makes variant interpretation challenging to molecular diagnostic laboratories. In order to facilitate and improve clinical diagnostic assessment and genetic counseling, we present i) an update of the previously reported RNU4ATAC mutations and an analysis of the genetic variations affecting this gene using the Genome Aggregation Database (gnomAD) resource; ii) the pathogenicity prediction performances of scores computed based on an RNA structure prediction tool and of those produced by the Combined Annotation Dependent Depletion tool for the 285 RNU4ATAC variants identified in patients or in large-scale sequencing projects; iii) a method, based on a cellular assay, that allows to measure the effect of RNU4ATAC variants on splicing efficiency of a minor (U12-type) reporter intron. Lastly, the concordance of bioinformatic predictions and cellular assay results was investigated.
Vascular Endothelial Growth Factor (VEGF) is the main player in angiogenesis. Because of its crucial role in this process, the study of the genetic factors controlling VEGF variability may be of ...particular interest for many angiogenesis-associated diseases. Although some polymorphisms in the VEGF gene have been associated with a susceptibility to several disorders, no genome-wide search on VEGF serum levels has been reported so far. We carried out a genome-wide linkage analysis in three isolated populations and we detected a strong linkage between VEGF serum levels and the 6p21.1 VEGF region in all samples. A new locus on chromosome 3p26.3 significantly linked to VEGF serum levels was also detected in a combined population sample. A sequencing of the gene followed by an association study identified three common single nucleotide polymorphisms (SNPs) influencing VEGF serum levels in one population (Campora), two already reported in the literature (rs3025039, rs25648) and one new signal (rs3025020). A fourth SNP (rs41282644) was found to affect VEGF serum levels in another population (Cardile). All the identified SNPs contribute to the related population linkages (35% of the linkage explained in Campora and 15% in Cardile). Interestingly, none of the SNPs influencing VEGF serum levels in one population was found to be associated in the two other populations. These results allow us to exclude the hypothesis that the common variants located in the exons, intron-exon junctions, promoter and regulative regions of the VEGF gene may have a causal effect on the VEGF variation. The data support the alternative hypothesis of a multiple rare variant model, possibly consisting in distinct variants in different populations, influencing VEGF serum levels.
Mutations in LRRK2 were recently identified in autosomal dominant Parkinson's disease (PD), including the G2019S mutation. To evaluate its frequency, we analyzed 198 probands with autosomal dominant ...PD, mostly from France and North Africa. Surprisingly, the frequency in North African families (7/17, 41%) was greater than those from Europe (5/174, 2.9%). The clinical features in 21 patients, including 1 with a homozygous mutation, were those of typical PD, with lower Mini‐Mental State Examination scores. There were also 15 unaffected mutation carriers, aged 32 to 74 years. LRRK2 mutations appear to be a common cause of autosomal dominant PD, particularly in North Africa. Ann Neurol 2005;58:784–787
Many linkage studies are performed in inbred populations, either small isolated populations or large populations with a long tradition of marriages between relatives. In such populations, there exist ...very complex genealogies with unknown loops. Therefore, the true inbreeding coefficient of an individual is often unknown. Good estimators of the inbreeding coefficient (
f) are important, since it has been shown that underestimation of
f may lead to false linkage conclusions. When an individual is genotyped for markers spanning the whole genome, it should be possible to use this genomic information to estimate that individual's
f. To do so, we propose a maximum-likelihood method that takes marker dependencies into account through a hidden Markov model. This methodology also allows us to infer the full probability distribution of the identity-by-descent (IBD) status of the two alleles of an individual at each marker along the genome (posterior IBD probabilities) and provides a variance for the estimates. We simulate a full genome scan mimicking the true autosomal genome for (1) a first-cousin pedigree and (2) a quadruple-second-cousin pedigree. In both cases, we find that our method accurately estimates
f for different marker maps. We also find that the proportion of genome IBD in an individual with a given genealogy is very variable. The approach is illustrated with data from a study of demyelinating autosomal recessive Charcot-Marie-Tooth disease.