Mutations in genes encoding subunits of the phagocyte NADPH oxidase complex are recognized to cause chronic granulomatous disease (CGD), a severe primary immunodeficiency. Here we describe how ...deficiency of CYBC1, a previously uncharacterized protein in humans (C17orf62), leads to reduced expression of NADPH oxidase's main subunit (gp91
) and results in CGD. Analyzing two brothers diagnosed with CGD we identify a homozygous loss-of-function mutation, p.Tyr2Ter, in CYBC1. Imputation of p.Tyr2Ter into 155K chip-genotyped Icelanders reveals six additional homozygotes, all with signs of CGD, manifesting as colitis, rare infections, or a severely impaired PMA-induced neutrophil oxidative burst. Homozygosity for p.Tyr2Ter consequently associates with inflammatory bowel disease (IBD) in Iceland (P = 8.3 × 10
; OR = 67.6), as well as reduced height (P = 3.3 × 10
; -8.5 cm). Overall, we find that CYBC1 deficiency results in CGD characterized by colitis and a distinct profile of infections indicative of macrophage dysfunction.
The HapMap Web site at http://www.hapmap.org is the primary portal to genotype data produced as part of the International Haplotype Map Project. In phase I of the project, >1.1 million SNPs were ...genotyped in 270 individuals from four worldwide populations. The HapMap Web site provides researchers with a number of tools that allow them to analyze the data as well as download data for local analyses. This paper presents step-by-step guides to using those tools, including guides for retrieving genotype and frequency data, picking tag-SNPs for use in association studies, viewing haplotypes graphically, and examining marker-to-marker LD patterns.
There is a huge demand on bioinformaticians to provide their biologists with user friendly and scalable software infrastructures to capture, exchange, and exploit the unprecedented amounts of new ...*omics data. We here present MOLGENIS, a generic, open source, software toolkit to quickly produce the bespoke MOLecular GENetics Information Systems needed.
The MOLGENIS toolkit provides bioinformaticians with a simple language to model biological data structures and user interfaces. At the push of a button, MOLGENIS' generator suite automatically translates these models into a feature-rich, ready-to-use web application including database, user interfaces, exchange formats, and scriptable interfaces. Each generator is a template of SQL, JAVA, R, or HTML code that would require much effort to write by hand. This 'model-driven' method ensures reuse of best practices and improves quality because the modeling language and generators are shared between all MOLGENIS applications, so that errors are found quickly and improvements are shared easily by a re-generation. A plug-in mechanism ensures that both the generator suite and generated product can be customized just as much as hand-written software.
In recent years we have successfully evaluated the MOLGENIS toolkit for the rapid prototyping of many types of biomedical applications, including next-generation sequencing, GWAS, QTL, proteomics and biobanking. Writing 500 lines of model XML typically replaces 15,000 lines of hand-written programming code, which allows for quick adaptation if the information system is not yet to the biologist's satisfaction. Each application generated with MOLGENIS comes with an optimized database back-end, user interfaces for biologists to manage and exploit their data, programming interfaces for bioinformaticians to script analysis tools in R, Java, SOAP, REST/JSON and RDF, a tab-delimited file format to ease upload and exchange of data, and detailed technical documentation. Existing databases can be quickly enhanced with MOLGENIS generated interfaces using the 'ExtractModel' procedure.
The MOLGENIS toolkit provides bioinformaticians with a simple model to quickly generate flexible web platforms for all possible genomic, molecular and phenotypic experiments with a richness of interfaces not provided by other tools. All the software and manuals are available free as LGPLv3 open source at http://www.molgenis.org.
Abstract Autoimmune thyroid disease (AITD) is a common autoimmune disease. In a GWAS meta-analysis of 110,945 cases and 1,084,290 controls, 290 sequence variants at 225 loci are associated with AITD. ...Of these variants, 115 are previously unreported. Multiomics analysis yields 235 candidate genes outside the MHC-region and the findings highlight the importance of genes involved in T-cell regulation. A rare 5’-UTR variant (rs781745126-T, MAF = 0.13% in Iceland) in LAG3 has the largest effect (OR = 3.42, P = 2.2 × 10 −16 ) and generates a novel start codon for an open reading frame upstream of the canonical protein translation initiation site. rs781745126-T reduces mRNA and surface expression of the inhibitory immune checkpoint LAG-3 co-receptor on activated lymphocyte subsets and halves LAG-3 levels in plasma among heterozygotes. All three homozygous carriers of rs781745126-T have AITD, of whom one also has two other T-cell mediated diseases, that is vitiligo and type 1 diabetes. rs781745126-T associates nominally with vitiligo (OR = 5.1, P = 6.5 × 10 −3 ) but not with type 1 diabetes. Thus, the effect of rs781745126-T is akin to drugs that inhibit LAG-3, which unleash immune responses and can have thyroid dysfunction and vitiligo as adverse events. This illustrates how a multiomics approach can reveal potential drug targets and safety concerns.
The flow of research data concerning the genetic basis of health and disease is rapidly increasing in speed and complexity. In response, many projects are seeking to ensure that there are appropriate ...informatics tools, systems and databases available to manage and exploit this flood of information. Previous solutions, such as central databases, journal-based publication and manually intensive data curation, are now being enhanced with new systems for federated databases, database publication, and more automated management of data flows and quality control. Along with emerging technologies that enhance connectivity and data retrieval, these advances should help to create a powerful knowledge environment for genotype-phenotype information.
Rare missense mutations in the gene encoding coatomer subunit alpha (COPA) have recently been shown to cause autoimmune interstitial lung, joint and kidney disease, also known as COPA syndrome, under ...a dominant mode of inheritance.
Here we describe an Icelandic family with three affected individuals over two generations with a rare clinical presentation of lung and joint disease and a histological diagnosis of follicular bronchiolitis. We performed whole-genome sequencing (WGS) of the three affected as well as three unaffected members of the family, and searched for rare genotypes associated with disease using 30,067 sequenced Icelanders as a reference population. We assessed all coding and splicing variants, prioritizing variants in genes known to cause interstitial lung disease. We detected a heterozygous missense mutation, p.Glu241Lys, in the COPA gene, private to the affected family members. The mutation occurred de novo in the paternal germline of the index case and was absent from 30,067 Icelandic genomes and 141,353 individuals from the genome Aggregation Database (gnomAD). The mutation occurs within the conserved and functionally important WD40 domain of the COPA protein.
This is the second report of the p.Glu241Lys mutation in COPA, indicating the recurrent nature of the mutation. The mutation was reported to co-segregate with COPA syndrome in a large family from the USA with five affected members, and classified as pathogenic. The two separate occurrences of the p.Glu241Lys mutation in cases and its absence from a large number of sequenced genomes confirms its role in the pathogenesis of the COPA syndrome.
Epileptic encephalopathies are a group of childhood epilepsies that display high phenotypic and genetic heterogeneity. The recent, extensive use of next-generation sequencing has identified a large ...number of genes in epileptic encephalopathies, including UBA5 in which biallelic mutations were first described as pathogenic in 2016 (Colin E et al., Am J Hum Genet 99(3):695-703, 2016. Muona M et al., Am J Hum Genet 99(3):683-694, 2016). UBA5 encodes an activating enzyme for a post-translational modification mechanism known as ufmylation, and is the first gene from the ufmylation pathway that is linked to disease.
We sequenced the genomes of two sisters with early-onset epileptic encephalopathy along with their unaffected parents in an attempt to find a genetic cause for their condition. The sisters, born in 2004 and 2006, presented with infantile spasms at six months of age, which later progressed to recurrent, treatment-resistant seizures. We detected a compound heterozygous genotype in UBA5 in the sisters, a genotype not seen elsewhere in an Icelandic reference set of 30,067 individuals nor in public databases. One of the mutations, c.684G > A, is a paternally inherited exonic splicing mutation, occuring at the last nucleotide of exon 7 of UBA5. The mutation is predicted to disrupt the splice site, resulting in loss-of-function of one allele of UBA5. The second mutation is a maternally inherited missense mutation, p.Ala371Thr, previously reported as pathogenic when in compound heterozygosity with a loss-of-function mutation in UBA5 and is believed to produce a hypomorphic allele. Supportive of this, we have identified three adult Icelanders homozygous for the p.Ala371Thr mutation who show no signs of neurological disease.
We describe compound heterozygous mutations in the UBA5 gene in two sisters with early-onset epileptic encephalopathy. To our knowledge, this is the first description of mutations in UBA5 since the initial discovery that pathogenic biallelic variants in the gene cause early-onset epileptic encephalopathy. We further provide confirmatory evidence that p.Ala371Thr is a hypomorphic mutation, by presenting three adult homozygotes who show no signs of neurological disease.
The SNP Consortium website (http://snp.cshl.org) has undergone many changes since its initial conception three years ago. The database back end has been changed from the venerable ACeDB to the more ...scalable MySQL engine. Users can access the data via gene or single nucleotide polymorphism (SNP) keyword searches and browse or dump SNP data to textfiles. A graphical genome browsing interface shows SNPs mapped onto the genome assembly in the context of externally available gene predictions and other features. SNP allele frequency and genotype data are available via FTP-download and on individual SNP report web pages. SNP linkage maps are available for download and for browsing in a comparative map viewer. All software components of the data coordinating center (DCC) website (http://snp.cshl.org) are open source.