The accurate typing of human leukocyte antigen (HLA) alleles is critical for a variety of medical applications, such as genomic studies of multifactorial diseases, including immune system and ...inflammation‐related disorders, and donor selection in organ transplantation and regenerative medicine. Here, we developed a new algorithm for determining HLA alleles using next‐generation sequencing (NGS) results. The method consists of constructing an extensive dictionary of HLA alleles, precise mapping of the NGS reads, and calculating a score based on weighted read counts to select the most suitable pair of alleles. The developed algorithm compares the score of all allele pairs, taking into account variation not only in the domain for antigen presentation (G‐DOMAIN), but also outside this domain. Using this method, HLA alleles could be determined with 6‐digit precision. We showed that our method was more accurate than other NGS‐based methods and revealed limitations of the conventional HLA typing technologies. Furthermore, we determined the complete genomic sequence of an HLA‐A‐like‐pseudogene when we assembled NGS reads that had caused arguable typing, and found its identity with HLA‐Y*02:01. The accuracy of the HLA‐A allele typing was improved after the HLA‐Y*02:01 sequence was included in the HLA allele dictionary.
HLA‐HD is a new algorithm for determining HLA alleles using next‐generation sequencing (NGS) data. The method consists of constructing an extensive dictionary of HLA alleles, precise mapping of the NGS reads, and calculating a score based on weighted read counts to select the most suitable pair of alleles. HLA‐HD compares the score of all allele pairs, taking into account variation not only in the domain for antigen presentation, but also outside this domain.
Retinitis pigmentosa (RP), a major cause of blindness in developed countries, has multiple causative genes; its prevalence differs by ethnicity. Usher syndrome is the most common form of syndromic RP ...and is accompanied by hearing impairment. Although molecular diagnosis is challenging, recent technological advances such as targeted high-throughput resequencing are efficient screening tools.
We performed comprehensive molecular testing in 329 Japanese RP and Usher syndrome patients by using a custom capture panel that covered the coding exons and exon/intron boundaries of all 193 known inherited eye disease genes combined with Illumina HiSequation 2500. Candidate variants were screened using systematic data analyses, and their potential pathogenicity was assessed according to the frequency of the variants in normal populations, in silico prediction tools, and compatibility with known phenotypes or inheritance patterns.
Molecular diagnoses were made in 115/317 RP patients (36.3%) and 6/12 Usher syndrome patients (50%). We identified 104 distinct mutations, including 66 novel mutations. EYS, USH2A, and RHO were common causative genes. In particular, mutations in EYS accounted for 15.0% of the autosomal recessive/simplex RP patients or 10.7% of the entire RP cohort. Among the 189 previously reported mutations detected in the current study, 55 (29.1%) were found commonly in Japanese or other public databases and were excluded from molecular diagnoses.
By screening a large cohort of patients, this study catalogued the genetic variations involved in RP and Usher syndrome in a Japanese population and highlighted the different distribution of causative genes among populations.
Next-generation sequencing (NGS) has greatly advanced the studies of causative genes and variants of inherited diseases. While it is sometimes challenging to determine the pathogenicity of identified ...variants in NGS, the American College of Medical Genetics and Genomics established the guidelines to help the interpretation. However, as to the genetic screenings for patients with retinitis pigmentosa (RP) in Japan, none of the previous studies utilized the guidelines. Considering that EYS is the major causative gene of RP in Japan, we conducted stepwise genetic screening of 220 Japanese patients with RP utilizing the guidelines. Step 1-4 comprised the following, in order: Sanger sequencing for two major EYS founder mutations; targeted sequencing of all coding regions of EYS; whole genome sequencing; Sanger sequencing for Alu element insertion in RP1, a recently determined founder mutation for RP. Among the detected variants, 2, 19, 173, and 1 variant(s) were considered pathogenic and 8, 41, 44, and 5 patients were genetically solved in step 1, 2, 3, and 4, respectively. Totally, 44.5% (98/220) of the patients were genetically solved, and 50 (51.0%) were EYS-associated and 5 (5.1%) were Alu element-associated. Among the unsolved 122 patients, 22 had at least one possible pathogenic variant.
Profiles of sequence variants that influence gene transcription are very important for understanding mechanisms that affect phenotypic variation and disease susceptibility. Using genotypes at 1.4 ...million SNPs and a comprehensive transcriptional profile of 15,454 coding genes and 6,113 lincRNA genes obtained from peripheral blood cells of 298 Japanese individuals, we mapped expression quantitative trait loci (eQTLs). We identified 3,804 cis-eQTLs (within 500 kb from target genes) and 165 trans-eQTLs (>500 kb away or on different chromosomes). Cis-eQTLs were often located in transcribed or adjacent regions of genes; among these regions, 5' untranslated regions and 5' flanking regions had the largest effects. Epigenetic evidence for regulatory potential accumulated in public databases explained the magnitude of the effects of our eQTLs. Cis-eQTLs were often located near the respective target genes, if not within genes. Large effect sizes were observed with eQTLs near target genes, and effect sizes were obviously attenuated as the eQTL distance from the gene increased. Using a very stringent significance threshold, we identified 165 large-effect trans-eQTLs. We used our eQTL map to assess 8,069 disease-associated SNPs identified in 1,436 genome-wide association studies (GWAS). We identified genes that might be truly causative, but GWAS might have failed to identify for 148 out of the GWAS-identified SNPs; for example, TUFM (P = 3.3E-48) was identified for inflammatory bowel disease (early onset); ZFP90 (P = 4.4E-34) for ulcerative colitis; and IDUA (P = 2.2E-11) for Parkinson's disease. We identified four genes (P<2.0E-14) that might be related to three diseases and two hematological traits; each expression is regulated by trans-eQTLs on a different chromosome than the gene.
Whole-genome and -exome resequencing using next-generation sequencers is a powerful approach for identifying genomic variations that are associated with diseases. However, systematic strategies for ...prioritizing causative variants from many candidates to explain the disease phenotype are still far from being established, because the population-specific frequency spectrum of genetic variation has not been characterized. Here, we have collected exomic genetic variation from 1208 Japanese individuals through a collaborative effort, and aggregated the data into a prevailing catalog. In total, we identified 156 622 previously unreported variants. The allele frequencies for the majority (88.8%) were lower than 0.5% in allele frequency and predicted to be functionally deleterious. In addition, we have constructed a Japanese-specific major allele reference genome by which the number of unique mapping of the short reads in our data has increased 0.045% on average. Our results illustrate the importance of constructing an ethnicity-specific reference genome for identifying rare variants. All the collected data were centralized to a newly developed database to serve as useful resources for exploring pathogenic variations. Public access to the database is available at http://www.genome.med.kyoto-u.ac.jp/SnpDB/.
Mitochondrial disorders have the highest incidence among congenital metabolic disorders characterized by biochemical respiratory chain complex deficiencies. It occurs at a rate of 1 in 5,000 births, ...and has phenotypic and genetic heterogeneity. Mutations in about 1,500 nuclear encoded mitochondrial proteins may cause mitochondrial dysfunction of energy production and mitochondrial disorders. More than 250 genes that cause mitochondrial disorders have been reported to date. However exact genetic diagnosis for patients still remained largely unknown. To reveal this heterogeneity, we performed comprehensive genomic analyses for 142 patients with childhood-onset mitochondrial respiratory chain complex deficiencies. The approach includes whole mtDNA and exome analyses using high-throughput sequencing, and chromosomal aberration analyses using high-density oligonucleotide arrays. We identified 37 novel mutations in known mitochondrial disease genes and 3 mitochondria-related genes (MRPS23, QRSL1, and PNPLA4) as novel causative genes. We also identified 2 genes known to cause monogenic diseases (MECP2 and TNNI3) and 3 chromosomal aberrations (6q24.3-q25.1, 17p12, and 22q11.21) as causes in this cohort. Our approaches enhance the ability to identify pathogenic gene mutations in patients with biochemically defined mitochondrial respiratory chain complex deficiencies in clinical settings. They also underscore clinical and genetic heterogeneity and will improve patient care of this complex disorder.
The Japanese Archipelago is widely covered with acidic soil made of volcanic ash, an environment which is detrimental to the preservation of ancient biomolecules. More than 10,000 Palaeolithic and ...Neolithic sites have been discovered nationwide, but few skeletal remains exist and preservation of DNA is poor. Despite these challenging circumstances, we succeeded in obtaining a complete mitogenome (mitochondrial genome) sequence from Palaeolithic human remains. We also obtained those of Neolithic (the hunting-gathering Jomon and the farming Yayoi cultures) remains, and over 2,000 present-day Japanese. The Palaeolithic mitogenome sequence was not found to be a direct ancestor of any of Jomon, Yayoi, and present-day Japanese people. However, it was an ancestral type of haplogroup M, a basal group of the haplogroup M. Therefore, our results indicate continuity in the maternal gene pool from the Palaeolithic to present-day Japanese. We also found that a vast increase of population size happened and has continued since the Yayoi period, characterized with paddy rice farming. It means that the cultural transition, i.e. rice agriculture, had significant impact on the demographic history of Japanese population.
Thiazoline-related innate fear-eliciting compounds (tFOs) orchestrate hypothermia, hypometabolism, and anti-hypoxia, which enable survival in lethal hypoxic conditions. Here, we show that most of ...these effects are severely attenuated in transient receptor potential ankyrin 1 (Trpa1) knockout mice. TFO-induced hypothermia involves the Trpa1-mediated trigeminal/vagal pathways and non-Trpa1 olfactory pathway. TFOs activate Trpa1-positive sensory pathways projecting from trigeminal and vagal ganglia to the spinal trigeminal nucleus (Sp5) and nucleus of the solitary tract (NTS), and their artificial activation induces hypothermia. TFO presentation activates the NTS-Parabrachial nucleus pathway to induce hypothermia and hypometabolism; this activation was suppressed in Trpa1 knockout mice. TRPA1 activation is insufficient to trigger tFO-mediated anti-hypoxic effects; Sp5/NTS activation is also necessary. Accordingly, we find a novel molecule that enables mice to survive in a lethal hypoxic condition ten times longer than known tFOs. Combinations of appropriate tFOs and TRPA1 command intrinsic physiological responses relevant to survival fate.
Previous studies have reported genome-wide mutation profile analyses in ovarian clear cell carcinomas (OCCCs). This study aims to identify specific novel molecular alterations by combined analyses of ...somatic mutation and copy number variation. We performed whole exome sequencing of 39 OCCC samples with 16 matching blood tissue samples. Four hundred twenty-six genes had recurrent somatic mutations. Among the 39 samples, ARID1A (62%) and PIK3CA (51%) were frequently mutated, as were genes such as KRAS (10%), PPP2R1A (10%), and PTEN (5%), that have been reported in previous OCCC studies. We also detected mutations in MLL3 (15%), ARID1B (10%), and PIK3R1 (8%), which are associations not previously reported. Gene interaction analysis and functional assessment revealed that mutated genes were clustered into groups pertaining to chromatin remodeling, cell proliferation, DNA repair and cell cycle checkpointing, and cytoskeletal organization. Copy number variation analysis identified frequent amplification in chr8q (64%), chr20q (54%), and chr17q (46%) loci as well as deletion in chr19p (41%), chr13q (28%), chr9q (21%), and chr18q (21%) loci. Integration of the analyses uncovered that frequently mutated or amplified/deleted genes were involved in the KRAS/phosphatidylinositol 3-kinase (82%) and MYC/retinoblastoma (75%) pathways as well as the critical chromatin remodeling complex switch/sucrose nonfermentable (85%). The individual and integrated analyses contribute details about the OCCC genomic landscape, which could lead to enhanced diagnostics and therapeutic options.
Human immune systems are very complex, and the basis for individual differences in immune phenotypes is largely unclear. One reason is that the phenotype of the immune system is so complex that it is ...very difficult to describe its features and quantify differences between samples. To identify the genetic factors that cause individual differences in whole lymphocyte profiles and their changes after vaccination without having to rely on biological assumptions, we performed a genome-wide association study (GWAS), using cytometry data. Here, we applied computational analysis to the cytometry data of 301 people before receiving an influenza vaccine, and 1, 7, and 90 days after the vaccination to extract the feature statistics of the lymphocyte profiles in a nonparametric and data-driven manner. We analyzed two types of cytometry data: measurements of six markers for B cell classification and seven markers for T cell classification. The coordinate values calculated by this method can be treated as feature statistics of the lymphocyte profile. Next, we examined the genetic basis of individual differences in human immune phenotypes with a GWAS for the feature statistics, and we newly identified seven significant and 36 suggestive single-nucleotide polymorphisms associated with the individual differences in lymphocyte profiles and their change after vaccination. This study provides a new workflow for performing combined analyses of cytometry data and other types of genomics data.