Ganoderma lucidum is a widely used medicinal macrofungus in traditional Chinese medicine that creates a diverse set of bioactive compounds. Here we report its 43.3-Mb genome, encoding 16,113 ...predicted genes, obtained using next-generation sequencing and optical mapping approaches. The sequence analysis reveals an impressive array of genes encoding cytochrome P450s (CYPs), transporters and regulatory proteins that cooperate in secondary metabolism. The genome also encodes one of the richest sets of wood degradation enzymes among all of the sequenced basidiomycetes. In all, 24 physical CYP gene clusters are identified. Moreover, 78 CYP genes are coexpressed with lanosterol synthase, and 16 of these show high similarity to fungal CYPs that specifically hydroxylate testosterone, suggesting their possible roles in triterpenoid biosynthesis. The elucidation of the G. lucidum genome makes this organism a potential model system for the study of secondary metabolic pathways and their regulation in medicinal fungi.
Summary
Following earlier incomplete and fragmented versions of a genome sequence for the grey mould Botrytis cinerea, a gapless, near‐finished genome sequence for B. cinerea strain B05.10 is ...reported. The assembly comprised 18 chromosomes and was confirmed by an optical map and a genetic map based on approximately 75 000 single nucleotide polymorphism (SNP) markers. All chromosomes contained fully assembled centromeric regions, and 10 chromosomes had telomeres on both ends. The genetic map consisted of 4153 cM and a comparison of the genetic distances with the physical distances identified 40 recombination hotspots. The linkage map also identified two mutations, located in the previously described genes Bos1 and BcsdhB, that conferred resistance to the fungicides boscalid and iprodione. The genome was predicted to encode 11 701 proteins. RNAseq data from >20 different samples were used to validate and improve gene models. Manual curation of chromosome 1 revealed interesting features, such as the occurrence of a dicistronic transcript and fully overlapping genes in opposite orientations, as well as many spliced antisense transcripts. Manual curation also revealed that the untranslated regions (UTRs) of genes can be complex and long, with many UTRs exceeding lengths of 1 kb and possessing multiple introns. Community annotation is in progress.
Background
Rice research has been enabled by access to the high quality reference genome sequence generated in 2005 by the International Rice Genome Sequencing Project (IRGSP). To further facilitate ...genomic-enabled research, we have updated and validated the genome assembly and sequence for the Nipponbare cultivar of
Oryza sativa
(
japonica
group).
Results
The Nipponbare genome assembly was updated by revising and validating the minimal tiling path of clones with the optical map for rice. Sequencing errors in the revised genome assembly were identified by re-sequencing the genome of two different Nipponbare individuals using the Illumina Genome Analyzer II/IIx platform. A total of 4,886 sequencing errors were identified in 321 Mb of the assembled genome indicating an error rate in the original IRGSP assembly of only 0.15 per 10,000 nucleotides. A small number (five) of insertions/deletions were identified using longer reads generated using the Roche 454 pyrosequencing platform. As the re-sequencing data were generated from two different individuals, we were able to identify a number of allelic differences between the original individual used in the IRGSP effort and the two individuals used in the re-sequencing effort. The revised assembly, termed Os-Nipponbare-Reference-IRGSP-1.0, is now being used in updated releases of the Rice Annotation Project and the Michigan State University Rice Genome Annotation Project, thereby providing a unified set of pseudomolecules for the rice community.
Conclusions
A revised, error-corrected, and validated assembly of the Nipponbare cultivar of rice was generated using optical map data, re-sequencing data, and manual curation that will facilitate on-going and future research in rice. Detection of polymorphisms between three different Nipponbare individuals highlights that allelic differences between individuals should be considered in diversity studies.
The ascomycete fungus Colletotrichum higginsianum causes anthracnose disease of brassica crops and the model plant Arabidopsis thaliana. Previous versions of the genome sequence were highly ...fragmented, causing errors in the prediction of protein-coding genes and preventing the analysis of repetitive sequences and genome architecture.
Here, we re-sequenced the genome using single-molecule real-time (SMRT) sequencing technology and, in combination with optical map data, this provided a gapless assembly of all twelve chromosomes except for the ribosomal DNA repeat cluster on chromosome 7. The more accurate gene annotation made possible by this new assembly revealed a large repertoire of secondary metabolism (SM) key genes (89) and putative biosynthetic pathways (77 SM gene clusters). The two mini-chromosomes differed from the ten core chromosomes in being repeat- and AT-rich and gene-poor but were significantly enriched with genes encoding putative secreted effector proteins. Transposable elements (TEs) were found to occupy 7% of the genome by length. Certain TE families showed a statistically significant association with effector genes and SM cluster genes and were transcriptionally active at particular stages of fungal development. All 24 subtelomeres were found to contain one of three highly-conserved repeat elements which, by providing sites for homologous recombination, were probably instrumental in four segmental duplications.
The gapless genome of C. higginsianum provides access to repeat-rich regions that were previously poorly assembled, notably the mini-chromosomes and subtelomeres, and allowed prediction of the complete SM gene repertoire. It also provides insights into the potential role of TEs in gene and genome evolution and host adaptation in this asexual pathogen.
In the human genome, heterozygous sites refer to genomic positions with a different allele or nucleotide variant on the maternal and paternal chromosomes. Resolving these allelic differences by ...chromosomal copy, also known as phasing, is achievable on a short-read sequencer when using a library preparation method that captures long-range genomic information. TELL-Seq is a library preparation that captures long-range genomic information with the aid of molecular identifiers (barcodes). The same barcode is used to tag the reads derived from the same long DNA fragment within a range of up to 200 kilobases (kb), generating linked-reads. This strategy can be used to phase an entire genome. Here, we introduce a TELL-Seq protocol developed for targeted applications, enabling the phasing of enriched loci of varying sizes, purity levels, and heterozygosity. To validate this protocol, we phased 2-200 kb loci enriched with different methods: CRISPR/Cas9-mediated excision coupled with pulse-field electrophoresis for the longest fragments, CRISPR/Cas9-mediated protection from exonuclease digestion for mid-size fragments, and long PCR for the shortest fragments. All selected loci have known clinical relevance: BRCA1, BRCA2, MLH1, MSH2, MSH6, APC, PMS2, SCN5A-SCN10A, and PKI3CA. Collectively, the analyses show that TELL-Seq can accurately phase 2-200 kb targets using a short-read sequencer.
Very large DNA molecules enable comprehensive analysis of complex genomes, such as human, cancer, and plants because they span across sequence repeats and complex somatic events. When physically ...manipulated, or analyzed as single molecules, long polyelectrolytes are problematic because of mechanical considerations that include shear-mediated breakage, dealing with the massive size of these coils, or the length of stretched DNAs using common experimental techniques and fluidic devices. Accordingly, we harness analyte “issues” as exploitable advantages by our invention and characterization of the “molecular gate,” which controls and synchronizes formation of stretched DNA molecules as DNA dumbbells within nanoslit geometries. Molecular gate geometries comprise micro- and nanoscale features designed to synergize very low ionic strength conditions in ways we show effectively create an “electrostatic bottle.” This effect greatly enhances molecular confinement within large slit geometries and supports facile, synchronized electrokinetic loading of nanoslits, even without dumbbell formation. Device geometries were considered at the molecular and continuum scales through computer simulations, which also guided our efforts to optimize design and functionalities. In addition, we show that the molecular gate may govern DNA separations because DNA molecules can be electrokinetically triggered, by varying applied voltage, to enter slits in a size-dependent manner. Lastly, mapping the Mesoplasma florum genome, via synchronized dumbbell formation, validates our nascent approach as a viable starting point for advanced development that will build an integrated system capable of large-scale genome analysis.
Variation in genome structure is an important source of human genetic polymorphism: It affects a large proportion of the genome and has a variety of phenotypic consequences relevant to health and ...disease. In spite of this, human genome structure variation is incompletely characterized due to a lack of approaches for discovering a broad range of structural variants in a global, comprehensive fashion. We addressed this gap with Optical Mapping, a high-throughput, high-resolution single-molecule system for studying genome structure. We used Optical Mapping to create genome-wide restriction maps of a complete hydatidiform mole and three lymphoblast-derived cell lines, and we validated the approach by demonstrating a strong concordance with existing methods. We also describe thousands of new variants with sizes ranging from kb to Mb.
Multiple myeloma (MM), a malignancy of plasma cells, is characterized by widespread genomic heterogeneity and, consequently, differences in disease progression and drug response. Although recent ...large-scale sequencing studies have greatly improved our understanding of MM genomes, our knowledge about genomic structural variation in MM is attenuated due to the limitations of commonly used sequencing approaches. In this study, we present the application of optical mapping, a single-molecule, whole-genome analysis system, to discover new structural variants in a primary MM genome. Through our analysis, we have identified and characterized widespread structural variation in this tumor genome. Additionally, we describe our efforts toward comprehensive characterization of genome structure and variation by integrating our findings from optical mapping with those from DNA sequencing-based genomic analysis. Finally, by studying this MM genome at two time points during tumor progression, we have demonstrated an increase in mutational burden with tumor progression at all length scales of variation.
Significance In the last several years, we have seen significant progress toward personalized cancer genomics and therapy. Although we routinely discern and understand genomic variation at single base pair and chromosomal levels, comprehensive analysis of genome variation, particularly structural variation, remains a challenge. We present an integrated approach using optical mappingâa single-molecule, whole-genome analysis systemâand DNA sequencing to comprehensively identify genomic structural variation in sequential samples from a multiple myeloma patient. Through our analysis, we have identified widespread structural variation and an increase in mutational burden with tumor progression. Our findings highlight the need to routinely incorporate structural variation analysis at many length scales to understand cancer genomes more comprehensively.
Fusarium oxysporum is a cross-kingdom fungal pathogen that infects plants and humans. Horizontally transferred lineage-specific (LS) chromosomes were reported to determine host-specific pathogenicity ...among phytopathogenic F. oxysporum. However, the existence and functional importance of LS chromosomes among human pathogenic isolates are unknown. Here we report four unique LS chromosomes in a human pathogenic strain NRRL 32931, isolated from a leukemia patient. These LS chromosomes were devoid of housekeeping genes, but were significantly enriched in genes encoding metal ion transporters and cation transporters. Homologs of NRRL 32931 LS genes, including a homolog of ceruloplasmin and the genes that contribute to the expansion of the alkaline pH-responsive transcription factor PacC/Rim1p, were also present in the genome of NRRL 47514, a strain associated with Fusarium keratitis outbreak. This study provides the first evidence, to our knowledge, for genomic compartmentalization in two human pathogenic fungal genomes and suggests an important role of LS chromosomes in niche adaptation.
About 85% of the maize genome consists of highly repetitive sequences that are interspersed by low-copy, gene-coding sequences. The maize community has dealt with this genomic complexity by the ...construction of an integrated genetic and physical map (iMap), but this resource alone was not sufficient for ensuring the quality of the current sequence build. For this purpose, we constructed a genome-wide, high-resolution optical map of the maize inbred line B73 genome containing >91,000 restriction sites (averaging 1 site/ approximately 23 kb) accrued from mapping genomic DNA molecules. Our optical map comprises 66 contigs, averaging 31.88 Mb in size and spanning 91.5% (2,103.93 Mb/ approximately 2,300 Mb) of the maize genome. A new algorithm was created that considered both optical map and unfinished BAC sequence data for placing 60/66 (2,032.42 Mb) optical map contigs onto the maize iMap. The alignment of optical maps against numerous data sources yielded comprehensive results that proved revealing and productive. For example, gaps were uncovered and characterized within the iMap, the FPC (fingerprinted contigs) map, and the chromosome-wide pseudomolecules. Such alignments also suggested amended placements of FPC contigs on the maize genetic map and proactively guided the assembly of chromosome-wide pseudomolecules, especially within complex genomic regions. Lastly, we think that the full integration of B73 optical maps with the maize iMap would greatly facilitate maize sequence finishing efforts that would make it a valuable reference for comparative studies among cereals, or other maize inbred lines and cultivars.