Summary
CRISPR/Cas9 and Cas12a (Cpf1) nucleases are two of the most powerful genome editing tools in plants. In this work, we compared their activities by targeting maize glossy2 gene coding region ...that has overlapping sequences recognized by both nucleases. We introduced constructs carrying SpCas9‐guide RNA (gRNA) and LbCas12a‐CRISPR RNA (crRNA) into maize inbred B104 embryos using Agrobacterium‐mediated transformation. On‐target mutation analysis showed that 90%–100% of the Cas9‐edited T0 plants carried indel mutations and 63%–77% of them were homozygous or biallelic mutants. In contrast, 0%–60% of Cas12a‐edited T0 plants had on‐target mutations. We then conducted CIRCLE‐seq analysis to identify genome‐wide potential off‐target sites for Cas9. A total of 18 and 67 potential off‐targets were identified for the two gRNAs, respectively, with an average of five mismatches compared to the target sites. Sequencing analysis of a selected subset of the off‐target sites revealed no detectable level of mutations in the T1 plants, which constitutively express Cas9 nuclease and gRNAs. In conclusion, our results suggest that the CRISPR/Cas9 system used in this study is highly efficient and specific for genome editing in maize, while CRISPR/Cas12a needs further optimization for improved editing efficiency.
Full text
Available for:
BFBNIB, DOBA, FZAB, GIS, IJS, IZUM, KILJ, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, UILJ, UKNU, UL, UM, UPUK
Core Ideas
Plant CRISPR‐Cas9 genome editing may generate unintended off‐target mutation.
Potential for off‐target mutation is an important regulatory question for genome‐edited plants.
Validated ...design approaches to discriminate target and potential off‐target edits are needed.
The CRISPR‐Cas9 system (clustered regularly interspaced short palindromic repeats with associated Cas9 protein) has been used to generate targeted changes for direct modification of endogenous genes in an increasing number of plant species; but development of plant genome editing has not yet fully considered potential off‐target mismatches that may lead to unintended changes within the genome. Assessing the specificity of CRISPR‐Cas9 for increasing editing efficiency as well as the potential for unanticipated downstream effects from off‐target mutations is an important regulatory consideration for agricultural applications. Increasing genome‐editing specificity entails developing improved design methods that better predict the prevalence of off‐target mutations as a function of genome composition and design of the engineered ribonucleoprotein (RNP). Early results from CRISPR‐Cas9 genome editing in plant systems indicate that the incidence of off‐target mutation frequencies is quite low; however, by analyzing CRISPR‐edited plant lines and improving both computational tools and reagent design, it may be possible to further decrease unanticipated effects at potential mismatch sites within the genome. This will provide assurance that CRISPR‐Cas9 reagents can be designed and targeted with a high degree of specificity. Improved and experimentally validated design tools for discriminating target and potential off‐target positions that incorporate consideration of the designed nuclease fidelity and selectivity will help to increase confidence for regulatory decision making for genome‐edited plants.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
Abstract
Annotating gene structures and functions to genome assemblies is necessary to make assembly resources useful for biological inference. Gene Ontology (GO) term assignment is the most used ...functional annotation system, and new methods for GO assignment have improved the quality of GO-based function predictions. The Gene Ontology Meta Annotator for Plants (GOMAP) is an optimized, high-throughput, and reproducible pipeline for genome-scale GO annotation of plants. We containerized GOMAP to increase portability and reproducibility and also optimized its performance for HPC environments. Here we report on the pipeline’s availability and performance for annotating large, repetitive plant genomes and describe how GOMAP was used to annotate multiple maize genomes as a test case. Assessment shows that GOMAP expands and improves the number of genes annotated and annotations assigned per gene as well as the quality (based on
$$F_{max}$$
F
max
) of GO assignments in maize. GOMAP has been deployed to annotate other species including wheat, rice, barley, cotton, and soy. Instructions and access to the GOMAP Singularity container are freely available online at
https://bioinformapping.com/gomap/
. A list of annotated genomes and links to data is maintained at
https://dill-picl.org/projects/gomap/
.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
We created a new high‐coverage, robust, and reproducible functional annotation of maize protein‐coding genes based on Gene Ontology (GO) term assignments. Whereas the existing Phytozome and Gramene ...maize GO annotation sets only cover 41% and 56% of maize protein‐coding genes, respectively, this study provides annotations for 100% of the genes. We also compared the quality of our newly derived annotations with the existing Gramene and Phytozome functional annotation sets by comparing all three to a manually annotated gold standard set of 1,619 genes where annotations were primarily inferred from direct assay or mutant phenotype. Evaluations based on the gold standard indicate that our new annotation set is measurably more accurate than those from Phytozome and Gramene. To derive this new high‐coverage, high‐confidence annotation set, we used sequence similarity and protein domain presence methods as well as mixed‐method pipelines that were developed for the Critical Assessment of Function Annotation (CAFA) challenge. Our project to improve maize annotations is called maize‐GAMER (GO Annotation Method, Evaluation, and Review), and the newly derived annotations are accessible via MaizeGDB (http://download.maizegdb.org/maize-GAMER) and CyVerse (B73 RefGen_v3 5b+ at doi.org/10.7946/P2S62P and B73 RefGen_v4 Zm00001d.2 at doi.org/10.7946/P2M925).
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
The accuracy of machine learning tasks critically depends on high quality ground truth data. Therefore, in many cases, producing good ground truth data typically involves trained professionals; ...however, this can be costly in time, effort, and money. Here we explore the use of crowdsourcing to generate a large number of training data of good quality. We explore an image analysis task involving the segmentation of corn tassels from images taken in a field setting. We investigate the accuracy, speed and other quality metrics when this task is performed by students for academic credit, Amazon MTurk workers, and Master Amazon MTurk workers. We conclude that the Amazon MTurk and Master Mturk workers perform significantly better than the for-credit students, but with no significant difference between the two MTurk worker types. Furthermore, the quality of the segmentation produced by Amazon MTurk workers rivals that of an expert worker. We provide best practices to assess the quality of ground truth data, and to compare data quality produced by different sources. We conclude that properly managed crowdsourcing can be used to establish large volumes of viable ground truth data at a low cost and high quality, especially in the context of high throughput plant phenotyping. We also provide several metrics for assessing the quality of the generated datasets.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Genome assemblies are foundational for understanding the biology of a species. They provide a physical framework for mapping additional sequences, thereby enabling characterization of, for example, ...genomic diversity and differences in gene expression across individuals and tissue types. Quality metrics for genome assemblies gauge both the completeness and contiguity of an assembly and help provide confidence in downstream biological insights. To compare quality across multiple assemblies, a set of common metrics are typically calculated and then compared to one or more gold standard reference genomes. While several tools exist for calculating individual metrics, applications providing comprehensive evaluations of multiple assembly features are, perhaps surprisingly, lacking. Here, we describe a new toolkit that integrates multiple metrics to characterize both assembly and gene annotation quality in a way that enables comparison across multiple assemblies and assembly types.
Our application, named GenomeQC, is an easy-to-use and interactive web framework that integrates various quantitative measures to characterize genome assemblies and annotations. GenomeQC provides researchers with a comprehensive summary of these statistics and allows for benchmarking against gold standard reference assemblies.
The GenomeQC web application is implemented in R/Shiny version 1.5.9 and Python 3.6 and is freely available at https://genomeqc.maizegdb.org/ under the GPL license. All source code and a containerized version of the GenomeQC pipeline is available in the GitHub repository https://github.com/HuffordLab/GenomeQC.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
MaizeGDB is a highly curated, community-oriented database and informatics service to researchers focused on the crop plant and model organism Zea mays ssp. mays. Although some form of the maize ...community database has existed over the last 25 years, there have only been two major releases. In 1991, the original maize genetics database MaizeDB was created. In 2003, the combined contents of MaizeDB and the sequence data from ZmDB were made accessible as a single resource named MaizeGDB. Over the next decade, MaizeGDB became more sequence driven while still maintaining traditional maize genetics datasets. This enabled the project to meet the continued growing and evolving needs of the maize research community, yet the interface and underlying infrastructure remained unchanged. In 2015, the MaizeGDB team completed a multi-year effort to update the MaizeGDB resource by reorganizing existing data, upgrading hardware and infrastructure, creating new tools, incorporating new data types (including diversity data, expression data, gene models, and metabolic pathways), and developing and deploying a modern interface. In addition to coordinating a data resource, the MaizeGDB team coordinates activities and provides technical support to the maize research community. MaizeGDB is accessible online at http://www.maizegdb.org.
Remarkable productivity has been achieved in crop species through artificial selection and adaptation to modern agronomic practices. Whether intensive selection has changed the ability of improved ...cultivars to maintain high productivity across variable environments is unknown. Understanding the genetic control of phenotypic plasticity and genotype by environment (G × E) interaction will enhance crop performance predictions across diverse environments. Here we use data generated from the Genomes to Fields (G2F) Maize G × E project to assess the effect of selection on G × E variation and characterize polymorphisms associated with plasticity. Genomic regions putatively selected during modern temperate maize breeding explain less variability for yield G × E than unselected regions, indicating that improvement by breeding may have reduced G × E of modern temperate cultivars. Trends in genomic position of variants associated with stability reveal fewer genic associations and enrichment of variants 0-5000 base pairs upstream of genes, hypothetically due to control of plasticity by short-range regulatory elements.
An important advantage of delivering CRISPR reagents into cells as a ribonucleoprotein (RNP) complex is the ability to edit genes without reagents being integrated into the genome. Transient presence ...of RNP molecules in cells can reduce undesirable off-target effects. One method for RNP delivery into plant cells is the use of a biolistic gun. To facilitate selection of transformed cells during RNP delivery, a plasmid carrying a selectable marker gene can be co-delivered with the RNP to enrich for transformed/edited cells. In this work, we compare targeted mutagenesis in rice using three different delivery platforms: biolistic RNP/DNA co-delivery; biolistic DNA delivery; and Agrobacterium-mediated delivery. All three platforms were successful in generating desired mutations at the target sites. However, we observed a high frequency (over 14%) of random plasmid or chromosomal DNA fragment insertion at the target sites in transgenic events generated from both biolistic delivery platforms. In contrast, integration of random DNA fragments was not observed in transgenic events generated from the Agrobacterium-mediated method. These data reveal important insights that must be considered when selecting the method for genome-editing reagent delivery in plants, and emphasize the importance of employing appropriate molecular screening methods to detect unintended alterations following genome engineering.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK