Machine learning (ML) Is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ...ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning-based differential network analysis (mIDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mIDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stressresponsive "noninformative" genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained "informative" genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mIDNA substantially outperformed traditional statistical testing-based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally valldate the mIDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress-related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes.
► Identifying genetic markers for yield requires rapid quantification of crop traits. ► Proximal sensing offers promise for field-based phenotyping (FBP). ► Efficient data integration and ...modeling-assisted analysis are key for FBP. ► FBP scaled to thousands of field plots is a feasible, attainable goal. ► FBP systems require new, integrative collaborations that cross disciplines.
A major challenge for crop research in the 21st century is how to predict crop performance as a function of genetic architecture. Advances in “next generation” DNA sequencing have greatly improved genotyping efficiency and reduced genotyping costs. Methods for characterizing plant traits (phenotypes), however, have much progressed more slowly over the past 30 years, and constraints in phenotyping capability limit our ability to dissect the genetics of quantitative traits, especially those related to harvestable yield and stress tolerance. As a case in point, mapping populations for major crops may consist of 20 or more families, each represented by as many as 200 lines, necessitating field trials with over 20,000 plots at a single location. Investing in the resources and labor needed to quantify even a few agronomic traits for linkage with genetic markers in such massive populations is currently impractical for most breeding programs. Herein, we define key criteria, experimental approaches, equipment and data analysis tools required for robust, high-throughput field-based phenotyping (FBP). The focus is on simultaneous proximal sensing for spectral reflectance, canopy temperature, and plant architecture where a vehicle carrying replicated sets of sensors records data on multiple plots, with the potential to record data throughout the crop life cycle. The potential to assess traits, such as adaptations to water deficits or acute heat stress, several times during a single diurnal cycle is especially valuable for quantifying stress recovery. Simulation modeling and related tools can help estimate physiological traits such as canopy conductance and rooting capacity. Many of the underlying techniques and requisite instruments are available and in use for precision crop management. Further innovations are required to better integrate the functions of multiple instruments and to ensure efficient, robust analysis of the large volumes of data that are anticipated. A complement to the core proximal sensing is high-throughput phenotyping of specific traits such as nutrient status, seed composition, and other biochemical characteristics, as well as underground root architecture. The ability to “ground truth” results with conventional measurements is also necessary. The development of new sensors and imaging systems undoubtedly will continue to improve our ability to phenotype very large experiments or breeding nurseries, with the core FBP abilities achievable through strong interdisciplinary efforts that assemble and adapt existing technologies in novel ways.
Genes controlling hormone levels have been used to increase grain yields in wheat (Triticum aestivum) and rice (Oryza sativa). We created transgenic rice plants expressing maize (Zea mays), rice, or ...Arabidopsis thaliana genes encoding sterol C-22 hydroxylases that control brassinosteroid (BR) hormone levels using a promoter that is active in only the stems, leaves, and roots. The transgenic plants produced more tillers and more seed than wild-type plants. The seed were heavier as well, especially the seed at the bases of the spikes that fill the least. These phenotypic changes brought about 15 to 44% increases in grain yield per plant relative to wild-type plants in greenhouse and field trials. Expression of the Arabidopsis C-22 hydroxylase in the embryos or endosperms themselves had no apparent effect on seed weight. These results suggested that BRs stimulate the flow of assimilate from the source to the sink. Microarray and photosynthesis analysis of transgenic plants revealed evidence of enhanced CO₂ assimilation, enlarged glucose pools in the flag leaves, and increased assimilation of glucose to starch in the seed. These results further suggested that BRs stimulate the flow of assimilate. Plants have not been bred directly for seed filling traits, suggesting that genes that control seed filling could be used to further increase grain yield in crop plants.
The third, or wobble, position in a codon provides a high degree of possible degeneracy and is an elegant fault-tolerance mechanism. Nucleotide biases between organisms at the wobble position have ...been documented and correlated with the abundances of the complementary tRNAs. We and others have noticed a bias for cytosine and guanine at the third position in a subset of transcripts within a single organism. The bias is present in some plant species and warm-blooded vertebrates but not in all plants, or in invertebrates or cold-blooded vertebrates.
Here we demonstrate that in certain organisms the amount of GC at the wobble position (GC3) can be used to distinguish two classes of genes. We highlight the following features of genes with high GC3 content: they (1) provide more targets for methylation, (2) exhibit more variable expression, (3) more frequently possess upstream TATA boxes, (4) are predominant in certain classes of genes (e.g., stress responsive genes) and (5) have a GC3 content that increases from 5'to 3'. These observations led us to formulate a hypothesis to explain GC3 bimodality in grasses.
Our findings suggest that high levels of GC3 typify a class of genes whose expression is regulated through DNA methylation or are a legacy of accelerated evolution through gene conversion. We discuss the three most probable explanations for GC3 bimodality: biased gene conversion, transcriptional and translational advantage and gene methylation.
Summary
Recent studies on jasmonic acid (JA) biosynthetic mutants have shown that jasmonates play essential roles in pollen maturation and dehiscence and wound‐induced defence against biotic attacks. ...To better understand the biosynthetic mechanisms of this essential plant hormone, we isolated an Arabidopsis knock‐out mutant defective in the JA biosynthetic gene CYP74A (allene oxide synthase, AOS) using reverse genetics screening methods. This enzyme catalyses dehydration of the hydroperoxide to an unstable allene oxide in the JA biosynthetic pathway. Endogenous JA levels, which increase 100‐fold 1 h after wounding in wild‐type plants, do not increase after wounding in the aos mutant. In addition, the mutant showed severe male sterility due to defects in anther and pollen development. The male‐sterile phenotype was completely rescued by exogenous application of methyl jasomonate and by complementation with constitutive expression of the AOS gene. RT–PCR analysis showed that the induction of transcripts for vegetative storage protein and lipoxygenase genes, previously shown to be inducible by wound and jasmonate application in the wild‐type, was absent in the aos mutant. In transgenic plants constitutively expressing AOS, wound‐induced JA levels were 50–100% higher compared to wild‐type plants. Taken together with JA deficiency in the aos mutant, our results show that AOS is critical for the biosynthesis of all biologically active jasmonates. Our results also suggest that AOS expression is limiting JA levels in wounded plants, but that the AOS hydroperoxide substrate levels, controlled by upstream enzymes (lipoxygenase and phospholipase), determine JA levels in unwounded plants.
Auxins are growth regulators involved in virtually all aspects of plant development. However, little is known about how plants synthesize these essential compounds. We propose that the level of ...indole-3-acetic acid is regulated by the flux of indole-3-acetaldoxime through a cytochrome P450, CYP83B1, to the glucosinolate pathway. A T-DNA insertion in the CYP83B1 gene leads to plants with a phenotype that suggests severe auxin overproduction, whereas CYP83B1 overexpression leads to loss of apical dominance typical of auxin deficit. CYP83B1 N-hydroxylates indole-3-acetaldoxime to the corresponding aci-nitro compound, 1-aci-nitro-2-indolyl-ethane, with a K m of 3 μM and a turnover number of $53\ {\rm min}^{-1}$. The aci-nitro compound formed reacts non-enzymatically with thiol compounds to produce an N-alkyl-thiohydroximate adduct, the committed precursor of glucosinolates. Thus, indole-3-acetaldoxime is the metabolic branch point between the primary auxin indole-3-acetic acid and indole glucosinolate biosynthesis in Arabidopsis.
Summary
Plants unable to synthesize or perceive brassinosteroids (BRs) are dwarfs. Arabidopsis dwf4 was shown to be defective in a steroid 22α hydroxylase (CYP90B1) step that is the putative ...rate‐limiting step in the BR biosynthetic pathway. To better understand the role of DWF4 in BR biosynthesis, transgenic Arabidopsis plants ectopically overexpressing DWF4 (AOD4) were generated, using the cauliflower mosaic virus 35S promoter, and their phenotypes were characterized. The hypocotyl length of both light‐ and dark‐grown AOD4 seedlings was increased dramatically as compared to wild type. At maturity, inflorescence height increased >35% in AOD4 lines and >14% in tobacco DWF4 overexpressing lines (TOD4), relative to controls. The total number of branches and siliques increased more than twofold in AOD4 plants, leading to a 59% increase in the number of seeds produced. Analysis of endogenous BR levels in dwf4, Ws‐2 and AOD4 revealed that dwf4 accumulated the precursors of the 22α‐hydroxylation steps, whereas overexpression of DWF4 resulted in increased levels of downstream compounds relative to Ws‐2, indicative of facilitated metabolic flow through the step. Both the levels of DWF4 transcripts and BR phenotypic effects were progressively increased in dwf4, wild‐type and AOD4 plants, respectively. This suggests that it will be possible to control plant growth by engineering DWF4 transcription in plants.
Seven dwarf mutants resembling brassinosteroid (BR)-biosynthetic dwarfs were isolated that did not respond significantly to the application of exogenous BRs. Genetic and molecular analyses revealed ...that these were novel alleles of BRI1 (Brassinosteroid-Insensitive 1), which encodes a receptor kinase that may act as a receptor for BRs or be involved in downstream signaling. The results of morphological and molecular analyses indicated that these represent a range of alleles from weak to null. The endogenous BRs were examined from 5-week-old plants of a null allele (bri1-4) and two weak alleles (bri1-5 and bri1-6). Previous analysis of endogenous BRs in several BR-biosynthetic dwarf mutants revealed that active BRs are deficient in these mutants. However, bri1-4 plants accumulated very high levels of brassinolide, castasterone, and typhasterol (57-, 128-, and 33-fold higher, respectively, than those of wild-type plants). Weaker alleles (bri1-5 and bri1-6) also accumulated considerable levels of brassinolide, castasterone, and typhasterol, but less than the null allele (bri1-4). The levels of 6-deoxoBRs in bri1 mutants were comparable to that of wild type. The accumulation of biologically active BRs may result from the inability to utilize these active BRs, the inability to regulate BR biosynthesis in bri1 mutants, or both. Therefore, BRI1 is required for the homeostasis of endogenous BR levels.
We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are ...non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST).
From screening a population of Arabidopsis overexpression lines, two Arabidopsis genes were identified, EFO1 (EARLY FLOWERING BY OVEREXPRESSION 1) and EFO2, that confer early flowering when ...overexpressed. The two genes encode putative WD-domain proteins which share high sequence similarity and constitute a small subfamily. Interestingly, the efo2-1 loss-of-function mutant also flowered earlier in short days and slightly earlier in long days than the wild type, while no flowering-time or morphological differences were observed in efo1-1 relative to the wild type. In addition, the efo2-1 mutation perturbed hypocotyl elongation, leaf expansion and formation, and stem elongation. EFO1 and EFO2 are both regulated by the circadian clock. Expression and genetic analyses revealed that EFO2 suppresses flowering largely through the action of CONSTANS (CO) and FLOWERING LOCUS T (FT), suggesting that EFO2 is a negative regulator of photoperiodic flowering. The growth defects in efo2-1 were augmented in efo1 efo2, but the induction of FT in the double mutant was comparable to that in efo2-1. Thus, while EFO2 acts as a floral repressor, EFO1 may not be directly involved in flowering, but the two genes do have overlapping roles in regulating other developmental processes. EFO1 and EFO2 may function collectively to serve as one of the converging points where the signals of growth and flowering intersect.