Protein function can be tuned using laboratory evolution, in which one rapidly searches through a library of proteins for the properties of interest. In site-directed recombination, n crossovers are ...chosen in an alignment of p parents to define a set of p(n + 1) peptide fragments. These fragments are then assembled combinatorially to create a library of pn+1 proteins. We have developed a computational algorithm to enrich these libraries in folded proteins while maintaining an appropriate level of diversity for evolution. For a given set of parents, our algorithm selects crossovers that minimize the average energy of the library, subject to constraints on the length of each fragment. This problem is equivalent to finding the shortest path between nodes in a network, for which the global minimum can be found efficiently. Our algorithm has a running time of O(N3p2 + N2n) for a protein of length N. Adjusting the constraints on fragment length generates a set of optimized libraries with varying degrees of diversity. By comparing these optima for different sets of parents, we rapidly determine which parents yield the lowest energy libraries.
SCHEMA-guided protein recombination Silberg, Jonathan J; Endelman, Jeffrey B; Arnold, Frances H
Methods in enzymology,
2004, Letnik:
388
Journal Article
Remote sensing is revolutionizing the phenotyping of agricultural field trials, but for many researchers, the extraction of plot‐level results is a bottleneck. We have developed the R package ...FIELDimageR as a user‐friendly tool to analyze orthomosaic images containing many plots. The basic workflow involves cropping and rotating the image, followed by the creation of a shapefile based on the experimental design. The package includes functions to calculate the number of plants per plot, canopy cover percentage, vegetation indices, and plant height. FIELDimageR is publicly available as a GitHub repository (https://github.com/filipematias23/FIELDimageR).
Processed potato products, such as chips and fries, contribute to the dietary intake of acrylamide, a suspected human carcinogen. One of the most promising approaches for reducing acrylamide ...consumption is to develop and commercialize new potato varieties with low acrylamide-forming potential. To facilitate this effort, a National Fry Processing Trial (NFPT) was conducted from 2011-2013 in five states. More than 140 advanced breeding lines were evaluated for tuber agronomic traits and biochemical properties from harvest through eight months of storage. Several dozen entries had significantly less acrylamide than the check varieties Russet Burbank and Ranger Russet, with reductions in excess of 50%. As in previous studies, the glucose content of raw tubers was highly predictive of acrylamide in finished fries (R2 = 0.64 – 0.77). Despite its role in acrylamide formation, tuber free asparagine was not predictive of acrylamide, potentially because it showed relatively little variation in the NFPT population. Even when glucose was included in the model as a covariate, genotype was highly significant (p = 0.001) for predicting acrylamide, indicating there may be as yet unidentified genetic loci to target in breeding. The NFPT has demonstrated that many elite US clones with low acrylamide-forming potential exist, as a by-product of breeding for resistance to cold-induced sweetening. Our ongoing challenge is to combine this trait with the complex quality attributes required by the food industry.
Recombination generates chimeric proteins whose ability to fold depends on minimizing structural perturbations that result when portions of the sequence are inherited from different parents. These ...chimeric sequences can display functional properties characteristic of the parents or acquire entirely new functions. Seventeen chimeras were generated from two CYP102 members of the functionally diverse cytochrome P450 family. Chimeras predicted to have limited structural disruption, as defined by the SCHEMA algorithm, displayed CO binding spectra characteristic of folded P450s. Even this small population exhibited significant functional diversity: chimeras displayed altered substrate specificities, a wide range in thermostabilities, up to a 40-fold increase in peroxidase activity, and ability to hydroxylate a substrate toward which neither parent heme domain shows detectable activity. These results suggest that SCHEMA-guided recombination can be used to generate diverse P450s for exploring function evolution within the P450 structural framework.
New sources of genetic diversity must be incorporated into plant breeding programs if they are to continue increasing grain yield and quality, and tolerance to abiotic and biotic stresses. Germplasm ...collections provide a source of genetic and phenotypic diversity, but characterization of these resources is required to increase their utility for breeding programs. We used a barley SNP iSelect platform with 7,842 SNPs to genotype 2,417 barley accessions sampled from the USDA National Small Grains Collection of 33,176 accessions. Most of the accessions in this core collection are categorized as landraces or cultivars/breeding lines and were obtained from more than 100 countries. Both STRUCTURE and principal component analysis identified five major subpopulations within the core collection, mainly differentiated by geographical origin and spike row number (an inflorescence architecture trait). Different patterns of linkage disequilibrium (LD) were found across the barley genome and many regions of high LD contained traits involved in domestication and breeding selection. The genotype data were used to define 'mini-core' sets of accessions capturing the majority of the allelic diversity present in the core collection. These 'mini-core' sets can be used for evaluating traits that are difficult or expensive to score. Genome-wide association studies (GWAS) of 'hull cover', 'spike row number', and 'heading date' demonstrate the utility of the core collection for locating genetic factors determining important phenotypes. The GWAS results were referenced to a new barley consensus map containing 5,665 SNPs. Our results demonstrate that GWAS and high-density SNP genotyping are effective tools for plant breeders interested in accessing genetic diversity in large germplasm collections.
As part of the USDA Barley Coordinated Agricultural Project, 899 two-row spring barley lines were genotyped at 3072 SNP markers and phenotyped for four food quality traits: kernel hardness, ...polyphenol oxidase (PPO) activity, total phenolics, and amylose content. Analysis of the existing consensus map for this marker set revealed significant distortion in the fine structure of the Hardness locus on chromosome 5H. To create a revised barley consensus map with improved fine structure, new algorithms for the simplification and linearization of consensus graphs were developed and implemented in the R package DAGGER. DAGGER uses quadratic programming to generate a consensus map with minimum error relative to the linkage maps while remaining ordinally consistent with them. The root-mean-squared error between the barley linkage maps and the DAGGER map was 0.82 cM per marker interval compared to 2.28 cM for the existing consensus map. Association mapping of the barley food quality traits identified significant features corresponding to the PPO locus on chromosome 2H and the Wax locus on 7H, but the majority of the genetic variation was unaccounted for. While the polygenic nature of the food quality traits makes traditional marker-assisted selection difficult, genomic selection is well suited for this complexity because all markers are included in the prediction model. A key method for the genomic prediction of breeding values is ridge regression (RR), which is equivalent to BLUP when the genetic covariance between lines is proportional to their similarity in genotype space. This additive model can be broadened to include epistatic effects by using other kernels, such as the Gaussian, which represent inner products in a complex feature space. To facilitate the use of RR and non-additive kernels in plant breeding, a new software package for R called rrBLUP was developed. When applied to the barley food quality traits, the cross-validation accuracy between phenotype and predicted breeding value ranged from 0.31 for total phenolics to 0.56 for kernel hardness. Although further research is needed, these results suggest genomic selection of barley food quality may be viable in the near future.