The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a ...systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the 'search space' of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (Kd) and catalytic (kcat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving kcat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the 'best' amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole, simultaneously, this offers opportunities for protein improvement not readily available to natural evolution on rapid timescales. Intelligent landscape navigation, informed by sequence-activity relationships and coupled to the emerging methods of synthetic biology, offers scope for the development of novel biocatalysts that are both highly active and robust.
Abstract
The pathogenic fungus
Aspergillus fumigatus
is a major etiological agent of fungal invasive and chronic diseases affecting tens of millions of individuals worldwide. Draft genome sequences ...of two clinical isolates (Af293 and A1163) are commonly used as reference genomes for analyses of clinical and environmental strains. However, the reference sequences lack coverage of centromeres, an accurate sequence for ribosomal repeats, and a comprehensive annotation of chromosomal rearrangements such as translocations and inversions. Here, we used PacBio Single Molecule Real-Time (SMRT), Oxford Nanopore and Illumina HiSeq sequencing for de novo genome assembly and polishing of two laboratory reference strains of
A. fumigatus
, CEA10 (parental isolate of A1163) and its descendant A1160. We generated full length chromosome assemblies and a comprehensive telomere-to-telomere coverage for CEA10 and near complete assembly of A1160 including ribosomal repeats and the sequences of centromeres, which we discovered to be composed of long transposon elements. We envision these high-quality reference genomes will become fundamental resources to study
A. fumigatus
biology, pathogenicity and virulence, and to discover more effective treatments against diseases caused by this fungus.
The ability to engineer biological systems, whether to introduce novel functionality or improved performance, is a cornerstone of biotechnology and synthetic biology. Typically, this requires the ...generation of genetic diversity to explore variations in phenotype, a process that can be performed at many levels, from single molecule targets (i.e., in directed evolution of enzymes) to whole organisms (e.g., in chassis engineering). Recent advances in DNA synthesis technology and automation have enhanced our ability to create variant libraries with greater control and throughput. This review highlights the latest developments in approaches to create such a hierarchy of diversity from the enzyme level to entire pathways in vitro, with a focus on the creation of combinatorial libraries that are required to navigate a target's vast design space successfully to uncover significant improvements in function.
Transcription factor-based biosensors are useful tools for the detection of metabolites and industrially valuable molecules, and present many potential applications in biotechnology and biomedicine. ...However, the most common approach to develop biosensors relies on employing a limited set of naturally occurring allosteric transcription factors (aTFs). Therefore, altering the ligand specificity of aTFs towards the detection of new effectors is an important goal.
Here, the PcaV repressor, a member of the MarR aTF family, was used to develop a biosensor for the detection of hydroxyl-substituted benzoic acids, including protocatechuic acid (PCA). The PCA biosensor was further subjected to directed evolution to alter its ligand specificity towards vanillin and other closely related aromatic aldehydes, to generate the Van2 biosensor. Ligand recognition of Van2 was explored in vitro using a range of biochemical and biophysical analyses, and extensive in vivo genetic-phenotypic analysis was performed to determine the role of each amino acid change upon biosensor performance.
This is the first study to report directed evolution of a member of the MarR aTF family, and demonstrates the plasticity of the PCA biosensor by altering its ligand specificity to generate a biosensor for aromatic aldehydes.
Abstract
Despite its greener credentials, biomanufacturing remains financially uncompetitive compared with the higher carbon emitting, hydrocarbon-based chemical industry. Replacing traditional ...chassis such as
E. coli
with novel robust organisms, are a route to cost reduction for biomanufacturing. Extremophile bacteria such as the halophilic
Halomonas bluephagenesis
TD01 exemplify this potential by thriving in environments inherently inimical to other organisms, so reducing sterilisation costs. Novel chassis are inevitably less well annotated than established organisms. Rapid characterisation along with community data sharing will facilitate adoption of such organisms for biomanufacturing. The data record comprises a newly sequenced genome for the organism and evidence via LC-MS based proteomics for expression of 1160 proteins (30% of the proteome) including baseline quantification of 1063 proteins (27% of the proteome), and a spectral library enabling re-use for targeted LC-MS proteomics assays. Protein data are annotated with KEGG Orthology, enabling rapid matching of quantitative data to pathways of interest to biomanufacturing.
Epstein-Barr virus (EBV) is a human herpesvirus that persists as a largely subclinical infection in the vast majority of adults worldwide. Recent evidence indicates that an important component of the ...persistence strategy involves active interference with the MHC class I antigen processing pathway during the lytic replication cycle. We have now identified a novel role for the lytic cycle gene, BILF1, which encodes a glycoprotein with the properties of a constitutive signaling G-protein-coupled receptor (GPCR). BILF1 reduced the levels of MHC class I at the cell surface and inhibited CD8(+) T cell recognition of endogenous target antigens. The underlying mechanism involves physical association of BILF1 with MHC class I molecules, an increased turnover from the cell surface, and enhanced degradation via lysosomal proteases. The BILF1 protein of the closely related CeHV15 gamma(1)-herpesvirus of the Rhesus Old World primate (80% amino acid sequence identity) downregulated surface MHC class I similarly to EBV BILF1. Amongst the human herpesviruses, the GPCR encoded by the ORF74 of the KSHV gamma(2)-herpesvirus is most closely related to EBV BILF1 (15% amino acid sequence identity) but did not affect levels of surface MHC class I. An engineered mutant of BILF1 that was unable to activate G protein signaling pathways retained the ability to downregulate MHC class I, indicating that the immune-modulating and GPCR-signaling properties are two distinct functions of BILF1. These findings extend our understanding of the normal biology of an important human pathogen. The discovery of a third EBV lytic cycle gene that cooperates to interfere with MHC class I antigen processing underscores the importance of the need for EBV to be able to evade CD8(+) T cell responses during the lytic replication cycle, at a time when such a large number of potential viral targets are expressed.
Abstract
Kazachstania bulderi
is a non-conventional yeast species able to grow efficiently on glucose and δ-gluconolactone at low pH. These unique traits make
K. bulderi
an ideal candidate for use in ...sustainable biotechnology processes including low pH fermentations and the production of green chemicals including organic acids. To accelerate strain development with this species, detailed information of its genetics is needed. Here, by employing long read sequencing we report a high-quality phased genome assembly for three strains of
K. bulderi
species, including the type strain. The sequences were assembled into 12 chromosomes with a total length of 14 Mb, and the genome was fully annotated at structural and functional levels, including allelic and structural variants, ribosomal array and mating type locus. This high-quality reference genome provides a resource to advance our fundamental knowledge of biotechnologically relevant non-conventional yeasts and to support the development of genetic tools for manipulating such strains towards their use as production hosts in biotechnological processes.
Synthetic biology utilizes the Design-Build-Test-Learn pipeline for the engineering of biological systems. Typically, this requires the construction of specifically designed, large and complex DNA ...assemblies. The availability of cheap DNA synthesis and automation enables high-throughput assembly approaches, which generates a heavy demand for DNA sequencing to verify correctly assembled constructs. Next-generation sequencing is ideally positioned to perform this task, however with expensive hardware costs and bespoke data analysis requirements few laboratories utilize this technology in-house. Here a workflow for highly multiplexed sequencing is presented, capable of fast and accurate sequence verification of DNA assemblies using nanopore technology. A novel sample barcoding system using polymerase chain reaction is introduced, and sequencing data are analyzed through a bespoke analysis algorithm. Crucially, this algorithm overcomes the problem of high-error rate nanopore data (which typically prevents identification of single nucleotide variants) through statistical analysis of strand bias, permitting accurate sequence analysis with single-base resolution. As an example, 576 constructs (6 × 96 well plates) were processed in a single workflow in 72 h (from
colonies to analyzed data). Given our procedure's low hardware costs and highly multiplexed capability, this provides cost-effective access to powerful DNA sequencing for any laboratory, with applications beyond synthetic biology including directed evolution, single nucleotide polymorphism analysis and gene synthesis.
CodonGenie, freely available from
http://codon.synbiochem.co.uk
, is a simple web application for designing ambiguous codons to support protein mutagenesis applications. Ambiguous codons are derived ...from specific heterogeneous nucleotide mixtures, which create sequence degeneracy when synthesised in a DNA library. In directed evolution studies, such codons are carefully selected to encode multiple amino acids. For example, the codon NTN, where the code N denotes a mixture of all four nucleotides, will encode a mixture of phenylalanine, leucine, isoleucine, methionine and valine. Given a user-defined target collection of amino acids matched to an intended host organism, CodonGenie designs and analyses all ambiguous codons that encode the required amino acids. The codons are ranked according to their efficiency in encoding the required amino acids while minimising the inclusion of additional amino acids and stop codons. Organism-specific codon usage is also considered.
Directed evolution enables the improvement and optimisation of enzymes for particular applications and is a valuable tool for biotechnology and synthetic biology. However, studies are often limited ...in their scope by the inability to screen very large numbers of variants to identify improved enzymes. One class of enzyme for which a universal, operationally simple ultra-high throughput (>106 variants per day) assay is not available is flavin adenine dinucleotide (FAD) dependent oxidases. The current high throughput assay involves a visual, colourimetric, colony-based screen, however this is not suitable for very large libraries and does not enable quantification of the relative fitness of variants. To address this, we describe an optimised method for the sensitive detection of oxidase activity within single Escherichia coli (E. coli) cells, using the monoamine oxidase from Aspergillus niger, MAO-N, as a model system. In contrast to other methods for the screening of oxidase activity in vivo, this method does not require cell surface expression, emulsion formation or the addition of an extracellular peroxidase. Furthermore, we show that fluorescence activated cell sorting (FACS) of large libraries derived from MAO-N under the assay conditions can enrich the library in functional variants at much higher rates than via the colony-based method. We demonstrate its use for directed evolution by identifying a new mutant of MAO-N with improved activity towards a novel secondary amine substrate. This work demonstrates, for the first time, an ultra-high throughput screening methodology widely applicable for the directed evolution of FAD dependent oxidases in E. coli.