Life is that which replicates and evolves, but there is no consensus on how life emerged. We advocate a systems protobiology view, whereby the first replicators were assemblies of spontaneously ...accreting, heterogeneous and mostly non-canonical amphiphiles. This view is substantiated by rigorous chemical kinetics simulations of the graded autocatalysis replication domain (GARD) model, based on the notion that the replication or reproduction of compositional information predated that of sequence information. GARD reveals the emergence of privileged non-equilibrium assemblies (composomes), which portray catalysis-based homeostatic (concentration-preserving) growth. Such a process, along with occasional assembly fission, embodies cell-like reproduction. GARD pre-RNA evolution is evidenced in the selection of different composomes within a sparse fitness landscape, in response to environmental chemical changes. These observations refute claims that GARD assemblies (or other mutually catalytic networks in the metabolism first scenario) cannot evolve. Composomes represent both a genotype and a selectable phenotype, anteceding present-day biology in which the two are mostly separated. Detailed GARD analyses show attractor-like transitions from random assemblies to self-organized composomes, with negative entropy change, thus establishing composomes as dissipative systems—hallmarks of life. We show a preliminary new version of our model, metabolic GARD (M-GARD), in which lipid covalent modifications are orchestrated by non-enzymatic lipid catalysts, themselves compositionally reproduced. M-GARD fills the gap of the lack of true metabolism in basic GARD, and is rewardingly supported by a published experimental instance of a lipid-based mutually catalytic network. Anticipating near-future far-reaching progress of molecular dynamics, M-GARD is slated to quantitatively depict elaborate protocells, with orchestrated reproduction of both lipid bilayer and lumenal content. Finally, a GARD analysis in a whole-planet context offers the potential for estimating the probability of life's emergence. The invigorated GARD scrutiny presented in this review enhances the validity of autocatalytic sets as a bona fide early evolution scenario and provides essential infrastructure for a paradigm shift towards a systems protobiology view of life's origin.
Systems chemistry has been a key component of origin of life research, invoking models of life's inception based on evolving molecular networks. One such model is the graded autocatalysis replication ...domain (GARD) formalism embodied in a lipid world scenario, which offers rigorous computer simulation based on defined chemical kinetics equations. GARD suggests that the first pre-RNA life-like entities could have been homeostatically-growing assemblies of amphiphiles, undergoing compositional replication and mutations, as well as rudimentary selection and evolution. Recent progress in molecular dynamics has provided an experimental tool to study complex biological phenomena such as protein folding, ligand-receptor interactions, and micellar formation, growth, and fission. The detailed molecular definition of GARD and its inter-molecular catalytic interactions make it highly compatible with molecular dynamics analyses. We present a roadmap for simulating GARD's kinetic and thermodynamic behavior using various molecular dynamics methodologies. We review different approaches for testing the validity of the GARD model by following micellar accretion and fission events and examining compositional changes over time. Near-future computational advances could provide empirical delineation for further system complexification, from simple compositional non-covalent assemblies towards more life-like protocellular entities with covalent chemistry that underlies metabolism and genetic encoding.
Early steps in the origin of life were necessarily connected to the unlikely formation of self-reproducing structures from chaotic chemistry. Simulations of chemical kinetics based on the graded ...autocatalysis replication domain (GARD) model demonstrate the ability of a micellar system to become self-reproducing units away from equilibrium. Even though they may be very rare in the initial state of the system, the property of their endogenous mutually catalytic networks being dynamic attractors greatly enhanced reproduction propensity, revealing their potential for selection and Darwinian evolution processes. In parallel, order and complexity have been shown to be crucial parameters in successful evolution. Here, we probe these parameters in the dynamics of GARD-governed entities in an attempt to identify characteristic mechanisms of their development in non-covalent molecular assemblies. Using a virtual random walk perspective, a value for consecutive order is defined based on statistical thermodynamics. The complexity, on the other hand, is determined by the size of a minimal algorithm fully describing the statistical properties of the random walk. By referring to a previously published diagonal line in an order/complexity diagram that represents the progression of evolution, it is shown that the GARD model has the potential to advance in this direction. These results can serve as a solid foundation for identifying general criteria for future analyses of evolving systems.
Olfactory receptors (ORs) are G protein-coupled receptors with a crucial role in odor detection. A typical mammalian genome harbors ~ 1000 OR genes and pseudogenes; however, different gene ...duplication/deletion events have occurred in each species, resulting in complex orthology relationships. While the human OR nomenclature is widely accepted and based on phylogenetic classification into 18 families and further into subfamilies, for other mammals different and multiple nomenclature systems are currently in use, thus concealing important evolutionary and functional insights.
Here, we describe the Mutual Maximum Similarity (MMS) algorithm, a systematic classifier for assigning a human-centric nomenclature to any OR gene based on inter-species hierarchical pairwise similarities. MMS was applied to the OR repertoires of seven mammals and zebrafish. Altogether, we assigned symbols to 10,249 ORs. This nomenclature is supported by both phylogenetic and synteny analyses. The availability of a unified nomenclature provides a framework for diverse studies, where textual symbol comparison allows immediate identification of potential ortholog groups as well as species-specific expansions/deletions; for example, Or52e5 and Or52e5b represent a rat-specific duplication of OR52E5. Another example is the complete absence of OR subfamily OR6Z among primate OR symbols. In other mammals, OR6Z members are located in one genomic cluster, suggesting a large deletion in the great ape lineage. An additional 14 mammalian OR subfamilies are missing from the primate genomes. While in chimpanzee 87% of the symbols were identical to human symbols, this number decreased to ~ 50% in dog and cow and to ~ 30% in rodents, reflecting the adaptive changes of the OR gene superfamily across diverse ecological niches. Application of the proposed nomenclature to zebrafish revealed similarity to mammalian ORs that could not be detected from the current zebrafish olfactory receptor gene nomenclature.
We have consolidated a unified standard nomenclature system for the vertebrate OR superfamily. The new nomenclature system will be applied to cow, horse, dog and chimpanzee by the Vertebrate Gene Nomenclature Committee and its implementation is currently under consideration by other relevant species-specific nomenclature committees.
GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human ...gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite's next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein-RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL:http://www.genecards.org/.
Olfactory receptors (ORs) are the largest gene family in the human genome. Although they are expected to be expressed specifically in olfactory tissues, some ectopic expression has been reported, ...with special emphasis on sperm and testis. The present study systematically explores the expression patterns of OR genes in a large number of tissues and assesses the potential functional implication of such ectopic expression.
We analyzed the expression of hundreds of human and mouse OR transcripts, via EST and microarray data, in several dozens of human and mouse tissues. Different tissues had specific, relatively small OR gene subsets which had particularly high expression levels. In testis, average expression was not particularly high, and very few highly expressed genes were found, none corresponding to ORs previously implicated in sperm chemotaxis. Higher expression levels were more common for genes with a non-OR genomic neighbor. Importantly, no correlation in expression levels was detected for human-mouse orthologous pairs. Also, no significant difference in expression levels was seen between intact and pseudogenized ORs, except for the pseudogenes of subfamily 7E which has undergone a human-specific expansion.
The OR superfamily as a whole, show widespread, locus-dependent and heterogeneous expression, in agreement with a neutral or near neutral evolutionary model for transcription control. These results cannot reject the possibility that small OR subsets might play functional roles in different tissues, however considerable care should be exerted when offering a functional interpretation for ectopic OR expression based only on transcription information.
The sense of smell is a complex molecular device, encompassing several hundred olfactory receptor proteins (ORs). These receptors, encoded by the largest human gene superfamily, integrate odorant ...signals into an accurate ‘odor image’ in the brain. Widespread phenotypic diversity in human olfaction is, in part, attributable to prevalent genetic variation in OR genes, owing to copy number variation, deletion alleles and deleterious single nucleotide polymorphisms. The development of new genomic tools, including next generation sequencing and CNV assays, provides opportunities to characterize the genetic variations of this system. The advent of large-scale functional screens of expressed ORs, combined with genetic association studies, has the potential to link variations in ORs to human chemosensory phenotypes. This promises to provide a genome-wide view of human olfaction, resulting in a deeper understanding of personalized odor coding, with the potential to decipher flavor and fragrance preferences.
Mixed lipid micelles were proposed to facilitate life through their documented growth dynamics and catalytic properties. Our previous research predicted that micellar self-reproduction involves ...catalyzed accretion of lipid molecules by the residing lipids, leading to compositional homeostasis. Here, we employ atomistic Molecular Dynamics simulations, beginning with 54 lipid monomers, tracking an entire course of micellar accretion. This was done to examine the self-assembly of variegated lipid clusters, allowing us to measure entry and exit rates of monomeric lipids into pre-micelles with different compositions and sizes. We observe considerable rate-modifications that depend on the assembly composition and scrutinize the underlying mechanisms as well as the energy contributions. Lastly, we describe the measured potential for compositional homeostasis in our simulated mixed micelles. This affirms the basis for micellar self-reproduction, with implications for the study of the origin of life.
It is widely accepted that autocatalysis constitutes a crucial facet of effective replication and evolution (e.g., in Eigen's hypercycle model). Other models for early evolution (e.g., by Dyson, ...Gánti, Varela, and Kauffman) invoke catalytic networks, where cross-catalysis is more apparent. A key question is how the balance between auto- (self-) and cross- (mutual) catalysis shapes the behavior of model evolving systems. This is investigated using the graded autocatalysis replication domain (GARD) model, previously shown to capture essential features of reproduction, mutation, and evolution in compositional molecular assemblies. We have performed numerical simulations of an ensemble of GARD networks, each with a different set of lognormally distributed catalytic values. We asked what is the influence of the catalytic content of such networks on beneficial evolution. Importantly, a clear trend was observed, wherein only networks with high mutual catalysis
propensity (
) allowed for an augmented diversity of composomes, quasi-stationary compositions that exhibit high replication fidelity. We have reexamined a recent analysis that showed meager selection in a single GARD instance and for a few nonstationary target compositions. In contrast, when we focused here on compotypes (clusters of composomes) as targets for selection in populations of compositional assemblies, appreciable selection response was observed for a large portion of the networks simulated. Further, stronger selection response was seen for high
values. Our simulations thus demonstrate that GARD can help analyze important facets of evolving systems, and indicate that excess mutual catalysis over self-catalysis is likely to be important for the emergence of molecular systems capable of evolutionlike behavior.
Motivation: Genes are often characterized dichotomously as either housekeeping or single-tissue specific. We conjectured that crucial functional information resides in genes with midrange profiles of ...expression. Results: To obtain such novel information genome-wide, we have determined the mRNA expression levels for one of the largest hitherto analyzed set of 62 839 probesets in 12 representative normal human tissues. Indeed, when using a newly defined graded tissue specificity index τ, valued between 0 for housekeeping genes and 1 for tissue-specific genes, genes with midrange profiles having 0.15 < τ < 0.85 were found to constitute >50% of all expression patterns. We developed a binary classification, indicating for every gene the I B tissues in which it is overly expressed, and the 12 − I B tissues in which it shows low expression. The 85 dominant midrange patterns with I B = 2–11 were found to be bimodally distributed, and to contribute most significantly to the definition of tissue specification dendrograms. Our analyses provide a novel route to infer expression profiles for presumed ancestral nodes in the tissue dendrogram. Such definition has uncovered an unsuspected correlation, whereby de novo enhancement and diminution of gene expression go hand in hand. These findings highlight the importance of gene suppression events, with implications to the course of tissue specification in ontogeny and phylogeny. Availability: All data and analyses are publically available at the GeneNote website, http://genecards.weizmann.ac.il/genenote/ and, GEO accession GSE803. Contact: doron.lancet@weizmann.ac.il Supplementary information: Four tables available at the above site.