Stress variation associated with folding is one of the controlling factors in the development of tectonic fractures, however, little attention has been paid to the influence of neutral surfaces ...during folding on fracture distribution in a fault-related fold. In this study, we take the Cretaceous Bashijiqike Formation in the Kuqa Depression as an example and analyze the distribution of tectonic fractures in fault-related folds by core observation and logging data analysis. Three fracture zones are identified in a fault-related fold: a tensile zone, a transition zone and a compressive zone, which may be constrained by two neutral surfaces of fold. Well correlation reveals that the tensile zone and the transition zone reach the maximum thickness at the fold hinge and get thinner in the fold limbs. A 2D viscoelastic stress field model of a fault-related fold was constructed to further investigate the mechanism of fracturing. Statistical and numerical analysis reveal that the tensile zone and the transition zone become thicker with decreasing interlimb angle. Stress variation associated with folding is the first level of control over the general pattern of fracture distribution while faulting is a secondary control over the development of local fractures in a fault-related fold.
•Three fracture zones are identified in a fault-related fold.•A 2D viscoelastic stress field model of a fault-related fold was constructed.•Folding plays a crucial role in controlling the general distribution of fractures.•Faulting has a local influence on the development of fractures around the fault.
Motivation: Ancestral gene order reconstruction problems, including the median problem, quartet construction, small phylogeny, guided genome halving and genome aliquoting, are NP hard. Available ...heuristics dedicated to each of these problems are computationally costly for even small instances. Results: We present a data structure enabling rapid heuristic solution to all these ancestral genome reconstruction problems. A generic greedy algorithm with look-ahead based on an automatically generated priority system suffices for all the problems using this data structure. The efficiency of the algorithm is due to fast updating of the structure during run time and to the simplicity of the priority scheme. We illustrate with the first rapid algorithm for quartet construction and apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny. Availability: http://albuquerque.bioinformatics.uottawa.ca/pathgroup/Quartet.html Contact: chunfang313@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.
Genome median and genome halving are combinatorial optimization problems that aim at reconstructing ancestral genomes as well as the evolutionary events leading from the ancestor to extant species. ...Exploring complexity issues is a first step towards devising efficient algorithms. The complexity of the median problem for unichromosomal genomes (permutations) has been settled for both the breakpoint distance and the reversal distance. Although the multichromosomal case has often been assumed to be a simple generalization of the unichromosomal case, it is also a relaxation so that complexity in this context does not follow from existing results, and is open for all distances.
We settle here the complexity of several genome median and halving problems, including a surprising polynomial result for the breakpoint median and guided halving problems in genomes with circular and linear chromosomes, showing that the multichromosomal problem is actually easier than the unichromosomal problem. Still other variants of these problems are NP-complete, including the DCJ double distance problem, previously mentioned as an open question. We list the remaining open problems.
This theoretical study clears up a wide swathe of the algorithmical study of genome rearrangements with multiple multichromosomal genomes.
A basic tool for studying the polyploidization history of a genome, especially in plants, is the distribution of duplicate gene similarities in syntenically aligned regions of a genome. This ...distribution can usually be decomposed into two or more components identifiable by peaks, or local maxima, each representing a different polyploidization event. The distributions may be generated by means of a discrete time branching process, followed by a sequence divergence model. The branching process, as well as the inference of fractionation rates based on it, requires knowledge of the ploidy level of each event, which cannot be directly inferred from the pair similarity distribution.
For a sequence of two events of unknown ploidy, either tetraploid, giving rise to whole genome doubling (WGD), or hexaploid, giving rise to whole genome tripling (WGT), we base our analysis on triples of similar genes. We calculate the probability of the four triplet types with origins in one or the other event, or both, and impose a mutational model so that the distribution resembles the original data. Using a ML transition point in the similarities between the two events as a discriminator for the hypothesized origin of each similarity, we calculate the predicted number of triplets of each type for each model combining WGT and/or WGD. This yields a predicted profile of triplet types for each model. We compare the observed and predicted triplet profiles for each model to confirm the polyploidization history of durian, poplar and cabbage.
We have developed a way of inferring the ploidy of up to three successive WGD and/or WGT events by estimating the time of origin of each of the similarities in triples of genes. This may be generalized to a larger number of events and to higher ploidies.
To reconstruct the ancestral genome of a set of phylogenetically related descendant species, we use the RACCROCHE pipeline for organizing a large number of generalized gene adjacencies into contigs ...and then into chromosomes. Separate reconstructions are carried out for each ancestral node of the phylogenetic tree for focal taxa. The ancestral reconstructions are monoploids; they each contain at most one member of each gene family constructed from descendants, ordered along the chromosomes. We design and implement a new computational technique for solving the problem of estimating the ancestral monoploid number of chromosomes x. This involves a "g-mer" analysis to resolve a bias due long contigs, and gap statistics to estimate x. We find that the monoploid number of all the rosid and asterid orders is Formula: see text. We show that this is not an artifact of our method by deriving Formula: see text for the metazoan ancestor.
We outline a principled approach to the analysis of duplicate gene similarity distributions, based on a model integrating sequence divergence and the process of fractionation of duplicate genes ...resulting from whole genome duplication (WGD). This model allows us to predict duplicate gene similarity distributions for a series of two or three WGD, for whole genome triplication followed by a WGD, and for triplication, followed by speciation, followed by WGD. We calculate the probabilities of all possible fates of a gene pair as its two members proliferate or are lost, predicting the number of surviving pairs from each event. We discuss how to calculate maximum likelihood estimators for the parameters of these models, illustrating with an analysis of the distribution of paralog similarities in the poplar genome.
Plant basic helix-loop-helix (bHLH) transcription factors play pivotal roles in responding to stress, including cold and drought. However, it remains unclear how bHLH family genes respond to these ...stresses in Kandelia obovata. In this study, we identified 75 bHLH members in K. obovata, classified into 11 subfamilies and unevenly distributed across its 18 chromosomes. Collineation analysis revealed that segmental duplication primarily drove the expansion of KobHLH genes. The KobHLH promoters were enriched with elements associated with light response. Through RNA-seq, we identified several cold/drought-associated KobHLH genes. This correlated with decreased net photosynthetic rates (Pn) in the leaves of cold/drought-treated plants. Weighted gene co-expression network analysis (WGCNA) confirmed that 11 KobHLH genes were closely linked to photoinhibition in photosystem II (PS II). Among them, four Phytochrome Interacting Factors (PIFs) involved in chlorophyll metabolism were significantly down-regulated. Subcellular localization showed that KobHLH52 and KobHLH30 were located in the nucleus. Overall, we have comprehensively analyzed the KobHLH family and identified several members associated with photoinhibition under cold or drought stress, which may be helpfulfor further cold/drought-tolerance enhancement and molecular breeding through genetic engineering in K. obovata.
Genome amplification through duplication or proliferation of transposable elements has its counterpart in genome reduction, by elimination of DNA or by gene inactivation. Whether loss is primarily ...due to excision of random length DNA fragments or the inactivation of one gene at a time is controversial. Reduction after whole genome duplication (WGD) represents an inexorable collapse in gene complement.
We compare fifteen genomes descending from six eukaryotic WGD events 20-450 Mya. We characterize the collapse over time through the distribution of runs of reduced paralog pairs in duplicated segments. Descendant genomes of the same WGD event behave as replicates. Choice of paralog pairs to be reduced is random except for some resistant regions of contiguous pairs. For those paralog pairs that are reduced, conserved copies tend to concentrate on one chromosome.
Both the contiguous regions of reduction-resistant pairs and the concentration of runs of single copy genes on a single chromosome are evidence of transcriptional co-regulation, dosage sensitivity or other functional interaction constraining the reduction process. These constraints and their evolution over time show a consistent pattern across evolutionary domains and a highly reproducible pattern, as replicates, for the several descendants of a single WGD.
Abstract
Betula L. (birch) is a pioneer hardwood tree species with ecological, economic, and evolutionary importance in the Northern Hemisphere. We sequenced the Betula platyphylla genome and ...assembled the sequences into 14 chromosomes. The Betula genome lacks evidence of recent whole-genome duplication and has the same paleoploidy level as Vitis vinifera and Prunus mume. Phylogenetic analysis of lignin pathway genes coupled with tissue-specific expression patterns provided clues for understanding the formation of higher ratios of syringyl to guaiacyl lignin observed in Betula species. Our transcriptome analysis of leaf tissues under a time-series cold stress experiment revealed the presence of the MEKK1–MKK2–MPK4 cascade and six additional mitogen-activated protein kinases that can be linked to a gene regulatory network involving many transcription factors and cold tolerance genes. Our genomic and transcriptome analyses provide insight into the structures, features, and evolution of the B. platyphylla genome. The chromosome-level genome and gene resources of B. platyphylla obtained in this study will facilitate the identification of important and essential genes governing important traits of trees and genetic improvement of B. platyphylla.
Fractionation is the genome-wide process of losing one gene per duplicate pair following whole genome multiplication (doubling, tripling, …). This is important in the evolution of plants over tens of ...millions of years, because of their repeated cycles of genome multiplication and fractionation. One type of evidence in the study of these processes is the frequency distribution of similarities between the two genes, over all the duplicate pairs in the genome.
We study modeling and inference problems around the processes of fractionation and whole genome multiplication focusing first on the frequency distribution of similarities of duplicate pairs in the genome. Our birth-and-death model accounts for repeated duplication, triplication or other multiplication events, as well as fractionation rates among multiple progeny of a single gene specific to each event. It also has a biologically and combinatorially well-motivated way of handling the tendency for at least one sibling to survive fractionation. The method settles previously unexplored questions about the expected number of gene pairs tracing their ancestry back to each multiplication event. We exemplify the algebraic concepts inherent in our models and on Brassica rapa, whose evolutionary history is well-known. We demonstrate the quantitative analysis of high-similarity gene pairs and triples to confirm the known ploidies of events in the lineage of B. rapa.
Our birth-and-death model accounts for the similarity distribution of paralogs in terms of multiple rounds of whole genome multiplication and fractionation. An analysis of high-similarity gene triples confirms the recent Brassica triplication.