Small RNAs (sRNAs) associated with the RNA chaperon protein Hfq are key posttranscriptional regulators of gene expression in bacteria. Deciphering the sRNA-target interactome is an essential step ...toward understanding the roles of sRNAs in the cellular networks. We developed a broadly applicable methodology termed RIL-seq (RNA interaction by ligation and sequencing), which integrates experimental and computational tools for in vivo transcriptome-wide identification of interactions involving Hfq-associated sRNAs. By applying this methodology to Escherichia coli we discovered an extensive network of interactions involving RNA pairs showing sequence complementarity. We expand the ensemble of targets for known sRNAs, uncover additional Hfq-bound sRNAs encoded in various genomic regions along with their trans encoded targets, and provide insights into binding and possible cycling of RNAs on Hfq. Comparison of the sRNA interactome under various conditions has revealed changes in the sRNA repertoire as well as substantial re-wiring of the network between conditions.
Display omitted
•A widely applicable method for in vivo global mapping of small RNA interactome•Substantial re-wiring of the network upon changes in cellular conditions•Regulatory circuits involving two regulators derived from the same transcript•sRNAs acting in trans are encoded within almost every possible genomic element
Melamed et al. describe RIL-seq, an approach that can identify Hfq-bound pairs of small RNAs (sRNAs) and their targets. They apply RIL-seq to E. coli grown in different conditions, identifying new sRNAs and describing sRNA interactome changes upon change in conditions.
Genome-wide association studies are now identifying disease-associated chromosome regions. However, even after convincing replication, the localization of the causal variant(s) requires comprehensive ...resequencing, extensive genotyping and statistical analyses in large sample sets leading to targeted functional studies. Here, we have localized the type 1 diabetes (T1D) association in the interleukin 2 receptor alpha (IL2RA) gene region to two independent groups of SNPs, spanning overlapping regions of 14 and 40 kb, encompassing IL2RA intron 1 and the 5' regions of IL2RA and RBM17 (odds ratio = 2.04, 95% confidence interval = 1.70-2.45; P = 1.92 x 10(-28); control frequency = 0.635). Furthermore, we have associated IL2RA T1D susceptibility genotypes with lower circulating levels of the biomarker, soluble IL-2RA (P = 6.28 x 10(-28)), suggesting that an inherited lower immune responsiveness predisposes to T1D.
The laboratory mouse is the most widely used mammalian model organism in biomedical research. The 2.6 × 10(9) bases of the mouse genome possess a high degree of conservation with the human genome, so ...a thorough annotation of the mouse genome will be of significant value to understanding the function of the human genome. So far, most of the functional sequences in the mouse genome have yet to be found, and the cis-regulatory sequences in particular are still poorly annotated. Comparative genomics has been a powerful tool for the discovery of these sequences, but on its own it cannot resolve their temporal and spatial functions. Recently, ChIP-Seq has been developed to identify cis-regulatory elements in the genomes of several organisms including humans, Drosophila melanogaster and Caenorhabditis elegans. Here we apply the same experimental approach to a diverse set of 19 tissues and cell types in the mouse to produce a map of nearly 300,000 murine cis-regulatory sequences. The annotated sequences add up to 11% of the mouse genome, and include more than 70% of conserved non-coding sequences. We define tissue-specific enhancers and identify potential transcription factors regulating gene expression in each tissue or cell type. Finally, we show that much of the mouse genome is organized into domains of coordinately regulated enhancers and promoters. Our results provide a resource for the annotation of functional elements in the mammalian genome and for the study of mechanisms regulating tissue-specific gene expression.
•C-TALE is a new many-versus-all C-method based on in-solution hybridization.•Probes for C-TALE hybridization are prepared from BAC DNA.•C-TALE enables analysis of DNA folding of megabase-scale ...genomic regions.•C-TALE is well suited for studying individual chromatin loops.•High-resolution contact maps are achievable in C-TALE at a low sequencing depth.
Studies performed using Hi-C and other high-throughput whole-genome C-methods have demonstrated that 3D organization of eukaryotic genomes is functionally relevant. Unfortunately, ultra-deep sequencing of Hi-C libraries necessary to detect loop structures in large vertebrate genomes remains rather expensive. However, many studies are in fact aimed at determining the fine-scale 3D structure of comparatively small genomic regions up to several Mb in length. Such studies typically focus on the spatial structure of domains of coregulated genes, molecular mechanisms of loop formation, and interrogation of functional significance of GWAS-revealed polymorphisms. Therefore, a handful of molecular techniques based on Hi-C have been developed to address such issues. These techniques commonly rely on in-solution hybridization of Hi-C/3C-seq libraries with pools of biotinylated baits covering the region of interest, followed by deep sequencing of the enriched library. Here, we describe a new protocol of this kind, C-TALE (Chromatin TArget Ligation Enrichment). Preparation of hybridization probes from bacterial artificial chromosomes and an additional round of enrichment make C-TALE a cost-effective alternative to existing many-versus-all C-methods.
Despite advances in sequencing, the goal of obtaining a comprehensive view of genetic variation in populations is still far from reached. We sequenced 180 lines of A. thaliana from Sweden to obtain ...as complete a picture as possible of variation in a single region. Whereas simple polymorphisms in the unique portion of the genome are readily identified, other polymorphisms are not. The massive variation in genome size identified by flow cytometry seems largely to be due to 45S rDNA copy number variation, with lines from northern Sweden having particularly large numbers of copies. Strong selection is evident in the form of long-range linkage disequilibrium (LD), as well as in LD between nearby compensatory mutations. Many footprints of selective sweeps were found in lines from northern Sweden, and a massive global sweep was shown to have involved a 700-kb transposition.
Recombination is a major force that shapes genetic diversity. Determination of recombination rate is important and can theoretically be improved by increasing the sample size. However, it is nearly ...impossible to estimate recombination rates using traditional population genetics methods when the sample size is large because these methods are highly computationally demanding. In this study, we used a refined machine learning approach to estimate the recombination rate of the human genome using the UK10K human genomic dataset with 7,562 genomic sequences and its three subsets with 200, 400 and 2,000 genomic sequences. The estimation was performed under the human Out-of-Africa demographic model. We not only obtained an accurate human genetic map, but also found that the fluctuation of estimated recombination rate is reduced along the human genome when the sample size increases. The estimated UK10K recombination rate heterogeneity is less than that estimated from its subsets. Our results demonstrate how the sample size affects the estimated recombination rate, and analyses of a larger number of genomes result in a more precise estimation of recombination rate. The accurate genetic map based on UK10K data set is also expected to benefit other human biology researches.
The merging of distinct genomes, allopolyploidization, is a widespread phenomenon in plants. It generates adaptive potential through increased genetic diversity, but examples demonstrating its ...exploitation remain scarce. White clover (
) is a ubiquitous temperate allotetraploid forage crop derived from two European diploid progenitors confined to extreme coastal or alpine habitats. We sequenced and assembled the genomes and transcriptomes of this species complex to gain insight into the genesis of white clover and the consequences of allopolyploidization. Based on these data, we estimate that white clover originated ∼15,000 to 28,000 years ago during the last glaciation when alpine and coastal progenitors were likely colocated in glacial refugia. We found evidence of progenitor diversity carryover through multiple hybridization events and show that the progenitor subgenomes have retained integrity and gene expression activity as they traveled within white clover from their original confined habitats to a global presence. At the transcriptional level, we observed remarkably stable subgenome expression ratios across tissues. Among the few genes that show tissue-specific switching between homeologous gene copies, we found flavonoid biosynthesis genes strongly overrepresented, suggesting an adaptive role of some allopolyploidy-associated transcriptional changes. Our results highlight white clover as an example of allopolyploidy-facilitated niche expansion, where two progenitor genomes, adapted and confined to disparate and highly specialized habitats, expanded to a ubiquitous global presence after glaciation-associated allopolyploidization.
Abstract
Medicinal plants have garnered significant attention in ethnomedicine and traditional medicine due to their potential antitumor, anti-inflammatory and antioxidant properties. Recent ...advancements in genome sequencing and synthetic biology have revitalized interest in natural products. Despite the availability of sequenced genomes and transcriptomes of these plants, the absence of publicly accessible gene annotations and tabular formatted gene expression data has hindered their effective utilization. To address this pressing issue, we have developed IMP (Integrated Medicinal Plantomics), a freely accessible platform at https://www.bic.ac.cn/IMP. IMP curated a total of 8 565 672 genes for 84 high-quality genome assemblies, and 2156 transcriptome sequencing samples encompassing various organs, tissues, developmental stages and stimulations. With the integrated 10 analysis modules, users could simply examine gene annotations, sequences, functions, distributions and expressions in IMP in a one-stop mode. We firmly believe that IMP will play a vital role in enhancing the understanding of molecular metabolic pathways in medicinal plants or plants with medicinal benefits, thereby driving advancements in synthetic biology, and facilitating the exploration of natural sources for valuable chemical constituents like drug discovery and drug production.
Graphical Abstract
Graphical Abstract
Citrus breeding programs have many limitations associated with the species biology and physiology, requiring the incorporation of new biotechnological tools to provide new breeding possibilities. ...Diversity Arrays Technology (DArT) markers, combined with next-generation sequencing, have wide applicability in the construction of high-resolution genetic maps and in quantitative trait locus (QTL) mapping. This study aimed to construct an integrated genetic map using full-sib progeny derived from Murcott tangor and Pera sweet orange and DArTseq™ molecular markers and to perform QTL mapping of twelve fruit quality traits. A controlled Murcott x Pera crossing was conducted at the Citrus Germplasm Repository at the Sylvio Moreira Citrus Centre of the Agronomic Institute (IAC) located in Cordeirópolis, SP, in 1997. In 2012, 278 F
individuals out of a family of 312 confirmed hybrid individuals were analyzed for fruit traits and genotyped using the DArTseq markers. Using OneMap software to obtain the integrated genetic map, we considered only the DArT loci that showed no segregation deviation. The likelihood ratio and the genomic information from the available Citrus sinensis L. Osbeck genome were used to determine the linkage groups (LGs).
The resulting integrated map contained 661 markers in 13 LGs, with a genomic coverage of 2,774 cM and a mean density of 0.23 markers/cM. The groups were assigned to the nine Citrus haploid chromosomes; however, some of the chromosomes were represented by two LGs due the lack of information for a single integration, as in cases where markers segregated in a 3:1 fashion. A total of 19 QTLs were identified through composite interval mapping (CIM) of the 12 analyzed fruit characteristics: fruit diameter (cm), height (cm), height/diameter ratio, weight (g), rind thickness (cm), segments per fruit, total soluble solids (TSS, %), total titratable acidity (TTA, %), juice content (%), number of seeds, TSS/TTA ratio and number of fruits per box. The genomic sequence (pseudochromosomes) of C. sinensis was compared to the genetic map, and synteny was clearly identified. Further analysis of the map regions with the highest LOD scores enabled the identification of putative genes that could be associated with the fruit quality characteristics.
An integrated linkage map of Murcott tangor and Pera sweet orange using DArTseq™ molecular markers was established and it was useful to perform QTL mapping of twelve fruit quality traits. The next generation sequences data allowed the comparison between the linkage map and the genomic sequence (pseudochromosomes) of C. sinensis and the identification of genes that may be responsible for phenotypic traits in Citrus. The obtained linkage map was used to assign sequences that had not been previously assigned to a position in the reference genome.
Maize (Zea mays L.) is one of the most important cereal crops and a model for the study of genetics, evolution, and domestication. To better understand maize genome organization and to build a ...framework for genome sequencing, we constructed a sequence-ready fingerprinted contig-based physical map that covers 93.5% of the genome, of which 86.1% is aligned to the genetic map. The fingerprinted contig map contains 25,908 genic markers that enabled us to align nearly 73% of the anchored maize genome to the rice genome. The distribution pattern of expressed sequence tags correlates to that of recombination. In collinear regions, 1 kb in rice corresponds to an average of 3.2 kb in maize, yet maize has a 6-fold genome size expansion. This can be explained by the fact that most rice regions correspond to two regions in maize as a result of its recent polyploid origin. Inversions account for the majority of chromosome structural variations during subsequent maize diploidization. We also find clear evidence of ancient genome duplication predating the divergence of the progenitors of maize and rice. Reconstructing the paleoethnobotany of the maize genome indicates that the progenitors of modern maize contained ten chromosomes.