To characterize the role of the circadian clock in mouse physiology and behavior, we used RNA-seq and DNA arrays to quantify the transcriptomes of 12 mouse organs over time. We found 43% of all ...protein coding genes showed circadian rhythms in transcription somewhere in the body, largely in an organ-specific manner. In most organs, we noticed the expression of many oscillating genes peaked during transcriptional "rush hours" preceding dawn and dusk. Looking at the genomic landscape of rhythmic genes, we saw that they clustered together, were longer, and had more spiiceforms than nonoscillating genes. Systems-level analysis revealed intricate rhythmic orchestration of gene pathways throughout the body. We also found oscillations in the expression of more than 1,000 known and novel noncoding RNAs (ncRNAs). Supporting their potential role in mediating clock function, ncRNAs conserved between mouse and human showed rhythmic expression in similar proportions as protein coding genes. Importantly, we also found that the majority of best-selling drugs and World Health Organization essential medicines directly target the products of rhythmic genes. Many of these drugs have short half-lives and may benefit from timed dosage. In sum, this study highlights critical, systemic, and surprising roles of the mammalian circadian clock and provides a blueprint for advancement in chronotherapy.
Cotton is an important natural fiber crop, however, its comprehensive and high-resolution gene map is lacking. Here we integrate four complementary high-throughput techniques, including Pacbio long ...read Iso-seq, strand-specific RNA-seq, CAGE-seq, and PolyA-seq, to systematically explore the transcription landscape across 16 tissues or different organ types in Gossypium arboreum. We devise a computational pipeline, named IGIA, to reconstruct accurate gene structures from the integrated data. Our results reveal a dynamic and diverse transcriptional map in cotton: tissue-specific gene expression, alternative usage of TSSs and polyadenylation sites, hotspot of alternative splicing, and transcriptional read-through. These regulated events affect many genes in various aspects such as gain or loss of functional RNA motifs and protein domains, fine-tuning of DNA binding activity, and co-regulation for genes in the same complex or pathway. The methods and findings provide valuable resources for further functional genomic studies such as understanding natural SNP variations for plant community.
Since the first half of the twentieth century, evolutionary theory has been dominated by the idea that mutations occur randomly with respect to their consequences
. Here we test this assumption with ...large surveys of de novo mutations in the plant Arabidopsis thaliana. In contrast to expectations, we find that mutations occur less often in functionally constrained regions of the genome-mutation frequency is reduced by half inside gene bodies and by two-thirds in essential genes. With independent genomic mutation datasets, including from the largest Arabidopsis mutation accumulation experiment conducted to date, we demonstrate that epigenomic and physical features explain over 90% of variance in the genome-wide pattern of mutation bias surrounding genes. Observed mutation frequencies around genes in turn accurately predict patterns of genetic polymorphisms in natural Arabidopsis accessions (r = 0.96). That mutation bias is the primary force behind patterns of sequence evolution around genes in natural accessions is supported by analyses of allele frequencies. Finally, we find that genes subject to stronger purifying selection have a lower mutation rate. We conclude that epigenome-associated mutation bias
reduces the occurrence of deleterious mutations in Arabidopsis, challenging the prevailing paradigm that mutation is a directionless force in evolution.
A large number of putative cis-regulatory sequences have been annotated in the human genome, but the genes they control remain poorly defined. To bridge this gap, we generate maps of long-range ...chromatin interactions centered on 18,943 well-annotated promoters for protein-coding genes in 27 human cell/tissue types. We use this information to infer the target genes of 70,329 candidate regulatory elements and suggest potential regulatory function for 27,325 noncoding sequence variants associated with 2,117 physiological traits and diseases. Integrative analysis of these promoter-centered interactome maps reveals widespread enhancer-like promoters involved in gene regulation and common molecular pathways underlying distinct groups of human traits and diseases.
Pathway enrichment analysis helps researchers gain mechanistic insight into gene lists generated from genome-scale (omics) experiments. This method identifies biological pathways that are enriched in ...a gene list more than would be expected by chance. We explain the procedures of pathway enrichment analysis and present a practical step-by-step guide to help interpret gene lists resulting from RNA-seq and genome-sequencing experiments. The protocol comprises three major steps: definition of a gene list from omics data, determination of statistically enriched pathways, and visualization and interpretation of the results. We describe how to use this protocol with published examples of differentially expressed genes and mutated cancer genes; however, the principles can be applied to diverse types of omics data. The protocol describes innovative visualization techniques, provides comprehensive background and troubleshooting guidelines, and uses freely available and frequently updated software, including g:Profiler, Gene Set Enrichment Analysis (GSEA), Cytoscape and EnrichmentMap. The complete protocol can be performed in ~4.5 h and is designed for use by biologists with no prior bioinformatics training.
Salmonella enterica subsp. enterica serovar Typhimurium (S. Typhimurium) is one of the most important foodborne pathogens that infect humans globally. The gastrointestinal tracts of animals like ...pigs, poultry or cattle are the main reservoirs of Salmonella serotypes. Guinea pig meat is an important protein source for Andean countries, but this animal is commonly infected by S. Typhimurium, producing high mortality rates and generating economic losses. Despite its impact on human health, food security, and economy, there is no genomic information about the S. Typhimurium responsible for the guinea pig infections in Peru. Here, we sequence and characterize 11 S. Typhimurium genomes isolated from guinea pigs from four farms in Lima-Peru. We were able to identify two genetic clusters (HC100_9460 and HC100_9757) distinguishable at the H100 level of the Hierarchical Clustering of Core Genome Multi-Locus Sequence Typing (HierCC-cgMLST) scheme with an average of 608 SNPs of distance. All sequences belonged to sequence type 19 (ST19) and HC100_9460 isolates were typed in silico as monophasic variants (1,4,5,12:i:-) lacking the fljA and fljB genes. Phylogenomic analysis showed that human isolates from Peru were located within the same genetic clusters as guinea pig isolates, suggesting that these lineages can infect both hosts. We identified a genetic antimicrobial resistance cassette carrying the ant(3)-Ia, dfrA15, qacE, and sul1 genes associated with transposons TnAs3 and IS21 within an IncI1 plasmid in one guinea pig isolate, while antimicrobial resistance genes (ARGs) for β-lactam (blasub.CTX-M-65) and colistin (mcr-1) resistance were detected in Peruvian human-derived isolates. The presence of a virulence plasmid highly similar to the pSLT plasmid (LT2 reference strain) containing the spvRABCD operon was found in all guinea pig isolates. Finally, seven phage sequences (STGP_Φ1 to STGP_Φ7) were identified in guinea pig isolates, distributed according to the genetic lineage (H50 clusters level) and forming part of the specific gene content of each cluster. This study presents, for the first time, the genomic characteristics of S. Typhimurium isolated from guinea pigs in South America, showing particular diversity and genetic elements (plasmids and prophages) that require special attention and also broader studies in different periods of time and locations to determine their impact on human health.
B chromosomes (Bs) are supernumerary, dispensable parts of the nuclear genome, which appear in many different species of eukaryote. So far, Bs have been considered to be genetically inert elements ...without any functional genes.
Our comparative transcriptome analysis and the detection of active RNA polymerase II (RNAPII) in the proximity of B chromatin demonstrate that the Bs of rye (Secale cereale) contribute to the transcriptome. In total, 1954 and 1218 B-derived transcripts with an open reading frame were expressed in generative and vegetative tissues, respectively. In addition to B-derived transposable element transcripts, a high percentage of short transcripts without detectable similarity to known proteins and gene fragments from A chromosomes (As) were found, suggesting an ongoing gene erosion process.
In vitro analysis of the A- and B-encoded AGO4B protein variants demonstrated that both possess RNA slicer activity. These data demonstrate unambiguously the presence of a functional AGO4B gene on Bs and that these Bs carry both functional protein coding genes and pseudogene copies.
Thus, B-encoded genes may provide an additional level of gene control and complexity in combination with their related A-located genes. Hence, physiological effects, associated with the presence of Bs, may partly be explained by the activity of B-located (pseudo)genes.
Nitrate is a nutrient signal that triggers complex regulation of transcriptional networks to modulate nutrient-dependent growth and development in plants. This includes time- and nitrate ...concentration-dependent regulation of nitrate-related gene expression. However, the underlying mechanisms remain poorly understood. Here we identify NIGT1 transcriptional repressors as negative regulators of the Arabidopsis NRT2.1 nitrate transporter gene, and show antagonistic regulation by NLP primary transcription factors for nitrate signalling and the NLP-NIGT1 transcriptional cascade-mediated repression. This antagonistic regulation provides a resolution to the complexity of nitrate-induced transcriptional regulations. Genome-wide analysis reveals that this mechanism is applicable to NRT2.1 and other genes involved in nitrate assimilation, hormone biosynthesis and transcription. Furthermore, the PHR1 master regulator of the phosphorus-starvation response also directly promotes expression of NIGT1 family genes, leading to reductions in nitrate uptake. NIGT1 repressors thus act in two transcriptional cascades, forming a direct link between phosphorus and nitrogen nutritional regulation.
Although the response of plants exposed to severe drought stress has been studied extensively, little is known about how plants adapt their growth under mild drought stress conditions. Here, we ...analyzed the leaf and rosette growth response of six Arabidopsis (Arabidopsis thaliana) accessions originating from different geographic regions when exposed to mild drought stress. The automated phenotyping platform WIWAM was used to impose stress early during leaf development, when the third leaf emerges from the shoot apical meristem. Analysis of growth-related phenotypes showed differences in leaf development between the accessions. In all six accessions, mild drought stress reduced both leaf pavement cell area and number without affecting the stomatal index. Genome-wide transcriptome analysis (using RNA sequencing) of early developing leaf tissue identified 354 genes differentially expressed under mild drought stress in the six accessions. Our results indicate the existence of a robust response over different genetic backgrounds to mild drought stress in developing leaves. The processes involved in the overall mild drought stress response comprised abscisic acid signaling, proline metabolism, and cell wall adjustments. In addition to these known severe drought-related responses, 87 genes were found to be specific for the response of young developing leaves to mild drought stress.
Abstract
The Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and noncoding RNAs). GO annotations ...cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms. Here, we provide an updated overview of the GO knowledgebase, as well as the efforts of the broad, international consortium of scientists that develops, maintains, and updates the GO knowledgebase. The GO knowledgebase consists of three components: (1) the GO—a computational knowledge structure describing the functional characteristics of genes; (2) GO annotations—evidence-supported statements asserting that a specific gene product has a particular functional characteristic; and (3) GO Causal Activity Models (GO-CAMs)—mechanistic models of molecular “pathways” (GO biological processes) created by linking multiple GO annotations using defined relations. Each of these components is continually expanded, revised, and updated in response to newly published discoveries and receives extensive QA checks, reviews, and user feedback. For each of these components, we provide a description of the current contents, recent developments to keep the knowledgebase up to date with new discoveries, and guidance on how users can best make use of the data that we provide. We conclude with future directions for the project.