The multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level. Here, we studied the interdependence of ...transcription initiation, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing.
In MCF-7 breast cancer cells, we find 2700 genes with interdependent alternative transcription initiation, splicing and polyadenylation events, both in proximal and distant parts of mRNA molecules, including examples of coupling between transcription start sites and polyadenylation sites. The analysis of three human primary tissues (brain, heart and liver) reveals similar patterns of interdependency between transcription initiation and mRNA processing events. We predict thousands of novel open reading frames from full-length mRNA sequences and obtained evidence for their translation by shotgun proteomics. The mapping database rescues 358 previously unassigned peptides and improves the assignment of others. By recognizing sample-specific amino-acid changes and novel splicing patterns, full-length mRNA sequencing improves proteogenomics analysis of MCF-7 cells.
Our findings demonstrate that our understanding of transcriptome complexity is far from complete and provides a basis to reveal largely unresolved mechanisms that coordinate transcription initiation and mRNA processing.
The use of pharmacogenomics in clinical practice is becoming standard of care. However, due to the complex genetic makeup of pharmacogenes, not all genetic variation is currently accounted for. Here, ...we show the utility of long-read sequencing to resolve complex pharmacogenes by analyzing a well-characterised sample. This data consists of long reads that were processed to resolve phased haploblocks. 73% of pharmacogenes were fully covered in one phased haploblock, including 9/15 genes that are 100% complex. Variant calling accuracy in the pharmacogenes was high, with 99.8% recall and 100% precision for SNVs and 98.7% precision and 98.0% recall for Indels. For the majority of gene-drug interactions in the DPWG and CPIC guidelines, the associated genes could be fully resolved (62% and 63% respectively). Together, these findings suggest that long-read sequencing data offers promising opportunities in elucidating complex pharmacogenes and haplotype phasing while maintaining accurate variant calling.
Folsomia candida is a model in soil biology, belonging to the family of Isotomidae, subclass Collembola. It reproduces parthenogenetically in the presence of Wolbachia, and exhibits remarkable ...physiological adaptations to stress. To better understand these features and adaptations to life in the soil, we studied its genome in the context of its parthenogenetic lifestyle.
We applied Pacific Bioscience sequencing and assembly to generate a reference genome for F. candida of 221.7 Mbp, comprising only 162 scaffolds. The complete genome of its endosymbiont Wolbachia, was also assembled and turned out to be the largest strain identified so far. Substantial gene family expansions and lineage-specific gene clusters were linked to stress response. A large number of genes (809) were acquired by horizontal gene transfer. A substantial fraction of these genes are involved in lignocellulose degradation. Also, the presence of genes involved in antibiotic biosynthesis was confirmed. Intra-genomic rearrangements of collinear gene clusters were observed, of which 11 were organized as palindromes. The Hox gene cluster of F. candida showed major rearrangements compared to arthropod consensus cluster, resulting in a disorganized cluster.
The expansion of stress response gene families suggests that stress defense was important to facilitate colonization of soils. The large number of HGT genes related to lignocellulose degradation could be beneficial to unlock carbohydrate sources in soil, especially those contained in decaying plant and fungal organic matter. Intra- as well as inter-scaffold duplications of gene clusters may be a consequence of its parthenogenetic lifestyle. This high quality genome will be instrumental for evolutionary biologists investigating deep phylogenetic lineages among arthropods and will provide the basis for a more mechanistic understanding in soil ecology and ecotoxicology.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
During the transition from sexual to asexual reproduction, a suite of reproduction‐related sexual traits become superfluous, and may be selected against if costly. Female functional virginity refers ...to asexual females resisting to mate or not fertilizing eggs after mating. These traits appear to be among the first that evolve during transitions from sexual to asexual reproduction. The genetic basis of female functional virginity remains elusive. Previously, we reported that female functional virginity segregates as expected for a single recessive locus in the asexual parasitoid wasp Asobara japonica. Here, we investigate the genetic basis of this trait by quantitative trait loci (QTL) mapping and candidate gene analyses. Consistent with the segregation of phenotypes, we found a single QTL of large effect, spanning over 4.23 Mb and comprising at least 131 protein‐coding genes, of which 15 featured sex‐biased expression in the related sexual species Asobara tabida. Two of the 15 sex‐biased genes were previously identified to differ between related sexual and asexual population/species: CD151 antigen and nuclear pore complex protein Nup50. A third gene, hormone receptor 4, is involved in steroid hormone mediated mating behaviour. Overall, our results are consistent with a single locus, or a cluster of closely linked loci, underlying rapid evolution of female functional virginity in the transition to asexuality. Once this variant, causing rejection to mate, has swept through a population, the flanking region does not get smaller owing to lack of recombination in asexuals.
Abstract
Lichens are valuable models in symbiosis research and promising sources of biosynthetic genes for biotechnological applications. Most lichenized fungi grow slowly, resist aposymbiotic ...cultivation, and are poor candidates for experimentation. Obtaining contiguous, high-quality genomes for such symbiotic communities is technically challenging. Here, we present the first assembly of a lichen holo-genome from metagenomic whole-genome shotgun data comprising both PacBio long reads and Illumina short reads. The nuclear genomes of the two primary components of the lichen symbiosis—the fungus Umbilicaria pustulata (33 Mb) and the green alga Trebouxia sp. (53 Mb)—were assembled at contiguities comparable to single-species assemblies. The analysis of the read coverage pattern revealed a relative abundance of fungal to algal nuclei of ∼20:1. Gap-free, circular sequences for all organellar genomes were obtained. The bacterial community is dominated by Acidobacteriaceae and encompasses strains closely related to bacteria isolated from other lichens. Gene set analyses showed no evidence of horizontal gene transfer from algae or bacteria into the fungal genome. Our data suggest a lineage-specific loss of a putative gibberellin-20-oxidase in the fungus, a gene fusion in the fungal mitochondrion, and a relocation of an algal chloroplast gene to the algal nucleus. Major technical obstacles during reconstruction of the holo-genome were coverage differences among individual genomes surpassing three orders of magnitude. Moreover, we show that GC-rich inverted repeats paired with nonrandom sequencing error in PacBio data can result in missing gene predictions. This likely poses a general problem for genome assemblies based on long reads.
Anaerobic ammonium-oxidizing (anammox) bacteria are a group of strictly anaerobic chemolithoautotrophic microorganisms. They are capable of oxidizing ammonium to nitrogen gas using nitrite as a ...terminal electron acceptor, thereby facilitating the release of fixed nitrogen into the atmosphere. The anammox process is thought to exert a profound impact on the global nitrogen cycle and has been harnessed as an environment-friendly method for nitrogen removal from wastewater. In this study, we present the first closed genome sequence of an anammox bacterium, Kuenenia stuttgartiensis MBR1. It was obtained through Single-Molecule Real-Time (SMRT) sequencing of an enrichment culture constituting a mixture of at least two highly similar Kuenenia strains. The genome of the novel MBR1 strain is different from the previously reported Kuenenia KUST reference genome as it contains numerous structural variations and unique genomic regions. We find new proteins, such as a type 3b (sulf)hydrogenase and an additional copy of the hydrazine synthase gene cluster. Moreover, multiple copies of ammonium transporters and proteins regulating nitrogen uptake were identified, suggesting functional differences in metabolism. This assembly, including the genome-wide methylation profile, provides a new foundation for comparative and functional studies aiming to elucidate the biochemical and metabolic processes of these organisms.
Bacillus subtilis strains BS49 and BS34A, both derived from a common ancestor, carry one or more copies of Tn916, an extremely common mobile genetic element capable of transfer to and from a broad ...range of microorganisms. Here, we report the complete genome sequence of BS49 and the draft genome sequence of BS34A, which have repeatedly been used as donors to transfer Tn916, Tn916 derivatives or oriTTn916-containing plasmids to clinically important pathogens.
The genomes of two Bacillus subtilis strains with a common mobile element (Tn916) have been sequenced.
Oculopharyngeal muscular dystrophy (OPMD) is an adult-onset disorder characterized by ptosis, dysphagia and proximal limb weakness. Autosomal-dominant OPMD is caused by a short (GCG)8–13 expansions ...within the first exon of the poly(A)-binding protein nuclear 1 gene (PABPN1), leading to an expanded polyalanine tract in the mutated protein. Expanded PABPN1 forms insoluble aggregates in the nuclei of skeletal muscle fibres. In order to gain insight into the different physiological processes affected in OPMD muscles, we have used a transgenic mouse model of OPMD (A17.1) and performed transcriptomic studies combined with a detailed phenotypic characterization of this model at three time points. The transcriptomic analysis revealed a massive gene deregulation in the A17.1 mice, among which we identified a significant deregulation of pathways associated with muscle atrophy. Using a mathematical model for progression, we have identified that one-third of the progressive genes were also associated with muscle atrophy. Functional and histological analysis of the skeletal muscle of this mouse model confirmed a severe and progressive muscular atrophy associated with a reduction in muscle strength. Moreover, muscle atrophy in the A17.1 mice was restricted to fast glycolytic fibres, containing a large number of intranuclear inclusions (INIs). The soleus muscle and, in particular, oxidative fibres were spared, even though they contained INIs albeit to a lesser degree. These results demonstrate a fibre-type specificity of muscle atrophy in this OPMD model. This study improves our understanding of the biological pathways modified in OPMD to identify potential biomarkers and new therapeutic targets.
It has been postulated that aging is the consequence of an accelerated accumulation of somatic DNA mutations and that subsequent errors in the primary structure of proteins ultimately reach levels ...sufficient to affect organismal functions. The technical limitations of detecting somatic changes and the lack of insight about the minimum level of erroneous proteins to cause an error catastrophe hampered any firm conclusions on these theories. In this study, we sequenced the whole genome of DNA in whole blood of two pairs of monozygotic (MZ) twins, 40 and 100 years old, by two independent next-generation sequencing (NGS) platforms (Illumina and Complete Genomics). Potentially discordant single-base substitutions supported by both platforms were validated extensively by Sanger, Roche 454, and Ion Torrent sequencing. We demonstrate that the genomes of the two twin pairs are germ-line identical between co-twins, and that the genomes of the 100-year-old MZ twins are discerned by eight confirmed somatic single-base substitutions, five of which are within introns. Putative somatic variation between the 40-year-old twins was not confirmed in the validation phase. We conclude from this systematic effort that by using two independent NGS platforms, somatic single nucleotide substitutions can be detected, and that a century of life did not result in a large number of detectable somatic mutations in blood. The low number of somatic variants observed by using two NGS platforms might provide a framework for detecting disease-related somatic variants in phenotypically discordant MZ twins.