Background: Next generation sequencing is transforming our understanding of transcriptomes. It can determine the expression level of transcripts with a dynamic range of over six orders of magnitude ...from multiple tissues, developmental stages or conditions. Patterns of gene expression provide insight into functions of genes with unknown annotation. Results: The RNA Seq-Atlas presented here provides a record of high-resolution gene expression in a set of fourteen diverse tissues. Hierarchical clustering of transcriptional profiles for these tissues suggests three clades with similar profiles: aerial, underground and seed tissues. We also investigate the relationship between gene structure and gene expression and find a correlation between gene length and expression. Additionally, we find dramatic tissue-specific gene expression of both the most highly-expressed genes and the genes specific to legumes in seed development and nodule tissues. Analysis of the gene expression profiles of over 2,000 genes with preferential gene expression in seed suggests there are more than 177 genes with functional roles that are involved in the economically important seed filling process. Finally, the Seq-atlas also provides a means of evaluating existing gene model annotations for the Glycine max genome. Conclusions: This RNA-Seq atlas extends the analyses of previous gene expression atlases performed using Affymetrix GeneChip technology and provides an example of new methods to accommodate the increase in transcriptome data obtained from next generation sequencing. Data contained within this RNA-Seq atlas of Glycine max can be explored at http://www.soybase.org/soyseq.
Cas12a (Cpf1) is an RNA-guided endonuclease in the bacterial type V-A CRISPR-Cas anti-phage immune system that can be repurposed for genome editing. Cas12a can bind and cut dsDNA targets with high ...specificity in vivo, making it an ideal candidate for expanding the arsenal of enzymes used in precise genome editing. However, this reported high specificity contradicts Cas12a's natural role as an immune effector against rapidly evolving phages. Here, we employed high-throughput in vitro cleavage assays to determine and compare the native cleavage specificities and activities of three different natural Cas12a orthologs (FnCas12a, LbCas12a, and AsCas12a). Surprisingly, we observed pervasive sequence-specific nicking of randomized target libraries, with strong nicking of DNA sequences containing up to four mismatches in the Cas12a-targeted DNA-RNA hybrid sequences. We also found that these nicking and cleavage activities depend on mismatch type and position and vary with Cas12a ortholog and CRISPR RNA sequence. Our analysis further revealed robust nonspecific nicking of dsDNA when Cas12a is activated by binding to a target DNA. Together, our findings reveal that Cas12a has multiple nicking activities against dsDNA substrates and that these activities vary among different Cas12a orthologs.
Abstract
Motivation
As the cost of sequencing decreases, the amount of data being deposited into public repositories is increasing rapidly. Public databases rely on the user to provide metadata for ...each submission that is prone to user error. Unfortunately, most public databases, such as non-redundant (NR), rely on user input and do not have methods for identifying errors in the provided metadata, leading to the potential for error propagation. Previous research on a small subset of the NR database analyzed misclassification based on sequence similarity. To the best of our knowledge, the amount of misclassification in the entire database has not been quantified. We propose a heuristic method to detect potentially misclassified taxonomic assignments in the NR database. We applied a curation technique and quality control to find the most probable taxonomic assignment. Our method incorporates provenance and frequency of each annotation from manually and computationally created databases and clustering information at 95% similarity.
Results
We found more than two million potentially taxonomically misclassified proteins in the NR database. Using simulated data, we show a high precision of 97% and a recall of 87% for detecting taxonomically misclassified proteins. The proposed approach and findings could also be applied to other databases.
Availability and implementation
Source code, dataset, documentation, Jupyter notebooks and Docker container are available at https://github.com/boalang/nr.
Supplementary information
Supplementary data are available at Bioinformatics online.
The rapid pace of development for hybrid perovskite photovoltaics has recently resulted in promising figures of merit being obtained with regard to device stability. Rather than relying upon ...expensive barrier materials, realizing market‐competitive lifetimes is likely to require the development of intrinsically stable devices, and to this end accelerated aging tests can help identify degradation mechanisms that arise over the long term. Here, oxygen‐induced degradation of archetypal perovskite solar cells under operation is observed, even in dry conditions. With prolonged aging, this process ultimately drives decomposition of the perovskite. It is deduced that this is related to charge build‐up in the perovskite layer, and it is shown that by efficiently extracting charge this degradation can be mitigated. The results confirm the importance of high charge‐extraction efficiency in maximizing the tolerance of perovskite solar cells to oxygen.
Key to the development of perovskite photovoltaics is the mitigation of long‐term degradation mechanisms. When aging these solar cells in the presence of oxygen, two stages of degradation are evidenced that drive perovskite decomposition. This damage is coupled to the average density of charge within the perovskite, highlighting the need to maximize charge extraction efficiency when designing stable devices.
Comparative genomics of social insects has been intensely pursued in recent years with the goal of providing insights into the evolution of social behaviour and its underlying genomic and epigenomic ...basis. However, the comparative approach has been hampered by a paucity of data on some of the most informative social forms (e.g. incipiently and primitively social) and taxa (especially members of the wasp family Vespidae) for studying social evolution. Here, we provide a draft genome of the primitively eusocial model insect Polistes dominula, accompanied by analysis of caste‐related transcriptome and methylome sequence data for adult queens and workers. Polistes dominula possesses a fairly typical hymenopteran genome, but shows very low genomewide GC content and some evidence of reduced genome size. We found numerous caste‐related differences in gene expression, with evidence that both conserved and novel genes are related to caste differences. Most strikingly, these –omics data reveal a major reduction in one of the major epigenetic mechanisms that has been previously suggested to be important for caste differences in social insects: DNA methylation. Along with a conspicuous loss of a key gene associated with environmentally responsive DNA methylation (the de novo DNA methyltransferase Dnmt3), these wasps have greatly reduced genomewide methylation to almost zero. In addition to providing a valuable resource for comparative analysis of social insect evolution, our integrative –omics data for this important behavioural and evolutionary model system call into question the general importance of DNA methylation in caste differences and evolution in social insects.
Heterodera glycines, commonly referred to as the soybean cyst nematode (SCN), is an obligatory and sedentary plant parasite that causes over a billion-dollar yield loss to soybean production ...annually. Although there are genetic determinants that render soybean plants resistant to certain nematode genotypes, resistant soybean cultivars are increasingly ineffective because their multi-year usage has selected for virulent H. glycines populations. The parasitic success of H. glycines relies on the comprehensive re-engineering of an infection site into a syncytium, as well as the long-term suppression of host defense to ensure syncytial viability. At the forefront of these complex molecular interactions are effectors, the proteins secreted by H. glycines into host root tissues. The mechanisms of effector acquisition, diversification, and selection need to be understood before effective control strategies can be developed, but the lack of an annotated genome has been a major roadblock.
Here, we use PacBio long-read technology to assemble a H. glycines genome of 738 contigs into 123 Mb with annotations for 29,769 genes. The genome contains significant numbers of repeats (34%), tandem duplicates (18.7 Mb), and horizontal gene transfer events (151 genes). A large number of putative effectors (431 genes) were identified in the genome, many of which were found in transposons.
This advance provides a glimpse into the host and parasite interplay by revealing a diversity of mechanisms that give rise to virulence genes in the soybean cyst nematode, including: tandem duplications containing over a fifth of the total gene count, virulence genes hitchhiking in transposons, and 107 horizontal gene transfers not reported in other plant parasitic nematodes thus far. Through extensive characterization of the H. glycines genome, we provide new insights into H. glycines biology and shed light onto the mystery underlying complex host-parasite interactions. This genome sequence is an important prerequisite to enable work towards generating new resistance or control measures against H. glycines.
The assembly and annotation of a genome is a valuable resource for a species, with applications ranging from conservation genomics to gene discovery. Genomic resource development is especially ...important for species in culture, such as the California Yellowtail (Seriola dorsalis), the likely candidate for the establishment of commercial offshore aquaculture production in southern California. Genomic resource development for this species will improve the understanding of sex and other phenotypic traits, and allow for rapid increases in genetic improvement for and economic gain in culture production.
We describe the assembly and annotation of the S. dorsalis genome, and present resequencing data from 45 male and 45 female wild-caught S. dorsalis used to identify a sex-determining region and marker in this species. The genome assembly captured approximately 93% of the total 685 MB genome with an average coverage depth of 180×. Using the assembled genome, resequencing data from the 90 fish were aligned to place boundaries on the sex-determining region. Sex-specific markers were developed based on a female-specific, 61 nucleotide deletion identified in that region. We hypothesize that Estradiol 17-beta-dehydrogenase is the putative sex-determining gene and propose a plausible genetic mechanism for ZW sex determination in S. dorsalis involving a female-specific deletion of a transcription factor binding motif that may be targeted by Sox3.
Understanding the mechanism of sex determination and development of assays to determine sex is critical both for management of wild fisheries and for development of efficient and sustainable aquaculture practices. In addition, this genome assembly for S. dorsalis will be a substantial resource for a variety of future research applications.
A fully assembled spirochaete genome was identified as a contaminating scaffold in our red abalone (
) genome assembly. In this paper, we describe the analysis of this bacterial genome. The assembled ...spirochaete genome is 3.25 Mb in size with 48.5 mol% G+C content. The proteomes of 38 species were compared with the spirochaete genome and it was discovered to form an independent branch within the family
on the phylogenetic tree. The comparison of 16S rRNA sequences and average nucleotide identity scores between the spirochaete genome with known species of different families in
indicate that it is an unknown species. Further, the percentage of conserved proteins compared to neighbouring taxa confirm that it does not belong to a known genus within
. We propose the name
Haliotispira prima gen. nov., sp. nov. based on its taxonomic placement and origin. We also tested for the presence of this species in different species of abalone and found that it is also present in white abalone (
). In addition, we highlight the need for better classification of taxa within the class
.
Cross-talk between the gut microbiota and neurochemicals affects health and well-being of animals. However, little is known about this interaction in chickens despite their importance in food ...production. Probiotics and live
vaccines are microbial products commonly given orally to layer pullets to improve health and ensure food safety. This study's objective was to determine how these oral treatments, individually or in combination, would impact the gut environment of chickens. White Leghorn chicks were either non-treated (CON) or orally given probiotics (PRO), a recombinant attenuated
vaccine (RASV; VAX), or both (P+V). Birds were fed with probiotics daily beginning at 1-day-old and orally immunized with RASV at 4-days-old and boosted 2 weeks post-primary vaccination. At 5 weeks, ceca content, ceca tissues, and small intestinal scrapings (SISs) were collected from ten birds/group post-euthanasia for analyses. Catecholamine, but not serotonergic, metabolism was affected by treatments. Dopamine metabolism, indicated by L-DOPA and DOPAC levels, were increased in P+V birds versus CON and PRO birds. Based on 16S sequencing, beta diversity was more similar among vaccinated birds versus birds given probiotics, suggesting live
vaccination has a major selective pressure on microbial diversity. Abundances of
and Enterobacteriaceae positively correlated with levels of tyrosine and norepinephrine, respectively. Both enumeration and 16S sequencing, determined that PRO exhibited the greatest levels of Enterobacteriaceae in the ceca and feces, which was associated with greater IgA production against
virulence factors as tested by ELISA. In summary, we demonstrate that using probiotics alone versus in combination with a live vaccine has major implications in catecholamine production and the microbiota of layer pullets. Additionally, unique correlations between changes in some neurochemicals and specific bacteria have been shown.