Abstract
The PathoSystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center funded by the National Institute of Allergy and Infectious Diseases ...(https://www.patricbrc.org). PATRIC supports bioinformatic analyses of all bacteria with a special emphasis on pathogens, offering a rich comparative analysis environment that provides users with access to over 250 000 uniformly annotated and publicly available genomes with curated metadata. PATRIC offers web-based visualization and comparative analysis tools, a private workspace in which users can analyze their own data in the context of the public collections, services that streamline complex bioinformatic workflows and command-line tools for bulk data analysis. Over the past several years, as genomic and other omics-related experiments have become more cost-effective and widespread, we have observed considerable growth in the usage of and demand for easy-to-use, publicly available bioinformatic tools and services. Here we report the recent updates to the PATRIC resource, including new web-based comparative analysis tools, eight new services and the release of a command-line interface to access, query and analyze data.
The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and ...some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by 'virtual integration' to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics.
Chromosomes are hierarchically folded within cell nuclei into territories, domains and subdomains, but the functional importance and evolutionary dynamics of these hierarchies are poorly defined. ...Here, we comprehensively profile genome organizations of five Anopheles mosquito species and show how different levels of chromatin architecture influence each other. Patterns observed on Hi-C maps are associated with known cytological structures, epigenetic profiles, and gene expression levels. Evolutionary analysis reveals conservation of chromatin architecture within synteny blocks for tens of millions of years and enrichment of synteny breakpoints in regions with increased genomic insulation. However, in-depth analysis shows a confounding effect of gene density on both insulation and distribution of synteny breakpoints, suggesting limited causal relationship between breakpoints and regions with increased genomic insulation. At the level of individual loci, we identify specific, extremely long-ranged looping interactions, conserved for ~100 million years. We demonstrate that the mechanisms underlying these looping contacts differ from previously described Polycomb-dependent interactions and clustering of active chromatin.
The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of ...Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.
We previously showed that
, a primary signal expressed from the Y chromosome, is a strong candidate for a male-determining factor that confers female-specific lethality in
(Criscione et al., 2016). ...Here, we present evidence that
increases X gene expression in
-transgenic females from two independent lines, providing a mechanism underlying the
-conferred female lethality. The median level gene expression (MGE) of X-linked genes is significantly higher than autosomal genes in
-transgenic females while there is no significant difference in MGE between X and autosomal genes in wild-type females. Furthermore,
significantly upregulates at least 40% of the 996 genes across the X chromosome in transgenic females.
-conferred female-specific lethality is remarkably stable and completely penetrant. These findings indicate that
regulates dosage compensation in
and components of dosage compensation may be explored to develop novel strategies to control mosquito-borne diseases.
We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational ...Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from http://www.bionlp-st.org and the tasks continue as open challenges for all interested parties.
We've developed a highly curated bacterial virulence factor (VF) library in PATRIC (Pathosystems Resource Integration Center, www.patricbrc.org) to support infectious disease research. Although ...several VF databases are available, there is still a need to incorporate new knowledge found in published experimental evidence and integrate these data with other information known for these specific VF genes, including genomic and other omics data. This integration supports the identification of VFs, comparative studies and hypothesis generation, which facilitates the understanding of virulence and pathogenicity.
We have manually curated VFs from six prioritized NIAID (National Institute of Allergy and Infectious Diseases) category A-C bacterial pathogen genera, Mycobacterium, Salmonella, Escherichia, Shigella, Listeria and Bartonella, using published literature. This curated information on virulence has been integrated with data from genomic functional annotations, trancriptomic experiments, protein-protein interactions and disease information already present in PATRIC. Such integration gives researchers access to a broad array of information about these individual genes, and also to a suite of tools to perform comparative genomic and transcriptomics analysis that are available at PATRIC.
All tools and data are freely available at PATRIC (http://patricbrc.org).
Supplementary data are available at Bioinformatics online.
Abstract
The Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org) is designed to provide researchers with the tools and services that they need to perform genomic and other ‘omic’ ...data analyses. In response to mounting concern over antimicrobial resistance (AMR), the PATRIC team has been developing new tools that help researchers understand AMR and its genetic determinants. To support comparative analyses, we have added AMR phenotype data to over 15 000 genomes in the PATRIC database, often assembling genomes from reads in public archives and collecting their associated AMR panel data from the literature to augment the collection. We have also been using this collection of AMR metadata to build machine learning-based classifiers that can predict the AMR phenotypes and the genomic regions associated with resistance for genomes being submitted to the annotation service. Likewise, we have undertaken a large AMR protein annotation effort by manually curating data from the literature and public repositories. This collection of 7370 AMR reference proteins, which contains many protein annotations (functional roles) that are unique to PATRIC and RAST, has been manually curated so that it projects stably across genomes. The collection currently projects to 1 610 744 proteins in the PATRIC database. Finally, the PATRIC Web site has been expanded to enable AMR-based custom page views so that researchers can easily explore AMR data and design experiments based on whole genomes or individual genes.
Salmonella enterica pathogenicity island 1 (SPI-1) encodes proteins required for invasion of gut epithelial cells. The timing of invasion is tightly controlled by a complex regulatory network. The ...transcription factor (TF) HilD is the master regulator of this process and senses environmental signals associated with invasion. HilD activates transcription of genes within and outside SPI-1, including six other TFs. Thus, the transcriptional program associated with host cell invasion is controlled by at least 7 TFs. However, very few of the regulatory targets are known for these TFs, and the extent of the regulatory network is unclear. In this study, we used complementary genomic approaches to map the direct regulatory targets of all 7 TFs. Our data reveal a highly complex and interconnected network that includes many previously undescribed regulatory targets. Moreover, the network extends well beyond the 7 TFs, due to the inclusion of many additional TFs and noncoding RNAs. By comparing gene expression profiles of regulatory targets for the 7 TFs, we identified many uncharacterized genes that are likely to play direct roles in invasion. We also uncovered cross talk between SPI-1 regulation and other regulatory pathways, which, in turn, identified gene clusters that likely share related functions. Our data are freely available through an intuitive online browser and represent a valuable resource for the bacterial research community.
Invasion of epithelial cells is an early step during infection by Salmonella enterica and requires secretion of specific proteins into host cells via a type III secretion system (T3SS). Most T3SS-associated proteins required for invasion are encoded in a horizontally acquired genomic locus known as Salmonella pathogenicity island 1 (SPI-1). Multiple regulators respond to environmental signals to ensure appropriate timing of SPI-1 gene expression. In particular, there are seven transcription regulators that are known to be involved in coordinating expression of SPI-1 genes. We have used complementary genome-scale approaches to map the gene targets of these seven regulators. Our data reveal a highly complex and interconnected regulatory network that includes many previously undescribed target genes. Moreover, our data functionally implicate many uncharacterized genes in the invasion process and reveal cross talk between SPI-1 regulation and other regulatory pathways. All datasets are freely available through an intuitive online browser.