VectorBase is a National Institute of Allergy and Infectious Diseases supported Bioinformatics Resource Center (BRC) for invertebrate vectors of human pathogens. Now in its 11th year, VectorBase ...currently hosts the genomes of 35 organisms including a number of non-vectors for comparative analysis. Hosted data range from genome assemblies with annotated gene features, transcript and protein expression data to population genetics including variation and insecticide-resistance phenotypes. Here we describe improvements to our resource and the set of tools available for interrogating and accessing BRC data including the integration of Web Apollo to facilitate community annotation and providing Galaxy to support user-based workflows. VectorBase also actively supports our community through hands-on workshops and online tutorials. All information and data are freely available from our website at https://www.vectorbase.org/.
Abstract
The Eukaryotic Pathogen, Vector and Host Informatics Resource (VEuPathDB, https://veupathdb.org) represents the 2019 merger of VectorBase with the EuPathDB projects. As a Bioinformatics ...Resource Center funded by the National Institutes of Health, with additional support from the Welllcome Trust, VEuPathDB supports >500 organisms comprising invertebrate vectors, eukaryotic pathogens (protists and fungi) and relevant free-living or non-pathogenic species or hosts. Designed to empower researchers with access to Omics data and bioinformatic analyses, VEuPathDB projects integrate >1700 pre-analysed datasets (and associated metadata) with advanced search capabilities, visualizations, and analysis tools in a graphic interface. Diverse data types are analysed with standardized workflows including an in-house OrthoMCL algorithm for predicting orthology. Comparisons are easily made across datasets, data types and organisms in this unique data mining platform. A new site-wide search facilitates access for both experienced and novice users. Upgraded infrastructure and workflows support numerous updates to the web interface, tools, searches and strategies, and Galaxy workspace where users can privately analyse their own data. Forthcoming upgrades include cloud-ready application architecture, expanded support for the Galaxy workspace, tools for interrogating host-pathogen interactions, and improved interactions with affiliated databases (ClinEpiDB, MicrobiomeDB) and other scientific resources, and increased interoperability with the Bacterial & Viral BRC.
Opsins are light sensitive receptors associated with visual processes. Insects typically possess opsins that are stimulated by ultraviolet, short and long wavelength (LW) radiation. Six putative ...LW-sensitive opsins predicted in the yellow fever mosquito, Aedes aegypti and malaria mosquito, Anopheles gambiae, and eight in the southern house mosquito, Culex quinquefasciatus, suggest gene expansion in the Family Culicidae (mosquitoes) relative to other insects. Here we report the first detailed molecular and evolutionary analyses of LW opsins in three mosquito vectors, with a goal to understanding the molecular basis of opsin-mediated visual processes that could be exploited for mosquito control.
Time of divergence estimates suggest that the mosquito LW opsins originated from 18 or 19 duplication events between 166.9/197.5 to 1.07/0.94 million years ago (MY) and that these likely occurred following the predicted divergence of the lineages Anophelinae and Culicinae 145-226 MY. Fitmodel analyses identified nine amino acid residues in the LW opsins that may be under positive selection. Of these, eight amino acids occur in the N and C termini and are shared among all three species, and one residue in TMIII was unique to culicine species. Alignment of 5' non-coding regions revealed potential Conserved Non-coding Sequences (CNS) and transcription factor binding sites (TFBS) in seven pairs of LW opsin paralogs.
Our analyses suggest opsin gene duplication and residues possibly associated with spectral tuning of LW-sensitive photoreceptors. We explore two mechanisms - positive selection and differential expression mediated by regulatory units in CNS - that may have contributed to the retention of LW opsin genes in Culicinae and Anophelinae. We discuss the evolution of mosquito LW opsins in the context of major Earth events and possible adaptation of mosquitoes to LW-dominated photo environments, and implications for mosquito control strategies based on disrupting vision-mediated behaviors.
•VectorBase (VB) integrates diverse data for invertebrates related to human health.•VB provides tools to mine ‘omics and population data.•Develop new hypothesis by mining data using VB’s Search ...Strategy system.•Visualize data across VB with custom graphics, for example, JBrowse and MapVEu.•Analyze and integrate your own data using Galaxy workflows and VB integration.
VectorBase (VectorBase.org) is part of the VEuPathDB Bioinformatics Resource Center, providing free online access to multi-omics and population biology data, focusing on arthropod vectors and invertebrates of importance to human health. VectorBase includes genomics and functional genomics data from bed bugs, biting midges, body lice, kissing bugs, mites, mosquitoes, sand flies, ticks, tsetse flies, stable flies, house flies, fruit flies, and a snail intermediate host. Tools include the Search Strategy system and MapVEu, enabling users to interrogate and visualize diverse ‘omics and population-level data using a graphical interface (no programming experience required). Users can also analyze their own private data, such as transcriptomic sequences, exploring their results in the context of other publicly-available information in the database. Help Desk: help@vectorbase.org.
Background Many neglected tropical infectious diseases affecting humans are transmitted by arthropods such as mosquitoes and ticks. New mode-of-action chemistries are urgently sought to enhance ...vector management practices in countries where arthropod-borne diseases are endemic, especially where vector populations have acquired widespread resistance to insecticides. Methodology/Principal Findings We describe a "genome-to-lead" approach for insecticide discovery that incorporates the first reported chemical screen of a G protein-coupled receptor (GPCR) mined from a mosquito genome. A combination of molecular and pharmacological studies was used to functionally characterize two dopamine receptors (AaDOP1 and AaDOP2) from the yellow fever mosquito, Aedes aegypti. Sequence analyses indicated that these receptors are orthologous to arthropod D1-like (Gαs-coupled) receptors, but share less than 55% amino acid identity in conserved domains with mammalian dopamine receptors. Heterologous expression of AaDOP1 and AaDOP2 in HEK293 cells revealed dose-dependent responses to dopamine (EC50: AaDOP1 = 3.1±1.1 nM; AaDOP2 = 240±16 nM). Interestingly, only AaDOP1 exhibited sensitivity to epinephrine (EC50 = 5.8±1.5 nM) and norepinephrine (EC50 = 760±180 nM), while neither receptor was activated by other biogenic amines tested. Differential responses were observed between these receptors regarding their sensitivity to dopamine agonists and antagonists, level of maximal stimulation, and constitutive activity. Subsequently, a chemical library screen was implemented to discover lead chemistries active at AaDOP2. Fifty-one compounds were identified as "hits," and follow-up validation assays confirmed the antagonistic effect of selected compounds at AaDOP2. In vitro comparison studies between AaDOP2 and the human D1 dopamine receptor (hD1) revealed markedly different pharmacological profiles and identified amitriptyline and doxepin as AaDOP2-selective compounds. In subsequent Ae. aegypti larval bioassays, significant mortality was observed for amitriptyline (93%) and doxepin (72%), confirming these chemistries as "leads" for insecticide discovery. Conclusions/Significance This research provides a "proof-of-concept" for a novel approach toward insecticide discovery, in which genome sequence data are utilized for functional characterization and chemical compound screening of GPCRs. We provide a pipeline useful for future prioritization, pharmacological characterization, and expanded chemical screening of additional GPCRs in disease-vector arthropods. The differential molecular and pharmacological properties of the mosquito dopamine receptors highlight the potential for the identification of target-specific chemistries for vector-borne disease management, and we report the first study to identify dopamine receptor antagonists with in vivo toxicity toward mosquitoes.
is the primary mosquito vector of several human arboviruses, including the dengue virus (DENV). Vector control is the principal intervention to decrease the transmission of these viruses. The ...characterization of molecules involved in the mosquito physiological responses to blood-feeding may help identify novel targets useful in designing effective control strategies. In this study, we evaluated the in vivo effect of feeding adult female mosquitoes with human red blood cells reconstituted with either heat-inactivated (IB) or normal plasma (NB). The RNA-seq based transcript expression of IB and NB mosquitoes was compared against sugar-fed (SF) mosquitoes. In in vitro experiments, we treated Aag2 cells with a recombinant version of complement proteins (hC3 or hC5a) and compared transcript expression to untreated control cells after 24 h. The transcript expression analysis revealed that human complement proteins modulate approximately 2300 transcripts involved in multiple biological functions, including immunity. We also found 161 upregulated and 168 downregulated transcripts differentially expressed when human complement protein C3 (hC3) and human complement protein C5a (hC5a) treated cells were compared to the control untreated cells. We conclude that active human complement induces significant changes to the transcriptome of
mosquitoes, which may influence the physiology of these arthropods.
RNA-Seq is a method for profiling transcription using high-throughput sequencing and is an important component of many research projects that wish to study transcript isoforms, condition specific ...expression and transcriptional structure. The methods, tools and technologies used to perform RNA-Seq analysis continue to change, creating a bioinformatics challenge for researchers who wish to exploit these data. Resources that bring together genomic data, analysis tools, educational material and computational infrastructure can minimize the overhead required of life science researchers.
RNA-Rocket is a free service that provides access to RNA-Seq and ChIP-Seq analysis tools for studying infectious diseases. The site makes available thousands of pre-indexed genomes, their annotations and the ability to stream results to the bioinformatics resources VectorBase, EuPathDB and PATRIC. The site also provides a combination of experimental data and metadata, examples of pre-computed analysis, step-by-step guides and a user interface designed to enable both novice and experienced users of RNA-Seq data.
RNA-Rocket is available at rnaseq.pathogenportal.org. Source code for this project can be found at github.com/cidvbi/PathogenPortal.
anwarren@vt.edu
Supplementary materials are available at Bioinformatics online.
High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are ...enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium's minimal information (MIxS) and NCBI's BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant.
Arthropods play a dominant role in natural and human-modified terrestrial ecosystem dynamics. Spatially-explicit arthropod population time-series data are crucial for statistical or mathematical ...models of these dynamics and assessment of their veterinary, medical, agricultural, and ecological impacts. Such data have been collected world-wide for over a century, but remain scattered and largely inaccessible. In particular, with the ever-present and growing threat of arthropod pests and vectors of infectious diseases, there are numerous historical and ongoing surveillance efforts, but the data are not reported in consistent formats and typically lack sufficient metadata to make reuse and re-analysis possible. Here, we present the first-ever minimum information standard for arthropod abundance, Minimum Information for Reusable Arthropod Abundance Data (MIReAD). Developed with broad stakeholder collaboration, it balances sufficiency for reuse with the practicality of preparing the data for submission. It is designed to optimize data (re)usability from the "FAIR," (Findable, Accessible, Interoperable, and Reusable) principles of public data archiving (PDA). This standard will facilitate data unification across research initiatives and communities dedicated to surveillance for detection and control of vector-borne diseases and pests.
Rhodnius prolixus not only has served as a model organism for the study of insect physiology, but also is a major vector of Chagas disease, an illness that affects approximately seven million people ...worldwide. We sequenced the genome of R. prolixus, generated assembled sequences covering 95% of the genome (∼ 702 Mb), including 15,456 putative protein-coding genes, and completed comprehensive genomic analyses of this obligate blood-feeding insect. Although immune-deficiency (IMD)-mediated immune responses were observed, R. prolixus putatively lacks key components of the IMD pathway, suggesting a reorganization of the canonical immune signaling network. Although both Toll and IMD effectors controlled intestinal microbiota, neither affected Trypanosoma cruzi, the causal agent of Chagas disease, implying the existence of evasion or tolerance mechanisms. R. prolixus has experienced an extensive loss of selenoprotein genes, with its repertoire reduced to only two proteins, one of which is a selenocysteine-based glutathione peroxidase, the first found in insects. The genome contained actively transcribed, horizontally transferred genes from Wolbachia sp., which showed evidence of codon use evolution toward the insect use pattern. Comparative protein analyses revealed many lineage-specific expansions and putative gene absences in R. prolixus, including tandem expansions of genes related to chemoreception, feeding, and digestion that possibly contributed to the evolution of a blood-feeding lifestyle. The genome assembly and these associated analyses provide critical information on the physiology and evolution of this important vector species and should be instrumental for the development of innovative disease control methods.