eXamine is a Cytoscape app that displays set membership as contours on top of a node-link layout of a small graph. In addition to facilitating interpretation of enriched gene sets of small biological ...networks, eXamine can be used in other domains such as the visualization of communities in small social networks. eXamine was made available on the Cytoscape App Store in March 2014, has since registered more than 7,700 downloads, and has been highly rated by more than 25 users. In this paper, we present eXamine's new automation features that enable researchers to compose reproducible analysis workflows to generate visualizations of small, set-annotated graphs.
The understanding of molecular processes involved in a specific biological system can be significantly improved by combining and comparing different data sets and knowledge resources. However, these ...information sources often use different identification systems and an identifier conversion step is required before any integration effort. Mapping between identifiers is often provided by the reference information resources and several tools have been implemented to simplify their use. However, most of these tools do not combine the information provided by individual resources to increase the completeness of the mapping process. Also, deprecated identifiers from former versions of databases are not taken into account. Finally, finding automatically the most relevant path to map identifiers from one scope to the other is often not trivial. The Biological Entity Dictionary (BED) addresses these three challenges by relying on a graph data model describing possible relationships between entities and their identifiers. This model has been implemented using Neo4j and an R package provides functions to query the graph but also to create and feed a custom instance of the database. This design combined with a local installation of the graph database and a cache system make BED very efficient to convert large lists of identifiers.
The PubMLST.org website hosts a collection of open-access, curated databases that integrate population sequence data with provenance and phenotype information for over 100 different microbial species ...and genera. Although the PubMLST website was conceived as part of the development of the first multi-locus sequence typing (MLST) scheme in 1998 the software it uses, the Bacterial Isolate Genome Sequence database (BIGSdb, published in 2010), enables PubMLST to include all levels of sequence data, from single gene sequences up to and including complete, finished genomes. Here we describe developments in the BIGSdb software made from publication to June 2018 and show how the platform realises microbial population genomics for a wide range of applications. The system is based on the gene-by-gene analysis of microbial genomes, with each deposited sequence annotated and curated to identify the genes present and systematically catalogue their variation. Originally intended as a means of characterising isolates with typing schemes, the synthesis of sequences and records of genetic variation with provenance and phenotype data permits highly scalable (whole genome sequence data for tens of thousands of isolates) means of addressing a wide range of functional questions, including: the prediction of antimicrobial resistance; likely cross-reactivity with vaccine antigens; and the functional activities of different variants that lead to key phenotypes. There are no limitations to the number of sequences, genetic loci, allelic variants or schemes (combinations of loci) that can be included, enabling each database to represent an expanding catalogue of the genetic variation of the population in question. In addition to providing web-accessible analyses and links to third-party analysis and visualisation tools, the BIGSdb software includes a RESTful application programming interface (API) that enables access to all the underlying data for third-party applications and data analysis pipelines.
Functional characterisation of gene lists using Gene Ontology (GO) enrichment analysis is a common approach in computational biology, since many analysis methods end up with a list of genes as a ...result. Often there can be hundreds of functional terms that are significantly associated with a single list of genes and proper interpretation of such results can be a challenging endeavour. There are methods to visualise and aid the interpretation of these results, but most of them are limited to the results associated with one list of genes. However, in practice the number of gene lists can be considerably higher and common tools are not effective in such situations.
We introduce a novel R package, 'GOsummaries' that visualises the GO enrichment results as concise word clouds that can be combined together if the number of gene lists is larger. By also adding the graphs of corresponding raw experimental data, GOsummaries can create informative summary plots for various analyses such as differential expression or clustering. The case studies show that the GOsummaries plots allow rapid functional characterisation of complex sets of gene lists. The GOsummaries approach is particularly effective for Principal Component Analysis (PCA).
By adding functional annotation to the principal components, GOsummaries improves significantly the interpretability of PCA results. The GOsummaries layout for PCA can be effective even in situations where we cannot directly apply the GO analysis. For example, in case of metabolomics or metagenomics data it is possible to show the features with significant associations to the components instead of GO terms.
The GOsummaries package is available under GPL-2 licence at Bioconductor (http://www.bioconductor.org/packages/release/bioc/html/GOsummaries.html).
We present a comprehensive review of modelling approaches and associated software tools that address district-level energy systems. Buildings play an important role in urban energy systems regarding ...both the demand and supply of energy. It is no longer sufficient to simulate building energy use assuming isolation from the microclimate and energy system in which they operate, or to model an urban energy system without consideration of the buildings that it serves. This review complements previous studies by focussing on models that address district-level interactions in energy systems, and by assessing the capabilities of the software tools available alongside the theory of the modelling approaches used.
New models and tools that address these district-level interactions are reviewed and their competences assessed. These are divided into the following sections: district energy systems (including heat networks, multi-energy systems and low-temperature networks), renewable energy generation (including solar, bioenergy, wind and the related topic of seasonal storage), and the urban microclimate as it relates to energy demands. The scope and detail covered by twenty cross-disciplinary tools is summarised in a matrix; many other tools that focus on specific areas are also discussed. We end by summarising the current state of district-scale urban energy modelling as it relates to the built environment, along with our perspective on future challenges and research directions.
•Hybrid renewable energy systems with different energy sources are classified.•Evaluation indicators for sizing of hybrid renewable energy systems are presented.•Various sizing methodologies of ...hybrid renewable energy systems are summarized.•Crucial challenges and findings for optimum design of hybrid systems are discussed.
On account of the continuously increasing electricity consumption and concern for environmental issues, renewable energy sources have been widely utilized to generate electricity, and they present advantages such as cleanness, easy availability, low cost, and abundance. In 2017, the installed capacity of solar and wind power worldwide amounted to 903.1 GW, which represented 41.4% of the total installed capacity of renewable energy. Hybrid renewable energy systems have been proposed to overcome the variability and randomness of a single renewable energy source such as solar and wind power, and more than 80% of them are off-grid systems. Meanwhile, it is necessary to determine the size of each component to design a reliable and cost-effective hybrid renewable energy system. Therefore, this paper mainly reviews the recent classification, evaluation indicators, and sizing methodologies of hybrid renewable energy systems (stand-alone and grid-connected). Further optimization research is still required to improve the overall performance of hybrid renewable energy systems. Decision makers can explore and develop hybrid systems including hydropower and/or pumped hydro storage based on their superiority, and they should also pay attention to the development of hybrid energy storage. In addition to reliability and economic indicators, which have applications above 80%, more attention should be payed to environmental and social indicators to determine the system capacity, and some new indicators should be disseminated. The features of traditional, artificial intelligence, and hybrid methods, in additional to software tools, were assessed. Moreover, hybrid methods with high accuracy and fast convergence that can surmount the defects of single methods are the most promising sizing method compared to the other three sizing methods. This review is valuable to understand the current status and development trends of optimal sizing for hybrid renewable energy systems.
Clustering scientific publications in an important problem in bibliometric research. We demonstrate how two software tools, CitNetExplorer and VOSviewer, can be used to cluster publications and to ...analyze the resulting clustering solutions. CitNetExplorer is used to cluster a large set of publications in the field of astronomy and astrophysics. The publications are clustered based on direct citation relations. CitNetExplorer and VOSviewer are used together to analyze the resulting clustering solutions. Both tools use visualizations to support the analysis of the clustering solutions, with CitNetExplorer focusing on the analysis at the level of individual publications and VOSviewer focusing on the analysis at an aggregate level. The demonstration provided in this paper shows how a clustering of publications can be created and analyzed using freely available software tools. Using the approach presented in this paper, bibliometricians are able to carry out sophisticated cluster analyses without the need to have a deep knowledge of clustering techniques and without requiring advanced computer skills.
The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer counting data ...structure, a compressible De Bruijn graph representation, De Bruijn graph partitioning, and digital normalization. khmer is implemented in C++ and Python, and is freely available under the BSD license at
https://github.com/dib-lab/khmer/.
Docker has become a very popular container-based virtualization platform for software distribution that has revolutionized the way in which scientific software and software dependencies (software ...stacks) can be packaged, distributed, and deployed. Docker makes the complex and time-consuming installation procedures needed for scientific software a one-time process. Because it enables platform-independent installation, versioning of software environments, and easy redeployment and reproducibility, Docker is an ideal candidate for the deployment of identical software stacks on different compute environments such as XSEDE and Amazon AWS. CyVerse's Discovery Environment also uses Docker for integrating its powerful, community-recommended software tools into CyVerse's production environment for public use. This paper will help users bring their tools into CyVerse Discovery Environment (DE) which will not only allows users to integrate their tools with relative ease compared to the earlier method of tool deployment in DE but will also help users to share their apps with collaborators and release them for public use.