Motivation: In silico experiments in bioinformatics involve the co-ordinated use of computational tools and information repositories. A growing number of these resources are being made available with ...programmatic access in the form of Web services. Bioinformatics scientists will need to orchestrate these Web services in workflows as part of their analyses. Results: The Taverna project has developed a tool for the composition and enactment of bioinformatics workflows for the life sciences community. The tool includes a workbench application which provides a graphical user interface for the composition of workflows. These workflows are written in a new language called the simple conceptual unified flow language (Scufl), where by each step within a workflow represents one atomic task. Two examples are used to illustrate the ease by which in silico experiments can be represented as Scufl workflows using the workbench application. Availability: The Taverna workflow system is available as open source and can be downloaded with example Scufl workflows from http://taverna.sourceforge.net
Abstract
Background
Probabilistic functional integrated networks (PFINs) are designed to aid our understanding of cellular biology and can be used to generate testable hypotheses about protein ...function. PFINs are generally created by scoring the quality of interaction datasets against a Gold Standard dataset, usually chosen from a separate high-quality data source, prior to their integration. Use of an external Gold Standard has several drawbacks, including data redundancy, data loss and the need for identifier mapping, which can complicate the network build and impact on PFIN performance. Additionally, there typically are no Gold Standard data for non-model organisms.
Results
We describe the development of an integration technique, ssNet, that scores and integrates both high-throughput and low-throughout data from a single source database in a consistent manner without the need for an external Gold Standard dataset. Using data from
Saccharomyces cerevisiae
we show that ssNet is easier and faster, overcoming the challenges of data redundancy, Gold Standard bias and ID mapping. In addition ssNet results in less loss of data and produces a more complete network.
Conclusions
The ssNet method allows PFINs to be built successfully from a single database, while producing comparable network performance to networks scored using an external Gold Standard source and with reduced data loss.
The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of ...Perl modules available for managing and manipulating life-science information. Bioperl provides an easy-to-use, stable, and consistent programming interface for bioinformatics application programmers. The Bioperl modules have been successfully and repeatedly used to reduce otherwise complex tasks to only a few lines of code. The Bioperl object model has been proven to be flexible enough to support enterprise-level applications such as EnsEMBL, while maintaining an easy learning curve for novice Perl programmers. Bioperl is capable of executing analyses and processing results from programs such as BLAST, ClustalW, or the EMBOSS suite. Interoperation with modules written in Python and Java is supported through the evolving BioCORBA bridge. Bioperl provides access to data stores such as GenBank and SwissProt via a flexible series of sequence input/output modules, and to the emerging common sequence data storage format of the Open Bioinformatics Database Access project. This study describes the overall architecture of the toolkit, the problem domains that it addresses, and gives specific examples of how the toolkit can be used to solve common life-sciences problems. We conclude with a discussion of how the open-source nature of the project has contributed to the development effort.
There has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there ...are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools.
Developments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench.
Taverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data.
Synthetic biology open language visual (SBOL visual) version 2.2 Baig, Hasan; Fontanarrosa, Pedro; Kulkarni, Vishwesh ...
Berichte aus der medizinischen Informatik und Bioinformatik/Journal of integrative bioinformatics,
06/2020, Letnik:
17, Številka:
2
Journal Article
Recenzirano
Odprti dostop
People who are engineering biological organisms often find it useful to communicate in diagrams, both about the structure of the nucleic acid sequences that they are engineering and about the ...functional relationships between sequence features and other molecular species. Some typical practices and conventions have begun to emerge for such diagrams. The Synthetic Biology Open Language Visual (SBOL Visual) has been developed as a standard for organizing and systematizing such conventions in order to produce a coherent language for expressing the structure and function of genetic designs. This document details version 2.2 of SBOL Visual, which builds on the prior SBOL Visual 2.1 in several ways. First, the grounding of molecular species glyphs is changed from BioPAX to SBO, aligning with the use of SBO terms for interaction glyphs. Second, new glyphs are added for proteins, introns, and polypeptide regions (e. g., protein domains), the prior recommended macromolecule glyph is deprecated in favor of its alternative, and small polygons are introduced as alternative glyphs for simple chemicals.
Synthetic Biology Open Language (SBOL) Version 2.2.0 Cox, Robert Sidney; Madsen, Curtis; McLaughlin, James Alastair ...
Berichte aus der medizinischen Informatik und Bioinformatik/Journal of integrative bioinformatics,
4/2018, Letnik:
15, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Synthetic biology builds upon the techniques and successes of genetics, molecular biology, and metabolic engineering by applying engineering principles to the design of biological systems. The field ...still faces substantial challenges, including long development times, high rates of failure, and poor reproducibility. One method to ameliorate these problems would be to improve the exchange of information about designed systems between laboratories. The synthetic biology open language (SBOL) has been developed as a standard to support the specification and exchange of biological design information in synthetic biology, filling a need not satisfied by other pre-existing standards. This document details version 2.2.0 of SBOL that builds upon version 2.1.0 published in last year’s JIB special issue. In particular, SBOL 2.2.0 includes improved description and validation rules for genetic design provenance, an extension to support combinatorial genetic designs, a new class to add non-SBOL data as attachments, a new class for genetic design implementations, and a description of a methodology to describe the entire design-build-test-learn cycle within the SBOL data model.
Synthetic Biology Open Language (SBOL) Version 2.3 Madsen, Curtis; Goñi Moreno, Angel; P, Umesh ...
Berichte aus der medizinischen Informatik und Bioinformatik/Journal of integrative bioinformatics,
06/2019, Letnik:
16, Številka:
2
Journal Article
Recenzirano
Odprti dostop
Synthetic biology builds upon the techniques and successes of genetics, molecular biology, and metabolic engineering by applying engineering principles to the design of biological systems. The field ...still faces substantial challenges, including long development times, high rates of failure, and poor reproducibility. One method to ameliorate these problems is to improve the exchange of information about designed systems between laboratories. The synthetic biology open language (SBOL) has been developed as a standard to support the specification and exchange of biological design information in synthetic biology, filling a need not satisfied by other pre-existing standards. This document details version 2.3.0 of SBOL, which builds upon version 2.2.0 published in last year's JIB Standards in Systems Biology special issue. In particular, SBOL 2.3.0 includes means of succinctly representing sequence modifications, such as insertion, deletion, and replacement, an extension to support organization and attachment of experimental data derived from designs, and an extension for describing numerical parameters of design elements. The new version also includes specifying types of synthetic biology activities, unambiguous locations for sequences with multiple encodings, refinement of a number of validation rules, improved figures and examples, and clarification on a number of issues related to the use of external ontology terms.
Synthetic biology open language visual (SBOL Visual) version 2.3 Baig, Hasan; Fontanarossa, Pedro; Kulkarni, Vishwesh ...
Berichte aus der medizinischen Informatik und Bioinformatik/Journal of integrative bioinformatics,
06/2021, Letnik:
18, Številka:
3
Journal Article
Recenzirano
Odprti dostop
People who are engineering biological organisms often find it useful to communicate in diagrams, both about the structure of the nucleic acid sequences that they are engineering and about the ...functional relationships between sequence features and other molecular species. Some typical practices and conventions have begun to emerge for such diagrams. The Synthetic Biology Open Language Visual (SBOL Visual) has been developed as a standard for organizing and systematizing such conventions in order to produce a coherent language for expressing the structure and function of genetic designs. This document details version 2.3 of SBOL Visual, which builds on the prior SBOL Visual 2.2 in several ways. First, the specification now includes higher-level “interactions with interactions,” such as an inducer molecule stimulating a repression interaction. Second, binding with a nucleic acid backbone can be shown by overlapping glyphs, as with other molecular complexes. Finally, a new “unspecified interaction” glyph is added for visualizing interactions whose nature is unknown, the “insulator” glyph is deprecated in favor of a new “inert DNA spacer” glyph, and the polypeptide region glyph is recommended for showing 2A sequences.
Saint is a web application which provides a lightweight annotation integration environment for quantitative biological models. The system enables modellers to rapidly mark up models with biological ...information derived from a range of data sources. Availability and Implementation: Saint is freely available for use on the web at http://www.cisban.ac.uk/saint. The web application is implemented in Google Web Toolkit and Tomcat, with all major browsers supported. The Java source code is freely available for download at http://saint-annotate.sourceforge.net. The Saint web server requires an installation of libSBML and has been tested on Linux (32-bit Ubuntu 8.10 and 9.04). Contact: helpdesk@cisban.ac.uk; a.l.lister@ncl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.