Drug development is both increasing in cost whilst decreasing in productivity. There is a general acceptance that the current paradigm of R&D needs to change. One alternative approach is drug ...repositioning. With target-based approaches utilised heavily in the field of drug discovery, it becomes increasingly necessary to have a systematic method to rank gene-disease associations. Although methods already exist to collect, integrate and score these associations, they are often not a reliable reflection of expert knowledge. Furthermore, the amount of data available in all areas covered by bioinformatics is increasing dramatically year on year. It thus makes sense to move away from more generalised hypothesis driven approaches to research to one that allows data to generate their own hypothesis. We introduce an integrated, data driven approach to drug repositioning. We first apply a Bayesian statistics approach to rank 309,885 gene-disease associations using existing knowledge. Ranked associations are then integrated with other biological data to produce a semantically-rich drug discovery network. Using this network, we show how our approach identifies diseases of the central nervous system (CNS) to be an area of interest. CNS disorders are identified due to the low numbers of such disorders that currently have marketed treatments, in comparison to other therapeutic areas. We then systematically mine our network for semantic subgraphs that allow us to infer drug-disease relations that are not captured in the network. We identify and rank 275,934 drug-disease has_indication associations after filtering those that are more likely to be side effects, whilst commenting on the top ranked associations in more detail. The dataset has been created in Neo4j and is available for download at https://bitbucket.org/ncl-intbio/genediseaserepositioning along with a Java implementation of the searching algorithm.
RNA sequencing in situ allows for whole-transcriptome characterization at high resolution, while retaining spatial information. These data present an analytical challenge for bioinformatics-how to ...leverage spatial information effectively? Properties of data with a spatial dimension require special handling, which necessitate a different set of statistical and inferential considerations when compared to non-spatial data. The geographical sciences primarily use spatial data and have developed methods to analye them. Here we discuss the challenges associated with spatial analysis and examine how we can take advantage of practice from the geographical sciences to realize the full potential of spatial information in transcriptomic datasets.
Abstract
Background
Probabilistic functional integrated networks (PFINs) are designed to aid our understanding of cellular biology and can be used to generate testable hypotheses about protein ...function. PFINs are generally created by scoring the quality of interaction datasets against a Gold Standard dataset, usually chosen from a separate high-quality data source, prior to their integration. Use of an external Gold Standard has several drawbacks, including data redundancy, data loss and the need for identifier mapping, which can complicate the network build and impact on PFIN performance. Additionally, there typically are no Gold Standard data for non-model organisms.
Results
We describe the development of an integration technique, ssNet, that scores and integrates both high-throughput and low-throughout data from a single source database in a consistent manner without the need for an external Gold Standard dataset. Using data from
Saccharomyces cerevisiae
we show that ssNet is easier and faster, overcoming the challenges of data redundancy, Gold Standard bias and ID mapping. In addition ssNet results in less loss of data and produces a more complete network.
Conclusions
The ssNet method allows PFINs to be built successfully from a single database, while producing comparable network performance to networks scored using an external Gold Standard source and with reduced data loss.
Interactome analyses have traditionally been applied to yeast, human and other model organisms due to the availability of protein-protein interaction data for these species. Recently, these ...techniques have been applied to more diverse species using computational interaction prediction from genome sequence and other data types. This review describes the various types of computational interactome networks that can be created and how they have been used in diverse eukaryotic species, highlighting some of the key interactome studies in non-model organisms.
The transcription error rate estimated from mistakes in end product RNAs is 10−3–10−5. We analyzed the fidelity of nascent RNAs from all actively transcribing elongation complexes (ECs) in ...Escherichia coli and Saccharomyces cerevisiae and found that 1–3% of all ECs in wild-type cells, and 5–7% of all ECs in cells lacking proofreading factors are, in fact, misincorporated complexes. With the exception of a number of sequence-dependent hotspots, most misincorporations are distributed relatively randomly. Misincorporation at hotspots does not appear to be stimulated by pausing. Since misincorporation leads to a strong pause of transcription due to backtracking, our findings indicate that misincorporation could be a major source of transcriptional pausing and lead to conflicts with other RNA polymerases and replication in bacteria and eukaryotes. This observation implies that physical resolution of misincorporated complexes may be the main function of the proofreading factors Gre and TFIIS. Although misincorporation mechanisms between bacteria and eukaryotes appear to be conserved, the results suggest the existence of a bacteria-specific mechanism(s) for reducing misincorporation in protein-coding regions. The links between transcription fidelity, human disease, and phenotypic variability in genetically-identical cells can be explained by the accumulation of misincorporated complexes, rather than mistakes in mature RNA.
Single-cell sequencing technologies have emerged as a revolutionary tool with transformative new methods to profile genetic, epigenetic, spatial, and lineage information in individual cells. ...Single-cell RNA sequencing (scRNA-Seq) allows researchers to collect large datasets detailing the transcriptomes of individual cells in space and time and is increasingly being applied to reveal cellular heterogeneity in retinal development, normal physiology, and disease, and provide new insights into cell-type specific markers and signaling pathways. In recent years, scRNA-Seq datasets have been generated from retinal tissue and pluripotent stem cell-derived retinal organoids. Their cross-comparison enables staging of retinal organoids, identification of specific cells in developing and adult human neural retina and provides deeper insights into cell-type sub-specification and geographical differences. In this article, we review the recent rapid progress in scRNA-Seq analyses of retina and retinal organoids, the questions that remain unanswered and the technical challenges that need to be overcome to achieve consistent results that reflect the complexity, functionality, and interactions of all retinal cell types.
Abnormalities of the arterial valves, including bicuspid aortic valve (BAV) are amongst the most common congenital defects and are a significant cause of morbidity as well as predisposition to ...disease in later life. Despite this, and compounded by their small size and relative inaccessibility, there is still much to understand about how the arterial valves form and remodel during embryogenesis, both at the morphological and genetic level. Here we set out to address this in human embryos, using Spatial Transcriptomics (ST). We show that ST can be used to investigate the transcriptome of the developing arterial valves, circumventing the problems of accurately dissecting out these tiny structures from the developing embryo. We show that the transcriptome of CS16 and CS19 arterial valves overlap considerably, despite being several days apart in terms of human gestation, and that expression data confirm that the great majority of the most differentially expressed genes are valve-specific. Moreover, we show that the transcriptome of the human arterial valves overlaps with that of mouse atrioventricular valves from a range of gestations, validating our dataset but also highlighting novel genes, including four that are not found in the mouse genome and have not previously been linked to valve development. Importantly, our data suggests that valve transcriptomes are under-represented when using commonly used databases to filter for genes important in cardiac development; this means that causative variants in valve-related genes may be excluded during filtering for genomic data analyses for, for example, BAV. Finally, we highlight "novel" pathways that likely play important roles in arterial valve development, showing that mouse knockouts of RBP1 have arterial valve defects. Thus, this study has confirmed the utility of ST for studies of the developing heart valves and broadens our knowledge of the genes and signalling pathways important in human valve development.
•Bacterial genomes are far more complex and dynamic than previously thought.•Sequencing has quickly taken over from array-based methods for the study of transcriptomics.•Prokaryote-specific deep ...sequencing methods have been developed to investigate transcription in vivo.•There are many variations and options to these sequencing protocols and analyses.•These analyses require different processing than RNA-seq gene expression studies.
The identification of the protein-coding regions of a genome is straightforward due to the universality of start and stop codons. However, the boundaries of the transcribed regions, conditional operon structures, non-coding RNAs and the dynamics of transcription, such as pausing of elongation, are non-trivial to identify, even in the comparatively simple genomes of prokaryotes. Traditional methods for the study of these areas, such as tiling arrays, are noisy, labour-intensive and lack the resolution required for densely-packed bacterial genomes. Recently, deep sequencing has become increasingly popular for the study of the transcriptome due to its lower costs, higher accuracy and single nucleotide resolution. These methods have revolutionised our understanding of prokaryotic transcriptional dynamics. Here, we review the deep sequencing and data analysis techniques that are available for the study of transcription in prokaryotes, and discuss the bioinformatic considerations of these analyses.
There are two major pathways leading to induction of NF-κB subunits. The classical (or canonical) pathway typically leads to the induction of RelA or c-Rel containing complexes, and involves the ...degradation of IκBα in a manner dependent on IκB kinase (IKK) β and the IKK regulatory subunit NEMO. The alternative (or non-canonical) pathway, involves the inducible processing of p100 to p52, leading to the induction of NF-κB2(p52)/RelB containing complexes, and is dependent on IKKα and NF-κB inducing kinase (NIK). Here we demonstrate that in primary human fibroblasts, the alternative NF-κB pathway subunits NF-κB2 and RelB have multiple, but distinct, effects on the expression of key regulators of the cell cycle, reactive oxygen species (ROS) generation and protein stability. Specifically, following siRNA knockdown, quantitative PCR, western blot analyses and chromatin immunoprecipitation (ChIP) show that NF-κB2 regulates the expression of CDK4 and CDK6, while RelB, through the regulation of genes such as PSMA5 and ANAPC1, regulates the stability of p21WAF1 and the tumour suppressor p53. These combine to regulate the activity of the retinoblastoma protein, Rb, leading to induction of polycomb protein EZH2 expression. Moreover, our ChIP analysis demonstrates that EZH2 is also a direct NF-κB target gene. Microarray analysis revealed that in fibroblasts, EZH2 antagonizes a subset of p53 target genes previously associated with the senescent cell phenotype, including DEK and RacGAP1. We show that this pathway provides the major route of crosstalk between the alternative NF-κB pathway and p53, a consequence of which is to suppress cell senescence. Importantly, we find that activation of NF-κB also induces EZH2 expression in CD40L stimulated cells from Chronic Lymphocytic Leukemia patients. We therefore propose that this pathway provides a mechanism through which microenvironment induced NF-κB can inhibit tumor suppressor function and promote tumorigenesis.