Abstract
Motivation
Proteins usually perform their functions by interacting with other proteins, which is why accurately predicting protein–protein interaction (PPI) binding sites is a fundamental ...problem. Experimental methods are slow and expensive. Therefore, great efforts are being made towards increasing the performance of computational methods.
Results
We propose DEep Learning Prediction of Highly probable protein Interaction sites (DELPHI), a new sequence-based deep learning suite for PPI-binding sites prediction. DELPHI has an ensemble structure which combines a CNN and a RNN component with fine tuning technique. Three novel features, HSP, position information and ProtVec are used in addition to nine existing ones. We comprehensively compare DELPHI to nine state-of-the-art programmes on five datasets, and DELPHI outperforms the competing methods in all metrics even though its training dataset shares the least similarities with the testing datasets. In the most important metrics, AUPRC and MCC, it surpasses the second best programmes by as much as 18.5% and 27.7%, respectively. We also demonstrated that the improvement is essentially due to using the ensemble model and, especially, the three new features. Using DELPHI it is shown that there is a strong correlation with protein-binding residues (PBRs) and sites with strong evolutionary conservation. In addition, DELPHI’s predicted PBR sites closely match known data from Pfam. DELPHI is available as open-sourced standalone software and web server.
Availability and implementation
The DELPHI web server can be found at delphi.csd.uwo.ca/, with all datasets and results in this study. The trained models, the DELPHI standalone source code, and the feature computation pipeline are freely available at github.com/lucian-ilie/DELPHI.
Supplementary information
Supplementary data are available at Bioinformatics online.
The functional effects of most amino acid replacements accumulated during molecular evolution are unknown, because most are not observed naturally and the possible combinations are too numerous. We ...created 168 single mutations in wild-type Escherichia coli isopropymalate dehydrogenase (IMDH) that match the differences found in wild-type Pseudomonas aeruginosa IMDH. 104 mutant enzymes performed similarly to E. coli wild-type IMDH, one was functionally enhanced, and 63 were functionally compromised. The transition from E. coli IMDH, or an ancestral form, to the functional wild-type P. aeruginosa IMDH requires extensive epistasis to ameliorate the combined effects of the deleterious mutations. This result stands in marked contrast with a basic assumption of molecular phylogenetics, that sites in sequences evolve independently of each other. Residues that affect function are scattered haphazardly throughout the IMDH structure. We screened for compensatory mutations at three sites, all of which lie near the active site and all of which are among the least active mutants. No compensatory mutations were found at two sites indicating that a single site may engage in compound epistatic interactions. One complete and three partial compensatory mutations of the third site are remote and lie in a different domain. This demonstrates that epistatic interactions can occur between distant (>20Å) sites. Phylogenetic analysis shows that incompatible mutations were fixed in different lineages.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Technological advances in DNA recovery and sequencing have drastically expanded the scope of genetic analyses of ancient specimens to the extent that full genomic investigations are now feasible and ...are quickly becoming standard. This trend has important implications for infectious disease research because genomic data from ancient microbes may help to elucidate mechanisms of pathogen evolution and adaptation for emerging and re-emerging infections. Here we report a reconstructed ancient genome of Yersinia pestis at 30-fold average coverage from Black Death victims securely dated to episodes of pestilence-associated mortality in London, England, 1348-1350. Genetic architecture and phylogenetic analysis indicate that the ancient organism is ancestral to most extant strains and sits very close to the ancestral node of all Y. pestis commonly associated with human infection. Temporal estimates suggest that the Black Death of 1347-1351 was the main historical event responsible for the introduction and widespread dissemination of the ancestor to all currently circulating Y. pestis strains pathogenic to humans, and further indicates that contemporary Y. pestis epidemics have their origins in the medieval era. Comparisons against modern genomes reveal no unique derived positions in the medieval organism, indicating that the perceived increased virulence of the disease during the Black Death may not have been due to bacterial phenotype. These findings support the notion that factors other than microbial genetics, such as environment, vector dynamics and host susceptibility, should be at the forefront of epidemiological discussions regarding emerging Y. pestis infections.
Full text
Available for:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Encompassing the breadth of biodiversity in biomonitoring programmes has been frustrated by an inability to simultaneously identify large numbers of species accurately and in a timely fashion. ...Biomonitoring infers the state of an ecosystem from samples collected and identified using the best available taxonomic knowledge. The advent of DNA barcoding has now given way to the extraction of bulk DNA from mixed samples of organisms in environmental samples through the development of high-throughput sequencing (HTS). This DNA metabarcoding approach allows an unprecedented view of the true breadth and depth of biodiversity, but its adoption poses two important challenges. First, bioinformatics techniques must simultaneously perform complex analyses of large datasets and translate the results of these analyses to a range of users. Second, the insights gained from HTS need to be amalgamated with concepts such as Linnaean taxonomy and indicator species, which are less comprehensive but more intuitive. It is clear that we are moving beyond proof-of-concept studies to address the challenge of implementation of this new approach for environmental monitoring and regulation. Interpreting Darwin's ‘tangled bank’ through a DNA lens is now a reality, but the question remains: how can this information be generated and used reliably, and how does it relate to accepted norms in ecosystem study?
This article is part of the themed issue ‘From DNA barcodes to biomes’.
Full text
Available for:
BFBNIB, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK
This book is about making weather warnings more effective in saving lives, property, infrastructure and livelihoods, but the underlying theme of the book is partnership. The book represents the ...warning process as a pathway linking observations to weather forecasts to hazard forecasts to socio-economic impact forecasts to warning messages to the protective decision, via a set of five bridges that cross the divides between the relevant organisations and areas of expertise. Each bridge represents the communication, translation and interpretation of information as it passes from one area of expertise to another and ultimately to the decision maker, who may be a professional or a member of the public. The authors explore the partnerships upon which each bridge is built, assess the expertise and skills that each partner brings and the challenges of communication between them, and discuss the structures and methods of working that build effective partnerships. The book is ordered according to the “first mile” paradigm in which the decision maker comes first, and then the production chain through the warning and forecast to the observations is considered second. This approach emphasizes the importance of co-design and co-production throughout the warning process. The book is targeted at professionals and trainee professionals with a role in the warning chain, i.e. in weather services, emergency management agencies, disaster risk reduction agencies, risk management sections of infrastructure agencies. This is an open access book.
Gene expression in bacteria is a remarkably controlled and intricate process impacted by many factors. One such factor is the genomic position of a gene within a bacterial genome. Genes located near ...the origin of replication generally have a higher expression level, increased dosage, and are often more conserved than genes located farther from the origin of replication. The majority of the studies involved with these findings have only noted this phenomenon in a single gene or cluster of genes that was re-located to pre-determined positions within a bacterial genome. In this work, we look at the overall expression levels from eleven bacterial data sets from
Escherichia coli
,
Bacillus subtilis
,
Streptomyces
, and
Sinorhizobium meliloti
. We have confirmed that gene expression tends to decrease when moving away from the origin of replication in majority of the replicons analysed in this study. This study sheds light on the impact of genomic location on molecular trends such as gene expression and highlights the importance of accounting for spatial trends in bacterial molecular analysis.
Full text
Available for:
DOBA, EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, IZUM, KILJ, KISLJ, MFDPS, NLZOH, NUK, OBVAL, OILJ, PILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, SIK, UILJ, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
The 14th-18th century pandemic of Yersinia pestis caused devastating disease outbreaks in Europe for almost 400 years. The reasons for plague's persistence and abrupt disappearance in Europe are ...poorly understood, but could have been due to either the presence of now-extinct plague foci in Europe itself, or successive disease introductions from other locations. Here we present five Y. pestis genomes from one of the last European outbreaks of plague, from 1722 in Marseille, France. The lineage identified has not been found in any extant Y. pestis foci sampled to date, and has its ancestry in strains obtained from victims of the 14th century Black Death. These data suggest the existence of a previously uncharacterized historical plague focus that persisted for at least three centuries. We propose that this disease source may have been responsible for the many resurgences of plague in Europe following the Black Death.
In recent years there has been a growing appreciation of the potential advantages of using a seamless approach to weather and climate prediction. However, what exactly should this mean in practice? ...To help address this question, we document some of the experiences already gathered over 25 years of developing and using the Met Office Unified Model (MetUM) for both weather and climate prediction. Overall, taking a unified approach has given enormous benefits, both scientific and in terms of efficiency, but we also detail some of the challenges it has presented and the approaches taken to overcome them.
Full text
Available for:
BFBNIB, DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Large-scale genome arrangement plays an important role in bacterial genome evolution. A substantial number of genes can be inserted into, deleted from, or rearranged within genomes during evolution. ...Detecting or inferring gene insertions/deletions is of interest because such information provides insights into bacterial genome evolution and speciation. However, efficient inference of genome events is difficult because genome comparisons alone do not generally supply enough information to distinguish insertions, deletions, and other rearrangements. In this study, homologous genes from the complete genomes of 13 closely related bacteria were examined. The presence or absence of genes from each genome was cataloged, and a maximum likelihood method was used to infer insertion/deletion rates according to the phylogenetic history of the taxa. It was found that whole gene insertions/deletions in genomes occur at rates comparable to or greater than the rate of nucleotide substitution and that higher insertion/deletion rates are often inferred to be present at the tips of the phylogeny with lower rates on more ancient interior branches. Recently transferred genes are under faster and relaxed evolution compared with more ancient genes. Together, this implies that many of the lineage-specific insertions are lost quickly during evolution and that perhaps a few of the genes inserted by lateral transfer are niche specific.
Smallpox holds a unique position in the history of medicine. It was the first disease for which a vaccine was developed and remains the only human disease eradicated by vaccination. Although there ...have been claims of smallpox in Egypt, India, and China dating back millennia 1–4, the timescale of emergence of the causative agent, variola virus (VARV), and how it evolved in the context of increasingly widespread immunization, have proven controversial 4–9. In particular, some molecular-clock-based studies have suggested that key events in VARV evolution only occurred during the last two centuries 4–6 and hence in apparent conflict with anecdotal historical reports, although it is difficult to distinguish smallpox from other pustular rashes by description alone. To address these issues, we captured, sequenced, and reconstructed a draft genome of an ancient strain of VARV, sampled from a Lithuanian child mummy dating between 1643 and 1665 and close to the time of several documented European epidemics 1, 2, 10. When compared to vaccinia virus, this archival strain contained the same pattern of gene degradation as 20th century VARVs, indicating that such loss of gene function had occurred before ca. 1650. Strikingly, the mummy sequence fell basal to all currently sequenced strains of VARV on phylogenetic trees. Molecular-clock analyses revealed a strong clock-like structure and that the timescale of smallpox evolution is more recent than often supposed, with the diversification of major viral lineages only occurring within the 18th and 19th centuries, concomitant with the development of modern vaccination.
•Variola virus genome was reconstructed from a 17th century mummified child•The archival strain is basal to all 20th century strains, with same gene degradation•Molecular-clock analyses show that much of variola virus evolution occurred recently
Using ancient DNA sequences of variola virus recovered from the mummified remains of a 17th century child, Duggan et al. reconstruct the evolutionary history of smallpox. With the ancient strain, the genetic diversification of the smallpox virus is found to be more recent than previously supposed and concurrent with the onset of widespread vaccination.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP