How well can a QSAR model predict the activity of a molecule not in the training set used to create the model? A set of retrospective cross-validation experiments using 20 diverse in-house activity ...sets were done to find a good discriminator of prediction accuracy as measured by root-mean-square difference between observed and predicted activity. Among the measures we tested, two seem useful: the similarity of the molecule to be predicted to the nearest molecule in the training set and/or the number of neighbors in the training set, where neighbors are those more similar than a user-chosen cutoff. The molecules with the highest similarity and/or the most neighbors are the best-predicted. This trend holds true for narrow training sets and, to a lesser degree, for many diverse training sets and does not depend on which QSAR method or descriptor is used. One may define the similarity using a different descriptor than that used for the QSAR model. The similarity dependence for diverse training sets is somewhat unexpected. It appears to be greater for those data sets where the association of similar activities vs similar structures (as encoded in the Patterson plot) is stronger. We propose a way to estimate the reliability of the prediction of an arbitrary chemical structure on a given QSAR model, given the training set from which the model was derived.
Deep neural networks (DNNs) are complex computational models that have found great success in many artificial intelligence applications, such as computer vision , and natural language processing. , ...In the past four years, DNNs have also generated promising results for quantitative structure–activity relationship (QSAR) tasks. , Previous work showed that DNNs can routinely make better predictions than traditional methods, such as random forests, on a diverse collection of QSAR data sets. It was also found that multitask DNN modelsthose trained on and predicting multiple QSAR properties simultaneouslyoutperform DNNs trained separately on the individual data sets in many, but not all, tasks. To date there has been no satisfactory explanation of why the QSAR of one task embedded in a multitask DNN can borrow information from other unrelated QSAR tasks. Thus, using multitask DNNs in a way that consistently provides a predictive advantage becomes a challenge. In this work, we explored why multitask DNNs make a difference in predictive performance. Our results show that during prediction a multitask DNN does borrow “signal” from molecules with similar structures in the training sets of the other tasks. However, whether this borrowing leads to better or worse predictive performance depends on whether the activities are correlated. On the basis of this, we have developed a strategy to use multitask DNNs that incorporate prior domain knowledge to select training sets with correlated activities, and we demonstrate its effectiveness on several examples.
TRAIL (also called Apo2L) belongs to the tumor necrosis factor family, activates rapid apoptosis in tumor cells, and binds to the death-signaling receptor DR4. Two additional TRAIL receptors were ...identified. The receptor designated death receptor 5 (DR5) contained a cytoplasmic death domain and induced apoptosis much like DR4. The receptor designated decoy receptor 1 (DcR1) displayed properties of a glycophospholipid-anchored cell surface protein. DcR1 acted as a decoy receptor that inhibited TRAIL signaling. Thus, a cell surface mechanism exists for the regulation of cellular responsiveness to pro-apoptotic stimuli.
A DNA vaccine against infectious haematopoietic necrosis virus (IHNV) is effective at protecting rainbow trout, Oncorhynchus mykiss, against disease, but intramuscular injection is required and makes ...the vaccine impractical for use in the freshwater rainbow trout farming industry. Poly (D,L‐lactic‐co‐glycolic acid) (PLGA) is a U.S. Food and Drug Administration (FDA) approved polymer that can be used to deliver DNA vaccines. We evaluated the in vivo absorption of PLGA nanoparticles containing coumarin‐6 when added to a fish food pellet. We demonstrated that rainbow trout will eat PLGA nanoparticle coated feed and that these nanoparticles can be detected in the epithelial cells of the lower intestine within 96 h after feeding. We also detected low levels of gene expression and anti‐IHNV neutralizing antibodies when fish were fed or intubated with PLGA nanoparticles containing IHNV G gene plasmid. A virus challenge evaluation suggested a slight increase in survival at 6 weeks post‐vaccination in fish that received a high dose of the oral vaccine, but there was no difference when additional fish were challenged at 10 weeks post‐vaccination. The results of this study suggest that it is possible to induce an immune response using an orally delivered DNA vaccine, but the current system needs improvement.
Multitask deep learning has emerged as a powerful tool for computational drug discovery. However, despite a number of preliminary studies, multitask deep networks have yet to be widely deployed in ...the pharmaceutical and biotech industries. This lack of acceptance stems from both software difficulties and lack of understanding of the robustness of multitask deep networks. Our work aims to resolve both of these barriers to adoption. We introduce a high-quality open-source implementation of multitask deep networks as part of the DeepChem open-source platform. Our implementation enables simple python scripts to construct, fit, and evaluate sophisticated deep models. We use our implementation to analyze the performance of multitask deep networks and related deep models on four collections of pharmaceutical data (three of which have not previously been analyzed in the literature). We split these data sets into train/valid/test using time and neighbor splits to test multitask deep learning performance under challenging conditions. Our results demonstrate that multitask deep networks are surprisingly robust and can offer strong improvement over random forests. Our analysis and open-source implementation in DeepChem provide an argument that multitask deep networks are ready for widespread use in commercial drug discovery.
A critical link missing from our understanding of the nursery role of specific marine habitats is the evidence of connectivity between juvenile and adult habitats. This paper reviews and evaluates ...evidence of, and spatial scales for, movements from juvenile to adult habitats and it summarises the methods used to study movements. Examples include many fish families but few invertebrate taxa, and most are species of economic importance for USA and Australia. The types of juvenile habitat range from the entire estuary or shallow open coastal waters to specific habitats within estuaries or coastal waters; in some cases juvenile habitats include habitats not traditionally regarded as nursery areas (e.g. the surf zone). The duration of time spent in juvenile habitats averages 13 mo (range 8 d to 5 yr). The majority of organisms move distances of kilometres to hundreds of kilometres from juvenile to adult habitats, although the scale of movements ranged from metres to thousands of kilometres. Changes in abundance among separate habitats and the progression of size classes among separate habitats are the main methods used to infer movement and habitat connectivity. Spatial partitioning of stages of maturity, natural parasites, and a variety of artificial tagging methods have also been used. The latter will become more useful with continued developments in the miniaturisation of artificial tags. More recent studies have used natural tags (e.g. trace elements and stable isotopes) and these methods show great promise for determining movements from juvenile to adult habitats. Few studies provide good evidence for movement from specific juvenile habitats to adult habitats. Future studies need to focus on this movement to supplement data on density, growth and survival of organisms in putative nursery habitats. Such information will allow management and conservation efforts to focus on those habitats that make the greatest contribution to adult populations.