Abstract Motivation Deep graph learning (DGL) has been widely employed in the realm of ligand-based virtual screening. Within this field, a key hurdle is the existence of activity cliffs (ACs), where ...minor chemical alterations can lead to significant changes in bioactivity. In response, several DGL models have been developed to enhance ligand bioactivity prediction in the presence of ACs. Yet, there remains a largely unexplored opportunity within ACs for optimizing ligand bioactivity, making it an area ripe for further investigation. Results We present a novel approach to simultaneously predict and optimize ligand bioactivities through DGL and ACs (OLB-AC). OLB-AC possesses the capability to optimize ligand molecules located near ACs, providing a direct reference for optimizing ligand bioactivities with the matching of original ligands. To accomplish this, a novel attentive graph reconstruction neural network and ligand optimization scheme are proposed. Attentive graph reconstruction neural network reconstructs original ligands and optimizes them through adversarial representations derived from their bioactivity prediction process. Experimental results on nine drug targets reveal that out of the 667 molecules generated through OLB-AC optimization on datasets comprising 974 low-activity, noninhibitor, or highly toxic ligands, 49 are recognized as known highly active, inhibitor, or nontoxic ligands beyond the datasets’ scope. The 27 out of 49 matched molecular pairs generated by OLB-AC reveal novel transformations not present in their training sets. The adversarial representations employed for ligand optimization originate from the gradients of bioactivity predictions. Therefore, we also assess OLB-AC’s prediction accuracy across 33 different bioactivity datasets. Results show that OLB-AC achieves the best Pearson correlation coefficient (r2) on 27/33 datasets, with an average improvement of 7.2%–22.9% against the state-of-the-art bioactivity prediction methods. Availability and implementation The code and dataset developed in this work are available at github.com/Yueming-Yin/OLB-AC.
Although the human microbiome plays a key role in health and disease, the biological mechanisms underlying the interaction between the microbiome and its host are incompletely understood. Integration ...with other molecular profiling data offers an opportunity to characterize the role of the microbiome and elucidate therapeutic targets. However, this remains challenging to the high dimensionality, compositionality, and rare features found in microbiome profiling data. These challenges necessitate the use of methods that can achieve structured sparsity in learning cross-platform association patterns.
We propose Tree-Aggregated factor RegressiOn (TARO) for the integration of microbiome and metabolomic data. We leverage information on the taxonomic tree structure to flexibly aggregate rare features. We demonstrate through simulation studies that TARO accurately recovers a low-rank coefficient matrix and identifies relevant features. We applied TARO to microbiome and metabolomic profiles gathered from subjects being screened for colorectal cancer to understand how gut microrganisms shape intestinal metabolite abundances.
The R package TARO implementing the proposed methods is available online at https://github.com/amishra-stats/taro-package.
Abstract Motivation The identification of cancer subtypes plays a crucial role in cancer research and treatment. With the rapid development of high-throughput sequencing technologies, there has been ...an exponential accumulation of cancer multi-omics data. Integrating multi-omics data has emerged as a cost-effective and efficient strategy for cancer subtyping. While current methods primarily rely on genomics data, protein expression data offers a closer representation of phenotype. Therefore, integrating protein expression data holds promise for enhancing subtyping accuracy. However, the scarcity of protein expression data compared to genomics data presents a challenge in its direct incorporation into existing methods. Moreover, striking a balance between omics-specific learning and cross-omics learning remains a prevalent challenge in current multi-omics integration methods. Results We introduce Subtype-MGTP, a novel cancer subtyping framework based on the translation of Multiple Genomics To Proteomics. Subtype-MGTP comprises two modules: a translation module, which leverages available protein data to translate multi-type genomics data into predicted protein expression data, and an improved deep subspace clustering module, which integrates contrastive learning to cluster the predicted protein data, yielding refined subtyping results. Extensive experiments conducted on benchmark datasets demonstrate that Subtype-MGTP outperforms nine state-of-the-art cancer subtyping methods. The interpretability of clustering results is further supported by the clinical and survival analysis. Subtype-MGTP also exhibits strong robustness against varying rates of missing protein data and demonstrates distinct advantages in integrating multi-omics data with imbalanced multi-omics data. Availability and implementation The code and results are available at https://github.com/kybinn/Subtype-MGTP.
The completion of the genome has paved the way for genome-wide association studies (GWAS), which explained certain proportions of heritability. GWAS are not optimally suited to detect non-linear ...effects in disease risk, possibly hidden in non-additive interactions (epistasis). Alternative methods for epistasis detection using e.g. deep neural networks (DNNs) are currently under active development. However, DNNs are constrained by finite computational resources, which can be rapidly depleted due to increasing complexity with the sheer size of the genome. Besides, the curse of dimensionality complicates the task of capturing meaningful genetic patterns for DNNs; therefore necessitates dimensionality reduction.
We propose a method to compress single nucleotide polymorphism (SNP) data, while leveraging the linkage disequilibrium (LD) structure and preserving potential epistasis. This method involves clustering correlated SNPs into haplotype blocks and training per-block autoencoders to learn a compressed representation of the block's genetic content. We provide an adjustable autoencoder design to accommodate diverse blocks and bypass extensive hyperparameter tuning. We applied this method to genotyping data from Project MinE, and achieved 99% average test reconstruction accuracy-i.e. minimal information loss-while compressing the input to nearly 10% of the original size. We demonstrate that haplotype-block based autoencoders outperform linear Principal Component Analysis (PCA) by approximately 3% chromosome-wide accuracy of reconstructed variants. To the extent of our knowledge, our approach is the first to simultaneously leverage haplotype structure and DNNs for dimensionality reduction of genetic data.
Data are available for academic use through Project MinE at https://www.projectmine.com/research/data-sharing/, contingent upon terms and requirements specified by the source studies. Code is available at https://github.com/gizem-tas/haploblock-autoencoders.
Supplementary data are available at Bioinformatics online.
There is an urgent global need for accessible and cost-effective pro-mental health infrastructure. Public green spaces were officially designated in the 19th century, informed by a belief that they ...might provide health benefits. We outline modern research evidence that greenspace can play a pivotal role in population-level mental health.
Digital imaging andprocessing are shown to open new methods of paper research. Early Balticprinted books are the examples in Revealing Watermarks, which alsodescribes how to enhance security, by ...creating and archiving a digital'fingerprint'. Thus thefts are deterred and stolen items can be uniquelyidentified for return.
In 1978, G. Plotkin noticed that
$\mathbb{T}$
ω
, the cartesian product of ω copies of the three element flat domain of Booleans, is a universal domain, where ‘universal’ means that the retracts of
...$\mathbb{T}$
ω
for Scott's continuous semantics are exactly all the ωCC-domains, which with Scott continuous functions form a cartesian closed category. As usual, ‘ω’ is for ‘countably based,’ and here ‘CC’ is for ‘conditionally complete,’ which essentially means that any subset which is pairwise bounded has a least upper bound. Since
$\mathbb{T}$
ω
is also an ωDI-domain (an important structure in stable domain theory), the following problem arises naturally: is there a cartesian closed category C of domains with stable functions such that
$\mathbb{T}$
ω
, or a related structure, is universal in C for Berry’s stable semantics? The aim of this paper is to answer this question. We first investigate the properties of stable retracts. We introduce a new class of domains called conditionally complete DI-domains (CCDI-domain for short) and show that, (1)
$\mathbb{T}$
ω
is an ωCCDI-domain and the category of CCDI-domains (resp. ωCCDI-domains) with stable functions is cartesian closed; (2)
$\mathbb{T}$
ω
→
st
$\mathbb{T}$
ω
is a stable universal domain in the sense that every ωCCDI-domain is a stable retract of
$\mathbb{T}$
ω
→
st
$\mathbb{T}$
ω
, where
$\mathbb{T}$
ω
→
st
$\mathbb{T}$
ω
is the stable function space of
$\mathbb{T}$
ω
; (3) in particular,
$\mathbb{T}$
ω
→
st
$\mathbb{T}$
ω
is not a stable retract of
$\mathbb{T}$
ω
and hence
$\mathbb{T}$
ω
is not universal for Berry’s stable semantics. We remark that this paper is a completion and correction of our earlier report in the Proceedings of the 6th International Symposium on Domain Theory and Its Applications (ISDT2013).
Abstract Background The prevalence of depressive symptoms and cognitive decline increases with age. We investigated their temporal dynamics in individuals aged 85 and older across a 5-year follow-up ...period. Methods Participants were selected from the Leiden 85-plus study and were eligible if at least three follow-up measurements were available (325 of 599 participants). Depressive symptoms were assessed at baseline and at yearly assessments during a follow-up period of up to 5 years, using the 15-item Geriatric Depression Scale (GDS-15). Cognitive decline was measured through various tests, including the Mini Mental State Exam, Stroop test, Letter Digit Coding test and immediate and delayed recall. A novel method, dynamic time warping analysis, was employed to model their temporal dynamics within individuals, in undirected and directed time-lag analyses, to ascertain whether depressive symptoms precede cognitive decline in group-level aggregated results or vice versa. Results The 325 participants were all 85 years of age at baseline; 68% were female, and 45% received intermediate to higher education. Depressive symptoms and cognitive functioning significantly covaried in time, and directed analyses showed that depressive symptoms preceded most of the constituents of cognitive impairment in the oldest old. Of the GDS-15 symptoms, those with the strongest outstrength, indicating changes in these symptoms preceded subsequent changes in other symptoms, were worthlessness, hopelessness, low happiness, dropping activities/interests, and low satisfaction with life (all P’s < 0.01). Conclusion Depressive symptoms preceded cognitive impairment in a population based sample of the oldest old.
Abstract Background The Osteoarthritis Initiative (OAI) evaluates the development and progression of osteoarthritis. Frailty captures the heterogeneity in aging. Use of this resource-intensive ...dataset to answer aging-related research questions could be enhanced by a frailty measure. Objective To: (i) develop a deficit accumulation frailty index (FI) for the OAI; (ii) examine its relationship with age and compare between sexes, (iii) validate the FI versus all-cause mortality and (iv) compare this association with mortality with a modified frailty phenotype. Design OAI cohort study. Setting North America. Subjects An FI was determined for 4,755/4,796 and 4,149/4,796 who had a valid FI and frailty phenotype. Methods Fifty-nine-variables were screened for inclusion. Multivariate Cox regression evaluated the impact of FI or phenotype on all-cause mortality at follow-up (up to 146 months), controlling for age and sex. Results Thirty-one items were included. FI scores (0.16 ± 0.09) were higher in older adults and among females (both, P < 0.001). By follow-up, 264 people had died (6.4%). Older age, being male, and greater FI were associated with a higher risk of all-cause mortality (all, P < 0.001). The model including FI was a better fit than the model including the phenotype (AIC: 4,167 vs. 4,178) and was a better predictor of all-cause mortality than the phenotype with an area under receiver operating characteristic curve: 0.652 vs. 0.581. Conclusion We developed an FI using the OAI and validated it in relation to all-cause mortality. The FI may be used to study aging on clinical, functional and structural aspects of osteoarthritis included in the OAI.