t-distributed stochastic neighbor embedding (t-SNE) is widely used for visualizing single-cell RNA-sequencing (scRNA-seq) data, but it scales poorly to large datasets. We dramatically accelerate ...t-SNE, obviating the need for data downsampling, and hence allowing visualization of rare cell populations. Furthermore, we implement a heatmap-style visualization for scRNA-seq based on one-dimensional t-SNE for simultaneously visualizing the expression patterns of thousands of genes. Software is available at https://github.com/KlugerLab/FIt-SNE and https://github.com/KlugerLab/t-SNE-Heatmaps .
A key challenge in analyzing single cell RNA-sequencing data is the large number of false zeros, where genes actually expressed in a given cell are incorrectly measured as unexpressed. We present a ...method based on low-rank matrix approximation which imputes these values while preserving biologically non-expressed genes (true biological zeros) at zero expression levels. We provide theoretical justification for this denoising approach and demonstrate its advantages relative to other methods on simulated and biological datasets.
Medical practitioners use survival models to explore and understand the relationships between patients' covariates (e.g. clinical and genetic features) and the effectiveness of various treatment ...options. Standard survival models like the linear Cox proportional hazards model require extensive feature engineering or prior medical knowledge to model treatment interaction at an individual level. While nonlinear survival methods, such as neural networks and survival forests, can inherently model these high-level interaction terms, they have yet to be shown as effective treatment recommender systems.
We introduce DeepSurv, a Cox proportional hazards deep neural network and state-of-the-art survival method for modeling interactions between a patient's covariates and treatment effectiveness in order to provide personalized treatment recommendations.
We perform a number of experiments training DeepSurv on simulated and real survival data. We demonstrate that DeepSurv performs as well as or better than other state-of-the-art survival models and validate that DeepSurv successfully models increasingly complex relationships between a patient's covariates and their risk of failure. We then show how DeepSurv models the relationship between a patient's features and effectiveness of different treatment options to show how DeepSurv can be used to provide individual treatment recommendations. Finally, we train DeepSurv on real clinical studies to demonstrate how it's personalized treatment recommendations would increase the survival time of a set of patients.
The predictive and modeling capabilities of DeepSurv will enable medical researchers to use deep neural networks as a tool in their exploration, understanding, and prediction of the effects of a patient's characteristics on their risk of failure.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
In a broad range of classification and decision-making problems, one is given the advice or predictions of several classifiers, of unknown reliability, over multiple questions or queries. This ...scenario is different from the standard supervised setting, where each classifier’s accuracy can be assessed using available labeled data, and raises two questions: Given only the predictions of several classifiers over a large set of unlabeled test data, is it possible to (i) reliably rank them and (ii) construct a metaclassifier more accurate than most classifiers in the ensemble? Here we present a spectral approach to address these questions. First, assuming conditional independence between classifiers, we show that the off-diagonal entries of their covariance matrix correspond to a rank-one matrix. Moreover, the classifiers can be ranked using the leading eigenvector of this covariance matrix, because its entries are proportional to their balanced accuracies. Second, via a linear approximation to the maximum likelihood estimator, we derive the Spectral Meta-Learner (SML), an unsupervised ensemble classifier whose weights are equal to these eigenvector entries. On both simulated and real data, SML typically achieves a higher accuracy than most classifiers in the ensemble and can provide a better starting point than majority voting for estimating the maximum likelihood solution. Furthermore, SML is robust to the presence of small malicious groups of classifiers designed to veer the ensemble prediction away from the (unknown) ground truth.
Severe COVID-19 is characterized by persistent lung inflammation, inflammatory cytokine production, viral RNA and a sustained interferon (IFN) response, all of which are recapitulated and required ...for pathology in the SARS-CoV-2-infected MISTRG6-hACE2 humanized mouse model of COVID-19, which has a human immune system
. Blocking either viral replication with remdesivir
or the downstream IFN-stimulated cascade with anti-IFNAR2 antibodies in vivo in the chronic stages of disease attenuates the overactive immune inflammatory response, especially inflammatory macrophages. Here we show that SARS-CoV-2 infection and replication in lung-resident human macrophages is a critical driver of disease. In response to infection mediated by CD16 and ACE2 receptors, human macrophages activate inflammasomes, release interleukin 1 (IL-1) and IL-18, and undergo pyroptosis, thereby contributing to the hyperinflammatory state of the lungs. Inflammasome activation and the accompanying inflammatory response are necessary for lung inflammation, as inhibition of the NLRP3 inflammasome pathway reverses chronic lung pathology. Notably, this blockade of inflammasome activation leads to the release of infectious virus by the infected macrophages. Thus, inflammasomes oppose host infection by SARS-CoV-2 through the production of inflammatory cytokines and suicide by pyroptosis to prevent a productive viral cycle.
Cancers harbor many somatic mutations and germline variants, we hypothesized that the combined effect of germline variants that alter the structure, expression, or function of protein-coding regions ...of cancer-biology related genes (gHFI) determines which and how many somatic mutations (sM) must occur for malignant transformation. We show that gHFI and sM affect overlapping genes and the average number of gHFI in cancer hallmark genes is higher in patients who develop cancer at a younger age (r = -0.77, P = 0.0051), while the average number of sM increases in increasing age groups (r = 0.92, P = 0.000073). A strong negative correlation exists between average gHFI and average sM burden in increasing age groups (r = -0.70, P = 0.017). In early-onset cancers, the larger gHFI burden in cancer genes suggests a greater contribution of germline alterations to the transformation process while late-onset cancers are more driven by somatic mutations.
Revealing the clonal composition of a single tumor is essential for identifying cell subpopulations with metastatic potential in primary tumors or with resistance to therapies in metastatic tumors. ...Sequencing technologies provide only an overview of the aggregate of numerous cells. Computational approaches to de-mix a collective signal composed of the aberrations of a mixed cell population of a tumor sample into its individual components are not available. We propose an evolutionary framework for deconvolving data from a single genome-wide experiment to infer the composition, abundance and evolutionary paths of the underlying cell subpopulations of a tumor. We have developed an algorithm (TrAp) for solving this mixture problem. In silico analyses show that TrAp correctly deconvolves mixed subpopulations when the number of subpopulations and the measurement errors are moderate. We demonstrate the applicability of the method using tumor karyotypes and somatic hypermutation data sets. We applied TrAp to Exome-Seq experiment of a renal cell carcinoma tumor sample and compared the mutational profile of the inferred subpopulations to the mutational profiles of single cells of the same tumor. Finally, we deconvolve sequencing data from eight acute myeloid leukemia patients and three distinct metastases of one melanoma patient to exhibit the evolutionary relationships of their subpopulations.
The heterogeneous nature of mammalian PRC1 complexes has hindered our understanding of their biological functions. Here, we present a comprehensive proteomic and genomic analysis that uncovered six ...major groups of PRC1 complexes, each containing a distinct PCGF subunit, a RING1A/B ubiquitin ligase, and a unique set of associated polypeptides. These PRC1 complexes differ in their genomic localization, and only a small subset colocalize with H3K27me3. Further biochemical dissection revealed that the six PCGF-RING1A/B combinations form multiple complexes through association with RYBP or its homolog YAF2, which prevents the incorporation of other canonical PRC1 subunits, such as CBX, PHC, and SCM. Although both RYBP/YAF2- and CBX/PHC/SCM-containing complexes compact chromatin, only RYBP stimulates the activity of RING1B toward H2AK119ub1, suggesting a central role in PRC1 function. Knockdown of RYBP in embryonic stem cells compromised their ability to form embryoid bodies, likely because of defects in cell proliferation and maintenance of H2AK119ub1 levels.
► All PRC1 complexes are divided into six groups characterized by six PCGF subunits ► These six groups of PRC1 complexes target different genes through distinct mechanisms ► RYBP/YAF2 form mutually exclusive PRC1 complexes with CBX/PHC/SCM proteins ► RYBP stimulates PRC1-mediated H2AK119ub1 and is essential for ESC differentiation
Sources of variability in experimentally derived data include measurement error in addition to the physical phenomena of interest. This measurement error is a combination of systematic components, ...originating from the measuring instrument and random measurement errors. Several novel biological technologies, such as mass cytometry and single-cell RNA-seq (scRNA-seq), are plagued with systematic errors that may severely affect statistical analysis if the data are not properly calibrated.
We propose a novel deep learning approach for removing systematic batch effects. Our method is based on a residual neural network, trained to minimize the Maximum Mean Discrepancy between the multivariate distributions of two replicates, measured in different batches. We apply our method to mass cytometry and scRNA-seq datasets, and demonstrate that it effectively attenuates batch effects.
our codes and data are publicly available at https://github.com/ushaham/BatchEffectRemoval.git.
yuval.kluger@yale.edu.
Supplementary data are available at Bioinformatics online.
With recent approval of inhibitors of PD-1 in melanoma, non-small cell lung cancer (NSCLC) and renal cell carcinoma, extensive efforts are under way to develop biomarkers predictive of response. ...PD-L1 expression has been most widely studied, and is more predictive in NSCLC than renal cell carcinoma or melanoma. We therefore studied differences in expression patterns across tumor types.
We used tissue microarrays with tumors from NSCLC, renal cell carcinoma, or melanoma and a panel of cell lines to study differences between tumor types. Predictive studies were conducted on samples from 65 melanoma patients treated with PD-1 inhibitors alone or with CTLA-4 inhibitors, characterized for outcome. PD-L1 expression was studied by quantitative immunofluorescence using two well-validated antibodies.
PD-L1 expression was higher in NSCLC specimens than renal cell carcinoma, and lowest in melanoma (
= 0.001), and this finding was confirmed in a panel of cell lines. In melanoma tumors, PD-L1 was expressed either on tumor cells or immune-infiltrating cells. The association between PD-L1 expression in immune-infiltrating cells and progression-free or overall-survival in melanoma patients treated with ipilimumab and nivolumab was stronger than PD-L1 expression in tumor cells, and remained significant on multivariable analysis.
PD-L1 expression in melanoma tumor cells is lower than NSCLC or renal cell carcinoma cells. The higher response rate in melanoma patients treated with PD-1 inhibitors is likely related to PD-L1 in tumor-associated inflammatory cells. Further studies are warranted to validate the predictive role of inflammatory cell PD-L1 expression in melanoma and determine its biological significance.
.