Abstract
Motivation
Cancer is characterized by intra-tumor heterogeneity, the presence of distinct cell populations with distinct complements of somatic mutations, which include single-nucleotide ...variants (SNVs) and copy-number aberrations (CNAs). Single-cell sequencing technology enables one to study these cell populations at single-cell resolution. Phylogeny estimation algorithms that employ appropriate evolutionary models are key to understanding the evolutionary mechanisms behind intra-tumor heterogeneity.
Results
We introduce Single-cell Phylogeny Reconstruction (SPhyR), a method for tumor phylogeny estimation from single-cell sequencing data. In light of frequent loss of SNVs due to CNAs in cancer, SPhyR employs the k-Dollo evolutionary model, where a mutation can only be gained once but lost k times. Underlying SPhyR is a novel combinatorial characterization of solutions as constrained integer matrix completions, based on a connection to the cladistic multi-state perfect phylogeny problem. SPhyR outperforms existing methods on simulated data and on a metastatic colorectal cancer.
Availability and implementation
SPhyR is available on https://github.com/elkebir-group/SPhyR.
Supplementary information
Supplementary data are available at Bioinformatics online.
Metastasis is the migration of cancerous cells from a primary tumor to other anatomical sites. Although metastasis was long thought to result from monoclonal seeding, or single cellular migrations, ...recent phylogenetic analyses of metastatic cancers have reported complex patterns of cellular migrations between sites, including polyclonal migrations and reseeding. However, accurate determination of migration patterns from somatic mutation data is complicated by intratumor heterogeneity and discordance between clonal lineage and cellular migration. We introduce MACHINA, a multi-objective optimization algorithm that jointly infers clonal lineages and parsimonious migration histories of metastatic cancers from DNA sequencing data. MACHINA analysis of data from multiple cancers shows that migration patterns are often not uniquely determined from sequencing data alone and that complicated migration patterns among primary tumors and metastases may be less prevalent than previously reported. MACHINA's rigorous analysis of migration histories will aid in studies of the drivers of metastasis.
Copy-number aberrations (CNAs) are genetic alterations that amplify or delete the number of copies of large genomic segments. Although they are ubiquitous in cancer and, thus, a critical area of ...current cancer research, CNA identification from DNA sequencing data is challenging because it requires partitioning of the genome into complex segments with the same copy-number states that may not be contiguous. Existing segmentation algorithms address these challenges either by leveraging the local information among neighboring genomic regions, or by globally grouping genomic regions that are affected by similar CNAs across the entire genome. However, both approaches have limitations: overclustering in the case of local segmentation, or the omission of clusters corresponding to focal CNAs in the case of global segmentation. Importantly, inaccurate segmentation will lead to inaccurate identification of CNAs. For this reason, most pan-cancer research studies rely on manual procedures of quality control and anomaly correction. To improve copy-number segmentation, we introduce CNAViz, a web-based tool that enables the user to simultaneously perform local and global segmentation, thus overcoming the limitations of each approach. Using simulated data, we demonstrate that by several metrics, CNAViz allows the user to obtain more accurate segmentation relative to existing local and global segmentation methods. Moreover, we analyze six bulk DNA sequencing samples from three breast cancer patients. By validating with parallel single-cell DNA sequencing data from the same samples, we show that by using CNAViz, our user was able to obtain more accurate segmentation and improved accuracy in downstream copy-number calling.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
DNA sequencing of multiple samples from the same tumor provides data to analyze the process of clonal evolution in the population of cells that give rise to a tumor.
We formalize the problem of ...reconstructing the clonal evolution of a tumor using single-nucleotide mutations as the variant allele frequency (VAF) factorization problem. We derive a combinatorial characterization of the solutions to this problem and show that the problem is NP-complete. We derive an integer linear programming solution to the VAF factorization problem in the case of error-free data and extend this solution to real data with a probabilistic model for errors. The resulting AncesTree algorithm is better able to identify ancestral relationships between individual mutations than existing approaches, particularly in ultra-deep sequencing data when high read counts for mutations yield high confidence VAFs.
An implementation of AncesTree is available at: http://compbio.cs.brown.edu/software.
Intra-tumor heterogeneity renders the identification of somatic single-nucleotide variants (SNVs) a challenging problem. In particular, low-frequency SNVs are hard to distinguish from sequencing ...artifacts. While the increasing availability of multi-sample tumor DNA sequencing data holds the potential for more accurate variant calling, there is a lack of high-sensitivity multi-sample SNV callers that utilize these data. Here we report Moss, a method to identify low-frequency SNVs that recur in multiple sequencing samples from the same tumor. Moss provides any existing single-sample SNV caller the ability to support multiple samples with little additional time overhead. We demonstrate that Moss improves recall while maintaining high precision in a simulated dataset. On multi-sample hepatocellular carcinoma, acute myeloid leukemia and colorectal cancer datasets, Moss identifies new low-frequency variants that meet manual review criteria and are consistent with the tumor's mutational signature profile. In addition, Moss detects the presence of variants in more samples of the same tumor than reported by the single-sample caller. Moss' improved sensitivity in SNV calling will enable more detailed downstream analyses in cancer genomics.
Genes in SARS-CoV-2 and other viruses in the order of Nidovirales are expressed by a process of discontinuous transcription which is distinct from alternative splicing in eukaryotes and is mediated ...by the viral RNA-dependent RNA polymerase. Here, we introduce the DISCONTINUOUS TRANSCRIPT ASSEMBLYproblem of finding transcripts and their abundances given an alignment of paired-end short reads under a maximum likelihood model that accounts for varying transcript lengths. We show, using simulations, that our method, JUMPER, outperforms existing methods for classical transcript assembly. On short-read data of SARS-CoV-1, SARS-CoV-2 and MERS-CoV samples, we find that JUMPER not only identifies canonical transcripts that are part of the reference transcriptome, but also predicts expression of non-canonical transcripts that are supported by subsequent orthogonal analyses. Moreover, application of JUMPER on samples with and without treatment reveals viral drug response at the transcript level. As such, JUMPER enables detailed analyses of Nidovirales transcriptomes under varying conditions.
Emerging ultra-low coverage single-cell DNA sequencing (scDNA-seq) technologies have enabled high resolution evolutionary studies of copy number aberrations (CNAs) within tumors. While these ...sequencing technologies are well suited for identifying CNAs due to the uniformity of sequencing coverage, the sparsity of coverage poses challenges for the study of single-nucleotide variants (SNVs). In order to maximize the utility of increasingly available ultra-low coverage scDNA-seq data and obtain a comprehensive understanding of tumor evolution, it is important to also analyze the evolution of SNVs from the same set of tumor cells. We present Phertilizer, a method to infer a clonal tree from ultra-low coverage scDNA-seq data of a tumor. Based on a probabilistic model, our method recursively partitions the data by identifying key evolutionary events in the history of the tumor. We demonstrate the performance of Phertilizer on simulated data as well as on two real datasets, finding that Phertilizer effectively utilizes the copy-number signal inherent in the data to more accurately uncover clonal structure and genotypes compared to previous methods.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The discovery of the occurrence of inorganic pollutants in surface waters is identified in the system assessment quality. The most harmful elements are pesticides, persistent organic pollutants, ...pharmaceuticals, personal care products, and heavy metals are still dangerous to the environment due to their general uses. Chromate has the largest concentration compared to the other metals in the wastewater industries. This work evaluates the application of the spinel p-CoAl
2
O
4
as a photocatalyst prepared by the nitrate synthesis process to reduce Cr(VI), a hazardous metal for the environment. The photocatalyst was characterized using thermal analysis (TG), X-ray diffraction, UV-diffuse reflectance spectroscopy, scanning electron microscopy, fluorescent X-ray, Fourier transform infrared spectroscopy, electrical conductivity, and photoelectrochemically. The results showed that the efficiency of optimum reduction of Cr(Vl) to Cr(IIl) photoreduction is more effective (77%) for pH = 3.6 than that at high pH values up to 8 (7%). Moreover, the effect of the hetero-system CoAl
2
O
4
/ZnO on photocatalytic efficiency was investigated. The photocatalytic activity increases up to 99% with 1 g L
−1
, a total catalyst dosage over the hetero-system CoAl
2
O
4
/ZnO at a ratio of 75%/25%. This data is better relative to CoAl
2
O
4
or ZnO alone. The Cr(VI) photoreduction activity improvement was caused by the best separation and the photogeneration of electron-hole on the CoAl
2
O
4
/ZnO surfaces. Finally, the Lagergren pseudo-first-order and the Langmuir–Hinshelwood models fit well the experimental kinetics.
Parsimonious Clone Tree Integration in cancer Sashittal, Palash; Zaccaria, Simone; El-Kebir, Mohammed
Algorithms for molecular biology,
03/2022, Letnik:
17, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Every tumor is composed of heterogeneous clones, each corresponding to a distinct subpopulation of cells that accumulated different types of somatic mutations, ranging from single-nucleotide variants ...(SNVs) to copy-number aberrations (CNAs). As the analysis of this intra-tumor heterogeneity has important clinical applications, several computational methods have been introduced to identify clones from DNA sequencing data. However, due to technological and methodological limitations, current analyses are restricted to identifying tumor clones only based on either SNVs or CNAs, preventing a comprehensive characterization of a tumor's clonal composition.
To overcome these challenges, we formulate the identification of clones in terms of both SNVs and CNAs as a integration problem while accounting for uncertainty in the input SNV and CNA proportions. We thus characterize the computational complexity of this problem and we introduce PACTION (PArsimonious Clone Tree integratION), an algorithm that solves the problem using a mixed integer linear programming formulation. On simulated data, we show that tumor clones can be identified reliably, especially when further taking into account the ancestral relationships that can be inferred from the input SNVs and CNAs. On 49 tumor samples from 10 prostate cancer patients, our integration approach provides a higher resolution view of tumor evolution than previous studies.
PACTION is an accurate and fast method that reconstructs clonal architecture of cancer tumors by integrating SNV and CNA clones inferred using existing methods.
Monitoring stations have been established to combat water pollution, improve the ecosystem, promote human health, and facilitate drinking water production. However, continuous and extensive ...monitoring of water is costly and time-consuming, resulting in limited datasets and hindering water management research. This study focuses on developing an optimized K-nearest neighbor (KNN) model using the improved grey wolf optimization (I-GWO) algorithm to predict dry residue quantities. The model incorporates 20 physical and chemical parameters derived from a dataset of 400 samples. Cross-validation is employed to assess model performance, optimize parameters, and mitigate the risk of overfitting. Four folds are created, and each fold is optimized using 11 distance metrics and their corresponding weighting functions to determine the best model configuration. Among the evaluated models, the Jaccard distance metric with inverse squared weighting function consistently demonstrates the best performance in terms of statistical errors and coefficients for each fold. By averaging predictions from the models in the four folds, an estimation of the overall model performance is obtained. The resulting model exhibits high efficiency, with remarkably low errors reflected in the values of R, R2, R2ADJ, RMSE, and EPM, which are reported as 0.9979, 0.9958, 0.9956, 41.2639, and 3.1061, respectively. This study reveals a compelling non-linear correlation between physico-chemical water attributes and the content of dry tailings, indicating the ability to accurately predict dry tailing quantities. By employing the proposed methodology to enhance water quality models, it becomes possible to overcome limitations in water quality management and significantly improve the precision of predictions regarding critical water parameters.