Copy-number aberrations (CNAs) and whole-genome duplications (WGDs) are frequent somatic mutations in cancer but their quantification from DNA sequencing of bulk tumor samples is challenging. ...Standard methods for CNA inference analyze tumor samples individually; however, DNA sequencing of multiple samples from a cancer patient has recently become more common. We introduce HATCHet (Holistic Allele-specific Tumor Copy-number Heterogeneity), an algorithm that infers allele- and clone-specific CNAs and WGDs jointly across multiple tumor samples from the same patient. We show that HATCHet outperforms current state-of-the-art methods on multi-sample DNA sequencing data that we simulate using MASCoTE (Multiple Allele-specific Simulation of Copy-number Tumor Evolution). Applying HATCHet to 84 tumor samples from 14 prostate and pancreas cancer patients, we identify subclonal CNAs and WGDs that are more plausible than previously published analyses and more consistent with somatic single-nucleotide variants (SNVs) and small indels in the same samples.
Parsimonious Clone Tree Integration in cancer Sashittal, Palash; Zaccaria, Simone; El-Kebir, Mohammed
Algorithms for molecular biology,
03/2022, Volume:
17, Issue:
1
Journal Article
Peer reviewed
Open access
Every tumor is composed of heterogeneous clones, each corresponding to a distinct subpopulation of cells that accumulated different types of somatic mutations, ranging from single-nucleotide variants ...(SNVs) to copy-number aberrations (CNAs). As the analysis of this intra-tumor heterogeneity has important clinical applications, several computational methods have been introduced to identify clones from DNA sequencing data. However, due to technological and methodological limitations, current analyses are restricted to identifying tumor clones only based on either SNVs or CNAs, preventing a comprehensive characterization of a tumor's clonal composition.
To overcome these challenges, we formulate the identification of clones in terms of both SNVs and CNAs as a integration problem while accounting for uncertainty in the input SNV and CNA proportions. We thus characterize the computational complexity of this problem and we introduce PACTION (PArsimonious Clone Tree integratION), an algorithm that solves the problem using a mixed integer linear programming formulation. On simulated data, we show that tumor clones can be identified reliably, especially when further taking into account the ancestral relationships that can be inferred from the input SNVs and CNAs. On 49 tumor samples from 10 prostate cancer patients, our integration approach provides a higher resolution view of tumor evolution than previous studies.
PACTION is an accurate and fast method that reconstructs clonal architecture of cancer tumors by integrating SNV and CNA clones inferred using existing methods.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Bulk DNA sequencing of multiple samples from the same tumor is becoming common, yet most methods to infer copy-number aberrations (CNAs) from this data analyze individual samples independently. We ...introduce HATCHet2, an algorithm to identify haplotype- and clone-specific CNAs simultaneously from multiple bulk samples. HATCHet2 extends the earlier HATCHet method by improving identification of focal CNAs and introducing a novel statistic, the minor haplotype B-allele frequency (mhBAF), that enables identification of mirrored-subclonal CNAs. We demonstrate HATCHet2's improved accuracy using simulations and a single-cell sequencing dataset. HATCHet2 analysis of 10 prostate cancer patients reveals previously unreported mirrored-subclonal CNAs affecting cancer genes.
Copy-number aberrations (CNAs) are genetic alterations that amplify or delete the number of copies of large genomic segments. Although they are ubiquitous in cancer and, thus, a critical area of ...current cancer research, CNA identification from DNA sequencing data is challenging because it requires partitioning of the genome into complex segments with the same copy-number states that may not be contiguous. Existing segmentation algorithms address these challenges either by leveraging the local information among neighboring genomic regions, or by globally grouping genomic regions that are affected by similar CNAs across the entire genome. However, both approaches have limitations: overclustering in the case of local segmentation, or the omission of clusters corresponding to focal CNAs in the case of global segmentation. Importantly, inaccurate segmentation will lead to inaccurate identification of CNAs. For this reason, most pan-cancer research studies rely on manual procedures of quality control and anomaly correction. To improve copy-number segmentation, we introduce CNAViz, a web-based tool that enables the user to simultaneously perform local and global segmentation, thus overcoming the limitations of each approach. Using simulated data, we demonstrate that by several metrics, CNAViz allows the user to obtain more accurate segmentation relative to existing local and global segmentation methods. Moreover, we analyze six bulk DNA sequencing samples from three breast cancer patients. By validating with parallel single-cell DNA sequencing data from the same samples, we show that by using CNAViz, our user was able to obtain more accurate segmentation and improved accuracy in downstream copy-number calling.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Haplotype assembly is the process of assigning the different alleles of the variants covered by mapped sequencing reads to the two haplotypes of the genome of a human individual. Long reads, which ...are nowadays cheaper to produce and more widely available than ever before, have been used to reduce the fragmentation of the assembled haplotypes since their ability to span several variants along the genome. These long reads are also characterized by a high error rate, an issue which may be mitigated, however, with larger sets of reads, when this error rate is uniform across genome positions. Unfortunately, current state-of-the-art dynamic programming approaches designed for long reads deal only with limited coverages.
Here, we propose a new method for assembling haplotypes which combines and extends the features of previous approaches to deal with long reads and higher coverages. In particular, our algorithm is able to dynamically adapt the estimated number of errors at each variant site, while minimizing the total number of error corrections necessary for finding a feasible solution. This allows our method to significantly reduce the required computational resources, allowing to consider datasets composed of higher coverages. The algorithm has been implemented in a freely available tool, HapCHAT: Haplotype Assembly Coverage Handling by Adapting Thresholds. An experimental analysis on sequencing reads with up to 60 × coverage reveals improvements in accuracy and recall achieved by considering a higher coverage with lower runtimes.
Our method leverages the long-range information of sequencing reads that allows to obtain assembled haplotypes fragmented in a lower number of unphased haplotype blocks. At the same time, our method is also able to deal with higher coverages to better correct the errors in the original reads and to obtain more accurate haplotypes as a result.
HapCHAT is available at http://hapchat.algolab.eu under the GNU Public License (GPL).
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
BackgroundCheckpoint inhibitor (CPI) immunotherapies have provided durable clinical responses across a range of solid tumor types for some patients with cancer. Nonetheless, response rates to CPI ...vary greatly between cancer types. Resolving intratumor transcriptomic changes induced by CPI may improve our understanding of the mechanisms of sensitivity and resistance.MethodsWe assembled a cohort of longitudinal pre-therapy and on-therapy samples from 174 patients treated with CPI across six cancer types by leveraging transcriptomic sequencing data from five studies.ResultsMeta-analyses of published RNA markers revealed an on-therapy pattern of immune reinvigoration in patients with breast cancer, which was not discernible pre-therapy, providing biological insight into the impact of CPI on the breast cancer immune microenvironment. We identified 98 breast cancer-specific correlates of CPI response, including 13 genes which are known IO targets, such as toll-like receptors TLR1, TLR4, and TLR8, that could hold potential as combination targets for patients with breast cancer receiving CPI treatment. Furthermore, we demonstrate that a subset of response genes identified in breast cancer are already highly expressed pre-therapy in melanoma, and additionally we establish divergent RNA dynamics between breast cancer and melanoma following CPI treatment, which may suggest distinct immune microenvironments between the two cancer types.ConclusionsOverall, delineating longitudinal RNA dynamics following CPI therapy sheds light on the mechanisms underlying diverging response trajectories, and identifies putative targets for combination therapy.
Cancer is an evolutionary process characterized by the accumulation of somatic mutations in a population of cells that form a tumor. One frequent type of mutations is copy number aberrations, which ...alter the number of copies of genomic regions. The number of copies of each position along a chromosome constitutes the chromosome's copy-number profile. Understanding how such profiles evolve in cancer can assist in both diagnosis and prognosis.
We model the evolution of a tumor by segmental deletions and amplifications, and gauge distance from profile Formula: see text to Formula: see text by the minimum number of events needed to transform Formula: see text into Formula: see text. Given two profiles, our first problem aims to find a parental profile that minimizes the sum of distances to its children. Given
profiles, the second, more general problem, seeks a phylogenetic tree, whose
leaves are labeled by the
given profiles and whose internal vertices are labeled by ancestral profiles such that the sum of edge distances is minimum.
For the former problem we give a pseudo-polynomial dynamic programming algorithm that is linear in the profile length, and an integer linear program formulation. For the latter problem we show it is NP-hard and give an integer linear program formulation that scales to practical problem instance sizes. We assess the efficiency and quality of our algorithms on simulated instances.
https://github.com/raphael-group/CNT-ILP.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
(1) Background: In this study, the effects of different pH values (2.4, 3.2, 4.4 and 5.0), temperatures (30, 35, 40, 45 and 50°C) and agitation (100 rpm) on the enzymatic decolourisation of ...twenty-two dyes belonging to the chromophore groups anthraquinone, azo and triphenylmethane were assessed. (2) Methods: In all conditions, it was used a crude enzyme broth containing 30 U mL-1 laccases produced by Pleurotus sajor-caju PS-2001 in submerged process. (3) Results: Regarding the effects of pH values, the best results were obtained at pH 3.2 and 30°C, in which bleaching was observed for all dyes evaluated. In assays conducted at different temperatures, highest levels of decolourisation were observed at 35°C and pH 3.2 for nineteen of the dyes assessed. Thirteen dyes presented colour reduction exceeding 50% after the enzymatic treatment, including all acid and all disperse dyes evaluated. The reciprocal agitation of 100 rpm promoted negative effect on decolourisation. (4) Conclusion: From the results achieved, one can conclude that the laccase-containing preparation of P. sajor-caju PS-2001 has potential for the decolourisation of some dyes widely used in different industrial sectors, especially in the textile industry, and therefore could be used in future strategies for the biotreatment of coloured wastes.
Single-cell barcoding technologies enable genome sequencing of thousands of individual cells in parallel, but with extremely low sequencing coverage (<0.05×) per cell. While the total copy number of ...large multi-megabase segments can be derived from such data, important allele-specific mutations-such as copy-neutral loss of heterozygosity (LOH) in cancer-are missed. We introduce copy-number haplotype inference in single cells using evolutionary links (CHISEL), a method to infer allele- and haplotype-specific copy numbers in single cells and subpopulations of cells by aggregating sparse signal across hundreds or thousands of individual cells. We applied CHISEL to ten single-cell sequencing datasets of ~2,000 cells from two patients with breast cancer. We identified extensive allele-specific copy-number aberrations (CNAs) in these samples, including copy-neutral LOHs, whole-genome duplications (WGDs) and mirrored-subclonal CNAs. These allele-specific CNAs affect genomic regions containing well-known breast-cancer genes. We also refined the reconstruction of tumor evolution, timing allele-specific CNAs before and after WGDs, identifying low-frequency subpopulations distinguished by unique CNAs and uncovering evidence of convergent evolution.
Full text
Available for:
GEOZS, IJS, IMTLJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBMB, UL, UM, UPUK, ZAGLJ
Abstract (1) Background: Oxygen supply is an important parameter to be considered in submerged cultures. This study evaluated the influence of different conditions for dissolved oxygen (DO) ...concentration on laccases activities and growth of Pleurotus sajor-caju PS-2001 in submerged process in stirred-tank bioreactor. (2) Methods: Initially, three different conditions were tested: uncontrolled DO and minimum levels of 30% and 80% of saturation, with the pH controlled between 4.5 and 7.0. (3) Results: Best results were observed at 30% DO (26 U mL-1 of laccases at 96 h), whereas higher mycelial biomass was observed at 30% and 80% DO (above 4.5 g L-1). Four different conditions of DO (uncontrolled, 10%, 30% and 50% of saturation) were tested at pH 6.5, with higher laccases activity (80 U mL-1 at 66 h) and lower mycelial growth (1.36 g L-1 at 90 h) being achieved with DO of 30%. In this test, the highest values for volumetric productivity and specific yield factor were determined. Under the different pH conditions tested, the production of laccases is favoured at DO concentration of 30% of saturation, while superior DO levels favours fungal growth. (4) Conclusion: The results indicate that dissolved oxygen concentration is a critical factor for the culture of P. sajor-caju PS-2001 and has important effects not only on laccases production but also on fungal growth.