We propose an extension to quantile normalization that removes unwanted technical variation using control probes. We adapt our algorithm, functional normalization, to the Illumina 450k methylation ...array and address the open problem of normalizing methylation data with global epigenetic changes, such as human cancers. Using data sets from The Cancer Genome Atlas and a large case-control study, we show that our algorithm outperforms all existing normalization methods with respect to replication of results between experiments, and yields robust results even in the presence of batch effects. Functional normalization can be applied to any microarray platform, provided suitable control probes are available.
Chemotherapy resistance is a critical barrier in cancer treatment. Metabolic adaptations have been shown to fuel therapy resistance; however, little is known regarding the generality of these changes ...and whether specific therapies elicit unique metabolic alterations. Using a combination of metabolomics, transcriptomics, and functional genomics, we show that two anthracyclines, doxorubicin and epirubicin, elicit distinct primary metabolic vulnerabilities in human breast cancer cells. Doxorubicin-resistant cells rely on glutamine to drive oxidative phosphorylation and de novo glutathione synthesis, while epirubicin-resistant cells display markedly increased bioenergetic capacity and mitochondrial ATP production. The dependence on these distinct metabolic adaptations is revealed by the increased sensitivity of doxorubicin-resistant cells and tumor xenografts to buthionine sulfoximine (BSO), a drug that interferes with glutathione synthesis, compared with epirubicin-resistant counterparts that are more sensitive to the biguanide phenformin. Overall, our work reveals that metabolic adaptations can vary with therapeutics and that these metabolic dependencies can be exploited as a targeted approach to treat chemotherapy-resistant breast cancer.
Chromosomal breakage followed by faulty DNA repair leads to gene amplifications and deletions in cancers. However, the mere assessment of the extent of genomic changes, amplifications and deletions ...may reduce the complexity of genomic data observed by array comparative genomic hybridization (array CGH). We present here a novel approach to array CGH data analysis, which focuses on putative breakpoints responsible for rearrangements within the genome.
We performed array comparative genomic hybridization in 29 primary tumors from high risk patients with breast cancer. The specimens were flow sorted according to ploidy to increase tumor cell purity prior to array CGH. We describe the number of chromosomal breaks as well as the patterns of breaks on individual chromosomes in each tumor. There were differences in chromosomal breakage patterns between the 3 clinical subtypes of breast cancers, although the highest density of breaks occurred at chromosome 17 in all subtypes, suggesting a particular proclivity of this chromosome for breaks. We also observed chromothripsis affecting various chromosomes in 41% of high risk breast cancers.
Our results provide a new insight into the genomic complexity of breast cancer. Genomic instability dependent on chromosomal breakage events is not stochastic, targeting some chromosomes clearly more than others. We report a much higher percentage of chromothripsis than described previously in other cancers and this suggests that massive genomic rearrangements occurring in a single catastrophic event may shape many breast cancer genomes.
The genomics era has led to an increase in the dimensionality of data collected in the investigation of biological questions. In this context, dimension-reduction techniques can be used to summarise ...high-dimensional signals into low-dimensional ones, to further test for association with one or more covariates of interest. This paper revisits one such approach, previously known as principal component of heritability and renamed here as principal component of explained variance (PCEV). As its name suggests, the PCEV seeks a linear combination of outcomes in an optimal manner, by maximising the proportion of variance explained by one or several covariates of interest. By construction, this method optimises power; however, due to its computational complexity, it has unfortunately received little attention in the past. Here, we propose a general analytical PCEV framework that builds on the assets of the original method, i.e. conceptually simple and free of tuning parameters. Moreover, our framework extends the range of applications of the original procedure by providing a computationally simple strategy for high-dimensional outcomes, along with exact and asymptotic testing procedures that drastically reduce its computational cost. We investigate the merits of the PCEV using an extensive set of simulations. Furthermore, the use of the PCEV approach is illustrated using three examples taken from the fields of epigenetics and brain imaging.
Osteoporosis is a common disease diagnosed primarily by measurement of bone mineral density (BMD). We undertook a genome-wide association study (GWAS) in 142,487 individuals from the UK Biobank to ...identify loci associated with BMD as estimated by quantitative ultrasound of the heel. We identified 307 conditionally independent single-nucleotide polymorphisms (SNPs) that attained genome-wide significance at 203 loci, explaining approximately 12% of the phenotypic variance. These included 153 previously unreported loci, and several rare variants with large effect sizes. To investigate the underlying mechanisms, we undertook (1) bioinformatic, functional genomic annotation and human osteoblast expression studies; (2) gene-function prediction; (3) skeletal phenotyping of 120 knockout mice with deletions of genes adjacent to lead independent SNPs; and (4) analysis of gene expression in mouse osteoblasts, osteocytes and osteoclasts. The results implicate GPC6 as a novel determinant of BMD, and also identify abnormal skeletal phenotypes in knockout mice associated with a further 100 prioritized genes.
Deleterious copy number variants (CNVs) are identified in up to 20% of individuals with autism. However, levels of autism risk conferred by most rare CNVs remain unknown. The authors recently ...developed statistical models to estimate the effect size on IQ of all CNVs, including undocumented ones. In this study, the authors extended this model to autism susceptibility.
The authors identified CNVs in two autism populations (Simons Simplex Collection and MSSNG) and two unselected populations (IMAGEN and Saguenay Youth Study). Statistical models were used to test nine quantitative variables associated with genes encompassed in CNVs to explain their effects on IQ, autism susceptibility, and behavioral domains.
The "probability of being loss-of-function intolerant" (pLI) best explains the effect of CNVs on IQ and autism risk. Deleting 1 point of pLI decreases IQ by 2.6 points in autism and unselected populations. The effect of duplications on IQ is threefold smaller. Autism susceptibility increases when deleting or duplicating any point of pLI. This is true for individuals with high or low IQ and after removing de novo and known recurrent neuropsychiatric CNVs. When CNV effects on IQ are accounted for, autism susceptibility remains mostly unchanged for duplications but decreases for deletions. Model estimates for autism risk overlap with previously published observations. Deletions and duplications differentially affect social communication, behavior, and phonological memory, whereas both equally affect motor skills.
Autism risk conferred by duplications is less influenced by IQ compared with deletions. The model applied in this study, trained on CNVs encompassing >4,500 genes, suggests highly polygenic properties of gene dosage with respect to autism risk and IQ loss. These models will help to interpret CNVs identified in the clinic.
In observational studies, type-2 diabetes (T2D) is associated with an increased risk of coronary heart disease (CHD), yet interventional trials have shown no clear effect of glucose-lowering on CHD. ...Confounding may have therefore influenced these observational estimates. Here we use Mendelian randomization to obtain unconfounded estimates of the influence of T2D and fasting glucose (FG) on CHD risk. Using multiple genetic variants associated with T2D and FG, we find that risk of T2D increases CHD risk (odds ratio (OR)=1.11 (1.05-1.17), per unit increase in odds of T2D, P=8.8 × 10(-5); using data from 34,840/114,981 T2D cases/controls and 63,746/130,681 CHD cases/controls). FG in non-diabetic individuals tends to increase CHD risk (OR=1.15 (1.00-1.32), per mmol·per l, P=0.05; 133,010 non-diabetic individuals and 63,746/130,681 CHD cases/controls). These findings provide evidence supporting a causal relationship between T2D and CHD and suggest that long-term trials may be required to discern the effects of T2D therapies on CHD risk.
The role of rare genetic variation in the etiology of complex disease remains unclear. However, the development of next-generation sequencing technologies offers the experimental opportunity to ...address this question. Several novel statistical methodologies have been recently proposed to assess the contribution of rare variation to complex disease etiology. Nevertheless, no empirical estimates comparing their relative power are available. We therefore assessed the parameters that influence their statistical power in 1,998 individuals Sanger-sequenced at seven genes by modeling different distributions of effect, proportions of causal variants, and direction of the associations (deleterious, protective, or both) in simulated continuous trait and case/control phenotypes. Our results demonstrate that the power of recently proposed statistical methods depend strongly on the underlying hypotheses concerning the relationship of phenotypes with each of these three factors. No method demonstrates consistently acceptable power despite this large sample size, and the performance of each method depends upon the underlying assumption of the relationship between rare variants and complex traits. Sensitivity analyses are therefore recommended to compare the stability of the results arising from different methods, and promising results should be replicated using the same method in an independent sample. These findings provide guidance in the analysis and interpretation of the role of rare base-pair variation in the etiology of complex traits and diseases.