DIKUL - logo
E-resources
Full text
Peer reviewed
  • ASCA+ and APCA+: Extensions...
    Thiel, Michel; Féraud, Baptiste; Govaerts, Bernadette

    Journal of chemometrics, June 2017, 2017-06-00, 20170601, Volume: 31, Issue: 6
    Journal Article

    Many modern analytical methods are used to analyse samples coming from an experimental design, for example, in medical, biological, or agronomic fields. Those methods generate most of the time highly multivariate data like spectra or images. This is the case of “omics” technologies used to detect genes (genomics), mRNA (transcriptomics), proteins (proteomics), or metabolites (metabolomics) in a specific biological sample. Those technologies produce high‐dimensional multivariate databases where the number of variables (descriptors) tends to be much larger than the number of experimental units. Moreover, experiments in omics often follow designs aimed at understanding the effect of several factors on biological systems. Therefore, multivariate statistical tools are needed to highlight variables that are consistently modified by different biological states. It is in this context that 2 recent methods combine analysis of variance (ANOVA) and principal component analysis (PCA), namely, ASCA (ANOVA–simultaneous component analysis) and APCA (ANOVA‐PCA). They provide powerful tools to visualize multivariate structures in the space of each effect of the statistical model linked to the experimental design. Their main limitation is that they provide biased estimators of the factor effects when the design of experiment is unbalanced. This paper introduces 2 new methods, ASCA+ and APCA+, that allow, respectively, to extend the use of ASCA and APCA to unbalanced designs using several principles from the theory of general linear models. Both methods are applied on real‐life metabolomics data, clearly demonstrating the capacity of ASCA+ and APCA+ methods to highlight correct biomarkers corresponding to effects of interest in unbalanced designs. This paper presents 2 new methods: ASCA+ and APCA+ that allow, respectively, to extend the use of ASCA and APCA to unbalanced designs. Those new methods rely on the principle of the general linear model to estimate factor effects with least squares rather than with simple differences of means as proposed by classical ASCA and APCA. Their application on real‐life metabolomics data shows their advantage in highlighting biomarkers corresponding to a factor of interest in unbalanced designs.