Phylogenetic signal is the tendency for closely related species to display similar trait values due to their common ancestry. Several methods have been developed for quantifying phylogenetic signal ...in univariate traits and for sets of traits treated simultaneously, and the statistical properties of these approaches have been extensively studied. However, methods for assessing phylogenetic signal in high-dimensional multivariate traits like shape are less well developed, and their statistical performance is not well characterized. In this article, I describe a generalization of the statistic of Blomberg et al. that is useful for quantifying and evaluating phylogenetic signal in highly dimensional multivariate data. The method (Kmult) is found from the equivalency between statistical methods based on covariance matrices and those based on distance matrices. Using computer simulations based on Brownian motion, I demonstrate that the expected value of Kmult remains at 1.0 as trait variation among species is increased or decreased, and as the number of trait dimensions is increased. By contrast, estimates of phylogenetic signal found with a squared-change parsimony procedure for multivariate data change with increasing trait variation among species and with increasing numbers of trait dimensions, confounding biological interpretations. I also evaluate the statistical performance of hypothesis testing procedures based on and find that the method displays appropriate Type I error and high statistical power for detecting phylogenetic signal in highdimensional data. Statistical properties of Kmult were consistent for simulations using bifurcating and random phylogenies, for simulations using different numbers of species, for simulations that varied the number of trait dimensions, and for different underlying models of trait covariance structure. Overall these findings demonstrate that provides a useful means of evaluating phylogenetic signal in high-dimensional multivariate traits. Finally, I illustrate the utility of the new approach by evaluating the strength of phylogenetic signal for head shape in a lineage of Plethodon salamanders.
Studies of evolutionary correlations commonly use phylogenetic regression (i.e., independent contrasts and phylogenetic generalized least squares) to assess trait covariation in a phylogenetic ...context. However, while this approach is appropriate for evaluating trends in one or a few traits, it is incapable of assessing patterns in highly multivariate data, as the large number of variables relative to sample size prohibits parametric test statistics from being computed. This poses serious limitations for comparative biologists, who must either simplify how they quantify phenotypic traits, or alter the biological hypotheses they wish to examine. In this article, I propose a new statistical procedure for performing ANOVA and regression models in a phylogenetic context that can accommodate high-dimensional datasets. The approach is derived from the statistical equivalency between parametric methods using covariance matrices and methods based on distance matrices. Using simulations under Brownian motion, I show that the method displays appropriate Type I error rates and statistical power, whereas standard parametric procedures have decreasing power as data dimensionality increases. As such, the new procedure provides a useful means of assessing trait covariation across a set of taxa related by a phylogeny, enabling macroevolutionary biologists to test hypotheses of adaptation, and phenotypic change in high-dimensional datasets.
Many questions in evolutionary biology require the quantification and comparison of rates of phenotypic evolution. Recently, phylogenetic comparative methods have been developed for comparing ...evolutionary rates on a phylogeny for single, univariate traits (σ²), and evolutionary rate matrices (R) for sets of traits treated simultaneously. However, high-dimensional traits like shape remain under-examined with this framework, because methods suited for such data have not been fully developed. In this article, I describe a method to quantify phylogenetic evolutionary rates for high-dimensional multivariate data $\left( {\sigma _{mult}^2} \right)$, found from the equivalency between statistical methods based on covariance matrices and those based on distance matrices (R-mode and Q-mode methods). I then use simulations to evaluate the statistical performance of hypothesis-testing procedures that compare $\sigma _{mult}^1$ for two or more groups of species on a phylogeny. Under both isotropic and non-isotropic conditions, and for differing numbers of trait dimensions, the proposed method displays appropriate Type I error and high statistical power for detecting known differences in $\sigma _{mult}^1$ among groups. In contrast, the Type I error rate of likelihood tests based on the evolutionary rate matrix (R) increases as the number of trait dimensions (p) increases, and becomes unacceptably large when only a few trait dimensions are considered. Further, likelihood tests based on R cannot be computed when the number of trait dimensions equals or exceeds the number of taxa in the phylogeny (i.e., when p> N). These results demonstrate that tests based on $\sigma _{mult}^1$ provide a useful means of comparing evolutionary rates for high-dimensional data that are otherwise not analytically accessible to methods based on the evolutionary rate matrix. This advance thus expands the phylogenetic comparative toolkit for high-dimensional phenotypic traits like shape. Finally, I illustrate the utility of the new approach by evaluating rates of head shape evolution in a lineage of Plethodon salamanders.
Residual randomization in permutation procedures (RRPP) is an appropriate means of generating empirical sampling distributions for ANOVA statistics and linear model coefficients, using ordinary or ...generalized least‐squares estimation. This is an especially useful approach for high‐dimensional (multivariate) data.
Here, we present an r package that provides a comprehensive suite of tools for applying RRPP to linear models. Important available features include choices for OLS or GLS coefficient estimation, data or dissimilarity matrix analysis capability, choice among types I, II, or III sums of squares and cross‐products, various effect size estimation methods, and an ability to perform mixed‐model ANOVA.
The lm.rrpp function is similar to the lm function in many regards, but provides coefficient and ANOVA statistics estimates over many random permutations. The S3 generic functions commonly used with lm also work with lm.rrpp. Additionally, a pairwise function provides statistical tests for comparisons of least‐squares means or slopes, among designated groups. Users have many options for varying random permutations. Compared to similar available packages and functions, RRPP is extremely fast and yields comprehensive results for downstream analyses and graphics, following model fits with lm.rrpp.
The RRPP package facilitates analysis of both univariate and multivariate response data, even when the number of variables exceeds the number of observations.
Morphological integration describes the degree to which sets of organismal traits covary with one another. Morphological covariation may be evaluated at various levels of biological organization, but ...when characterizing such patterns across species at the macroevolutionary level, phylogeny must be taken into account. We outline an analytical procedure based on the evolutionary covariance matrix that allows species-level patterns of morphological integration among structures defined by sets of traits to be evaluated while accounting for the phylogenetic relationships among taxa, providing a flexible and robust complement to related phylogenetic independent contrasts based approaches. Using computer simulations under a Brownian motion model we show that statistical tests based on the approach display appropriate Type I error rates and high statistical power for detecting known levels of integration, and these trends remain consistent for simulations using different numbers of species, and for simulations that differ in the number of trait dimensions. Thus, our procedure provides a useful means of testing hypotheses of morphological integration in a phylogenetic context. We illustrate the utility of this approach by evaluating evolutionary patterns of morphological integration in head shape for a lineage of Plethodon salamanders, and find significant integration between cranial shape and mandible shape. Finally, computer code written in R for implementing the procedure is provided.
Recent years have seen increased interest in phylogenetic comparative analyses of multivariate data sets, but to date the varied proposed approaches have not been extensively examined. Here we review ...the mathematical properties required of any multivariate method, and specifically evaluate existing multivariate phylogenetic comparative methods in this context. Phylogenetic comparative methods based on the full multivariate likelihood are robust to levels of covariation among trait dimensions and are insensitive to the orientation of the data set, but display increasing model misspecification as the number of trait dimensions increases. This is because the expected evolutionary covariance matrix (V) used in the likelihood calculations becomes more ill-conditioned as trait dimensionality increases, and as evolutionary models become more complex. Thus, these approaches are only appropriate for data sets with few traits and many species. Methods that summarize patterns across trait dimensions treated separately (e.g., SURFACE) incorrectly assume independence among trait dimensions, resulting in nearly a 100% model misspecification rate. Methods using pairwise composite likelihood are highly sensitive to levels of trait covariation, the orientation of the data set, and the number of trait dimensions. The consequences of these debilitating deficiencies are that a user can arrive at differing statistical conclusions, and therefore biological inferences, simply from a dataspace rotation, like principal component analysis. By contrast, algebraic generalizations of the standard phylogenetic comparative toolkit that use the trace of covariance matrices are insensitive to levels of trait covariation, the number of trait dimensions, and the orientation of the data set. Further, when appropriate permutation tests are used, these approaches display acceptable Type I error and statistical power. We conclude that methods summarizing information across trait dimensions, as well as pairwise composite likelihood methods should be avoided, whereas algebraic generalizations of the phylogenetic comparative toolkit provide a useful means of assessing macroevolutionary patterns in multivariate data. Finally, we discuss areas in which multivariate phylogenetic comparative methods are still in need of future development; namely highly multivariate Ornstein–Uhlenbeck models and approaches for multivariate evolutionary model comparisons.
In recent years, likelihood-based approaches have been used with increasing frequency to evaluate macroevolutionary hypotheses of phenotypic evolution under distinct evolutionary processes in a ...phylogenetic context (e.g., Brownian motion, Ornstein-Uhlenbeck, etc.), and to compare one or more evolutionary rates for the same phenotypic trait along a phylogeny. It is also of interest to determine whether one trait evolves at a faster rate than another trait. However, to date no study has compared phylogenetic evolutionary rates between traits using likelihood, because a formal approach has not yet been proposed. In this article, I describe a new likelihood procedure for comparing evolutionary rates for two or more phenotypic traits on a phylogeny. This approach compares the likelihood of a model where each trait evolves at a distinct evolutionary rate to the likelihood of a model where all traits are constrained to evolve at a common evolutionary rate. The method can also account for within-species measurement error and within-species trait covariation if available. Simulations revealed that the method has appropriate Type I error rates and statistical power. Importantly, when compared with existing approaches based on phylogenetically independent contrasts and methods that compare confidence intervals for model parameters, the likelihood method displays preferable statistical properties for a wide range of simulated conditions. Thus, this likelihood-based method extends the phylogenetic comparative biology toolkit and provides evolutionary biologists with a more powerful means of determining when evolutionary rates differ between phenotypic traits. Finally, I provide an empirical example illustrating the approach by comparing rates of evolution for several phenotypic traits in Plethodon salamanders.
Evolutionary morphologists frequently wish to understand the extent to which organisms are integrated, and whether the strength of morphological integration among subsets of phenotypic variables ...differ among taxa or other groups. However, comparisons of the strength of integration across datasets are difficult, in part because the summary measures that characterize these patterns (RV coefficient and rPLS) are dependent both on sample size and on the number of variables. As a solution to this issue, we propose a standardized test statistic (a z-score) for measuring the degree of morphological integration between sets of variables. The approach is based on a partial least squares analysis of trait covariation, and its permutation-based sampling distribution. Under the null hypothesis of a random association of variables, the method displays a constant expected value and confidence intervals for datasets of differing sample sizes and variable number, thereby providing a consistent measure of integration suitable for comparisons across datasets. A two-sample test is also proposed to statistically determine whether levels of integration differ between datasets, and an empirical example examining cranial shape integration in Mediterranean wall lizards illustrates its use. Some extensions of the procedure are also discussed.
Evolutionary biology is multivariate, and advances in phylogenetic comparative methods for multivariate phenotypes have surged to accommodate this fact. Evolutionary trends in multivariate phenotypes ...are derived from distances and directions between species in a multivariate phenotype space. For these patterns to be interpretable, phenotypes should be characterized by traits in commensurate units and scale. Visualizing such trends, as is achieved with phylomorphospaces, should continue to play a prominent role in macroevolutionary analyses. Evaluating phylogenetic generalized least squares (PGLS) models (e.g., phylogenetic analysis of variance and regression) is valuable, but using parametric procedures is limited to only a few phenotypic variables. In contrast, nonparametric, permutation-based PGLS methods provide a flexible alternative and are thus preferred for high-dimensional multivariate phenotypes. Permutation-based methods for evaluating covariation within multivariate phenotypes are also well established and can test evolutionary trends in phenotypic integration. However, comparing evolutionary rates and modes in multivariate phenotypes remains an important area of future development.
Phylogenetic ANOVA Adams, Dean C.; Collyer, Michael L.
Evolution,
June 2018, Volume:
72, Issue:
6
Journal Article
Peer reviewed
Phylogenetic regression is frequently used in macroevolutionary studies, and its statistical properties have been thoroughly investigated. By contrast, phylogenetic ANOVA has received relatively less ...attention, and the conditions leading to incorrect statistical and biological inferences when comparing multivariate phenotypes among groups remain underexplored. Here, we propose a refined method of randomizing residuals in a permutation procedure (RRPP) for evaluating phenotypic differences among groups while conditioning the data on the phylogeny. We show that RRPP displays appropriate statistical properties for both phylogenetic ANOVA and regression models, and for univariate and multivariate datasets. For ANOVA, we find that RRPP exhibits higher statistical power than methods utilizing phylogenetic simulation. Additionally, we investigate how group dispersion across the phylogeny affects inferences, and reveal that highly aggregated groups generate strong and significant correlations with the phylogeny, which reduce statistical power and subsequently affect biological interpretations. We discuss the broader implications of this phylogenetic group aggregation, and its relation to challenges encountered with other comparative methods where one or a few transitions in discrete traits are observed on the phylogeny. Finally, we recommend that phylogenetic comparative studies of continuous trait data use RRPP for assessing the significance of indicator variables as sources of trait variation.