Linear mixed‐effects models are powerful tools for analysing complex datasets with repeated or clustered observations, a common data structure in ecology and evolution. Mixed‐effects models involve ...complex fitting procedures and make several assumptions, in particular about the distribution of residual and random effects. Violations of these assumptions are common in real datasets, yet it is not always clear how much these violations matter to accurate and unbiased estimation.
Here we address the consequences of violations in distributional assumptions and the impact of missing random effect components on model estimates. In particular, we evaluate the effects of skewed, bimodal and heteroscedastic random effect and residual variances, of missing random effect terms and of correlated fixed effect predictors. We focus on bias and prediction error on estimates of fixed and random effects.
Model estimates were usually robust to violations of assumptions, with the exception of slight upward biases in estimates of random effect variance if the generating distribution was bimodal but was modelled by Gaussian error distributions. Further, estimates for (random effect) components that violated distributional assumptions became less precise but remained unbiased. However, this particular problem did not affect other parameters of the model. The same pattern was found for strongly correlated fixed effects, which led to imprecise, but unbiased estimates, with uncertainty estimates reflecting imprecision.
Unmodelled sources of random effect variance had predictable effects on variance component estimates. The pattern is best viewed as a cascade of hierarchical grouping factors. Variances trickle down the hierarchy such that missing higher‐level random effect variances pool at lower levels and missing lower‐level and crossed random effect variances manifest as residual variance.
Overall, our results show remarkable robustness of mixed‐effects models that should allow researchers to use mixed‐effects models even if the distributional assumptions are objectively violated. However, this does not free researchers from careful evaluation of the model. Estimates that are based on data that show clear violations of key assumptions should be treated with caution because individual datasets might give highly imprecise estimates, even if they will be unbiased on average across datasets.
ABSTRACT
We present a novel perspective on life‐history evolution that combines recent theoretical advances in fluctuating density‐dependent selection with the notion of pace‐of‐life syndromes ...(POLSs) in behavioural ecology. These ideas posit phenotypic co‐variation in life‐history, physiological, morphological and behavioural traits as a continuum from the highly fecund, short‐lived, bold, aggressive and highly dispersive ‘fast’ types at one end of the POLS to the less fecund, long‐lived, cautious, shy, plastic and socially responsive ‘slow’ types at the other. We propose that such variation in life histories and the associated individual differences in behaviour can be explained through their eco‐evolutionary dynamics with population density – a single and ubiquitous selective factor that is present in all biological systems. Contrasting regimes of environmental stochasticity are expected to affect population density in time and space and create differing patterns of fluctuating density‐dependent selection, which generates variation in fast versus slow life histories within and among populations. We therefore predict that a major axis of phenotypic co‐variation in life‐history, physiological, morphological and behavioural traits (i.e. the POLS) should align with these stochastic fluctuations in the multivariate fitness landscape created by variation in density‐dependent selection. Phenotypic plasticity and/or genetic (co‐)variation oriented along this major POLS axis are thus expected to facilitate rapid and adaptively integrated changes in various aspects of life histories within and among populations and/or species. The fluctuating density‐dependent selection POLS framework presented here therefore provides a series of clear testable predictions, the investigation of which should further our fundamental understanding of life‐history evolution and thus our ability to predict natural population dynamics.
1. Labile characters allow individuals to flexibly adjust their phenotype to changes in environmental conditions. There is growing evidence that individuals can differ both in average expression and ...level of plasticity in this type of character. Both of these aspects are studied in conjunction within a reaction norm framework. 2. Theoreticians have investigated the factors promoting variation in reaction norm intercepts (average phenotype) and slopes (level of plasticity) of a key labile character: behaviour. A general prediction from their work is that selection will favour the evolution of repeatable individual variation in level of plasticity only under certain ecological conditions. While factors promoting individual repeatability of plasticity have thus been identified, empirical estimates of this phenomenon are largely lacking for wild populations. 3. We assayed aggressiveness of individual male great tits (Parus major) twice during their egg-laying stage and twice during their egg-incubation stage to quantify each male's level of seasonal plasticity. This procedure was applied during six consecutive years; all males breeding in our plots during those years were assayed, resulting in repeated measures of individual reaction norms for any individual breeding in multiple years. We quantified among- and within-individual variation in reaction norm components, allowing us to estimate repeatability of seasonal plasticity. Using social pedigree information, we further partitioned reaction norm components into their additive genetic and permanent environmental counterparts. 4. Cross-year individual repeatability for the intercepts (average aggressiveness) and slopes (level of seasonal plasticity) of the aggressiveness reaction norms were 0.574 and 0.516 respectively. The mean of the posterior distributions suggested modest heritabilities (h² = 0.260 for intercepts; h² = 0.266 for slopes), but these estimates were relatively uncertain. Males behaved more aggressively in areas with higher breeding densities, and became less aggressive and less plastic with increasing age; plasticity thus varied within individuals and was multidimensional in nature. 5. This empirical study quantified cross-year individual repeatability, heritability and agerelated reversible plasticity in behaviour. Acknowledging such patterns of multi-level variation is important not only for testing behavioural ecology theory concerning the evolution of repeatable differences in behavioural plasticity but also for predicting how reversible plasticity may evolve in natural populations.
Ecologists and evolutionary biologists routinely estimate selection gradients. Most researchers seek to quantify selection on individual phenotypes, regardless of whether fixed or repeatedly ...expressed traits are studied. Selection gradients estimated to address such questions are attenuated unless analyses account for measurement error and biological sources of within-individual variation. Estimates of standardized selection gradients published in Evolution between 2010 and 2019 were primarily based on traits measured once (59% of 325 estimates). We show that those are attenuated: bias increases with decreasing repeatability but differently for linear versus nonlinear gradients. Others derived individual-mean trait values prior to analyses (41%), typically using few repeats per individual, which does not remove bias. We evaluated three solutions, all requiring repeated measures: (i) correcting gradients derived from classic models using estimates of trait correlations and repeatabilities, (ii) multivariate mixedeffects models, previously used for estimating linear gradients (seven estimates, 2%), which we expand to nonlinear analyses, and (iii) errors-in-variables models that account for within-individual variance, and are rarely used in selection studies. All approaches produced accurate estimates regardless of repeatability and type of gradient, however, errors-in-variables models produced more precise estimates and may thus be preferable.
In the context of social evolution, the ecological drivers of selection are the phenotypes of other individuals. The social environment can thus evolve, potentially changing the adaptive value for ...different social strategies. Different branches of evolutionary biology have traditionally focused on different aspects of these feedbacks. Here, we synthesize behavioral ecology theory concerning evolutionarily stable strategies when fitness is frequency dependent with quantitative genetic models providing statistical descriptions of evolutionary responses to social selection. Using path analyses, we review how social interactions influence the strength of selection and how social responsiveness, social impact, and non-random social assortment affect responses to social selection. We then detail how the frequency-dependent nature of social interactions fits into this framework and how it imposes selection on traits mediating social responsiveness, social impact, and social assortment, further affecting evolutionary dynamics. Throughout, we discuss the parameters in quantitative genetics models of social evolution from a behavioral ecology perspective and identify their statistical counterparts in empirical studies. This integration of behavioral ecology and quantitative genetic perspectives should lead to greater clarity in the generation of hypotheses and more focused empirical research regarding evolutionary pathways and feedbacks inherent in specific social interactions.
Adaptive integration of life history and behaviour is expected to result in variation in the pace‐of‐life. Previous work focused on whether ‘risky’ phenotypes live fast but die young, but reported ...conflicting support. We posit that individuals exhibiting risky phenotypes may alternatively invest heavily in early‐life reproduction but consequently suffer greater reproductive senescence.
We used a 7‐year longitudinal dataset with >1,200 breeding records of >800 female great tits assayed annually for exploratory behaviour to test whether within‐individual age dependency of reproduction varied with exploratory behaviour. We controlled for biasing effects of selective (dis)appearance and within‐individual behavioural plasticity.
Slower and faster explorers produced moderate‐sized clutches when young; faster explorers subsequently showed an increase in clutch size that diminished with age (with moderate support for declines when old), whereas slower explorers produced moderate‐sized clutches throughout their lives. There was some evidence that the same pattern characterized annual fledgling success, if so, unpredictable environmental effects diluted personality‐related differences in this downstream reproductive trait.
Support for age‐related selective appearance was apparent, but only when failing to appreciate within‐individual plasticity in reproduction and behaviour.
Our study identifies within‐individual age‐dependent reproduction, and reproductive senescence, as key components of life‐history strategies that vary between individuals differing in risky behaviour. Future research should thus incorporate age‐dependent reproduction in pace‐of‐life studies.
Great tits differ in patterns of age‐dependent reproduction as a component of pace‐of‐life. Slow explorers produce stable clutch sizes throughout their reproductive lives. Fast explorers by contrast show age‐related increases followed by reproductive senescence. Age‐related reproduction therefore constitutes a key component of personality‐related life‐history variation.
Estimating the genetic variation underpinning a trait is crucial to understanding and predicting its evolution. A key statistical tool to estimate this variation is the animal model. Typically, the ...environment is modelled as an external variable independent of the organism, affecting the focal phenotypic trait via phenotypic plasticity. We studied what happens if the environment is not independent of the organism because it chooses or adjusts its environment, potentially creating non‐zero genotype–environment correlations.
We simulated a set of biological scenarios assuming the presence or absence of a genetic basis for a focal phenotypic trait and/or the focal environment (treated as an extended phenotype), as well as phenotypic plasticity (the effect of the environment on the phenotypic trait) and/or ‘environmental plasticity’ (the effect of the phenotypic trait on the local environment). We then estimated the additive genetic variance of the phenotypic trait and/or the environment by applying five animal models which differed in which variables were fitted as the dependent variable and which covariates were included.
We show that animal models can estimate the additive genetic variance of the local environment (i.e. the extended phenotype) and can detect environmental plasticity. We show that when the focal environment has a genetic basis, the additive genetic variance of a phenotypic trait increases if there is phenotypic plasticity. We also show that phenotypic plasticity can be mistakenly inferred to exist when it is actually absent and instead environmental plasticity is present. When the causal relationship between the phenotype and the environment is misunderstood, it can lead to severe misinterpretation of the genetic parameters, including finding ‘phantom’ genetic variation for traits that, in reality, have none. We also demonstrate how using bivariate models can partly alleviate these issues. Finally, we provide the mathematical equations describing the expected estimated values.
This study highlights that not taking gene–environment correlations into account can lead to erroneous interpretations of additive genetic variation and phenotypic plasticity estimates. If we aim to understand and predict how organisms adapt to environmental change, we need a better understanding of the mechanisms that may lead to gene–environment correlations.
Resumen
La estimación de la variación genética de un carácter es crucial para comprender y predecir su evolución. Una herramienta clave para su estimación son los “animal models”. Normalmente, el ambiente se modela como una variable externa e independiente al organismo que afecta al carácter fenotípico vía plasticidad fenotípica. Aquí estudiamos qué ocurre si el ambiente no es independiente del organismo porque éste elige o ajusta su ambiente y se generan, de manera potencial, correlaciones genotipo‐ambientales.
Simulamos un conjunto de escenarios biológicos asumiendo la presencia o ausencia de base genética para el rasgo fenotípico y/o el ambiente focal (que es tratado como un fenotipo extendido), conjuntamente con plasticidad fenotípica (el efecto del ambiente sobre el carácter fenotípico) y/o “plasticidad ambiental” (el efecto del carácter fenotípico sobre el ambiente local). A continuación, estimamos la varianza genética aditiva del carácter fenotípico y/o el ambiente local aplicando 5 animal models que diferían en qué variables eran ajustadas como variable dependiente y qué covariables eran incluidas en cada modelo.
Demostramos que los animal models pueden estimar la varianza genética aditiva del ambiente local (es decir, del fenotipo extendido) y pueden estimar la plasticidad ambiental. También, que cuando el ambiente focal tiene base genética, la varianza genética aditiva del carácter fenotípico puede incrementar si existe plasticidad fenotípica. Además, que es posible inferir de manera errónea la existencia de plasticidad fenotípica cuando existe plasticidad ambiental. Una malinterpretación de la relación causal entre fenotipo y ambiente puede conllevar a una grave interpretación errónea de los parámetros genéticos, incluyendo la identificación de una variación genética “fantasma” para los distintos caracteres que en realidad no existe. De manera adicional, demostramos cómo el uso de modelos bivariados puede aliviar parcialmente estos problemas. Finalmente, proporcionamos ecuaciones matemáticas describiendo cuales son las estimas esperadas en cada caso.
Este estudio resalta el hecho de que no tener en cuenta las correlaciones genético‐ambientales puede conllevar a una interpretación errónea de las estimas de la variación genética y la plasticidad fenotípica. Si pretendemos comprender y predecir cómo los organismos de adaptan a los cambios ambientales, necesitamos un mayor conocimiento de los mecanismos que podrían provocar dichas correlaciones genético‐ambientales.
Understanding how environmental variation affects phenotypic evolution requires models based on ecologically realistic assumptions that include variation in population size and specific mechanisms by ...which environmental fluctuations affect selection. Here we generalize quantitative genetic theory for environmentally induced stochastic selection to include general forms of frequency-and density-dependent selection. We show how the relevant fitness measure under stochastic selection relates to Fisher’s fundamental theorem of natural selection, and present a general class of models in which density regulation acts through total use of resources rather than just population size. In this model, there is a constant adaptive topography for expected evolution, and the function maximized in the long run is the expected factor restricting population growth. This allows us to generalize several previous results and to explain why apparently “K-selected” species with slow life histories often have low carrying capacities. Our joint analysis of density-and frequency-dependent selection reveals more clearly the relationship between population dynamics and phenotypic evolution, enabling a broader range of eco-evolutionary analyses of some of the most interesting problems in evolution in the face of environmental variation.
Body size plays a key role in the ecology and evolution of all organisms. Therefore, quantifying the sources of morphological (co)variation, dependent and independent of body size, is of key ...importance when trying to understand and predict responses to selection. We combine structural equation modeling with quantitative genetics analyses to study morphological (co) variation in a meta-population of house sparrows (Passer domesticus). As expected, we found evidence of a latent variable “body size,” causing genetic and environmental covariation between morphological traits. Estimates of conditional evolvability show that allometric relationships constrain the independent evolution of house sparrow morphology. We also found spatial differences in general body size and its allometric relationships. On islands where birds are more dispersive and mobile, individuals were smaller and had proportionally longer wings for their body size. Although on islands where sparrows are more sedentary and nest in dense colonies, individuals were larger and had proportionally longer tarsi for their body size. We corroborated these results using simulations and show that our analyses produce unbiased allometric slope estimates. This study highlights that in the short term allometric relationships may constrain phenotypic evolution, but that in the long term selection pressures can also shape allometric relationships.
Animal ecologists often collect hierarchically structured data and analyse these with linear mixed‐effects models. Specific complications arise when the effect sizes of covariates vary on multiple ...levels (e.g. within vs. among subjects). Mean centring of covariates within subjects offers a useful approach in such situations, but is not without problems.
A statistical model represents a hypothesis about the underlying biological process. Mean centring within clusters assumes that the lower level responses (e.g. within subjects) depend on the deviation from the subject mean (relative) rather than on the absolute scale of the covariate. This may or may not be biologically realistic. We show that mismatch between the nature of the generating (i.e. biological) process and the form of the statistical analysis produce major conceptual and operational challenges for empiricists.
We explored the consequences of mismatches by simulating data with three response‐generating processes differing in the source of correlation between a covariate and the response. These data were then analysed by three different analysis equations. We asked how robustly different analysis equations estimate key parameters of interest and under which circumstances biases arise.
Mismatches between generating and analytical equations created several intractable problems for estimating key parameters. The most widely misestimated parameter was the among‐subject variance in response. We found that no single analysis equation was robust in estimating all parameters generated by all equations. Importantly, even when response‐generating and analysis equations matched mathematically, bias in some parameters arose when sampling across the range of the covariate was limited.
Our results have general implications for how we collect and analyse data. They also remind us more generally that conclusions from statistical analysis of data are conditional on a hypothesis, sometimes implicit, for the process(es) that generated the attributes we measure. We discuss strategies for real data analysis in face of uncertainty about the underlying biological process.
Many biological processes produce hierarchical ecological data, such as different ways temperature may affect activity within and among individual green iguanas. Simulations were used to investigate how well statistical models using mean centring perform depending on whether they matched the underlying process. A variety of problems were found, including some arising from sampling even when models match. Potential solutions involve better integration of statistics and biology. Photo by D. F. Westneat.