The past decade has seen significant progress in characterizing uncertainty in environmental systems models, through statistical treatment of incomplete knowledge regarding parameters, model ...structure, and observational data. Attention has now turned to the issue of model structural adequacy (MSA, a term we prefer over model structure “error”). In reviewing philosophical perspectives from the groundwater, unsaturated zone, terrestrial hydrometeorology, and surface water communities about how to model the terrestrial hydrosphere, we identify several areas where different subcommunities can learn from each other. In this paper, we (a) propose a consistent and systematic “unifying conceptual framework” consisting of five formal steps for comprehensive assessment of MSA; (b) discuss the need for a pluralistic definition of adequacy; (c) investigate how MSA has been addressed in the literature; and (d) identify four important issues that require detailed attention—structured model evaluation, diagnosis of epistemic cause, attention to appropriate model complexity, and a multihypothesis approach to inference. We believe that there exists tremendous scope to collectively improve the scientific fidelity of our models and that the proposed framework can help to overcome barriers to communication. By doing so, we can make better progress toward addressing the question “How can we use data to detect, characterize, and resolve model structural inadequacies?”
Key Points
Model building comprises five important formal steps
These remain poorly understood, and methods for dealing with them remain ad‐hoc
Progress requires a common perspective on epistemic problems of model adequacy
In this commentary we suggest that hydrologists and land‐surface modelers may be unnecessarily constraining the behavioral agility of very complex physics‐based models. We argue that the relatively ...poor performance of such models can occur due to restrictions on their ability to refine their portrayal of physical processes, in part because of strong a priori constraints in: (i) the representation of spatial variability and hydrologic connectivity, (ii) the choice of model parameterizations, and (iii) the choice of model parameter values. We provide a specific example of problems associated with strong a priori constraints on parameters in a land surface model. Moving forward, we assert that improving hydrological models requires integrating the strengths of the “physics‐based” modeling philosophy (which relies on prior knowledge of hydrologic processes) with the strengths of the “conceptual” modeling philosophy (which relies on data driven inference). Such integration will accelerate progress on methods to define and discriminate among competing modeling options, which should be ideally incorporated in agile modeling frameworks and tested through a diagnostic evaluation approach.
Key Points:
Complex process‐based models have strong a priori constraints
We provide an example demonstrating strong sensitivity of fixed parameters
Relaxing strong a priori constraints can help improve hydrology simulations
This work presents a new approach to defining drought by establishing an empirical relationship between historical droughts (and wet spells) documented in impact reports, and a broad range of ...observed climate features using Random Forest (RF) models. The new drought indicator quantifies the conditional probability of drought, considering multiple drought‐related climate features and their interactive effects, and can be used for forecasting with up to 3‐month lead time. The approach was tested out‐of‐sample across several random selections of training and testing datasets, and demonstrated better predictive capabilities than commonly used drought indicators (e.g., Standardised Precipitation Index and Evaporative Demand Drought Index) in a range of performance metrics. Furthermore, it showed comparable performance to the (expert elicitation‐based) US Drought Monitor (USDM), the current state‐of‐the‐art record of historical drought in the USA. As well as providing an alternative historical drought indicator to USDM, the RF approach offers additional advantages by being automated, by providing drought information at the grid‐scale, and by having forecasting capacity. While traditional drought metrics define drought as extreme anomalies in drought‐related variables, the approach presented here reveals the full suite of circumstances that lead to impactful droughts. We highlight several combinations of climate features—such as precipitation, potential evapotranspiration, soil moisture and change in water storage—that led to drought events not detected by commonly used drought metrics. The new RF drought indicator combines meteorological, hydrological, agricultural, and socioeconomic drought, providing drought information for all impacted sectors. As a proof‐of‐concept, the RF drought indicator was trained on Texan climate data and droughts.
Key Points
Machine learning is used to establish a relationship between droughts documented in impact reports, and a range of observed climate features
The new drought indicator quantifies the conditional probability of drought considering climate features, and can be used for forecasting
Random Forest trained on drought impact data allowed us to identify the full suite of circumstances that lead to impactful droughts
Are Plant Functional Types Fit for Purpose? Cranko Page, Jon; Abramowitz, Gab; De Kauwe, Martin. G. ...
Geophysical research letters,
16 January 2024, Letnik:
51, Številka:
1
Journal Article
Recenzirano
Odprti dostop
For over 40 years, Plant Functional Types (PFTs) have been used to discretize the ∼400,000 species of terrestrial plants into “similar” classes. Within Earth System Models (ESMs), PFTs simplify ...terrestrial biosphere modeling in combination with soil information and other site characteristics. However, in flux analysis studies, PFT schemes are often implemented as the sole analytical lens to clarify complex behavior. This usage assumes that PFTs adequately enable a mapping between climate inputs and flux outputs. Here, we show that random forest models, trained using aggregated climate and flux measurements from 245 eddy‐covariance sites, cannot accurately predict PFT groupings, regardless of the nature of the PFT scheme. Similarly, PFTs provide negligible benefit when using site climate to predict site flux regimes and vice versa. While use of PFT classifications is convenient, our results suggest they do not aid analytical skill, which has important implications for future terrestrial flux studies.
Plain Language Summary
To understand how the land surface behaves, we often divide plants into a small number (20 or less) of ”similar” groups, such as evergreen forests, or grasslands, known as Plant Functional Types (PFTs). The idea is that landscapes with similar large‐scale characteristics will behave in the same way. In land surface models, these PFT groups determine how the simulated plants react to the climate in combination with soil information and other characteristics, yet analysis of observations often use PFT groups alone to try to explain variations in results between different experimental sites. We use machine learning to show that while PFTs might be visually compelling, they do not necessarily represent behavior groupings and might actually hide real world behavior if used for analysis. As such, we suggest that future studies instead try to look at more specific site characteristics when trying to explain analysis results.
Key Points
Plant Functional Types (PFTs), as often used in land flux studies, are not easily empirically associated with site climate and/or flux regimes
A broad selection of alternative vegetation/land cover classifications do not offer greater predictability
The disconnect between PFTs and climate/flux regimes has implications for modeling and analysis of terrestrial systems
Abstract
Dynamical downscaling (DD), and machine learning (ML) based techniques have been widely applied to downscale global climate models and reanalyses to a finer spatiotemporal scale, but the ...relative performance of these two methods remains unclear. We implement an ML regression approach using a multi-layer perceptron (MLP) with a novel loss function to downscale coarse-resolution precipitation from the Bureau of Meteorology Atmospheric high-resolution Regional Reanalysis for Australia from grids of 12–48 km to 5 km, using the Australia Gridded Climate Data observations as the target. A separate MLP is developed for each coarse grid to predict the fine grid values within it, by combining coarse-scale time-varying meteorological variables with fine-scale static surface properties as predictors. The resulting predictions (on out-of-sample test periods) are more accurate than DD in capturing the rainfall climatology, as well as the frequency distribution and spatiotemporal variability of daily precipitation, reducing biases in daily extremes by 15%–85% with 12 km prediction fields. When prediction fields are coarsened, the skill of the MLP decreases—at 24 km relative bias increases by ∼10%, and at 48 km it increases by another ∼4%—but skill remains comparable to or, for some metrics, much better than DD. These results show that ML-based downscaling benefits from higher-resolution driving data but can still improve on DD (and at far less computational cost) when downscaling from a global climate model grid of ∼50 km.
This paper addresses the question of how well we should expect a land surface model to perform. A statistically‐based artificial neural network is used as a de facto land surface model and its ...results used to benchmark the performance of a traditional physically‐based land surface model. This provides us with a measure of land surface model performance relative to the information contained in the meteorological forcing about the surface fluxes. Further, it is a benchmark that is independent of the measure of model performance. The technique is used to benchmark three models at three observational sites, with results showing that for the most part, the models under‐utilise the information available to them. This suggests that there are considerable opportunities for model improvement.
This paper presents a set of analytical tools to evaluate the performance of three land surface models (LSMs) that are used in global climate models (GCMs). Predictions of the fluxes of sensible ...heat, latent heat, and net CO₂ exchange obtained using process-based LSMs are benchmarked against two statistical models that only use incoming solar radiation, air temperature, and specific humidity as inputs to predict the fluxes. Both are then compared to measured fluxes at several flux stations located on three continents. Parameter sets used for the LSMs include default values used in GCMs for the plant functional type and soil type surrounding each flux station, locally calibrated values, and ensemble sets encompassing combinations of parameters within their respective uncertainty ranges. Performance of the LSMs is found to be generally inferior to that of the statistical models across a wide variety of performance metrics, suggesting that the LSMs underutilize the meteorological information used in their inputs and that model complexity may be hindering accurate prediction. The authors show that model evaluation is purpose specific; good performance in one metric does not guarantee good performance in others. Self-organizing maps are used to divide meteorological “‘forcing space” into distinct regions as a mechanism to identify the conditions under which model bias is greatest. These new techniques will aid modelers to identify the areas of model structure responsible for poor performance.
This study examines the subset climate model ensemble size required to reproduce certain statistical characteristics from a full ensemble. The ensemble characteristics examined are the root mean ...square error, the ensemble mean and standard deviation. Subset ensembles are created using measures that consider the simulation performance alone or include a measure of simulation independence relative to other ensemble members. It is found that the independence measure is able to identify smaller subset ensembles that retain the desired full ensemble characteristics than either of the performance based measures. It is suggested that model independence be considered when choosing ensemble subsets or creating new ensembles.
The rationale for using multi-model ensembles in climate change projections and impacts research is often based on the expectation that different models constitute independent estimates; therefore, a ...range of models allows a better characterisation of the uncertainties in the representation of the climate system than a single model. However, it is known that research groups share literature, ideas for representations of processes, parameterisations, evaluation data sets and even sections of model code. Thus, nominally different models might have similar biases because of similarities in the way they represent a subset of processes, or even be near-duplicates of others, weakening the assumption that they constitute independent estimates. If there are near-replicates of some models, then treating all models equally is likely to bias the inferences made using these ensembles. The challenge is to establish the degree to which this might be true for any given application. While this issue is recognised by many in the community, quantifying and accounting for model dependence in anything other than an ad-hoc way is challenging. Here we present a synthesis of the range of disparate attempts to define, quantify and address model dependence in multi-model climate ensembles in a common conceptual framework, and provide guidance on how users can test the efficacy of approaches that move beyond the equally weighted ensemble. In the upcoming Coupled Model Intercomparison Project phase 6 (CMIP6), several new models that are closely related to existing models are anticipated, as well as large ensembles from some models. We argue that quantitatively accounting for dependence in addition to model performance, and thoroughly testing the effectiveness of the approach used will be key to a sound interpretation of the CMIP ensembles in future scientific studies.
Large scale modes of climate variability influence rainfall and soil moisture in southeastern Australia, and extended dry conditions have been associated with a lack of climate mode phases conducive ...to wetter conditions. However, the role of large-scale climate variability in breaking ongoing soil moisture droughts has not been well quantified, and the utility of large-scale signals in drought recovery assessments have not been explored. Here, we study the influence of El Niño Southern Oscillation (ENSO) and the Indian Ocean Dipole (IOD) on the probability of soil moisture drought breaking in eastern and southeastern Australia, using logistic regression modelling. A long-term historical dataset from the Australian Water Resources Assessment Landscape (AWRA-L) model is used for the assessment. The probability estimates from the logistic regression modelling validate well against observed probability of a drought ending. We then use model estimates to understand the probability contributions from different climate modes. We show that there is a seasonal pattern in soil moisture drought breaking probabilities with higher probabilities in austral summer in eastern Australia and summer/autumn in southeastern Australia. ENSO has the largest influence on probabilities in winter with extreme opposite phases of the mode resulting in regional average probability differences of 15–26%. The IOD exhibits the largest influences during spring and winter and opposite phases result in differences of about 18%. The method can be used to estimate soil moisture drought breaking probabilities in near real-time during drought events, and may assist decision making by managers engaged in drought risk and water resources planning.