In spite of the importance of land ecosystems in offsetting carbon dioxide
emissions released by anthropogenic activities into the atmosphere, the
spatiotemporal dynamics of terrestrial carbon fluxes ...remain largely
uncertain at regional to global scales. Over the past decade, data
assimilation (DA) techniques have grown in importance for improving these
fluxes simulated by terrestrial biosphere models (TBMs), by optimizing model
parameter values while also pinpointing possible parameterization
deficiencies. Although the joint assimilation of multiple data streams is
expected to constrain a wider range of model processes, their actual
benefits in terms of reduction in model uncertainty are still
under-researched, also given the technical challenges. In this study, we
investigated with a consistent DA framework and the ORCHIDEE-LMDz
TBM–atmosphere model how the assimilation of different combinations of data
streams may result in different regional to global carbon budgets. To do so,
we performed comprehensive DA experiments where three datasets (in situ measurements
of net carbon exchange and latent heat fluxes, spaceborne estimates of the
normalized difference vegetation index, and atmospheric CO2
concentration data measured at stations) were assimilated alone or
simultaneously. We thus evaluated their complementarity and usefulness to
constrain net and gross C land fluxes. We found that a major challenge in
improving the spatial distribution of the land C sinks and sources with
atmospheric CO2 data relates to the correction of the soil carbon
imbalance.
Recent successes in passive remote sensing of far-red solar-induced chlorophyll fluorescence (SIF) have spurred the development and integration of
canopy-level fluorescence models in global ...terrestrial biosphere models (TBMs) for climate and carbon cycle research. The interaction of fluorescence
with photochemistry at the leaf and canopy scales provides opportunities to diagnose and constrain model simulations of photosynthesis and related
processes, through direct comparison to and assimilation of tower, airborne, and satellite data. TBMs describe key processes related to the absorption of
sunlight, leaf-level fluorescence emission, scattering, and reabsorption throughout the canopy. Here, we analyze simulations from an ensemble of
process-based TBM–SIF models (SiB3 – Simple Biosphere Model, SiB4, CLM4.5 – Community Land Model, CLM5.0, BETHY – Biosphere Energy Transfer Hydrology, ORCHIDEE – Organizing Carbon and Hydrology In Dynamic Ecosystems, and BEPS – Boreal Ecosystems Productivity Simulator) and the SCOPE (Soil Canopy Observation Photosynthesis Energy) canopy radiation and vegetation model at a subalpine
evergreen needleleaf forest near Niwot Ridge, Colorado. These models are forced with local meteorology and analyzed against tower-based continuous
far-red SIF and gross-primary-productivity-partitioned (GPP) eddy covariance data at diurnal and synoptic scales during the growing season
(July–August 2017). Our primary objective is to summarize the site-level state of the art in TBM–SIF modeling over a relatively short time period
(summer) when light, canopy structure, and pigments are similar, setting the stage for regional- to global-scale analyses. We find that these models
are generally well constrained in simulating photosynthetic yield but show strongly divergent patterns in the simulation of absorbed photosynthetic
active radiation (PAR), absolute GPP and fluorescence, quantum yields, and light response at the leaf and canopy scales. This study highlights the need for
mechanistic modeling of nonphotochemical quenching in stressed and unstressed environments and improved the representation of light absorption (APAR),
distribution of light across sunlit and shaded leaves, and radiative transfer from the leaf to the canopy scale.
Data assimilation methods provide a rigorous statistical framework for constraining parametric uncertainty in land surface models (LSMs), which in turn helps to improve their predictive capability ...and to identify areas in which the representation of physical processes is inadequate. The increase in the number of available datasets in recent years allows us to address different aspects of the model at a variety of spatial and temporal scales. However, combining data streams in a DA system is not a trivial task. In this study we highlight some of the challenges surrounding multiple data stream assimilation for the carbon cycle component of LSMs. We give particular consideration to the assumptions associated with the type of inversion algorithm that are typically used when optimising global LSMs - namely, Gaussian error distributions and linearity in the model dynamics. We explore the effect of biases and inconsistencies between the observations and the model (resulting in non-Gaussian error distributions), and we examine the difference between a simultaneous assimilation (in which all data streams are included in one optimisation) and a step-wise approach (in which each data stream is assimilated sequentially) in the presence of non-linear model dynamics. In addition, we perform a preliminary investigation into the impact of correlated errors between two data streams for two cases, both when the correlated observation errors are included in the prior observation error covariance matrix, and when the correlated errors are ignored. We demonstrate these challenges by assimilating synthetic observations into two simple models: the first a simplified version of the carbon cycle processes represented in many LSMs and the second a non-linear toy model. Finally, we provide some perspectives and advice to other land surface modellers wishing to use multiple data streams to constrain their model parameters.
Land surface models (LSMs),
which form the land component of earth system models, rely on numerous processes for
describing carbon, water and energy budgets, often associated with highly uncertain
...parameters. Data assimilation (DA) is a useful approach for optimising the most critical
parameters in order to improve model accuracy and refine future climate predictions. In
this study, we compare two different DA methods for optimising the parameters of seven
plant functional types (PFTs) of the ORCHIDEE LSM using daily averaged eddy-covariance
observations of net ecosystem exchange and latent heat flux at 78 sites across the globe.
We perform a technical investigation of two classes of minimisation methods – local
gradient-based (the L-BFGS-B algorithm, limited memory
Broyden–Fletcher–Goldfarb–Shanno algorithm with bound constraints) and global random
search (the genetic algorithm) – by evaluating their relative performance in terms of
the model–data fit and the difference in retrieved parameter values. We examine the
performance of each method for two cases: when optimising parameters at each site
independently (“single-site” approach) and when simultaneously optimising the model at
all sites for a given PFT using a common set of parameters (“multi-site” approach). We
find that for the single site case the random search algorithm results in lower values of
the cost function (i.e. lower model–data root mean square differences) than the
gradient-based method; the difference between the two methods is smaller for the
multi-site optimisation due to a smoothing of the cost function shape with a greater
number of observations. The spread of the cost function, when performing the same tests
with 16 random first-guess parameters, is much larger with the gradient-based method, due
to the higher likelihood of being trapped in local minima. When using pseudo-observation
tests, the genetic algorithm results in a closer approximation of the true posterior
parameter value in the L-BFGS-B algorithm. We demonstrate the advantages and challenges
of different DA techniques and provide some advice on using it for the LSM parameter
optimisation.
Plant activity in semi-arid ecosystems is largely controlled by pulses of precipitation, making them particularly vulnerable to increased aridity that is expected with climate change. Simple ...bucket-model hydrology schemes in land surface models (LSMs) have had limited ability in accurately capturing semi-arid water stores and fluxes. Recent, more complex, LSM hydrology models have not been widely evaluated against semi-arid ecosystem in situ data. We hypothesize that the failure of older LSM versions to represent evapotranspiration, ET, in arid lands is because simple bucket models do not capture realistic fluctuations in upper-layer soil moisture. We therefore predict that including a discretized soil hydrology scheme based on a mechanistic description of moisture diffusion will result in an improvement in model ET when compared to data because the temporal variability of upper-layer soil moisture content better corresponds to that of precipitation inputs. To test this prediction, we compared ORCHIDEE LSM simulations from (1) a simple conceptual 2-layer bucket scheme with fixed hydraulic parameters and (2) an 11-layer discretized mechanistic scheme of moisture diffusion in unsaturated soil based on Richards equations, against daily and monthly soil moisture and ET observations, together with data-derived estimates of transpiration / evapotranspiration, T∕ET, ratios, from six semi-arid grass, shrub, and forest sites in the south-western USA. The 11-layer scheme also has modified calculations of surface runoff, water limitation, and resistance to bare soil evaporation, E, to be compatible with the more complex hydrology configuration. To diagnose remaining discrepancies in the 11-layer model, we tested two further configurations: (i) the addition of a term that captures bare soil evaporation resistance to dry soil; and (ii) reduced bare soil fractional vegetation cover. We found that the more mechanistic 11-layer model results in a better representation of the daily and monthly ET observations. We show that, as predicted, this is because of improved simulation of soil moisture in the upper layers of soil (top ∼ 10 cm). Some discrepancies between observed and modelled soil moisture and ET may allow us to prioritize future model development and the collection of additional data. Biases in winter and spring soil moisture at the forest sites could be explained by inaccurate soil moisture data during periods of soil freezing and/or underestimated snow forcing data. Although ET is generally well captured by the 11-layer model, modelled T∕ET ratios were generally lower than estimated values across all sites, particularly during the monsoon season. Adding a soil resistance term generally decreased simulated bare soil evaporation, E, and increased soil moisture content, thus increasing transpiration, T, and reducing the negative bias between modelled and estimated monsoon T∕ET ratios. This negative bias could also be accounted for at the low-elevation sites by decreasing the model bare soil fraction, thus increasing the amount of transpiring leaf area. However, adding the bare soil resistance term and decreasing the bare soil fraction both degraded the model fit to ET observations. Furthermore, remaining discrepancies in the timing of the transition from minimum T∕ET ratios during the hot, dry May–June period to high values at the start of the monsoon in July–August may also point towards incorrect modelling of leaf phenology and vegetation growth in response to monsoon rains. We conclude that a discretized soil hydrology scheme and associated developments improve estimates of ET by allowing the modelled upper-layer soil moisture to more closely match the pulse precipitation dynamics of these semi-arid ecosystems; however, the partitioning of T from E is not solved by this modification alone.
Large uncertainties in land surface models (LSMs) simulations still arise from inaccurate forcing, poor description of land surface heterogeneity (soil and vegetation properties), incorrect model ...parameter values and incomplete representation of biogeochemical processes. The recent increase in the number and type of carbon cycle-related observations, including both in situ and remote sensing measurements, has opened a new road to optimize model parameters via robust statistical model–data integration techniques, in order to reduce the uncertainties of simulated carbon fluxes and stocks. In this study we present a carbon cycle data assimilation system that assimilates three major data streams, namely the Moderate Resolution Imaging Spectroradiometer (MODIS)-Normalized Difference Vegetation Index (NDVI) observations of vegetation activity, net ecosystem exchange (NEE) and latent heat (LE) flux measurements at more than 70 sites (FLUXNET), as well as atmospheric CO2 concentrations at 53 surface stations, in order to optimize the main parameters (around 180 parameters in total) of the Organizing Carbon and Hydrology in Dynamics Ecosystems (ORCHIDEE) LSM (version 1.9.5 used for the Coupled Model Intercomparison Project Phase 5 (CMIP5) simulations). The system relies on a stepwise approach that assimilates each data stream in turn, propagating the information gained on the parameters from one step to the next. Overall, the ORCHIDEE model is able to achieve a consistent fit to all three data streams, which suggests that current LSMs have reached the level of development to assimilate these observations. The assimilation of MODIS-NDVI (step 1) reduced the growing season length in ORCHIDEE for temperate and boreal ecosystems, thus decreasing the global mean annual gross primary production (GPP). Using FLUXNET data (step 2) led to large improvements in the seasonal cycle of the NEE and LE fluxes for all ecosystems (i.e., increased amplitude for temperate ecosystems). The assimilation of atmospheric CO2, using the general circulation model (GCM) of the Laboratoire de Météorologie Dynamique (LMDz; step 3), provides an overall constraint (i.e., constraint on large-scale net CO2 fluxes), resulting in an improvement of the fit to the observed atmospheric CO2 growth rate. Thus, the optimized model predicts a land C (carbon) sink of around 2.2 PgC yr−1 (for the 2000–2009 period), which is more compatible with current estimates from the Global Carbon Project (GCP) than the prior value. The consistency of the stepwise approach is evaluated with back-compatibility checks. The final optimized model (after step 3) does not significantly degrade the fit to MODIS-NDVI and FLUXNET data that were assimilated in the first two steps, suggesting that a stepwise approach can be used instead of the more “challenging” implementation of a simultaneous optimization in which all data streams are assimilated together. Most parameters, including the scalar of the initial soil carbon pool size, changed during the optimization with a large error reduction. This work opens new perspectives for better predictions of the land carbon budgets.
•We used biomass with eddy covariance fluxes to optimize process-based model ORCHIDEE.•Use of aboveground biomass increment allows optimization of C allocation parameters.•Using total aboveground ...biomass requires representation of management in the model.
Biomass as a resource, and as a vulnerable carbon pool, is a key variable to diagnose the impacts of global changes on the terrestrial biosphere, and therefore its proper description in models is crucial. Model-Data Fusion (MDF) or data assimilation methods are useful tools in improving ecosystem models that describe interactions between vegetation and atmosphere. We use a MDF method based on a Bayesian approach, in which data are combined with a process model in order to provide optimized estimates of model parameters and to better quantify model uncertainties, whilst taking into account prior information on the parameters. With this method we are able to use multiple data streams, which allows us to simultaneously constrain modeled variables at site level across different temporal scales. In this study both high frequency eddy covariance flux measurements of net CO2 and evapotranspiration (ET), and low frequency biometric measurements of total aboveground biomass and the annual increment (which includes all compartments), are assimilated with the ORCHIDEE model version “AR5” at a beech (Hesse) and a maritime pine (Le Bray) forest site using four to five years of flux data and nine years of biomass data. When assimilating the observed aboveground annual biomass increment (AGB_inc) together with net CO2 and ET flux, the RMSE of modelled AGB_inc was reduced from the a priori estimates by 37% at Hesse and 69% at Le Bray, without reducing the fit to the net CO2 and ET that can be achieved when assimilating flux data alone. Assimilating biomass increment data also provides insight in the performance of the allocation scheme of the model. Comparison with detailed site-based measurements at Hesse showed that the optimization reduced positive biases in the model, for example in fine root and leaf production. We also investigated how to use stand-scale total aboveground biomass in optimization (AGB_tot). However, this study demonstrated that assimilating AGB_tot measurements in the ORCHIDEE-AR5 model lead to some inconsistencies, particularly for the annual dynamics of the AGB_inc, partly because this version of the model lacked a realistic representation of forest stand processes including management and disturbances.
How carbon (C) is allocated to different plant tissues (leaves, stem, and roots) determines how long C remains in plant biomass and thus remains a central challenge for understanding the global C ...cycle. We used a diverse set of observations (AmeriFlux eddy covariance tower observations, biomass estimates from tree-ring data, and leaf area index (LAI) measurements) to compare C fluxes, pools, and LAI data with those predicted by a land surface model (LSM), the Community Land Model (CLM4.5). We ran CLM4.5 for nine temperate (including evergreen and deciduous) forests in North America between 1980 and 2013 using four different C allocation schemes: i. dynamic C allocation scheme (named "D-CLM4.5") with one dynamic allometric parameter, which allocates C to the stem and leaves to vary in time as a function of annual net primary production (NPP); ii. an alternative dynamic C allocation scheme (named "D-Litton"), where, similar to (i), C allocation is a dynamic function of annual NPP, but unlike (i) includes two dynamic allometric parameters involving allocation to leaves, stem, and coarse roots; iii.–iv. a fixed C allocation scheme with two variants, one representative of observations in evergreen (named "F-Evergreen") and the other of observations in deciduous forests (named "F-Deciduous"). D-CLM4.5 generally overestimated gross primary production (GPP) and ecosystem respiration, and underestimated net ecosystem exchange (NEE). In D-CLM4.5, initial aboveground biomass in 1980 was largely overestimated (between 10 527 and 12 897 g C m−2) for deciduous forests, whereas aboveground biomass accumulation through time (between 1980 and 2011) was highly underestimated (between 1222 and 7557 g C m−2) for both evergreen and deciduous sites due to a lower stem turnover rate in the sites than the one used in the model. D-CLM4.5 overestimated LAI in both evergreen and deciduous sites because the leaf C–LAI relationship in the model did not match the observed leaf C–LAI relationship at our sites. Although the four C allocation schemes gave similar results for aggregated C fluxes, they translated to important differences in long-term aboveground biomass accumulation and aboveground NPP. For deciduous forests, D-Litton gave more realistic Cstem ∕ Cleaf ratios and strongly reduced the overestimation of initial aboveground biomass and aboveground NPP for deciduous forests by D-CLM4.5. We identified key structural and parameterization deficits that need refinement to improve the accuracy of LSMs in the near future. These include changing how C is allocated in fixed and dynamic schemes based on data from current forest syntheses and different parameterization of allocation schemes for different forest types. Our results highlight the utility of using measurements of aboveground biomass to evaluate and constrain the C allocation scheme in LSMs, and suggest that stem turnover is overestimated by CLM4.5 for these AmeriFlux sites. Understanding the controls of turnover will be critical to improving long-term C processes in LSMs.
Large uncertainties in Land surface models (LSMs) simulations still arise from inaccurate forcing, incorrect model parameter values and incomplete representation of biogeochemical processes. The ...recent increase in the number and type of carbon cycle related observations, including both in situ and remote sensing measurements, has opened a new road to optimize model parameters via robust statistical model-data integration techniques, in order to reduce the simulated carbon fluxes and stocks uncertainties. In this study we present a Carbon Cycle Data Assimilation System (CCDAS) that assimilates three major data streams, namely MODIS-NDVI observations of vegetation activity, net ecosystem exchange (NEE) and latent heat (LE) flux measurements at more than 70 sites (FLUXNET), and atmospheric CO.sub.2 concentrations at 53 surface stations, in order to optimize the main parameters of the ORCHIDEE LSM (around 180 parameters in total). The system relies on a step-wise approach that assimilates each data stream in turn, propagating the information gained on the parameters from one step to the next. Overall, the ORCHIDEE model is able to achieve a consistent fit to all three data streams, which suggests that current LSMs have reached the level of development to assimilate these observations. The assimilation of MODIS-NDVI (step 1) reduced the growing season length in ORCHIDEE for temperate and boreal ecosystems, thus decreasing the global mean annual gross primary production (GPP). Using FLUXNET data (step 2) led to large improvements in the seasonal cycle of the NEE and LE fluxes for all ecosystems (i.e., increased amplitude for temperate ecosystems). The assimilation of atmospheric CO.sub.2, using the atmospheric transport model LMDz (step 3), provides an overall constraint (i.e., constraint on large scale net CO.sub.2 fluxes), resulting in an improvement of the fit to the observed atmospheric CO.sub.2 growth rate. Thus the optimized model predicts a land C sink of around 2.2 PgC.yr.sup.−1 (for the 2000--2009 period), which is more compatible with current estimates from the Global Carbon Project (GCP) than the prior value. The consistency of the step-wise approach is evaluated with backcompatibility checks. The final optimized model (after step 3) does not significantly degrade the fit to MODIS-NDVI and FLUXNET data that were assimilated in the first two steps, suggesting that a stepwise approach can be used instead of the more âchallengingâ implementation of a simultaneous optimization in which all data streams are assimilated together. Most parameters, including the scalar of the initial soil carbon pool size, changed during the optimization with a large error reduction. This work opens new perspectives for better predictions of the land carbon budgets.