Soil available water capacity is an important soil property for land use planning, drought risk assessment, and modelling crop production or carbon cycling. Measurements of soil moisture at several ...soil water potentials are expensive and time-consuming, hence it is common to estimate available water capacity with pedotransfer functions (PTFs). Available PTFs for France rarely provide uncertainty estimates for the model coefficients and predictions, which are often required for error propagation analysis and modelling ecological processes. The objectives of this study were: 1) to develop class-PTFs and 2) continuous-PTFs with associated uncertainties, and 3) to assess the domain of applicability of the PTFs across metropolitan France. We used the SOLHYDRO database for calibrating continuous and class PTFs. For continuous PTFs we used linear regression models using sand, clay, organic carbon (SOC), and bulk density (BD) as predictor variables, whereas class-PTFs were defined by texture class, bulk density, and horizon type (all horizons, topsoils, subsoils). The models were validated with an independent validation dataset. The domain of applicability of the PTFs was evaluated calculating the Mahalanobis distance between the calibration dataset and horizon data from the French soil monitoring network (RMQS). At independent validation, texture class-PTFs had a RMSE between 0.047 cm3 cm–3 and 0.058 cm3 cm–3 and continuous-PTFs had a RMSE between 0.040 cm3 cm–3 and 0.053 cm3 cm–3. Texture class-PTFs had similar or better predictive performance than texture-structural class-PTFs. The prediction performance of continuous-PTFs improved slightly when including SOC and BD in the models. The variance of the predictions of continuous-PTFs associated to error in the model coefficients was rather small, and increased as the values of the predictor variables were more distant to the centroid of the calibration data. We provide the PTF users with tools for classifying new samples as within or outside the applicability domain of the PTFs. When applied to the RMQS dataset, >50% of the horizons outside the domain of applicability were located in forests and natural areas or managed pastures. The spatial distribution of RMQS horizons outside the domain of applicability of the PTF can inform future sampling locations for increasing the diversity of the soil properties and site conditions represented by the SOLHYDRO dataset.
•We calibrated class and continuous-PTFs for estimating soil moisture at pF = 2.0 and pF = 4.2•RMSE ranged between 0.040 cm3 cm−3 and 0.058 cm3 cm−3 at independent validation•The PTFs are suitable for estimating available water capacity in most agricultural French soils•Users can assess whether their samples are within the applicability domain of the PTFs
Perspectives on data‐driven soil research Wadoux, Alexandre M. J.‐C.; Román‐Dobarco, Mercedes; McBratney, Alex B.
European journal of soil science,
July 2021, 2021-07-00, 20210701, Letnik:
72, Številka:
4
Journal Article
Recenzirano
Soil is a complex system in which biological, chemical and physical interactions take place. The behaviour of these interactions changes in spatial scale from the atomic to the global, and in time. ...To understand how this system works, soil scientists usually rely on incremental improvements in the knowledge by refinement of theories through hypothesis testing and development using carefully designed experiments. In the last two decades, the primacy of this knowledge construction process has been challenged by the development of large soil databases and algorithms such as machine learning. The data‐driven research approach to soil science, the inference of soil knowledge directly from data by using computational tools and modelling techniques, is becoming more popular. Despite the wide adoption of a data‐driven research approach to soil science, there has been little discussion on how a research driven by data instead of hypotheses affects scientific progress. In this paper, we provide an introductory perspective on data‐driven soil research by discussing some of the issues and opportunities of knowledge discovery from soil data. We show that while data‐driven soil research may seem revolutionary for some, soil science has a long history of exploratory efforts to generate knowledge from data. Empirical and factual soil classifications, for example, were data driven. We further discuss, with examples, (i) data, databases and the logic of data storage for data‐driven soil research, (ii) the issues of extreme empiricist claims that arise corollary to the increase in the use of computational tools, and (iii) the challenge of formulating a scientific explanation based on patterns observed in the data and data analysis tools. By considering the epistemic challenges of the data‐driven scientific research in the light of the historical literature, we found that there is a continuity of practices, some being certainly amplified by recent technological changes, but that the core methods of scientific enquiry from data remain essentially unchanged.
Highlights
Historical account of data‐driven soil science research.
Describe data to be used for data‐driven soil science.
Discuss conceptual issues and opportunities for data‐driven soil science.
Investigate the challenge of formulating an explanation from soil data.
•The pedogenon is a conceptual soil taxon.•Used quantitative variables representing soil-forming factors at a reference time.•Two-step modelling generated pedogenon classes with meaningful spatial ...patterns.•1000 pedogenon classes for NSW suitable for local and regional management.•Pedogenon maps give support for assessing soil change and soil monitoring surveys.
Soil entities are generally defined based on soil properties, using morphological, genetic, or utilitarian criteria. Alternatively, soil entities could be characterized by groupings of homogeneous soil-forming factors under the assumption that the dominant soil-forming processes occurring over a time period within each group are similar, and therefore develop unique soil entities with similar soil properties. We define the pedogenon as a conceptual soil taxon defined from a set of quantitative state variables that represent the soil-forming factors for a given reference time. The objective of this study was to develop a methodology for mapping pedogenon classes at the time of the European settlement in New South Wales (Australia). This period was chosen as reference because from 1788 onwards the intensification of land use has accelerated the rate of change of soil properties. We implemented a two-step modelling approach with a set of environmental covariates representing the soil-forming factors, including the estimated natural vegetation at 1750. The k-means algorithm was applied to generate pedogenon classes suitable for local management. Then, hierarchical clustering was applied to identify the organization of pedogenons into families or “branches” of higher level taxa. We tested the ability of the pedogenon classes for explaining the variance of stable soil properties (particle size fractions) in the subsoil (30–60 cm depth) with redundancy analysis (RDA). The results indicated that between 800 and 1000 pedogenon classes provide the desired level of detail for both local and regional management across New South Wales. The influence of the pre-1750 vegetation types (e.g. Acacia open woodlands and shrublands, Callitris forests and woodlands) was apparent in the distribution of some pedogenon branches. Pedogenon classes differed in their characteristics (median area ≈750 km2), but overall showed meaningful spatial patterns at local scale and formed regional assemblages. The RDA models indicated that pedogenon classes explained about 30% of the variance of silt and clay content. This flexible modelling framework allows the creation of pedogenon maps over large areas at high resolution (90 m) and is applicable at different scales. Potential applications of pedogenon maps include the quantitative assessment of soil change and designing soil monitoring surveys.
The increasing demand for soil information has led to the rapid development of Digital Soil Mapping (DSM) products. As a consequence, multiple soil maps are sometimes available for a particular area. ...Rather than selecting the best map, model ensemble offers a way to capitalize on existing soil information, and to improve the map accuracy. In this study we ensemble four topsoil texture maps of France with different resolution made by different organizations at the national, European, and global scale. We investigated two methods of model ensemble: the Granger-Ramanathan (GR) and Variance-Weighted (VW) methods. Ensemble methods based on area stratification were also tested to take into account local soil information. We also assessed the impact of the number of calibration points on the evaluation indicators. Both ensemble methods improved the accuracy of the map compared to the best of the primary maps, while the GR method outperformed the VW method. We found that the different stratification strategies did not improve the accuracy significantly when compared to the global methods. Finally, we showed that a relatively low number of calibration points is required in the merging process if the sampling is well designed. This study demonstrates that digital soil mapping products at various scales from various data sources can be combined with the ensemble method taking advantage of all existing efforts and taking care of harmonization issues.
•Two ensemble methods were tested to combine soil texture maps: Granger-Ramanathan and variance-weighted•We successfully merged maps from different scales to improve maps at a national scale•Stratification did not result in improved accuracy•A reasonable number of calibration points was necessary to apply ensembles in other regions
•Three SOC maps were compared in mainland France.•Five model averaging approaches were tested in improving SOC map.•SOC map can be improved using more than 100 soil samples for model ...averaging.•Variance Weighted approach performed best.•Merging maps using model averaging is also applicable to data-poor situations.
The soil organic carbon (SOC) pool is the largest terrestrial carbon (C) pool and is two to three times larger than the C stored in vegetation and the atmosphere. SOC is a crucial component within the C cycle, and an accurate baseline of SOC is required, especially for biogeochemical and earth system modelling. This baseline will allow better monitoring of SOC dynamics due to land use change and climate change. However, current estimates of SOC stock and its spatial distribution have large uncertainties. In this study, we test whether we can improve the accuracy of the three existing SOC maps of France obtained at national (IGCS), continental (LUCAS), and global (SoilGrids) scales using statistical model averaging approaches. Soil data from the French Soil Monitoring Network (RMQS) were used to calibrate and evaluate five model averaging approaches, i.e., Granger-Ramanathan, Bias-corrected Variance Weighted (BC-VW), Bayesian Modelling Averaging, Cubist and Residual-based Cubist. Cross-validation showed that with a calibration size larger than 100 observations, the five model averaging approaches performed better than individual SOC maps. The BC-VW approach performed best and is recommended for model averaging. Our results show that 200 calibration observations were an acceptable calibration strategy for model averaging in France, showing that a fairly small number of spatially stratified observations (sampling density of 1 sample per 2500 km2) provides sufficient calibration data. We also tested the use of model averaging in data-poor situations by reproducing national SOC maps using various sized subsets of the IGCS dataset for model calibration. The results show that model averaging always performs better than the national SOC map. However, the Modelling Efficiency dropped substantially when the national SOC map was excluded in model averaging. This indicates the necessity of including a national SOC map for model averaging, even if produced with a small dataset (i.e., 200 samples). This study provides a reference for data-poor countries to improve national SOC maps using existing continental and global SOC maps.
Increasing soil organic carbon (SOC) stocks is a promising way to mitigate the increase in atmospheric CO2 concentration. Based on a simple ratio between CO2 anthropogenic emissions and SOC stocks ...worldwide, it has been suggested that a 0.4% (4 per 1000) yearly increase in SOC stocks could compensate for current anthropogenic CO2 emissions. Here, we used a reverse RothC modelling approach to estimate the amount of C inputs to soils required to sustain current SOC stocks and to increase them by 4‰ per year over a period of 30 years. We assessed the feasibility of this aspirational target first by comparing the required C input with net primary productivity (NPP) flowing to the soil, and second by considering the SOC saturation concept. Calculations were performed for mainland France, at a 1 km grid cell resolution. Results showed that a 30%–40% increase in C inputs to soil would be needed to obtain a 4‰ increase per year over a 30‐year period. 88.4% of cropland areas were considered unsaturated in terms of mineral‐associated SOC, but characterized by a below target C balance, that is, less NPP available than required to reach the 4‰ aspirational target. Conversely, 90.4% of unimproved grasslands were characterized by an above target C balance, that is, enough NPP to reach the 4‰ objective, but 59.1% were also saturated. The situation of improved grasslands and forests was more evenly distributed among the four categories (saturated vs. unsaturated and above vs below target C balance). Future data from soil monitoring networks should enable to validate these results. Overall, our results suggest that, for mainland France, priorities should be (1) to increase NPP returns in cropland soils that are unsaturated and have a below target carbon balance and (2) to preserve SOC stocks in other land uses.
The 4 per 1000 aspirational target suggests that a 0.4% yearly increase in soil organic carbon (SOC) stocks could compensate for current anthropogenic CO2 emissions. Using a model of SOC dynamics, estimates of available net primary productivity (NPP), and applying the SOC saturation concept, we assessed its feasibility in the case of mainland France. Our results indicate that the 4 per 1000 target is reachable only for limited areas. Priorities should be to increase NPP returns in cropland soils that are unsaturated and have a below target carbon balance, but also to preserve SOC stocks in other land uses.
•Hand-feel soil texture and particle-size distribution are compared using a large database.•The overall accuracy of hand-feel soil texture class allocation was 73%•Most discrepancies were explained ...by very fine and coarse sand content.•Predicting soil water retention at pF2 using hand-feel texture gave satisfactory results.
Due to cost constraints, field texture classes estimated by hand-feel by soil surveyors are more abundant than laboratory measurements of particle-size distribution. Thus, there is a considerable potential to use field-estimated soil textures for mapping on the condition that they are reliable and can be characterized by a probability distribution function similar to values obtained by laboratory measurements. This study aimed to investigate and elucidate the differences between the field texture classes estimated by hand-feel and soil texture determined from particle-size analysis under laboratory conditions in a region of Central France. We tested several hypotheses to explain the discrepancies between field estimates and laboratory measurements (organic C content, pH, more detailed particle-size analyses, and CEC). Finally, we simulated the consequences of using particle-size distribution estimated from field texture on a pedotransfer function (PTF) for water retention. Laboratory measurements of clay, silt, and sand content for each field texture class were available for about 17,400 samples. Considering laboratory measurements and the French texture triangle as the reference, the overall accuracy of field texture class allocation was 73%, which was better than most of the results previously reported in the literature. When looking at each field texture class, most predictions were consistent; however, there were noticeable differences between a few field texture classes and particle-size classes. The extreme texture classes located at the corners of the texture triangle were better predicted than those located at the centre of the triangle. We found the discrepancy of field texture classes can be explained by the very fine sand (50–100 µm) and very coarse sand (1000–2000 µm) contents. Based on the particle-size distribution from each field texture class, we calculated their joint probability distribution function of their corresponding laboratory measurements of clay, silt, and sand content. Results showed that PTF values predicted using hand-feel texture were consistent with those obtained with the measured particle-size distribution. Overall, we demonstrated the value of hand-feel texture in expanding the soil texture database and supporting the expansion of the national database to inform soil water retention properties.
With the rapid development of digital soil mapping it is not unusual to find several maps for the same soil property in an area of interest. We applied two standard methods of model averaging for ...combining two regional maps and a European map of topsoil texture in agricultural land for the Region Centre (France). The two methods for model ensemble were the Granger-Ramanathan (G-R) and the Bates-Granger (B-G). A calibration dataset was used for fitting the coefficients of the G-R model, and for calculating a global variance: prediction error ratio which was then used to re-scale the weights of the B-G model. The prediction performance of the three primary maps and the two ensemble maps was compared with an independent validation dataset consisting on 100 observations from the French soil monitoring network. The prediction accuracy of the ensemble models improved only for clay in comparison to the primary maps (∆R2=0.02–0.06, ∆RMSE=−1.56–−4.97gkg−1). Overall, the G-R models obtained smaller RMSE and greater bias than B-G, and G-R estimated better the prediction uncertainty. The dissimilarities between the methods for estimating the prediction variance and non-optimal estimated uncertainties were important limitations for the B-G models despite applying a global correction factor for the prediction variances. The results suggested that both the calibration and validation datasets should represent the patterns of spatial variation and range of values of the soil property for the prediction space. Nonetheless, model ensemble methods proved to be useful for merging maps with different types of datasets, spatial coverage, and methodological approaches.
•Three different source maps were combined into a single topsoil texture map.•We applied the Bates-Granger (BG) and Granger-Ramanathan (GR) ensemble methods.•Ensemble models improved accuracy only for clay in comparison to the primary maps.•GR models obtained smaller RMSE and estimated better the uncertainty than BG.•The availability and accuracy of uncertainty estimates limits the performance of BG.
The soil security concept has been put forward to maintain and improve soil resources inter alia to provide food, clean water, climate change mitigation and adaptation, and to protect ecosystems. A ...provisional framework suggested indicators for the soil security dimensions, and a methodology to achieve a quantification. In this study, we illustrate the framework for the function soil carbon storage and the two dimensions of soil capacity and soil condition. The methodology consists of (i) the selection and quantification of a small set of soil indicators for capacity and condition, (ii) the transformation of indicator values to unitless utility values via expert-generated utility graphs, and (iii) a two-level aggregation of the utility values by soil profile and by dimension. For capacity, we used a set of three indicators: total organic and inorganic carbon content and mineral associated organic carbon in the fine fraction (MAOC) estimated via their reference value using existing maps of pedogenons and current landuse to identify areas of remnant genosoils (total organic and inorganic carbon) and the 90th percentile for MAOC. For condition we used the same set of indicators, but this time using the estimated current value and comparing with their reference-state values (calculated for capacity). The methodology was applied to the whole of Australia at a spatial resolution of 90 m × 90 m. The results show that the unitless indicator values supporting the function varied greatly in Australia. Aggregation of the indicators into the two dimensions of capacity and condition revealed that most of Australia has a relatively low capacity to support the function, but that most soils are in a generally good condition relative to that capacity, with some exceptions in agricultural areas, although more sampling of the remnant genosoils is required for corroboration and improvement. The maps of capacity and condition may serve as a basis to estimate a spatially-explicit local index of Australia’s soil resilience to the threat of decarbonization.
•Methodology based on the soil security assessment framework.•Quantification of the capacity and condition of the soil to store carbon.•Spatial estimation of the utility to support carbon storage for three indicators.•Australia has low capacity, but overall good condition to store carbon.•Intensive up-to-date sampling is required to corroborate the results.
Not only do soils provide 98.7% of the calories consumed by humans, they also provide numerous other functions upon which planetary survivability closely depends. However, our continuously increasing ...focus on soils for biomass provision (food, fiber, and energy) through intensive agriculture is rapidly degrading soils and diminishing their capacity to deliver other vital functions. These tradeoffs in soil functionality - the increased provision of one function at the expense of other critical planetary functions - are the focus of this review. We examine how land-use change for biomass provision has decreased the ability of soils to regulate the carbon pool and thereby contribute profoundly to climate change, to cycle the nutrients that sustain plant growth and ecosystem health, to protect the soil biodiversity upon which many other functions depend, and to cycle the Earth's freshwater supplies. We also examine how this decreasing ability of soil to provide these other functions can be halted and reversed. Despite the complexity and the interconnectedness of soil functions, we show that soil organic carbon plays a central role and is a master indicator for soil functioning and that we require a better understanding of the factors controlling the behavior and persistence of C in soils. Given the threats facing humanity and their economies, it is imperative that we recognize that Soil Security is itself an existential challenge and that we need to increase our focus on the multiple functions of soils for long-term human welfare and survivability of the planet.