•The ability of CNN to estimate rice grain yield using UAV images is investigated.•The correlation between VI and rice grain yield is low at the ripening stage.•The proposed CNN provides robust yield ...forecast throughout the ripening stage.•RGB images dominate the network training at the ripening stage of paddy rice.•A more robust network can be trained by RGB data from late stage.
Forecasting rice grain yield prior to harvest is essential for crop management, food security evaluation, food trade, and policy-making. Many successful applications have been made in crop yield estimation using remotely sensed products, such as vegetation index (VI) from multispectral imagery. However, VI-based approaches are only suitable for estimating rice grain yield at the middle stage of growth but have limited capability at the ripening stage. In this study, an efficient convolutional neural network (CNN) architecture was proposed to learn the important features related to rice grain yield from low-altitude remotely sensed imagery. In one major region for rice cultivation of Southern China, a 160-hectare site with over 800 management units was chosen to investigate the ability of CNN in rice grain yield estimation. The datasets of RGB and multispectral images were obtained by a fixed-wing, unmanned aerial vehicle (UAV), which was mounted with a digital camera and multispectral sensors. The network was trained with different datasets and compared against the traditional vegetation index-based method. In addition, the temporal and spatial generality of the trained network was investigated. The results showed that the CNNs trained by RGB and multispectral datasets perform much better than VIs-based regression model for rice grain yield estimation at the ripening stage. The RGB imagery of very high spatial resolution contains important spatial features with respect to grain yield distribution, which can be learned by deep CNN. The results highlight the promising potential of deep convolutional neural networks for rice grain yield estimation with excellent spatial and temporal generality, and a wider time window of yield forecasting.
•Using CNN to detect rice phenology by a mono-temporal imagery of UAV was investigated.•Shape-model-fitting method underperformed with short length of time-series data.•Integrating regional mean ...thermal time into CNN improve the detection accuracy.•The estimated harvest dates was in close agreement with the observation.•The proposed approach provides near real-time estimation of the principal phenological stages of rice.
Near real-time crop phenology detection is essential for crop management, estimation of harvest time and yield estimation. Previous approaches to crop phenology detection have relied on time-series (multi-temporal) vegetation index (VI) data, and have included threshold-based, phenometrics-based and shape-model-fitting-based (SMF) methods. However, the performance of these methods depends on the duration and temporal resolution of the time-series data. In this study, we propose a new approach which identifies the principal growth stages of rice (Oryza sativa L.) directly from RGB images. Only a mono-temporal unmanned aerial vehicle (UAV) imagery was required for a large-area phenology detection via the well-trained network. An efficient convolutional neural network (CNN) architecture was designed to estimate rice phenology. The CNN incorporated spatial pyramid pooling (SPP), transfer learning and an auxiliary branch with external data. A total of 82 plots across a 160-hectare rice cultivation area of Southern China were selected to evaluate the proposed network. CNN predictions were ground truthed using rice phenology measurements taken from each plot throughout the growing season. Aerial data were collected using a fixed-wing UAV equipped with multispectral and RGB cameras. The performance of traditional SMF methods deteriorated when time-series VI data were of short duration. In contrast, the phenological stage estimated by the proposed network showed good agreement with ground observations, with a top-1 accuracy rate of 83.9% and mean absolute error (MAE) of 0.18. The spatial distribution of harvest dates for 627 plots in the study area were computed from the phenological stage estimates. The estimates matched well with the observed harvest dates. The results demonstrated the excellent performance of the proposed deep learning approach in near real-time phenology detection and harvest time estimation.
This study evaluated three algorithms of the iterative ensemble Kalman filter (EnKF). They are Confirming EnKF, Restart EnKF, and modified Restart EnKF developed to resolve the inconsistency problem ...(i.e., updated model parameters and state variables do not follow the Richards equation) in vadose zone data assimilation due to model nonlinearity. While Confirming and Restart EnKF were adapted from literature, modified Restart EnKF was developed in this study to reduce computational costs by calculating only the mean simulation, not all the ensemble realizations, from time t = 0. A total of 11 cases were designed to investigate the performance of EnKF, Confirming EnKF, Restart EnKF, and modified Restart EnKF with different types and spatial configurations of observations (pressure head and water content) and different values of observation error variance, initial guess of ensemble mean and variance, ensemble size, and damping factor. The numerical study showed that Confirming EnKF produced considerable inconsistency for the nonlinear unsaturated flow problem, which differs from the apparent consensus opinion that Confirming EnKF can resolve the inconsistency problem. In contrast, Restart EnKF and its modification can resolve the inconsistency problem. Restart EnKF and its modification outperformed EnKF and Confirming EnKF in the various cases considered in this study. It ws also found that combining different types of observations can achieve better assimilation results, which is useful for monitoring network design.
•Physical model and machine learning are compared for simulating soil moisture.•The effects of model assumptions and observation errors are investigated.•Their performances under extrapolation and ...different soil water dynamics are discussed.
Soil moisture plays a critical role as an essential component of the global water resources by regulating mass and energy exchange between land surface and atmosphere. Quantification of these exchange processes requires accurate characterization and simulation of soil water movement. Physically-based models (PBMs) and machine learning methods (MLMs) can both be used in soil moisture simulation. However, their performances in soil water simulation have only been compared in a limited number of cases. Moreover, almost all of them are conducted in field studies each with fixed soil, initial condition, and boundary condition. Here, we developed three artificial neural network (ANN) frameworks, and made clearer and more systematic comparisons between them and a PBM—Ross numerical model solving Richards equation and parameter estimation using a data assimilation approach (iterative ensemble smoother, Ross-IES) in synthetic and real-world conditions. Compared with the ANNs, Ross-IES is more significantly affected by physical model uncertainties such as soil heterogeneity, initial and boundary conditions, while both methods are affected by observation noise. For Ross-IES, the errors from boundary conditions and hydraulic parameter conceptualization are found to be more prominent than that of observation noise and therefore are suggested to be identified first. Meanwhile, the ANNs have difficulty in simulating the peaks and troughs of the soil water time series as well as in situations where the soil moisture is constantly saturated. ANNs yield a superior simulation when the nonlinear relationship between the response variables and driving data is weak, while the performance of Ross-IES is governed by the prior soil hydraulic information. In addition, Ross-IES approach requires much higher computational cost than the ANNs. ANN-MS performs best among the three ANN-based machine learning models and demonstrates great data mining ability and robustness against overfitting.
Conventionally, soil moisture dynamics are mathematically modeled by the Richardson‐Richards equation, whose derivation is based on the conservation of mass and the Buckingham‐Darcy law. However, it ...is complicated and even impossible to finish such rigorous derivations based on physical principles due to the complexity and uncertainties in the vadose zone. In this work, we propose a data‐driven sparse regression framework. For the first time, we discover the time‐dependent nonlinear soil moisture flow equation from only volumetric water content observations. The framework leverages linear approximations and group sparsity techniques. Except for a few assumptions, it requires no prior information, including a specific constitutive relationship model, boundary conditions, and initial conditions. Several numerical experiments are tested to demonstrate that the framework successfully discovers the underlying soil moisture flow equation from data and tends to discover the parsimonious equation governed by dominant physical processes. Besides, the identified nonlinear coefficients, which represent soil hydraulic properties, fit well with the actual coefficients, although they deviate slightly at near‐saturated ends. The results demonstrate the satisfactory performance of the proposed framework under various scenarios. Despite being based on a homogeneous soil assumption, this study provides a new perspective for deriving soil moisture flow governing equations.
Key Points
Sparse regression accurately discovers the soil moisture flow governing equation from only volumetric water content observations
Sparse regression tends to discover parsimonious equations governed by dominant physical processes
Soil hydraulic properties can be precisely derived from volumetric water content observations
•The reason of numerical divergence for simulating infiltration into dry soil is revealed.•The influencing factors of allowed maximum time step sizes are analyzed.•A more robust and cost-effective ...modified iteration algorithm is proposed.
Numerical models based on Richards’ equation are often employed to simulate the soil water dynamics. Among them, those Picard iteration models which use the head as primary variable are widely adopted due to their simplicity and capability for handling partially saturated flow conditions. However, it is well-known that those models are prone to convergence failure in some unfavorable flow conditions, especially when simulating infiltration into initially dry soils. Here we analyze the reasons that give rise to the numerical difficulty. Moreover, several modifications to the mass-conservative Picard iteration method are proposed so that numerical difficulty is avoided in these unfavorable flow conditions. Our proposed modifications do not degrade the simulated results, while they lead to more robust convergence performances and cost-effective simulations.
Data‐driven scientific discovery methods have been developed and applied to discover governing equations from data, involving the attempt to discover the unsaturated flow equation in soils from data. ...However, an important but unresolved problem is how to reconstruct the unsaturated flow equation from highly noisy and scarce discrete data. In this study, we present a new deep‐learning framework: DeepGS (deep‐learning‐based group sparsity framework), that leverages the synergy of group sparsity and physics‐informed deep learning (PIDL) to reconstruct the latent governing equation for unsaturated flow. In particular, we design a strategy that decomposes the identification of the unsaturated flow equation into two tasks: the determination of the partial differential equation structure and the reconstruction of the nonlinear coefficients. The tasks can be seamlessly handled by group sparse regression and the PIDL approach. Through the training, it realizes the simultaneous reconstruction of soil moisture dynamics and unsaturated flow governing equation. A series of comprehensive numerical experiments are conducted to determine the optimal architecture and test its performance. The results show the efficacy and robustness of DeepGS, which significantly outperform previous methods. We also conclude that accurately reconstructing soil moisture dynamics and spatiotemporal derivatives from noisy and scarce data play a critical role in governing equation discovery. This study further demonstrates the potential of discovering the governing equation for unsaturated flow from data in more complex scenarios, where rich and accurate soil moisture observations are generally intractable to access.
Plain Language Summary
Establishing an equation to describe soil water flow is important for scientists and engineers to understand its physical characteristics and apply it to scientific and engineering practice. Deriving the equation from physical principles step by step is difficult due to the complexity of soil water flow and even may be inaccurate. Recently, a class of physics‐informed data‐driven methods has been proposed, which enables learning the physical equations directly from data, and it has been applied to soil water flow equation establishment. However, it requires rich and accurate soil water observations, which are generally difficult to access. Here, we propose a new deep‐learning approach to reduce the high dependence of previous methods on high‐quality data. Specifically, we designed a special deep‐learning architecture and its training method to realize this objective. We designed and conducted comprehensive numerical experiments to test the methods. This study provides insights into how to accurately discover the soil water flow equation from data. Broadly, it is a step forward in revealing yet unclear physical laws of soil water flow from data.
Key Points
A deep‐learning framework is proposed for reconstructing the unsaturated flow equation from sparse and noisy data
Equation discovery tasks are decomposed into determining equation structure and reconstructing coefficients
Recovering and calculating accurate derivatives from data is key for the equation reconstruction
•A gradient-enhanced nonparametric data assimilation scheme was proposed.•Spatio-temporal gradients can provide implicit physical constraints.•Spatial gradients had a more robust performance than ...temporal gradients.•An enhancement strategy that used only surface temporal gradients was recommended.
Soil water content (SWC) is a vital variable in the hydrological cycle, while simulation of it often relies on resolving the soil water flow equation. To cope with the unavailability or poor quality of physical models, various nonparametric data assimilation (DA) schemes have been established. However, there tends to be two significant common challenges in such methods: (1) the difficulty in capturing locally changing behaviors of time series, especially the peaks and troughs, and (2) poor statistical interpolation capability in time and space. These two challenges are no doubt attributed to the complete renunciation of physical constraints. Unlike previous physics-informed approaches that incorporated physical governing equations and engineering control into the loss functions, this study attempts to introduce additional physical constraints from data gradient into the model-free DA framework. As a follow-up study of Wang et al. (2021), a gradient-enhanced version of nonparametric DA schemes (i.e., GE-EnKFGP) is proposed. The temporal (daily) and spatial (vertical) gradients of the SWC are merged into the construction of the unsaturated flow dynamical models based on the Gaussian process (GP), while the Kalman update formulation is used to reconcile real-time observations. With the aid of a series of real-world cases, the performance of the GE-EnKFGP was compared with the original EnKFGP and its gradient-based version (GB-EnKFGP), where the temporal gradients of the SWC were used as the proxy for the SWC as the GP output. The results showed that the enhancement of the gradient information in the GE-EnKFGP led to a better estimation than the initial EnKFGP due to its more accurate identification of multiple local extrema. This should be attributed to the mass conservation constraint hidden within the temporal gradients and the implicit constraint of the driving force (or upper boundary) from the spatial gradients. Spatial gradients of the SWC outperformed temporal ones under various application scenarios. The GB-EnKFGP and GE-EnKFGP exhibited superior performances in retrieving surface SWC than that in the deeper layer. Hence, an enhancement scheme using only the temporal gradients of the surface layer was recommended. In the context of spatial extrapolation, the assistance of spatial gradients yielded an improved estimate of the deeper SWC quite robustly through GP training and assimilation of easy-to-access surface data. However, the implementation of the GB-EnKFGP and temporal gradient-enhanced EnKFGP i.e., GE-EnKFGP (t) run the risk of triggering a performance collapse due to the delayed response of SWC profiles to rainfall events.
•Updating plant development stage can avoid “phenological shift”.•The SSPE method improves DA performance whilst offering sufficient observations.•Soil stratification and excessive unknown crop ...parameters hinder effective DA.•Deep soil water and grain yield measurement are needed to improve DA performance.
Improvements to agricultural water and crop managements require detailed information on crop and soil states, and their evolution. Data assimilation provides an attractive way of obtaining these information by integrating measurements with model in a sequential manner. However, data assimilation for soil-water-atmosphere-plant (SWAP) system is still lack of comprehensive exploration due to a large number of variables and parameters in the system. In this study, simultaneous state-parameter estimation using ensemble Kalman filter (EnKF) was employed to evaluate the data assimilation performance and provide advice on measurement design for SWAP system. The results demonstrated that a proper selection of state vector is critical to effective data assimilation. Especially, updating the development stage was able to avoid the negative effect of “phenological shift”, which was caused by the contrasted phenological stage in different ensemble members. Simultaneous state-parameter estimation (SSPE) assimilation strategy outperformed updating-state-only (USO) assimilation strategy because of its ability to alleviate the inconsistency between model variables and parameters. However, the performance of SSPE assimilation strategy could deteriorate with an increasing number of uncertain parameters as a result of soil stratification and limited knowledge on crop parameters. In addition to the most easily available surface soil moisture (SSM) and leaf area index (LAI) measurements, deep soil moisture, grain yield or other auxiliary data were required to provide sufficient constraints on parameter estimation and to assure the data assimilation performance. This study provides an insight into the response of soil moisture and grain yield to data assimilation in SWAP system and is helpful for soil moisture movement and crop growth modeling and measurement design in practice.
•We propose a dynamic data-driven approach based on Gaussian process regression to estimate model structural error in soil moisture data assimilation.•Gaussian process error model can represent the ...underlying model structural error.•The proposed hybrid method (EnKF-GP) outperforms the standard EnKF.
Attributing to the flexibility in considering various types of observation error and model error, data assimilation has been increasingly applied to dynamically improve soil moisture modeling in many hydrological practices. However, accurate characterization of model error, especially the part caused by defective model structure, presents a significant challenge to the successful implementation of data assimilation. Model structural error has received limited attention relative to parameter and input errors, mainly due to our poor understanding of structural inadequacy and the difficulties in parameterizing structural error. In this paper, we present a dynamic data-driven approach to estimate the model structural error in soil moisture data assimilation without the need for identifying error generation mechanism or specifying particular form for the error model. The error model is based on the Gaussian process regression and then integrated into the ensemble Kalman filter (EnKF) to form a hybrid method for dealing with multi-source model errors. Two variants of the hybrid method in terms of two different error correction manners are proposed. The effectiveness of the proposed method is tested through a suit of synthetic cases and a real-world case. Results demonstrate the potential of the proposed hybrid method for estimating model structural error and providing improved model predictions. Compared to the traditional EnKF without explicitly considering the model structural error, parameter compensation issue is obviously reduced and soil moisture retrieval is substantially improved.