Data assimilation (DA) and uncertainty quantification (UQ) are extensively used in analysing and reducing error propagation in high-dimensional spatial-temporal dynamics. Typical applications span ...from computational fluid dynamics (CFD) to geoscience and climate systems. Recently, much effort has been given in combining DA, UQ and machine learning (ML) techniques. These research efforts seek to address some critical challenges in high-dimensional dynamical systems, including but not limited to dynamical system identification, reduced order surro-gate modelling, error covariance specification and model error correction. A large number of developed techniques and methodologies exhibit a broad applicability across numerous domains, resulting in the necessity for a comprehensive guide. This paper provides the first overview of state-of-the-art researches in this interdisciplinary field, covering a wide range of applications. This review is aimed at ML scientists who attempt to apply DA and UQ techniques to improve the accuracy and the interpretability of their models, but also at DA and UQ experts who intend to integrate cutting-edge ML approaches to their systems. Therefore, this article has a special focus on how ML methods can overcome the existing limits of DA and UQ, and vice versa. Some exciting perspectives of this rapidly developing research field are also discussed.
Starting from three Eulerian second-order nonlinear advection schemes for semi-staggered Arakawa grids B/E, advection schemes of fourth order of formal accuracy were developed. All three second-order ...advection schemes control the nonlinear energy cascade in case of nondivergent flow by conserving quadratic quantities. Linearization of all three schemes leads to the same second-order linear advection scheme. The second-order term of the truncation error of the linear advection scheme has a special form so that it can be eliminated by modifying the advected quantity while still preserving consistency. Tests with linear advection of a cone confirm the advantage of the fourth-order scheme. However, if a localized, large amplitude and high wavenumber pattern is present in initial conditions, the clear advantage of the fourth-order scheme disappears.
The new nonlinear fourth-order schemes are quadratic conservative and reduce to the Arakawa Jacobian for advected quantities in case of nondivergent flow. In case of general flow the conservation properties of the new momentum advection schemes impose stricter constraint on the nonlinear cascade than the original second-order schemes. However, for nondivergent flow, the conservation properties of the fourth-order schemes cannot be proven in the same way as those of the original second-order schemes. Therefore, demanding long-term and low-resolution nonlinear tests were carried out in order to investigate how well the fourth-order schemes control the nonlinear energy cascade. All schemes were able to maintain meaningful solutions throughout the test.
Finally, the impact was examined of the fourth-order momentum advection on global medium-range forecasts. The 500-hPa anomaly correlation coefficient obtained using the best performing fourth-order scheme did not show an improvement compared to the tests using its second-order counterpart.
Assimilation of weather radar measurements including radar reflectivity and radial wind data has been operational at the Deutscher Wetterdienst, with a diagonal observation error (OE) covariance ...matrix. For an implementation of a full OE covariance matrix, the statistics of the OE have to be a priori estimated, for which the Desroziers method has been often used. However, the resulted statistics consists of contributions from different error sources and are difficult to interpret. In this work, we use an approach that is based on samples for truncation error in radar observation space to approximate the representation error due to unresolved scales and processes (RE) and compare its statistics with the OE statistics estimated by the Desroziers method. It is found that the statistics of the RE help the understanding of several important features in the variances and correlation length scales of the OE for both reflectivity and radial wind data and the other error sources from the microphysical scheme, radar observation operator and the superobbing technique may also contribute, for instance, to differences among different elevations and observation types. The statistics presented here can serve as a guideline for selecting which observations are assimilated and for assignment of the OE covariance matrix that can be diagonal or full and correlated.
The ensemble Kalman filter algorithm can produce negative values for non‐negative variables. To mitigate this sign problem and to simultaneously maintain the mass conservation, a new concept of ...combining weak constraints on mass conservation and non‐negativity has been introduced in this work, with a focus on hydrometeor variables in convective‐scale data assimilation. We modify the local ensemble transform Kalman filter with weak constraints on mass conservation for each hydrometeor variable and adopt the assimilation of clear‐air reflectivity data as a weak constraint on non‐negativity. We examine the concept by a series of sensitivity experiments using an idealized setup. Results show that both weak constraints successfully improve the mass conservation property in analyses and both reduce the biased increase in integrated mass‐flux divergence and vorticity. Furthermore, the least biased increase is obtained by combining both constraints, and the best forecasts are also achieved by the combination.
Plain Language Summary
Often physical properties of a system that we are modeling dictate plausible values of the initial conditions of our numerical models. Unfortunately, by using modern data assimilation techniques to obtain these initial conditions, physical property of non‐negativity is frequently violated. On the other hand, algorithms that are able to preserve the non‐negativity usually would break mass conservation. Here, we propose a fast, easy to implement modification of the existing algorithm (local ensemble transform Kalman filter) that is able to weakly preserve both properties of mass conservation and non‐negativity. In idealized experiments that assimilate radar data in non‐hydrostatic, convection‐permitting numerical model and update hydrometeor values, we show the benefit of the proposed approach on prediction of atmospheric water variables.
Key Points
A weakly constrained LETKF for mass conservation and non‐negativity is introduced and examined in convective‐scale data assimilation
Combining both constraints results in the least biases in total mass of hydrometeors and in mass‐flux divergence and vorticity in analyses
Best forecasts are also achieved by the combination
Aircraft observations of wind and temperature collected by airport surveillance radars Mode-S Enhanced Surveillance (Mode-S EHS) were assimilated in the Consortium for Small-Scale Modeling ...Kilometre-scale Ensemble Data Assimilation (COSMO-KENDA), which couples an ensemble Kalman filter to a 40-member ensemble of the convection permitting COSMO-DE model. The number of observing aircrafts in Mode-S EHS was about 15 times larger than in the AMDAR system. In the comparison of both aircraft observation systems, a similar observation error standard deviation was diagnosed for wind. For temperature, a larger error was diagnosed for Mode-S EHS. With the high density of Mode-S EHS observations, a reduction of temperature and wind error in forecasts of 1 and 3 hours was found mainly in the flight level and less near the surface. The amount of Mode-S EHS data was reduced by random thinning to test the effect of a varying observation density. With the current data assimilation setup, a saturation of the forecast error reduction was apparent when more than 50% of the Mode-S EHS data were assimilated. Forecast kinetic energy spectra indicated that the reduction in error is related to analysis updates on all scales resolved by COSMO-DE.
The Madden–Julian oscillation (MJO) is the dominant component of tropical intraseasonal variability, with wide‐reaching impacts even on extratropical weather and climate patterns. However, predicting ...the MJO is challenging. One reason is the suboptimal state estimates obtained with standard data assimilation (DA) approaches. These are typically based on filtering methods with Gaussian approximations and do not take into account physical properties that are important specifically for the MJO. In this article, a constrained ensemble DA method is applied to study the impact of different physical constraints on the state estimation and prediction of the MJO. The quadratic programming ensemble (QPEns) algorithm utilized extends the standard stochastic ensemble Kalman filter (EnKF) with specifiable constraints on the updates of all ensemble members. This allows us to recover physically more consistent states and to respect possible associated non‐Gaussian statistics. The study is based on identical twin experiments with an adopted nonlinear model for tropical intraseasonal variability. This so‐called skeleton model succeeds in reproducing the main large‐scale features of the MJO and closely related tropical waves, while keeping adequate simplicity for fast experiments on intraseasonal time‐scales. Conservation laws and other crucial physical properties from the model are examined as constraints in the QPEns. Our results demonstrate an overall improvement in the filtering and forecast skill when the model's total energy is conserved in the initial conditions. The degree of benefit is found to be dependent on the observational setup and the strength of the model's nonlinear dynamics. It is also shown that, even in cases where the statistical error in some waves remains comparable with the stochastic EnKF during the DA stage, their prediction is improved remarkably when using the initial state resulting from the QPEns.
Unsatisfactory predictions of the MJO are partly due to DA methods that do not respect non‐Gaussian PDFs and the physical properties of the tropical atmosphere. Therefore the QPEns, an algorithm extending a stochastic EnKF with state constraints, is tested here on a simplified model for the MJO and associated tropical waves. Our series of identical twin experiments shows, in particular, that a constraint on the truth's nonlinear total energy improves forecasts statistically and can, in certain situations, even prevent filter divergence.
Data assimilation algorithms require an accurate estimate of the uncertainty of the prior (background) field that cannot be adequately represented by the ensemble of numerical model simulations. ...Partially, this is due to the sampling error that arises from the use of a small number of ensemble members to represent the background‐error covariance. It is also partially a consequence of the fact that the geophysical model does not represent its own error. Several mechanisms have been introduced so far to alleviate the detrimental effects of misrepresented ensemble covariances, allowing for the successful implementation of ensemble data assimilation techniques for atmospheric dynamics. One of the established approaches is additive inflation, which consists of perturbing each ensemble member with a sample from a given distribution. This results in a fixed rank of the effective model‐error covariance matrix. In this article, a more flexible approach is introduced, where the model error samples are treated as additional synthetic ensemble members, which are used in the update step of data assimilation but are not forecast. This way, the rank of the model‐error covariance matrix can be chosen independently of the ensemble. The effect of this altered additive inflation method on the performance of the filter is analyzed here in an idealized experiment. It is shown that the additional synthetic ensemble members can make it feasible to achieve convergence in an otherwise divergent parameter setting of data assimilation. The use of this method also allows for a less stringent localization radius.
In this article, a flexible approach to additive noise is introduced where model error samples are treated as additional synthetic ensemble members. The effect of this altered additive inflation method on the performance of the filter is analyzed here in an idealized experiment. It is shown that the additional synthetic ensemble members can make it feasible to achieve convergence in an otherwise divergent parameter setting of data assimilation.
Numerical discretization schemes have a long history of incorporating the most important conservation properties of the continuous system in order to improve the prediction of the nonlinear flow. The ...question arises whether data assimilation algorithms should follow a similar approach. To address this issue, we explore the conservation properties during data assimilation using perfect model experiments with a 2D shallow‐water model preserving important properties of the true nonlinear flow. The data assimilation scheme used here is the Local Ensemble Transform Kalman Filter with varying observed variables, inflation, localization radius and thinning interval. It is found that, during the assimilation, the total energy of the analysis ensemble mean converges with time towards the nature run value. However, enstrophy, divergence and the energy spectra are strongly affected by the data assimilation settings. Having in mind that the conservation of both the kinetic energy and enstrophy by the momentum advection schemes in the case of non‐divergent flow prevents a systematic and unrealistic energy cascade towards the high wave numbers, we test the effects on the prediction depending on the type of error in the initial condition. During the assimilation, we assess the downward nonlinear energy cascade through a scalar, domain‐averaged noise measure. We show that the accumulated noise during assimilation and the error of analysis are good indicators of the quality of the prediction.
Data assimilation (DA) methods for convective‐scale numerical weather prediction at operational centres are surveyed. The operational methods include variational methods (3D‐Var and 4D‐Var), ensemble ...methods (LETKF) and hybrids between variational and ensemble methods (3DEnVar and 4DEnVar). At several operational centres, other assimilation algorithms, like latent heat nudging, are additionally applied to improve the model initial state, with emphasis on convective scales. It is demonstrated that the quality of forecasts based on initial data from convective‐scale DA is significantly better than the quality of forecasts from simple downscaling of larger‐scale initial data. However, the duration of positive impact depends on the weather situation, the size of the computational domain and the data that are assimilated. Furthermore it is shown that more advanced methods applied at convective scales provide improvements over simpler methods. This motivates continued research and development in convective‐scale DA.
Challenges in research and development for improvements of convective‐scale DA are also reviewed and discussed. The difficulty of handling the wide range of spatial and temporal scales makes development of multi‐scale assimilation methods and space–time covariance localization techniques important. Improved utilization of observations is also important. In order to extract more information from existing observing systems of convective‐scale phenomena (e.g. weather radar data and satellite image data), it is necessary to provide improved statistical descriptions of the observation errors associated with these observations.
Data assimilation methods for convective‐scale numerical weather prediction at operational centres are surveyed. It is demonstrated that the quality of forecasts based on initial data from convective‐scale data assimilation is significantly better than the quality of forecasts from simple downscaling. Furthermore it is shown that more advanced methods applied at convective scales provide improvements over simpler methods.