Deep learning (DL) models can accurately predict many hydrologic variables including streamflow and water temperature; however, these models have typically predicted hydrologic variables ...independently. This study explored the benefits of modeling two interdependent variables, daily average streamflow and daily average stream water temperature, together using multi‐task DL. A multi‐task scaling factor controlled the relative contribution of the auxiliary variable's error to the overall loss during training. Our experiments examined the improvement in prediction accuracy of the multi‐task approach using paired streamflow and water temperature data from sites across the conterminous United States. Our results showed that for 56 out of 101 sites, the best performing multi‐task models performed better overall than the single‐task models in terms of Nash‐Sutcliffe efficiency for predicting streamflow with single‐site models. For 43 sites, the best multi‐task, single‐site models made no significant difference in predicting streamflow. The multi‐task approach had a smaller effect when applied to a model trained with data from 101 sites together, significantly improving performance for only 17 sites. The multi‐task scaling factor was consequential in determining to what extent the multi‐task approach was beneficial. A naïve selection of this factor led to significantly worse‐performing models for 3 of 101 sites when predicting streamflow as the primary variable, and 47 of 53 sites when predicting stream temperature as the primary variable. We conclude that a multi‐task approach can make more accurate predictions by leveraging information from interdependent hydrologic variables, but only for some sites, variables, and model configurations.
Key Points
A single deep learning model was used to predict both water temperature and streamflow
The best configured single‐site multi‐task models improved streamflow predictions for most sites tested
A naïve implementation of multi‐task learning was detrimental to water temperature predictions
Abstract
Although groundwater discharge is a critical stream temperature control process, it is not explicitly represented in many stream temperature models, an omission that may reduce predictive ...accuracy, hinder management of aquatic habitat, and decrease user confidence. We assessed the performance of a previously‐described process‐guided deep learning model of stream temperature in the Delaware River Basin (USA). We found lower accuracy (root mean square error RMSE of 1.71 versus 1.35°C) and stronger seasonal bias (absolute mean monthly bias of 1.06 vs. 0.68°C) for reaches primarily influenced by deep groundwater as compared to atmospheric conditions. We then tested four approaches for improving groundwater process representation: (a) a custom loss function leveraging the unique patterns of air and water temperature coupling characteristic of different temperature drivers, (b) inclusion of additional groundwater‐relevant catchment attributes, (c) incorporation of additional process model outputs, and (d) a composite model. The custom loss function and the additional attributes significantly improved the predictive accuracy in groundwater‐dominated reaches (RMSE of 1.37 and 1.26°C) and reduced the seasonal bias (absolute mean monthly bias of 0.44 and 0.48°C), but neither approach could identify holdout groundwater reaches. Variable importance analysis indicates the custom loss function nudges the model to use the existing inputs more efficiently, whereas with the added features the model relies on a broader suite of inputs. This analysis is a substantial step toward more accurately representing groundwater discharge processes in stream temperature models and will improve predictive accuracy and inform habitat management.
Plain Language Summary
Groundwater flowing into streams and rivers can cool the water during the summer and warm it during the winter. This creates important habitat for animals like trout and dwarf wedgemussels. Resource managers use computer models of stream temperature, but most models do not simulate groundwater flows with much detail. Insufficient accounting for groundwater could lead to predicted temperatures that are less accurate or less trusted. We tested four ways of including groundwater in a stream temperature model. The particular model we used is a type of machine learning model termed “process‐guided deep learning” because it takes advantage of both the computational advances in machine learning and our collective understanding of the science of stream temperature. We found that two approaches, one that focused on patterns between the air temperature and water temperature and one that incorporated additional descriptions of each stream reach, significantly improved the temperature predictions. Our findings have important implications for stream temperature predictions, habitat management, and methods for incorporating scientific expertise into machine learning models.
Key Points
Existing process‐guided deep learning stream temperature models perform poorly in reaches with groundwater‐controlled temperatures
A custom loss function or enhanced input data improved predictive performance in monitored but not hold‐out groundwater reaches
The loss function used existing input data more efficiently; the enhanced inputs model spread its reliance to a wider range of inputs
Agricultural runoff from the Mississippi‐Atchafalaya River Basin delivers nitrogen (N) and phosphorus (P) to the Gulf of Mexico, causing hypoxia, and climate drives interannual variation in nutrient ...loads. Climate phenomena such as El Niño–Southern Oscillation may influence nutrient export through effects on river flow, nutrient uptake, or biogeochemical transformation, but landscape variation at smaller spatial scales can mask climate signals in load or discharge time series within large river networks. We used multivariate autoregressive state‐space modeling to investigate climate signals in the long‐term record (1979–2014) of discharge, N, P, and SiO2 loads at three nested spatial scales within the Mississippi‐Atchafalaya River Basin. We detected significant signals of El Niño–Southern Oscillation and land‐surface temperature anomalies in N loads but not discharge, SiO2, or P, suggesting that large‐scale climate phenomena contribute to interannual variation in nutrient loads through biogeochemical mechanisms beyond simple discharge‐load relationships.
Plain Language Summary
Runoff of excess nutrients from crop fertilizers applied throughout the Mississippi‐Atchafalaya River Basin, particularly nitrogen (N) and phosphorus (P), pollute freshwater and coastal ecosystems such as the Gulf of Mexico. Though agriculture is the main source, year‐to‐year variation in the size of nutrient loads is largely controlled by precipitation and river flow, which mobilize nutrients from the landscape. Additional climate variables, such as temperature, influence nutrient loads by controlling rates of nutrient uptake or transformation by plants, algae, and microbes, but these processes may be difficult to detect in a nearly continental‐scale river network with heterogeneous subbasins. We identified signals of multiple large‐scale climate phenomena in the long‐term record (1979–2014) of nutrient loads from the Mississippi River and its major tributaries. Climate effects on nutrient loads, particularly N, were different and often stronger than on river flow, indicating that long‐term patterns in nutrient loads were influenced by processes beyond simple precipitation‐driven runoff. Variable effects of climate on nutrient export present challenges for reducing nutrient loads to the Mississippi River and Gulf of Mexico. Adjustments to targeted reductions may be needed as global and regional climates change.
Key Points
Significant signals of large‐scale climate phenomena appear in N loads but not discharge, SiO2, or P loads in the Mississippi River Basin at multiple spatial scales
Effects of climate variables differ among nutrients (N, P, SiO2) and nutrient forms (nitrate, ammonium)
Climate‐driven processes independent of river flow contribute to temporal variation in nutrient loads within the Mississippi River Basin
Drought is common in rivers, yet how this disturbance regulates metabolic activity across network scales is largely unknown. Drought often lowers gross primary production (GPP) and ecosystem ...respiration (ER) in small headwaters but by contrast can enhance GPP and cause algal blooms in downstream estuaries. We estimated ecosystem metabolism across a nested network of 13 reaches from headwaters to the main stem of the Connecticut River from 2015 through 2017, which encompassed a pronounced drought. During drought, GPP and ER increased, but with greater enhancement in larger rivers. Responses of GPP and ER were partially due to warmer temperatures associated with drought, particularly in the larger rivers where temperatures during summer drought were > 10°C higher than typical summer baseflow. The larger rivers also had low canopy cover, which allowed primary producers to take advantage of lower turbidity and fewer cloudy days during drought. We conclude that GPP is enhanced by higher temperature, lower turbidity, and longer water residence times that are all a function of low discharge, but ecosystem response in temperate watersheds to these drivers depends on light availability regulated by riparian canopy cover. In larger rivers, GPP increased more than ER during drought, even leading to temporary autotrophy, an otherwise rare event in the typically light-limited heterotrophic Connecticut River main stem. With climate change, rivers and streams may become warmer and drought frequency and severity may increase. Such changes may increase autotrophy in rivers with broad implications for carbon cycling and water quality in aquatic ecosystems.
The metabolic regimes of flowing waters Bernhardt, E. S.; Heffernan, J. B.; Grimm, N. B. ...
Limnology and oceanography,
03/2018, Letnik:
63, Številka:
S1
Journal Article
Recenzirano
Odprti dostop
The processes and biomass that characterize any ecosystem are fundamentally constrained by the total amount of energy that is either fixed within or delivered across its boundaries. Ultimately, ...ecosystems may be understood and classified by their rates of total and net productivity and by the seasonal patterns of photosynthesis and respiration. Such understanding is well developed for terrestrial and lentic ecosystems but our understanding of ecosystem phenology has lagged well behind for rivers. The proliferation of reliable and inexpensive sensors for monitoring dissolved oxygen and carbon dioxide is underpinning a revolution in our understanding of the ecosystem energetics of rivers. Here, we synthesize our current understanding of the drivers and constraints on river metabolism, and set out a research agenda aimed at characterizing, classifying and modeling the current and future metabolic regimes of flowing waters.
The rapid growth of data in water resources has created new opportunities to accelerate knowledge discovery with the use of advanced deep learning tools. Hybrid models that integrate theory with ...state‐of‐the art empirical techniques have the potential to improve predictions while remaining true to physical laws. This paper evaluates the Process‐Guided Deep Learning (PGDL) hybrid modeling framework with a use‐case of predicting depth‐specific lake water temperatures. The PGDL model has three primary components: a deep learning model with temporal awareness (long short‐term memory recurrence), theory‐based feedback (model penalties for violating conversation of energy), and model pretraining to initialize the network with synthetic data (water temperature predictions from a process‐based model). In situ water temperatures were used to train the PGDL model, a deep learning (DL) model, and a process‐based (PB) model. Model performance was evaluated in various conditions, including when training data were sparse and when predictions were made outside of the range in the training data set. The PGDL model performance (as measured by root‐mean‐square error (RMSE)) was superior to DL and PB for two detailed study lakes, but only when pretraining data included greater variability than the training period. The PGDL model also performed well when extended to 68 lakes, with a median RMSE of 1.65 °C during the test period (DL: 1.78 °C, PB: 2.03 °C; in a small number of lakes PB or DL models were more accurate). This case‐study demonstrates that integrating scientific knowledge into deep learning tools shows promise for improving predictions of many important environmental variables.
Key Points
Process‐Guided Deep Learning (PGDL) models integrate advanced empirical techniques with process knowledge
We used PGDL to accurately predict lake water temperatures for various conditions
PGDL performance improved significantly when pretraining data included diverse conditions generated by an existing process‐based model
A common approach to understanding neurodegenerative disease is comparing gene expression in diseased versus healthy tissues. We illustrate that expression profiles derived from whole tissue RNA ...highly reflect the degenerating tissues' altered cellular composition, not necessarily transcriptional regulation. To accurately understand transcriptional changes that accompany neuropathology, we acutely purify neurons, astrocytes and microglia from single adult mouse brains and analyse their transcriptomes by RNA sequencing. Using peripheral endotoxemia to establish the method, we reveal highly specific transcriptional responses and altered RNA processing in each cell type, with Tnfr1 required for the astrocytic response. Extending the method to an Alzheimer's disease model, we confirm that transcriptomic changes observed in whole tissue are driven primarily by cell type composition, not transcriptional regulation, and identify hundreds of cell type-specific changes undetected in whole tissue RNA. Applying similar methods to additional models and patient tissues will transform our understanding of aberrant gene expression in neurological disease.
Most environmental data come from a minority of well‐monitored sites. An ongoing challenge in the environmental sciences is transferring knowledge from monitored sites to unmonitored sites. Here, we ...demonstrate a novel transfer‐learning framework that accurately predicts depth‐specific temperature in unmonitored lakes (targets) by borrowing models from well‐monitored lakes (sources). This method, meta‐transfer learning (MTL), builds a meta‐learning model to predict transfer performance from candidate source models to targets using lake attributes and candidates' past performance. We constructed source models at 145 well‐monitored lakes using calibrated process‐based (PB) modeling and a recently developed approach called process‐guided deep learning (PGDL). We applied MTL to either PB or PGDL source models (PB‐MTL or PGDL‐MTL, respectively) to predict temperatures in 305 target lakes treated as unmonitored in the Upper Midwestern United States. We show significantly improved performance relative to the uncalibrated PB General Lake Model, where the median root mean squared error (RMSE) for the target lakes is 2.52°C. PB‐MTL yielded a median RMSE of 2.43°C; PGDL‐MTL yielded 2.16°C; and a PGDL‐MTL ensemble of nine sources per target yielded 1.88°C. For sparsely monitored target lakes, PGDL‐MTL often outperformed PGDL models trained on the target lakes themselves. Differences in maximum depth between the source and target were consistently the most important predictors. Our approach readily scales to thousands of lakes in the Midwestern United States, demonstrating that MTL with meaningful predictor variables and high‐quality source models is a promising approach for many kinds of unmonitored systems and environmental variables.
Key Points
Meta‐transfer learning (MTL) learns from models trained on data‐rich systems to inform predictions in systems where no observations exist
We use MTL with process‐based and process‐guided deep learning models to accurately predict lake temperatures in the Midwest United States
The most important predictor of transfer model success is the difference in maximum depth between the data‐rich and unmonitored lake
Satellite estimates of inland water quality have the potential to vastly expand our ability to observe and monitor the dynamics of large water bodies. For almost 50 years, we have been able to ...remotely sense key water quality constituents like total suspended sediment, dissolved organic carbon, chlorophyll a, and Secchi disk depth. Nonetheless, remote sensing of water quality is poorly integrated into inland water sciences, in part due to a lack of publicly available training data and a perception that remote estimates are unreliable. Remote sensing models of water quality can be improved by training and validation on larger data sets of coincident field and satellite observations, here called matchups. To facilitate model development and deeper integration of remote sensing into inland water science, we have built AquaSat, the largest such matchup data set ever assembled. AquaSat contains more than 600,000 matchups, covering 1984–2019, of ground‐based total suspended sediment, dissolved organic carbon, chlorophyll a, and SDDSecchi disk depth measurements paired with spectral reflectance from Landsat 5, 7, and 8 collected within ±1 day of each other. To build AquaSat, we developed open source tools in R and Python and applied them to existing public data sets covering the contiguous United States, including the Water Quality Portal, LAGOS‐NE, and the Landsat archive. In addition to publishing the data set, we are also publishing our full code architecture to facilitate expanding and improving AquaSat. We anticipate that this work will help make remote sensing of inland water accessible to more hydrologists, ecologists, and limnologists while facilitating novel data‐driven approaches to monitoring and understanding critical water resources at large spatiotemporal scales.
Key Points
AquaSat contains ∼600,000 paired observations of water quality and Landsat reflectance, the largest such matchup data set
Matchups capture diverse water bodies across the USA for 1984–2019; we see clear water quality/reflectance relationships
AquaSat and open source code developed here will enable better development of models for remote sensing of water quality
5,10-Methylenetetrahydrofolate dehydrogenase (MTD) catalyzes the reversible oxidation of 5,10-methylenetetrahydrofolate to 5,10-methenyltetrahydrofolate. This reaction is critical for the supply of ...one-carbon units at the required oxidation states for the synthesis of purines and dTMP. For most MTDs, dehydrogenase activity is co-located with a methenyl-THF cyclohydrolase activity as part of bifunctional or trifunctional enzyme. The yeast Saccharomyces cerevisiae contains a monofunctional NAD+-dependent 5,10-methylenetetrahydrofolate dehydrogenase (yMTD). Kinetic, crystallographic, and mutagenesis studies were conducted to identify critical residues in order to gain further insight into the reaction mechanism of this enzyme and its apparent lack of cyclohydrolase activity. Hydride transfer was found to be rate-limiting for the oxidation of methylenetetrahydrofolate by kinetic isotope experiments (V H/V D = 3.3), and the facial selectivity of the hydride transfer to NAD+ was determined to be Pro-R (A-specific). Model building based on the previously solved structure of yMTD with bound NAD cofactor suggested a possible role for three conserved amino acids in substrate binding or catalysis: Glu121, Cys150, and Thr151. Steady-state kinetic measurements of mutant enzymes demonstrated that Glu121 and Cys150 were essential for dehydrogenase activity, whereas Thr151 allowed some substitution. Our results are consistent with a key role for Glu121 in correctly binding the folate substrate; however, the exact role of C150 is unclear. Single mutants Thr57Lys and Tyr98Gln and double mutant T57K/Y98Q were prepared to test the hypothesis that the lack of cyclohydrolase activity in yMTD was due to the substitution of a conserved Lys/Gln pair found in bifunctional MTDs. Each mutant retained dehydrogenase activity, but no cyclohydrolase activity was detected.