•A novel Partial and Invariant approach outperforms the Whole and Variant approach.•The Dual Invariant Partial filter has equivalent accuracy and is 500 times faster.•Dual optimizations with multiple ...stopping criteria are essential in building PCE.
Data assimilation plays an essential role in real-time forecasting but demands repetitive model evaluations given ensembles. To address this computational challenge, a novel, robust and efficient approach to surrogate data assimilation is presented. It replaces the internal processes of the ensemble Kalman filter (EnKF) with polynomial chaos expansion (PCE) theory. Eight types of surrogate filters, which can be characterized according to their different surrogate structures, building systems, and assimilating targets, are proposed and validated. To compensate for the potential shortcomings of the existing sequential experimental design (SED), an advanced optimization scheme, named sequential experimental design-polynomial degree (SED-PD), is also advised. Its dual optimization system resolves the issue of SED by which the value of the polynomial degree had to be selected ad-hoc or by trial and error; its multiple stopping criteria ensure convergence even when an accuracy metric does not monotonically decrease over iterations. A comprehensive investigation into how to configure a surrogate filter indicates that the new partial (replacing part of original filters) and invariant (valid for entire time periods) approaches are preferred in terms of accuracy and efficiency, which helps directly reduce the number of dimensions and bridge the gap between hindcasting and real-time forecasting. Of the eight filters, the Dual Invariant Partial filter performs best, with equivalent accuracy to Dual EnKF and about 500 times greater computational efficiency. Ultimately, this proposed surrogate filter will be a promising alternative tool for performing computationally-intensive data assimilation in high-dimensional problems.
Abstract
Summary
Phylogenetic profiles form the basis for tracing proteins and their functions across species and through time. Novel genome sequences nowadays often represent species from the ...remotest corner of the tree of life. Thus, phylogenetic profiling becomes increasingly important for functionally annotating this data and to integrate it into a comprehensive view on organismal evolution. To strengthen the link between the sharing of a gene across species and of the corresponding function, it is meanwhile common to complement phylogenetic profiles with additional information, such as domain architecture similarities between orthologs, or pairwise similarities of other protein features. However, there are few visualization tools that facilitate an intuitive integration of these various information layers. Here, we present PhyloProfile, an R-based tool to visualize, explore and analyze multi-layered phylogenetic profiles.
Availability and implementation
PhyloProfile is available as open source code under the MIT license at https://github.com/BIONF/phyloprofile. An online version for testing PhyloProfile and for small to medium-scale analyses is available at http://applbio.biologie.uni-frankfurt.de/phyloprofile.
Hydrologic flood prediction has been a quite complex and difficult task because of various sources of inherent uncertainty. Accurately quantifying these uncertainties plays a significant role in ...providing flood warnings and mitigating risk, but it is time-consuming. To offset the cost of quantifying the uncertainty, we adopted a highly efficient metamodel based on polynomial chaos expansion (PCE) theory and applied it to a lumped, deterministic rainfall–runoff model (Nedbør–Afstrømnings model, NAM) combined with generalized likelihood uncertainty estimation (GLUE). The central conclusions are: (1) the subjective aspects of GLUE (e.g., the cutoff threshold values of likelihood function) are investigated for 8 flood events that occurred in the Thu bon river watershed in Vietnam, resulting that the values of 0.82 for Nash–Sutcliffe efficiency, 4.05% for peak error, and 4.35% for volume error are determined as the acceptance thresholds. Moreover, the number of ensemble behavioral sets required to maintain the sufficient range of uncertainty but to avoid any unnecessary computation was set to 500. (2) The number of experiment designs (
N
) and degree of polynomial (
p
) are key factors in estimating PCE coefficients, and values of
N
= 50 and
p
= 4 are preferred. (3) The results computed using a PCE model consisting of polynomial bases are as good as those given by the NAM, while the total times required for making an ensemble in the PCE model are approximately seventeen times faster. (4) Two parameters (“CQOF” and “CK12”) turned out to be most dominant based on a visual inspection of the posterior distribution and the mathematical computations of the Sobol’ and Morris sensitivity analysis. Identification of the posterior parameter distributions from the calibration process helps to find the behavioral sets even faster. The unified framework that presents the most efficient ways of predicting flow regime and quantifying the uncertainty without deteriorating accuracy will ultimately be helpful for providing warnings and mitigating flood risk in a timely manner.
Applications of process‐based models (PBM) for predictions are confounded by multiple uncertainties and computational burdens, resulting in appreciable errors. A novel modeling framework combining a ...high‐fidelity PBM with surrogate and machine learning (ML) models is developed to tackle these challenges and applied for streamflow prediction. A surrogate model permits high computational efficiency of a PBM solution at a minimum loss of its accuracy. A novel probabilistic ML model partitions the PBM‐surrogate prediction errors into reducible and irreducible types, quantifying their distributions that arise due to both explicitly perceived uncertainties (such as parametric) or those that are entirely hidden to the modeler (not included or unexpected). Using this approach, we demonstrate a substantial improvement of streamflow predictive accuracy for a case study urbanized watershed. Such a framework provides an efficient solution combining the strengths of high‐fidelity and physics‐agnostic models for a wide range of prediction problems in geosciences.
Plain Language Summary
This study proposes a new framework that combines three different modeling techniques to make flood forecasting more accurate. The framework combines the strengths of (a) complex models (or process‐based models, PBMs) based on our understanding of relevant processes that can reproduce measurable quantities; (b) simpler models that are designed to mimic PBM's solutions—known as surrogate models—and make predictions within a few seconds; and (c) machine learning models that can detect relationships among variables using only data, improve the accuracy of prediction, and provide estimates of prediction uncertainty. The framework is tested in an urbanized watershed and shows a significant improvement in both computational efficiency and accuracy of streamflow prediction. Ultimately, the proposed framework is a novel powerful solution that combines the latest advances in different types of modeling approaches to solve prediction problems in geosciences. Its adaptability and efficiency make it suitable for a wide range of situations.
Key Points
While PBMs are physics‐based, the complexity of uncertainties and the high computational burden have limited their utility for predictions
The developed novel framework integrates process‐based models, surrogate, and machine learning (ML) models to predict ensemble flood attributes with error quantification
A novel probabilistic ML model partitions the errors into reducible and irreducible types, also quantifying their distributions
Oxaliplatin (OXA) was coupled to PEGylated polyamidoamine dendrimers of fourth generation (G4‐PEG@OXA) in the comparison to PEGylated ones of odd generation (G3.5‐PEG@OXA). Proton nuclear magnetic ...resonance and Fourier‐transform infrared spectroscopy were used to confirm the successful incorporation of OXA as well as the synthesis of carrier systems. Both two types of carrier systems exhibited in sphere nanoparticle shape with size of less than 100 nm that was in the range being able to cause toxicity on cancer cells. The average drug loading efficiency (DLE) of G4‐PEG@OXA was obtained at 84.63% that was higher than DLE of G3.5‐PEG of 75.69%. The release kinetic of G4‐PEG@OXA and G3.5‐PEG@OXA did not show any burst release phenomenon while free OXA was released over 40% at the first hour. The sustainable release of OXA was achieved when it was encapsulated in these carriers, but the G4 generation liberated OXA (3.4%‐6.4%) slower than G3.5 one (11.9%‐22.8%). The in vitro cytotoxicities of G4‐PEG@OXA were evaluated in HeLa cell lines using resazurin assay and live/dead staining test. Although the free OXA showed a rather moderate killing ability, the G4‐PEG@OXA still displayed the low viability of HeLa that was better to the result of G3.5‐PEG@OXA due to released OXA amount. The benefit of this system was to overcome the burst release phenomenon to minimize OXA toxicity without compromising its efficiency.
Display omitted
A series of novel N-substituted hydrazide derivatives were synthesized by reacting atranorin, a compound with a natural depside structure (1), with a range of hydrazines. The natural ...product and 12 new analogues (2–13) were investigated for inhibition of α-glucosidase. The N-substituted hydrazide derivatives showed more potent inhibition than the original. The experimental results were confirmed by docking analysis. This study suggests that these compounds are promising molecules for diabetes therapy. Molecular dynamics simulations were carried out with compound 2 demonstrating the best docking model using Gromac during simulation up to 20 ns to explore the stability of the complex ligand-protein. Furthermore, the activity of all synthetic compounds 2–13 against a normal cell line HEK293, used for assessing their cytotoxicity, was evaluated.
Introduction
Neoantigen-based immunotherapy has emerged as a promising strategy for improving the life expectancy of cancer patients. This therapeutic approach heavily relies on accurate ...identification of cancer mutations using DNA sequencing (DNAseq) data. However, current workflows tend to provide a large number of neoantigen candidates, of which only a limited number elicit efficient and immunogenic T-cell responses suitable for downstream clinical evaluation. To overcome this limitation and increase the number of high-quality immunogenic neoantigens, we propose integrating RNA sequencing (RNAseq) data into the mutation identification step in the neoantigen prediction workflow.
Methods
In this study, we characterize the mutation profiles identified from DNAseq and/or RNAseq data in tumor tissues of 25 patients with colorectal cancer (CRC). Immunogenicity was then validated by ELISpot assay using long synthesis peptides (sLP).
Results
We detected only 22.4% of variants shared between the two methods. In contrast, RNAseq-derived variants displayed unique features of affinity and immunogenicity. We further established that neoantigen candidates identified by RNAseq data significantly increased the number of highly immunogenic neoantigens (confirmed by ELISpot) that would otherwise be overlooked if relying solely on DNAseq data.
Discussion
This integrative approach holds great potential for improving the selection of neoantigens for personalized cancer immunotherapy, ultimately leading to enhanced treatment outcomes and improved survival rates for cancer patients.
The identification and quantification of actionable mutations are critical for guiding targeted therapy and monitoring drug response in colorectal cancer. Liquid biopsy (LB) based on plasma cell-free ...DNA analysis has emerged as a noninvasive approach with many clinical advantages over conventional tissue sampling. Here, we developed a LB protocol using ultra-deep massive parallel sequencing and validated its clinical performance for detection and quantification of actionable mutations in three major driver genes (KRAS, NRAS and BRAF). The assay showed a 92% concordance for mutation detection between plasma and paired tissues and great reliability in quantification of variant allele frequency.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
A novel modeling framework that simultaneously improves accuracy, predictability, and computational efficiency is presented. It embraces the benefits of three modeling techniques integrated together ...for the first time: surrogate modeling, parameter inference, and data assimilation. The use of polynomial chaos expansion (PCE) surrogates significantly decreases computational time. Parameter inference allows for model faster convergence, reduced uncertainty, and superior accuracy of simulated results. Ensemble Kalman filters assimilate errors that occur during forecasting. To examine the applicability and effectiveness of the integrated framework, we developed 18 approaches according to how surrogate models are constructed, what type of parameter distributions are used as model inputs, and whether model parameters are updated during the data assimilation procedure. We conclude that (1) PCE must be built over various forcing and flow conditions, and in contrast to previous studies, it does not need to be rebuilt at each time step; (2) model parameter specification that relies on constrained, posterior information of parameters (so‐called Selected specification) can significantly improve forecasting performance and reduce uncertainty bounds compared to Random specification using prior information of parameters; and (3) no substantial differences in results exist between single and dual ensemble Kalman filters, but the latter better simulates flood peaks. The use of PCE effectively compensates for the computational load added by the parameter inference and data assimilation (up to ~80 times faster). Therefore, the presented approach contributes to a shift in modeling paradigm arguing that complex, high‐fidelity hydrologic and hydraulic models should be increasingly adopted for real‐time and ensemble flood forecasting.
Key Points
A surrogate model must be built over various forcing and flow conditions and it does not need to be rebuilt at each time step
Model parameter specification for data assimilation can significantly improve forecasting performance and reduce uncertainty bounds
No substantial differences in results exists between single and dual EnKFs, but the latter better simulates flood peaks
The present study examined temporal variations in water and sediment discharges in the Red River basin from 1958 to 2021 resulting from climate change and anthropogenic factors, with projections ...extended to 2100. The 64-year observational period was divided into five distinct stages: 1958–1971 (Stage I: natural conditions); 1972–1988 (Stage II: onset of human activities); 1989–2010 (Stage III: post Hoa Binh dam construction); 2011–2016 (Stage IV: series of new dam constructions); and 2017–2021 (Stage V: combined effects of human activities and climate change). Attribution analysis revealed that human activities accounted for 62% and 92% of the dramatic declines in sediment loads in Stages III and IV, respectively. Projection results of fluvial sediment loads over an approximate 150-year timeframe (1958–2100) indicate an overriding impact from human activities. Climate change projections based on four scenarios (−5%, +5%, +10%, and +15% change per year) suggest associated decreases or increases in river flows. This study predicts that projected 21st century increases in river flow attributable to climate change will offset up to eight percent of the human-induced sediment load deficit.