COVID-19, the disease caused by the novel coronavirus 2019, has caused grave woes across the globe since it was first reported in the epicentre of Wuhan, Hubei, China, in December 2019. The spread of ...COVID-19 in China has been successfully curtailed by massive travel restrictions that rendered more than 900 million people housebound for more than two months since the lockdown of Wuhan, and elsewhere, on 23 January 2020. Here, we assess the impact of China's massive lockdowns and travel restrictions reflected by the changes in mobility patterns across and within provinces, before and during the lockdown period. We calibrate movement flow between provinces with an epidemiological compartment model to quantify the effectiveness of lockdowns and reductions in disease transmission. Our analysis demonstrates that the onset and phase of local community transmission in other provinces depends on the cumulative population outflow received from the epicentre Hubei. Moreover, we show that synchronous lockdowns and consequent reduced mobility lag a certain time to elicit an actual impact on suppressing the spread. Such highly coordinated nationwide lockdowns, applied via a top-down approach along with high levels of compliance from the bottom up, are central to mitigating and controlling
outbreaks and averting a massive health crisis.
Anaerobic digestion processes create biogases that can be useful sources of energy. The development of data-driven models of anaerobic digestion processes via operating parameters can lead to ...increased biogas production rates, resulting in greater energy production, through process modification and optimization. This study assessed processed and unprocessed input operating parameter variables for the development of regression models with transparent structures (‘white-box’ models) to: (1) estimate biogas production rates from municipal wastewater treatment plant (MWTP) anaerobic digestors; (2) compare their performances to artificial neural network (ANN) and adaptive network-based fuzzy inference system (ANFIS) models with opaque structures (‘black-box’ models) using Monte Carlo Simulation for uncertainty analysis; and (3) integrate the models with a genetic algorithm (GA) to optimize operating parameters for maximization of MWTP biogas production rates. The input variables were anaerobic digestion operating parameters from a MWTP including volatile fatty acids, total/fixed/volatile solids, pH, and inflow rate, which were processed via correlation tests and principal component analysis. Overall, the results indicated that the processed data did not improve regression model performances. Additionally, the developed non-linear regression model with the unprocessed inputs had the best performance based on values including R = 0.81, RMSE = 0.95, and IA = 0.89. However, this model was less accurate, but interestingly had less uncertainty, as compared to ANN and ANFIS models which indicates the compromise between model accuracy and uncertainty. Thus, all three models were coupled with GA optimization with maximum biogas production rate estimates of 22.0, 23.1, and 28.6 m3/min for ANN, ANFIS, and non-linear regression models, respectively.
Display omitted
•Municipal wastewater treatment creates greenhouse gases.•Data-driven models can be used to estimate gas emissions.•Monte Carlo Simulation is useful for determination of model uncertainty.•Energy creation from gases can be optimized based on model estimations.•Genetic algorithm optimization resulted in marked increases in biogas rates.
Forecasting thermal load is a key component for the majority of optimization solutions for controlling district heating and cooling systems. Recent studies have analysed the results of a number of ...data-driven methods applied to thermal load forecasting, this paper presents the results of combining a collection of these individual methods in an expert system. The expert system will combine multiple thermal load forecasts in a way that it always tracks the best expert in the system. This solution is tested and validated using a thermal load dataset of 27 months obtained from 10 residential buildings located in Rottne, Sweden together with outdoor temperature information received from a weather forecast service. The expert system is composed of the following data-driven methods: linear regression, extremely randomized trees regression, feed-forward neural network and support vector machine. The results of the proposed solution are compared with the results of the individual methods.
Machine learning for ecosystem services Willcock, Simon; Martínez-López, Javier; Hooftman, Danny A.P. ...
Ecosystem services,
10/2018, Letnik:
33
Journal Article
Recenzirano
Odprti dostop
•Machine learning processes automatically provide estimates of uncertainty.•Uncertainty information enables decision-makers to assign their own thresholds.•Machine learning algorithms can help ...scientists make use of ‘big data’.
Recent developments in machine learning have expanded data-driven modelling (DDM) capabilities, allowing artificial intelligence to infer the behaviour of a system by computing and exploiting correlations between observed variables within it. Machine learning algorithms may enable the use of increasingly available ‘big data’ and assist applying ecosystem service models across scales, analysing and predicting the flows of these services to disaggregated beneficiaries. We use the Weka and ARIES software to produce two examples of DDM: firewood use in South Africa and biodiversity value in Sicily, respectively. Our South African example demonstrates that DDM (64–91% accuracy) can identify the areas where firewood use is within the top quartile with comparable accuracy as conventional modelling techniques (54–77% accuracy). The Sicilian example highlights how DDM can be made more accessible to decision makers, who show both capacity and willingness to engage with uncertainty information. Uncertainty estimates, produced as part of the DDM process, allow decision makers to determine what level of uncertainty is acceptable to them and to use their own expertise for potentially contentious decisions. We conclude that DDM has a clear role to play when modelling ecosystem services, helping produce interdisciplinary models and holistic solutions to complex socio-ecological issues.
Lithium-ion batteries are a prominent technology for the electrification of the transport sector, which itself is a key measure towards the departure from fossil fuels. The “green shift” is taking ...place in the marine industry too, where the number of battery-powered vessels is fastly growing. In this case, monitoring the battery State of Health is essential more than ever to optimise battery use, promote safety, and ensure the coverage of ship power and energy demands. Classification societies typically require annual capacity tests for this purpose; however, the tests are disruptive, costly and time-consuming. As a consequence they are seldom, in addition to not being always fully reliable. We propose a novel alternative semi-supervised learning approach to estimate the State of Health of a lithium-ion battery system with no labelled data, starting from a minimal set of weakly labelled data from another similar system. The method is based on operational sensor data gathered from the battery, together with the battery State of Charge. Our results show that the procedure is valid, and the obtained estimates can be used to significantly progress in failure prevention, operational optimisation, and for planning batteries at the design stage.
•A novel semi-supervised method for on/offline battery health monitoring is presented.•The approach is developed with real usage data from the maritime field.•The method is versatile and provides sensible results, in line with expectations.•A cumulative model is applied on labels generated with this approach with low errors.
Artificial intelligence is a rapidly expanding area of research, with the disruptive potential to transform traditional approaches in the pharmaceutical industry, from drug discovery and development ...to clinical practice. Machine learning, a subfield of artificial intelligence, has fundamentally transformed in silico modelling and has the capacity to streamline clinical translation. This paper reviews data-driven modelling methodologies with a focus on drug formulation development. Despite recent advances, there is limited modelling guidance specific to drug product development and a trend towards suboptimal modelling practices, resulting in models that may not give reliable predictions in practice. There is an overwhelming focus on benchtop experimental outcomes obtained for a specific modelling aim, leaving the capabilities of data scraping or the use of combined modelling approaches yet to be fully explored. Moreover, the preference for high accuracy can lead to a reliance on black box methods over interpretable models. This further limits the widespread adoption of machine learning as black boxes yield models that cannot be easily understood for the purposes of enhancing product performance. In this review, recommendations for conducting machine learning research for drug product development to ensure trustworthiness, transparency, and reliability of the models produced are presented. Finally, possible future directions on how research in this area might develop are discussed to aim for models that provide useful and robust guidance to formulators.
Display omitted
In the portfolio of technologies available for net zero-enabling solutions, such as carbon capture and low-carbon production of hydrogen, membrane-based gas separation is a sustainable alternative to ...energy-intensive processes, such as solvent-based absorption or cryogenic distillation. Detailed knowledge of membrane materials performance in wide operative ranges is a necessary prerequisite for the design of efficient membrane processes. With the increasing popularization of data-driven methods in natural sciences and engineering, the investigation of their potential to support materials and process design for gas separation with membranes has received increasing attention, as it can help compact the lab-to-market cycle. In this work we review several machine learning (ML) strategies for the estimation of the gas separation performance of polymer membranes. New hybrid modelling strategies, in which ML complements physics-based models and simulation methods, are also discussed. Such strategies can enable the fast screening of large databases of existing materials for a specific separation, as well as assist in
materials design. We conclude by highlighting the challenges and future directions envisioned for the ML-assisted design and optimization of membrane materials and processes for traditional, as well as new, membrane separations.
We derive criteria for the selection of datapoints used for data-driven reduced-order modelling and other areas of supervised learning based on Gaussian process regression (GPR). While this is a ...well-studied area in the fields of active learning and optimal experimental design, most criteria in the literature are empirical. Here we introduce an optimality condition for the selection of a new input defined as the minimizer of the distance between the approximated output probability density function (pdf) of the reduced-order model and the exact one. Given that the exact pdf is unknown, we define the selection criterion as the supremum over the unit sphere of the native Hilbert space for the GPR. The resulting selection criterion, however, has a form that is difficult to compute. We combine results from GPR theory and asymptotic analysis to derive a computable form of the defined optimality criterion that is valid in the limit of small predictive variance. The derived asymptotic form of the selection criterion leads to convergence of the GPR model that guarantees a balanced distribution of data resources between probable and large-deviation outputs, resulting in an effective way of sampling towards data-driven reduced-order modelling.
This article is part of the theme issue ‘Data-driven prediction in dynamical systems’.
•The Canadian River Ice Database was used to study Mid-winter breakups (MWBs) severity.•Potential MWB drivers were identified from river and climate data on a national scale.•Identified drivers were ...used to successfully classify MWB severity across Canada.•A new threshold for the initiation of MWBs from the identified drivers was developed.
Mid-winter breakups (MWBs), consisting of the early breakup of the winter river ice cover before the typical spring breakup season, are becoming increasingly common events in cold region rivers. These events can lead to potentially severe flooding, while also altering the expected spring flow regime, yet data on these events is limited. In this study, a newly released Canadian River Ice Database (CRID), containing river ice data from 196 rivers across Canada obtained from time series analysis, was used to analyse these MWBs on a previously impossible national scale. The CRID data was combined with the Natural Resources Canada (NRCan) gridded daily climate dataset to identify a list of potential hydrologic and climatic drivers for MWB events. Techniques such as correlation analysis, Least Absolute Selection Shrinkage Operator (LASSO) regression, and input omission were combined to select 20 key drivers of the severity of MWB events. A random forest model that was trained with these drivers using data-driven modelling techniques successfully classified the MWBs as either low, medium, or high severity, achieving an overall accuracy of 80%. A new threshold for the prediction of MWB initiation based on climatic conditions was subsequently proposed through the use of optimization via an exhaustive grid search and its accuracy in identifying MWBs exceeded those proposed by previous studies. The new threshold used in conjunction with the random forest model provide valuable tools for both the prediction of MWBs and the assessment of their potential severity.