It is shown that Machine Learning (ML) algorithms can usefully capture the effect of crystallization composition and conditions (inputs) on key microstructural characteristics (outputs) of faujasite ...type zeolites (structure types FAU, EMT, and their intergrowths), which are widely used zeolite catalysts and adsorbents. The utility of ML (in particular, Geometric Harmonics) toward learning input-output relationships of interest is demonstrated, and a comparison with Neural Networks and Gaussian Process Regression, as alternative approaches, is provided. Through ML, synthesis conditions were identified to enhance the Si/Al ratio of high purity FAU zeolite to the hitherto highest level (i.e., Si/Al = 3.5) achieved via direct (not seeded), and organic structure-directing-agent-free synthesis from sodium aluminosilicate sols. The analysis of the ML algorithms' results offers the insight that reduced Na
O content is key to formulating FAU materials with high Si/Al ratio. An acid catalyst prepared by partial ion exchange of the high-Si/Al-ratio FAU (Si/Al = 3.5) exhibits improved proton reactivity (as well as specific activity, per unit mass of catalyst) in propane cracking and dehydrogenation compared to the catalyst prepared from the previously reported highest Si/Al ratio (Si/Al = 2.8).
We present a machine learning framework bridging manifold learning, neural networks, Gaussian processes, and Equation-Free multiscale approach, for the construction of different types of effective ...reduced order models from detailed agent-based simulators and the systematic multiscale numerical analysis of their emergent dynamics. The specific tasks of interest here include the detection of tipping points, and the uncertainty quantification of rare events near them. Our illustrative examples are an event-driven, stochastic financial market model describing the mimetic behavior of traders, and a compartmental stochastic epidemic model on an Erdös-Rényi network. We contrast the pros and cons of the different types of surrogate models and the effort involved in learning them. Importantly, the proposed framework reveals that, around the tipping points, the emergent dynamics of both benchmark examples can be effectively described by a one-dimensional stochastic differential equation, thus revealing the intrinsic dimensionality of the normal form of the specific type of the tipping point. This allows a significant reduction in the computational cost of the tasks of interest.
Abstract Machine learning models with uncertainty quantification have recently emerged as attractive tools to accelerate the navigation of catalyst design spaces in a data-efficient manner. Here, we ...combine active learning with a dropout graph convolutional network (dGCN) as a surrogate model to explore the complex materials space of high-entropy alloys (HEAs). We train the dGCN on the formation energies of disordered binary alloy structures in the Pd-Pt-Sn ternary alloy system and improve predictions on ternary structures by performing reduced optimization of the formation free energy, the target property that determines HEA stability, over ensembles of ternary structures constructed based on two coordinate systems: (a) a physics-informed ternary composition space, and (b) data-driven coordinates discovered by the Diffusion Maps manifold learning scheme. Both reduced optimization techniques improve predictions of the formation free energy in the ternary alloy space with a significantly reduced number of DFT calculations compared to a high-fidelity model. The physics-based scheme converges to the target property in a manner akin to a depth-first strategy, whereas the data-driven scheme appears more akin to a breadth-first approach. Both sampling schemes, coupled with our acquisition function, successfully exploit a database of DFT-calculated binary alloy structures and energies, augmented with a relatively small number of ternary alloy calculations, to identify stable ternary HEA compositions and structures. This generalized framework can be extended to incorporate more complex bulk and surface structural motifs, and the results demonstrate that significant dimensionality reduction is possible in thermodynamic sampling problems when suitable active learning schemes are employed.
We introduce a data-driven approach to building reduced dynamical models through manifold learning; the reduced latent space is discovered using Diffusion Maps (a manifold learning technique) on time ...series data. A second round of Diffusion Maps on those latent coordinates allows the approximation of the reduced dynamical models. This second round enables mapping the latent space coordinates back to the full ambient space (what is called lifting); it also enables the approximation of full state functions of interest in terms of the reduced coordinates. In our work, we develop and test three different reduced numerical simulation methodologies, either through pre-tabulation in the latent space and integration on the fly or by going back and forth between the ambient space and the latent space. The data-driven latent space simulation results, based on the three different approaches, are validated through (a) the latent space observation of the full simulation through the Nyström Extension formula, or through (b) lifting the reduced trajectory back to the full ambient space, via Latent Harmonics. Latent space modeling often involves additional regularization to favor certain properties of the space over others, and the mapping back to the ambient space is then constructed mostly independently from these properties; here, we use the same data-driven approach to construct the latent space and then map back to the ambient space.
•Construction of reduced data-driven dynamical models.•Latent Space modeling through manifold learning.•Latent Harmonics to approximate functions defined in Latent Space (here Diffusion Maps) coordinates.•Latent-Space assisted Scientific Computing.
A data-driven framework is presented, that enables the prediction of quantities, either observations or parameters, given sufficient partial data. The framework is illustrated via a computational ...model of the deposition of Cu in a Chemical Vapor Deposition (CVD) reactor, where the reactor pressure, the deposition temperature and feed mass flow rate are important process parameters that determine the outcome of the process. The sampled observations are high-dimensional vectors containing the outputs of a detailed CFD steady-state model of the process, i.e. the values of velocity, pressure, temperature, and species mass fractions at each point in the discretization. A machine learning workflow is presented, able to predict out-of-sample (a) observations (e.g. mass fraction in the reactor), given process parameters (e.g. inlet temperature); (b) process parameters, given observation data; and (c) partial observations (e.g. temperature in the reactor), given other partial observations (e.g. mass fraction in the reactor). The proposed workflow relies on two manifold learning schemes: Diffusion Maps and the associated Geometric Harmonics. Diffusion Maps are used for discovering a reduced representation of the available data, and Geometric Harmonics for extending functions defined on the discovered manifold. In our work a special use case of Geometric Harmonics is formulated and implemented, which we call Double Diffusion Maps, to map from the reduced representation back to (partial) observations and process parameters. A comparison of our manifold learning scheme to the traditional Gappy-POD approach is provided: ours can be thought of as a “Gappy DMAPs” approach. The presented methodology is easily transferable to application domains beyond reactor engineering.
•Nonlinear manifold learning is implemented for out-of-sample predictions.•Chemical Vapor Deposition reactor state variables predicted for new inputs.•Double DMAPs enables predictions of full observables even from partial data.•Efficient and accurate prediction of inputs from scarce observables.•Gappy DMAPS outperforms linear Gappy POD.
This study presents a collection of purely data-driven workflows for constructing reduced-order models (ROMs) for distributed dynamical systems. The ROMs we focus on, are data-assisted models ...inspired by, and templated upon, the theory of Approximate Inertial Manifolds (AIMs); the particular motivation is the so-called post-processing Galerkin method of Garcia-Archilla, Novo and Titi. Its applicability can be extended: the need for accurate truncated Galerkin projections and for deriving closed-formed corrections can be circumvented using machine learning tools. When the right latent variables are not a priori known, we illustrate how autoencoders as well as Diffusion Maps (a manifold learning scheme) can be used to discover good sets of latent variables and test their explainability. The proposed methodology can express the ROMs in terms of (a) theoretical (Fourier coefficients), (b) linear data-driven (POD modes) and/or (c) nonlinear data-driven (Diffusion Maps) coordinates. Both Black-Box and (theoretically-informed and data-corrected) Gray-Box models are described; the necessity for the latter arises when truncated Galerkin projections are so inaccurate as to not be amenable to post-processing. We use the Chafee-Infante reaction-diffusion and the Kuramoto-Sivashinsky dissipative partial differential equations to illustrate and successfully test the overall framework.
•Machine learning motivates revisiting post-processing Galerkin ROMs.•Diffusion Maps & autoencoders discover relations between sets of latent variables.•Data-driven workflows produce ROM closures to enhance accuracy.•“Gray box” models improve first principles ROMs when they are inaccurate.
Neurodegeneration is the pathological substrate that causes major disability in secondary progressive multiple sclerosis. A synthesis of preclinical and clinical research identified three ...neuroprotective drugs acting on different axonal pathobiologies. We aimed to test the efficacy of these drugs in an efficient manner with respect to time, cost, and patient resource.
We did a phase 2b, multiarm, parallel group, double-blind, randomised placebo-controlled trial at 13 clinical neuroscience centres in the UK. We recruited patients (aged 25–65 years) with secondary progressive multiple sclerosis who were not on disease-modifying treatment and who had an Expanded Disability Status Scale (EDSS) score of 4·0–6·5. Participants were randomly assigned (1:1:1:1) at baseline, by a research nurse using a centralised web-based service, to receive twice-daily oral treatment of either amiloride 5 mg, fluoxetine 20 mg, riluzole 50 mg, or placebo for 96 weeks. The randomisation procedure included minimisation based on sex, age, EDSS score at randomisation, and trial site. Capsules were identical in appearance to achieve masking. Patients, investigators, and MRI readers were unaware of treatment allocation. The primary outcome measure was volumetric MRI percentage brain volume change (PBVC) from baseline to 96 weeks, analysed using multiple regression, adjusting for baseline normalised brain volume and minimisation criteria. The primary analysis was a complete-case analysis based on the intention-to-treat population (all patients with data at week 96). This trial is registered with ClinicalTrials.gov, NCT01910259.
Between Jan 29, 2015, and June 22, 2016, 445 patients were randomly allocated amiloride (n=111), fluoxetine (n=111), riluzole (n=111), or placebo (n=112). The primary analysis included 393 patients who were allocated amiloride (n=99), fluoxetine (n=96), riluzole (n=99), and placebo (n=99). No difference was noted between any active treatment and placebo in PBVC (amiloride vs placebo, 0·0% 95% CI −0·4 to 0·5; p=0·99; fluoxetine vs placebo −0·1% –0·5 to 0·3; p=0·86; riluzole vs placebo −0·1% –0·6 to 0·3; p=0·77). No emergent safety issues were reported. The incidence of serious adverse events was low and similar across study groups (ten 9% patients in the amiloride group, seven 6% in the fluoxetine group, 12 11% in the riluzole group, and 13 12% in the placebo group). The most common serious adverse events were infections and infestations. Three patients died during the study, from causes judged unrelated to active treatment; one patient assigned amiloride died from metastatic lung cancer, one patient assigned riluzole died from ischaemic heart disease and coronary artery thrombosis, and one patient assigned fluoxetine had a sudden death (primary cause) with multiple sclerosis and obesity listed as secondary causes.
The absence of evidence for neuroprotection in this adequately powered trial indicates that exclusively targeting these aspects of axonal pathobiology in patients with secondary progressive multiple sclerosis is insufficient to mitigate neuroaxonal loss. These findings argue for investigation of different mechanistic targets and future consideration of combination treatment trials. This trial provides a template for future simultaneous testing of multiple disease-modifying medicines in neurological medicine.
Efficacy and Mechanism Evaluation (EME) Programme, an MRC and NIHR partnership, UK Multiple Sclerosis Society, and US National Multiple Sclerosis Society.
Treatment with natalizumab once every 4 weeks is approved for patients with relapsing-remitting multiple sclerosis, but is associated with a risk of progressive multifocal leukoencephalopathy. ...Switching to extended-interval dosing is associated with lower progressive multifocal leukoencephalopathy risk, but the efficacy of this approach is unclear. We aimed to assess the safety and efficacy of natalizumab once every 6 weeks compared with once every 4 weeks in patients with relapsing-remitting multiple sclerosis.
We did a randomised, controlled, open-label, phase 3b trial (NOVA) at 89 multiple sclerosis centres across 11 countries in the Americas, Europe, and Western Pacific. Included participants were aged 18–60 years with relapsing-remitting multiple sclerosis and had been treated with intravenous natalizumab 300 mg once every 4 weeks with no relapses for at least 12 months before randomisation, with no missed doses in the previous 3 months. Participants were randomly assigned (1:1), using a randomisation sequence generated by the study funder and contract personnel with interactive response technology, to switch to natalizumab once every 6 weeks or continue with once every 4 weeks. The centralised MRI reader, independent neurology evaluation committee, site examining neurologists, site backup examining neurologists, and site examining technicians were masked to study group assignments. The primary endpoint was the number of new or newly enlarging T2 hyperintense lesions at week 72, assessed in all participants who received at least one dose of assigned treatment and had at least one postbaseline MRI, relapse, or neurological examination or efficacy assessment. Missing primary endpoint data were handled under prespecified primary and secondary estimands: the primary estimand included all data, regardless of whether participants remained on the assigned treatment; the secondary estimand classed all data obtained after treatment discontinuation or study withdrawal as missing. Safety was assessed in all participants who received at least one dose of study treatment. Study enrolment is closed and an open-label extension study is ongoing. This study is registered with EudraCT, 2018-002145-11, and ClinicalTrials.gov, NCT03689972.
Between Dec 26, 2018, and Aug 30, 2019, 605 patients were assessed for eligibility and 499 were enrolled and assigned to receive natalizumab once every 6 weeks (n=251) or once every 4 weeks (n=248). After prespecified adjustments for missing data, mean numbers of new or newly enlarging T2 hyperintense lesions at week 72 were 0·20 (95% CI 0·07–0·63) in the once every 6 weeks group and 0·05 (0·01–0·22) in the once every 4 weeks group (mean lesion ratio 4·24 95% CI 0·86–20·85; p=0·076) under the primary estimand, and 0·31 (95% CI 0·12–0·82) and 0·06 (0·01–0·31; mean lesion ratio 4·93 95% CI 1·05–23·20; p=0·044) under the secondary estimand. Two participants in the once every 6 weeks group with extreme new or newly enlarging T2 hyperintense lesion numbers (≥25) contributed most of the excess lesions. Adverse events occurred in 194 (78%) of 250 participants in the once every 6 weeks group and 190 (77%) of 247 in the once every 4 weeks group, and serious adverse events occurred in 17 (7%) and 17 (7%), respectively. No deaths were reported. There was one case of asymptomatic progressive multifocal leukoencephalopathy (without clinical signs) in the once every 6 weeks group, and no cases in the once every 4 weeks group; 6 months after diagnosis, the participant was without increased disability and remained classified as asymptomatic.
We found a numerical difference in the mean number of new or newly enlarging T2 hyperintense lesions at week 72 between the once every 6 weeks and once every 4 weeks groups, which reached significance under the secondary estimand, but interpretation of statistical differences (or absence thereof) is limited because disease activity in the once every 4 weeks group was lower than expected. The safety profiles of natalizumab once every 6 weeks and once every 4 weeks were similar. Although this trial was not powered to assess differences in risk of progressive multifocal leukoencephalopathy, the occurrence of the (asymptomatic) case underscores the importance of monitoring and risk factor consideration in all patients receiving natalizumab.
Biogen.
While the major phenotypes of multiple sclerosis (MS) and relapsing–remitting, primary and secondary progressive MS have been well characterized, a subgroup of patients with an active, aggressive ...disease course and rapid disability accumulation remains difficult to define and there is no consensus about their management and treatment. The current lack of an accepted definition and treatment guidelines for aggressive MS triggered a 2018 focused workshop of the European Committee for Treatment and Research in Multiple Sclerosis (ECTRIMS) on aggressive MS. The aim of the workshop was to discuss approaches on how to describe and define the disease phenotype and its treatments. Unfortunately, it was not possible to come to consensus on a definition because of unavailable data correlating severe disease with imaging and molecular biomarkers. However, the workshop highlighted the need for future research needed to define this disease subtype while also focusing on its treatment and management. Here, we review previous attempts to define aggressive MS and present characteristics that might, with additional research, eventually help characterize it. A companion paper summarizes data regarding treatment and management.