JuMP is an open-source modeling language that allows users to express a wide range of optimization problems (linear, mixed-integer, quadratic, conic-quadratic, semidefinite, and nonlinear) in a ...high-level, algebraic syntax. JuMP takes advantage of advanced features of the Julia programming language to offer unique functionality while achieving performance on par with commercial modeling tools for standard tasks. In this work we will provide benchmarks, present the novel aspects of the implementation, and discuss how JuMP can be extended to new problem classes and composed with state-of-the-art tools for visualization and interactivity.
We formulate a general framework for hp-variational physics-informed neural networks (hp-VPINNs) based on the nonlinear approximation of shallow and deep neural networks and hp-refinement via domain ...decomposition and projection onto the space of high-order polynomials. The trial space is the space of neural network, which is defined globally over the entire computational domain, while the test space contains piecewise polynomials. Specifically in this study, the hp-refinement corresponds to a global approximation with a local learning algorithm that can efficiently localize the network parameter optimization. We demonstrate the advantages of hp-VPINNs in both accuracy and training cost for several numerical examples of function approximation and in solving differential equations.
•Development of a general framework for hp-variational physics-informed neural networks•Nonlinear approximation of neural networks, projection onto space of high-order polynomials.•Domain decomposition•Comparison with other methods that use neural networks•Local and global approximations with locally/globally defined test functions.•Different loss functions based on the variational form and integration by parts.•Detailed derivation of the hp-VPINN formulation.
TMB is an open source R package that enables quick implementation of complex nonlinear random effects (latent variable) models in a manner similar to the established AD Model Builder package (ADMB, ...http://admb-project.org/; Fournier et al. 2011). In addition, it offers easy access to parallel computations. The user defines the joint likelihood for the data and the random effects as a C++ template function, while all the other operations are done in R; e.g., reading in the data. The package evaluates and maximizes the Laplace approximation of the marginal likelihood where the random effects are automatically integrated out. This approximation, and its derivatives, are obtained using automatic differentiation (up to order three) of the joint likelihood. The computations are designed to be fast for problems with many random effects (≈ 106 ) and parameters (≈ 103 ). Computation times using ADMB and TMB are compared on a suite of examples ranging from simple models to large spatial models where the random effects are a Gaussian random field. Speedups ranging from 1.5 to about 100 are obtained with increasing gains for large problems. The package and examples are available at http://tmb-project.org/.
Here, we present a framework for calibration of parameters in elastoplastic constitutive models that is based on the use of automatic differentiation (AD). The model calibration problem is posed as a ...partial differential equation-constrained optimization problem where a finite element (FE) model of the coupled equilibrium equation and constitutive model evolution equations serves as the constraint. The objective function quantifies the mismatch between the displacement predicted by the FE model and full-field digital image correlation data, and the optimization problem is solved using gradient-based optimization algorithms. Forward and adjoint sensitivities are used to compute the gradient at considerably less cost than its calculation from finite difference approximations. Through the use of AD, we need only to write the constraints in terms of AD objects, where all of the derivatives required for the forward and inverse problems are obtained by appropriately seeding and evaluating these quantities. We present three numerical examples that verify the correctness of the gradient, demonstrate the AD approach's parallel computation capabilities via application to a large-scale FE model, and highlight the formulation's ease of extensibility to other classes of constitutive models.
Inverse problems associated with stochastic models constitute a significant portion of scientific and engineering applications. In such cases the unknown quantities are distributions. The ...applicability of traditional methods is limited because of their demanding assumptions or prohibitive computational consumption; for example, maximum likelihood methods require closed-form density functions, and Markov Chain Monte Carlo needs a large number of simulations. We propose a new method that estimates the unknown distribution by matching the statistical properties between observed and simulated random processes. We leverage the expressive power of neural networks to approximate the unknown distribution and use a discriminative neural network for computing the statistical discrepancies between the observed and simulated random processes. Here we demonstrated numerically that the proposed methods can estimate both the model parameters and learn complicated unknown distributions.
Gradient-based optimization has enabled dramatic advances in computational imaging through techniques like deep learning and nonlinear optimization. These methods require gradients not just of simple ...mathematical functions, but of general
programs
which encode complex transformations of images and graphical data. Unfortunately, practitioners have traditionally been limited to either hand-deriving gradients of complex computations, or composing programs from a limited set of coarse-grained operators in deep learning frameworks. At the same time, writing programs with the level of performance needed for imaging and deep learning is prohibitively difficult for most programmers.
We extend the image processing language Halide with general reverse-mode automatic differentiation (AD), and the ability to automatically optimize the implementation of gradient computations. This enables automatic computation of the gradients of arbitrary Halide programs, at high performance, with little programmer effort. A key challenge is to structure the gradient code to retain parallelism. We define a simple algorithm to automatically schedule these pipelines, and show how Halide's existing scheduling primitives can express and extend the key AD optimization of "checkpointing."
Using this new tool, we show how to easily define new neural network layers which automatically compile to high-performance GPU implementations, and how to solve nonlinear inverse problems from computational imaging. Finally, we show how differentiable programming enables dramatically improving the quality of even traditional, feed-forward image processing algorithms, blurring the distinction between classical and deep methods.
This paper presents the winning submission of the M4 forecasting competition. The submission utilizes a dynamic computational graph neural network system that enables a standard exponential smoothing ...model to be mixed with advanced long short term memory networks into a common framework. The result is a hybrid and hierarchical forecasting method.
Many algorithms for control, optimization and estimation in robotics depend on derivatives of the underlying system dynamics, e.g. to compute linearizations, sensitivities or gradient directions. ...However, we show that when dealing with rigid body dynamics, these derivatives are difficult to derive analytically and to implement efficiently. To overcome this issue, we extend the modelling tool 'RobCoGen' to be compatible with Automatic Differentiation. Additionally, we propose how to automatically obtain the derivatives and generate highly efficient source code. We highlight the flexibility and performance of the approach in two application examples. First, we show a trajectory optimization example for the quadrupedal robot HyQ, which employs auto-differentiation on the dynamics including a contact model. Second, we present a hardware experiment in which a six-DoF robotic arm avoids a randomly moving obstacle in a go-to task by fast, dynamic replanning.
Derivatives play a critical role in computational statistics, examples being Bayesian inference using Hamiltonian Monte Carlo sampling and the training of neural networks. Automatic differentiation ...(AD) is a powerful tool to automate the calculation of derivatives and is preferable to more traditional methods, especially when differentiating complex algorithms and mathematical functions. The implementation of AD, however, requires some care to insure efficiency. Modern differentiation packages deploy a broad range of computational techniques to improve applicability, run time, and memory management. Among these techniques are operation overloading, region‐based memory, and expression templates. There also exist several mathematical techniques which can yield high performance gains when applied to complex algorithms. For example, semi‐analytical derivatives can reduce by orders of magnitude the runtime required to numerically solve and differentiate an algebraic equation. Open and practical problems include the extension of current packages to provide more specialized routines, and finding optimal methods to perform higher‐order differentiation.
This article is categorized under:
Algorithmic Development > Scalable Statistical Methods
Automatic differentiation is a powerful tool to differentiate mathematical functions and algorithms. It has, over the past years, been applied to many branches of computational statistics. This article reviews some important mathematical and computational considerations required for its efficient implementation.