The goal of generative models is to learn the intricate relations between the data to create new simulated data, but current approaches fail in very high dimensions. When the true data generating ...process is based on physical processes these impose symmetries and constraints, and the generative model can be created by learning an effective description of the underlying physics, which enables scaling of the generative model to very high dimensions. In this work we propose Lagrangian Deep Learning (LDL) for this purpose, applying it to learn outputs of cosmological hydrodynamical simulations. The model uses layers of Lagrangian displacements of particles describing the observables to learn the effective physical laws. The displacements are modeled as the gradient of an effective potential, which explicitly satisfies the translational and rotational invariance. The total number of learned parameters is only of order 10, and they can be viewed as effective theory parameters. We combine N-body solver FastPM with LDL and apply them to a wide range of cosmological outputs, from the dark matter to the stellar maps, gas density and temperature. The computational cost of LDL is nearly four orders of magnitude lower than the full hydrodynamical simulations, yet it outperforms it at the same resolution. We achieve this with only of order 10 layers from the initial conditions to the final output, in contrast to typical cosmological simulations with thousands of time steps. This opens up the possibility of analyzing cosmological observations entirely within this framework, without the need for large dark-matter simulations.
We develop a new and simple method to model baryonic effects at the field level relevant for weak lensing analyses. We analyze thousands of state-of-the-art hydrodynamic simulations from the CAMELS ...project, each with different cosmology and strength of feedback, and we find that the cross-correlation coefficient between full hydrodynamic and N-body simulations is very close to 1 down to \(k\sim10~h{\rm Mpc}^{-1}\). This suggests that modeling baryonic effects at the field level down to these scales only requires N-body simulations plus a correction to the mode's amplitude given by: \(\sqrt{P_{\rm hydro}(k)/P_{\rm nbody}(k)}\). In this paper, we build an emulator for this quantity, using Gaussian processes, that is flexible enough to reproduce results from thousands of hydrodynamic simulations that have different cosmologies, astrophysics, subgrid physics, volumes, resolutions, and at different redshifts. Our emulator is accurate at the percent level and exhibits a range of validation superior to previous studies. This method and our emulator enable field-level simulation-based inference analyses and accounting for baryonic effects in weak lensing analyses.
Anomaly detection is a key application of machine learning, but is generally
focused on the detection of outlying samples in the low probability density
regions of data. Here we instead present and ...motivate a method for unsupervised
in-distribution anomaly detection using a conditional density estimator,
designed to find unique, yet completely unknown, sets of samples residing in
high probability density regions. We apply this method towards the detection of
new physics in simulated Large Hadron Collider (LHC) particle collisions as
part of the 2020 LHC Olympics blind challenge, and show how we detected a new
particle appearing in only 0.08% of 1 million collision events. The results we
present are our original blind submission to the 2020 LHC Olympics, where it
achieved the state-of-the-art performance.
We propose a general purpose Bayesian inference algorithm for expensive likelihoods, replacing the stochastic term in the Langevin equation with a deterministic density gradient term. The particle ...density is evaluated from the current particle positions using a Normalizing Flow (NF), which is differentiable and has good generalization properties in high dimensions. We take advantage of NF preconditioning and NF based Metropolis-Hastings updates for a faster convergence. We show on various examples that the method is competitive against state of the art sampling methods.
Recent analyses of the Pal 5 and GD-1 tidal streams suggest that the inner
dark matter halo of the Milky Way is close to spherical, in tension with
predictions from collisionless N-body simulations ...of cosmological structure
formation. We use the Eris simulation to test whether the combination of
dissipative physics and hierarchical structure formation can produce Milky
Way-like galaxies whose dark matter halos match the tidal stream constraints
from the GD-1 and Pal 5 clusters. We use a dynamical model of the simulated
Eris galaxy to generate many realizations of the GD-1 and Pal 5 tidal streams,
marginalize over observational uncertainties in the cluster galactocentric
positions and velocities, and compare with the observational constraints. We
find that the total density and potential of Eris contributed by baryons and
dark matter satisfies constraints from the existing Milky Way stellar stream
data, as the baryons both round and redistribute the dark matter during the
dissipative formation of the galaxy, and provide a centrally-concentrated mass
distribution that rounds the inner potential. The Eris dark matter halo or a
spherical Navarro-Frenk-White dark matter work comparably well in modeling the
stream data. In contrast, the equivalent dark matter-only ErisDark simulation
produces a prolate halo that cannot reproduce the observed stream data. The
on-going Gaia mission will provide decisive tests of the consistency between
$\Lambda$CDM and Milky Way streams, and should distinguish between models like
Eris and more spherical halos.
Generating mocks for future sky surveys requires large volumes and high resolutions, which is computationally expensive even for fast simulations. In this work we try to develop numerical schemes to ...calibrate various halo and matter statistics in fast low resolution simulations compared to high resolution N-body and hydrodynamic simulations. For the halos, we improve the initial condition accuracy and develop a halo finder "relaxed-FOF", where we allow different linking length for different halo mass and velocity dispersions. We show that our relaxed-FoF halo finder improves the common statistics, such as halo bias, halo mass function, halo auto power spectrum in real space and in redshift space, cross correlation coefficient with the reference halo catalog, and halo-matter cross power spectrum. We also incorporate the potential gradient descent (PGD) method into fast simulations to improve the matter distribution at nonlinear scale. By building a lightcone output, we show that the PGD method significantly improves the weak lensing convergence tomographic power spectrum. With these improvements FastPM is comparable to the high resolution full N-body simulation of the same mass resolution, with two orders of magnitude fewer time steps. These techniques can be used to improve the halo and matter statistics of FastPM simulations for mock catalogs of future surveys such as DESI and LSST.
We present MADLens a python package for producing non-Gaussian lensing convergence maps at arbitrary source redshifts with unprecedented precision. MADLens is designed to achieve high accuracy while ...keeping computational costs as low as possible. A MADLens simulation with only \(256^3\) particles produces convergence maps whose power agree with theoretical lensing power spectra up to \(L{=}10000\) within the accuracy limits of HaloFit. This is made possible by a combination of a highly parallelizable particle-mesh algorithm, a sub-evolution scheme in the lensing projection, and a machine-learning inspired sharpening step. Further, MADLens is fully differentiable with respect to the initial conditions of the underlying particle-mesh simulations and a number of cosmological parameters. These properties allow MADLens to be used as a forward model in Bayesian inference algorithms that require optimization or derivative-aided sampling. Another use case for MADLens is the production of large, high resolution simulation sets as they are required for training novel deep-learning-based lensing analysis tools. We make the MADLens package publicly available under a Creative Commons License (https://github.com/VMBoehm/MADLens).
Fast N-body PM simulations with a small number of time steps such as FastPM or COLA have been remarkably successful in modeling the galaxy statistics, but their lack of small scale force resolution ...and long time steps cannot give accurate halo matter profiles or matter power spectrum. High resolution N-body simulations can improve on this, but lack baryonic effects, which can only be properly included in hydro simulations. Here we present a scheme to calibrate the fast simulations to mimic the precision of the hydrodynamic simulations or high resolution N-body simulations. The scheme is based on a gradient descent of either effective gravitational potential, which mimics the short range force, or of effective enthalpy, which mimics gas hydrodynamics and feedback. The scheme is fast and differentiable, and can be incorporated as a post-processing step into any simulation. It gives very good results for the matter power spectrum for several of the baryonic feedback and dark matter simulations, and also gives improved dark matter halo profiles. The scheme is even able to find the large subhalos, and increase the correlation coefficient between the fast simulations and the high resolution N-body or hydro simulations. It can also be used to add baryonic effects to the high resolution N-body simulations. While the method has free parameters that can be calibrated on various simulations, they can also be viewed as astrophysical nuisance parameters describing baryonic effects that can be marginalized over during the data analysis. In this view these parameters can be viewed as an efficient parametrization of baryonic effects.
Recent analyses of the Pal 5 and GD-1 tidal streams suggest that the inner dark matter halo of the Milky Way is close to spherical, in tension with predictions from collisionless N-body simulations ...of cosmological structure formation. We use the Eris simulation to test whether the combination of dissipative physics and hierarchical structure formation can produce Milky Way-like galaxies whose dark matter halos match the tidal stream constraints from the GD-1 and Pal 5 clusters. We use a dynamical model of the simulated Eris galaxy to generate many realizations of the GD-1 and Pal 5 tidal streams, marginalize over observational uncertainties in the cluster galactocentric positions and velocities, and compare with the observational constraints. We find that the total density and potential of Eris contributed by baryons and dark matter satisfies constraints from the existing Milky Way stellar stream data, as the baryons both round and redistribute the dark matter during the dissipative formation of the galaxy, and provide a centrally-concentrated mass distribution that rounds the inner potential. The Eris dark matter halo or a spherical Navarro-Frenk-White dark matter work comparably well in modeling the stream data. In contrast, the equivalent dark matter-only ErisDark simulation produces a prolate halo that cannot reproduce the observed stream data. The on-going Gaia mission will provide decisive tests of the consistency between \(\Lambda\)CDM and Milky Way streams, and should distinguish between models like Eris and more spherical halos.
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop ...and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.