The frontier of simulation-based inference Cranmer, Kyle; Brehmer, Johann; Louppe, Gilles
Proceedings of the National Academy of Sciences - PNAS,
12/2020, Letnik:
117, Številka:
48
Journal Article, Web Resource
Recenzirano
Odprti dostop
Many domains of science have developed complex simulations to describe phenomena of interest. While these simulations provide high-fidelity models, they are poorly suited for inference and lead to ...challenging inverse problems. We review the rapidly developing field of simulation-based inference and identify the forces giving additional momentum to the field. Finally, we describe how the frontier is expanding so that a broad audience can appreciate the profound influence these developments may have on science.
Machine learning has played an important role in the analysis of high-energy physics data for decades. The emergence of deep learning in 2012 allowed for machine learning tools which could adeptly ...handle higher-dimensional and more complex problems than previously feasible. This review is aimed at the reader who is familiar with high-energy physics but not machine learning. The connections between machine learning and high-energy physics data analysis are explored, followed by an introduction to the core concepts of neural networks, examples of the key results demonstrating the power of deep learning for analysis of LHC data, and discussion of future prospects and concerns.
A
bstract
Recent progress in applying machine learning for jet physics has been built upon an analogy between calorimeters and images. In this work, we present a novel class of recursive neural ...networks built instead upon an analogy between QCD and natural languages. In the analogy, four-momenta are like words and the clustering history of sequential recombination jet algorithms is like the parsing of a sentence. Our approach works directly with the four-momenta of a variable-length set of particles, and the jet-based tree structure varies on an event-by-event basis. Our experiments highlight the flexibility of our method for building task-specific jet embeddings and show that recursive architectures are significantly more accurate and data efficient than previous image-based networks. We extend the analogy from individual jets (sentences) to full events (paragraphs), and show for the first time an event-level classifier operating on all the stable particles produced in an LHC event.
We describe likelihood-based statistical tests for use in high energy physics for the discovery of new phenomena and for construction of confidence intervals on model parameters. We focus on the ...properties of the test procedures that allow one to account for systematic uncertainties. Explicit formulae for the asymptotic distributions of test statistics are derived using results of Wilks and Wald. We motivate and justify the use of a representative data set, called the “Asimov data set”, which provides a simple method to obtain the median experimental sensitivity of a search or measurement as well as fluctuations about this expectation.
Based on the established task of identifying boosted, hadronically
decaying top quarks, we compare a wide range of modern machine learning
approaches. Unlike most established methods they rely on ...low-level
input, for instance calorimeter output. While their network
architectures are vastly different, their performance is comparatively
similar. In general, we find that these new approaches are extremely
powerful and great fun.
Source Code Repositories * Github (http://github.com): A web-based hosting service for software development projects that use the Git revision control system, including many open-source projects. * ...Git (http://git-scm.com): A free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. * Mercurial (http://mercurial.selenic.com): A free, distributed source control management tool. Systems to Package, Access, and Execute Data and Code * IPython Notebook (http://ipython.org/notebook.html): A web-based interactive computational environment where you can combine code execution, text, mathematics, plots, and rich media into a single document. * ROpenSci (http://ropensci.org): A suite of packages that allow access to data repositories through the R statistical programming environment. * Authorea (https://authorea.com): A collaborative online word processor for scholarly papers that allows the writing of web-native, living, dynamic, "executable" articles that include text, mathematical notation, images, and data.
Jet classification is an important ingredient in measurements and searches for new physics at particle colliders, and secondary vertex reconstruction is a key intermediate step in building powerful ...jet classifiers. We use a neural network to perform vertex finding inside jets in order to improve the classification performance, with a focus on separation of bottom vs. charm flavor tagging. We implement a novel, universal set-to-graph model, which takes into account information from all tracks in a jet to determine if pairs of tracks originated from a common vertex. We explore different performance metrics and find our method to outperform traditional approaches in accurate secondary vertex reconstruction. We also find that improved vertex finding leads to a significant improvement in jet classification performance.
The cabinetry library provides a Python-based solution for building and steering binned template fits. It tightly integrates with the pythonic High Energy Physics ecosystem, and in particular with ...pyhf for statistical inference. cabinetry uses a declarative approach for building statistical models, with a JSON schema describing possible configuration choices. Model building instructions can additionally be provided via custom code, which is automatically executed when applicable at key steps of the workflow. The library implements interfaces for performing maximum likelihood fitting, upper parameter limit determination, and discovery significance calculation. cabinetry also provides a range of utilities to study and disseminate fit results. These include visualizations of the fit model and data, visualizations of template histograms and fit results, ranking of nuisance parameters by their impact, a goodness-of-fit calculation, and likelihood scans. The library takes a modular approach, allowing users to include some or all of its functionality in their workflow.