A
bstract
Jet grooming has emerged as a necessary and vital tool for mitigating contamination radiation in jets. The additional restrictions on emissions imposed by the groomer can result in ...non-smooth behavior of resulting fixed-order distributions of observables measured on groomed jets. As a concrete example, we study the cusp in the hemisphere mass distribution of
e
+
e
−
→
hadrons events groomed with soft drop. We identify the leading emissions that contribute in the region about the cusp and formulate an all-orders factorization theorem that describes how the cusp is resolved through arbitrary strongly-ordered soft and collinear emissions. The factorization theorem exhibits numerous novel features such as contributions from collinear modes that can cross hemisphere boundaries as well as requiring explicit subtraction of the limit in which resolved emissions become collinear to the hard core. We present resummation of the cusp region through next-to-leading logarithmic accuracy and describe how it can be matched with established factorization theorems that describe other groomed phase space regions.
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop ...and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). Methods made use of modern machine learning tools and were based on unsupervised learning (autoencoders, generative adversarial networks, normalizing flows), weakly supervised learning, and semi-supervised learning. This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.
A growing number of weak and unsupervised machine learning approaches to anomaly detection are being proposed to significantly extend the search program at the Large Hadron Collider (LHC) and ...elsewhere. One of the prototypical examples for these methods is the search for resonant new physics, where a bump hunt can be performed in an invariant mass spectrum after applying a classifier to enhance the presence of a potential signal. A significant challenge to methods that rely entirely on data is that they are susceptible to sculpting artificial bumps from the dependence of the machine learning classifier on the invariant mass. We explore two solutions to this challenge by minimally incorporating simulation into the learning. In particular, we study the robustness of simulation assisted likelihood-free anomaly detection to correlations between the classifier and the invariant mass. Next, we propose a new approach that only uses the simulation for decorrelation but uses the classification without labels approach for achieving signal sensitivity. Both methods are compared using a full background fit analysis on simulated data from the LHC Olympics and are robust to correlations in the data.
Jet grooming has emerged as a necessary and vital tool for mitigating contamination radiation in jets. The additional restrictions on emissions imposed by the groomer can result in non-smooth ...behavior of resulting fixed-order distributions of observables measured on groomed jets. As a concrete example, we study the cusp in the hemisphere mass distribution of \(e^+e^-\to\) hadrons events groomed with soft drop. We identify the leading emissions that contribute in the region about the cusp and formulate an all-orders factorization theorem that describes how the cusp is resolved through arbitrary strongly-ordered soft and collinear emissions. The factorization theorem exhibits numerous novel features such as contributions from collinear modes that can cross hemisphere boundaries as well as requiring explicit subtraction of the limit in which resolved emissions become collinear to the hard core. We present resummation of the cusp region through next-to-leading logarithmic accuracy and describe how it can be matched with established factorization theorems that describe other groomed phase space regions.
A growing number of weak- and unsupervised machine learning approaches to anomaly detection are being proposed to significantly extend the search program at the Large Hadron Collider and elsewhere. ...One of the prototypical examples for these methods is the search for resonant new physics, where a bump hunt can be performed in an invariant mass spectrum. A significant challenge to methods that rely entirely on data is that they are susceptible to sculpting artificial bumps from the dependence of the machine learning classifier on the invariant mass. We explore two solutions to this challenge by minimally incorporating simulation into the learning. In particular, we study the robustness of Simulation Assisted Likelihood-free Anomaly Detection (SALAD) to correlations between the classifier and the invariant mass. Next, we propose a new approach that only uses the simulation for decorrelation but the Classification without Labels (CWoLa) approach for achieving signal sensitivity. Both methods are compared using a full background fit analysis on simulated data from the LHC Olympics and are robust to correlations in the data.
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop ...and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.