A
bstract
Discoveries of new phenomena often involve a dedicated search for a hypothetical physics signature. Recently, novel deep learning techniques have emerged for anomaly detection in the ...absence of a signal prior. However, by ignoring signal priors, the sensitivity of these approaches is significantly reduced. We present a new strategy dubbed Quasi Anomalous Knowledge (QUAK), whereby we introduce alternative signal priors that capture some of the salient features of new physics signatures, allowing for the recovery of sensitivity even when the alternative signal is incorrect. This approach can be applied to a broad range of physics models and neural network architectures. In this paper, we apply QUAK to anomaly detection of new physics events at the CERN Large Hadron Collider utilizing variational autoencoders with normalizing flow.
Bhargava, Hanke, and Shankar have recently shown that the asymptotic average
$2$-torsion subgroup size of the family of class groups of monogenized cubic
fields with positive and negative ...discriminants is $3/2$ and $2$, respectively.
In this paper, we provide strong computational evidence for these asymptotes.
We then develop a pair of novel conjectures that predicts, for $p$ prime, the
asymptotic average $p$-torsion subgroup size in class groups of monogenized
cubic fields.
Bhargava, Hanke, and Shankar have recently shown that the asymptotic average \(2\)-torsion subgroup size of the family of class groups of monogenized cubic fields with positive and negative ...discriminants is \(3/2\) and \(2\), respectively. In this paper, we provide strong computational evidence for these asymptotes. We then develop a pair of novel conjectures that predicts, for \(p\) prime, the asymptotic average \(p\)-torsion subgroup size in class groups of monogenized cubic fields.
Discoveries of new phenomena often involve a dedicated search for a hypothetical physics signature. Recently, novel deep learning techniques have emerged for anomaly detection in the absence of a ...signal prior. However, by ignoring signal priors, the sensitivity of these approaches is significantly reduced. We present a new strategy dubbed Quasi Anomalous Knowledge (QUAK), whereby we introduce alternative signal priors that capture some of the salient features of new physics signatures, allowing for the recovery of sensitivity even when the alternative signal is incorrect. This approach can be applied to a broad range of physics models and neural network architectures. In this paper, we apply QUAK to anomaly detection of new physics events at the CERN Large Hadron Collider utilizing variational autoencoders with normalizing flow.
Large Language Models (LLMs) are shifting how scientific research is done. It is imperative to understand how researchers interact with these models and how scientific sub-communities like astronomy ...might benefit from them. However, there is currently no standard for evaluating the use of LLMs in astronomy. Therefore, we present the experimental design for an evaluation study on how astronomy researchers interact with LLMs. We deploy a Slack chatbot that can answer queries from users via Retrieval-Augmented Generation (RAG); these responses are grounded in astronomy papers from arXiv. We record and anonymize user questions and chatbot answers, user upvotes and downvotes to LLM responses, user feedback to the LLM, and retrieved documents and similarity scores with the query. Our data collection method will enable future dynamic evaluations of LLM tools for astronomy.
The exponential growth of astronomical literature poses significant
challenges for researchers navigating and synthesizing general insights or even
domain-specific knowledge. We present Pathfinder, a ...machine learning framework
designed to enable literature review and knowledge discovery in astronomy,
focusing on semantic searching with natural language instead of syntactic
searches with keywords. Utilizing state-of-the-art large language models (LLMs)
and a corpus of 350,000 peer-reviewed papers from the Astrophysics Data System
(ADS), Pathfinder offers an innovative approach to scientific inquiry and
literature exploration. Our framework couples advanced retrieval techniques
with LLM-based synthesis to search astronomical literature by semantic context
as a complement to currently existing methods that use keywords or citation
graphs. It addresses complexities of jargon, named entities, and temporal
aspects through time-based and citation-based weighting schemes. We demonstrate
the tool's versatility through case studies, showcasing its application in
various research scenarios. The system's performance is evaluated using custom
benchmarks, including single-paper and multi-paper tasks. Beyond literature
review, Pathfinder offers unique capabilities for reformatting answers in ways
that are accessible to various audiences (e.g. in a different language or as
simplified text), visualizing research landscapes, and tracking the impact of
observatories and methodologies. This tool represents a significant advancement
in applying AI to astronomical research, aiding researchers at all career
stages in navigating modern astronomy literature.
Recent experimental searches for particles beyond the Standard Model (BSM) have yielded little in the realm of new physics discoveries. A number of research efforts have adopted new anomaly detection ...strategies which utilize density estimation algorithms based on unsupervised and semi-supervised machine learning. However, these efforts rely exclusively on QCD background priors, and thus drastically limit their own anomaly detection capabilities.;
;
In this thesis, we integrate an unsupervised density estimation algorithm, neural spline normalizing flows, into an anomaly detection strategy called Quasi-Anomalous Knowledge (QUAK), which allows us to take advantage of signal priors in addition to QCD background priors. The introduction of a signal prior allows us to learn the features of a particular type of BSM dijet event, giving us insight into the underlying variable distributions of hidden signals in CMS data. Through several studies on both Monte Carlo samples and 13 TeV data from CMS, we demonstrate that QUAK with normalizing flows (QUAK-NF) can be a powerful tool for conducting searches for BSM physics.
M.Eng.
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop ...and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.