This document summarizes the status of data preservation in high energy physics. The paradigms and the methodological advances are discussed from a perspective of more than ten years of experience ...with a structured effort at international level. The status and the scientific return related to the preservation of data accumulated at large collider experiments are presented, together with an account of ongoing efforts to ensure long-term analysis capabilities for ongoing and future experiments. Transverse projects aimed at generic solutions, most of which are specifically inspired by open science and FAIR principles, are presented as well. A prospective and an action plan are also indicated.
Data preservation in high energy physics Basaglia, T.; Bellis, M.; Blomer, J. ...
The European physical journal. C, Particles and fields,
09/2023, Letnik:
83, Številka:
9
Journal Article
Recenzirano
Odprti dostop
Data preservation is a mandatory specification for any present and future experimental facility and it is a cost-effective way of doing fundamental research by exploiting unique data sets in the ...light of the continuously increasing theoretical understanding. This document summarizes the status of data preservation in high energy physics. The paradigms and the methodological advances are discussed from a perspective of more than ten years of experience with a structured effort at international level. The status and the scientific return related to the preservation of data accumulated at large collider experiments are presented, together with an account of ongoing efforts to ensure long-term analysis capabilities for ongoing and future experiments. Transverse projects aimed at generic solutions, most of which are specifically inspired by open science and FAIR principles, are presented as well. A prospective and an action plan are also indicated.
Data preservation is a mandatory specification for any present and future experimental facility and it is a cost-effective way of doing fundamental research by exploiting unique data sets in the ...light of the continuously increasing theoretical understanding. This document summarizes the status of data preservation in high energy physics. The paradigms and the methodological advances are discussed from a perspective of more than ten years of experience with a structured effort at international level. The status and the scientific return related to the preservation of data accumulated at large collider experiments are presented, together with an account of ongoing efforts to ensure long-term analysis capabilities for ongoing and future experiments. Transverse projects aimed at generic solutions, most of which are specifically inspired by open science and FAIR principles, are presented as well. A prospective and an action plan are also indicated.
In this paper we describe the development of a streamlined framework for large-scale ATLAS pMSSM reinterpretations of LHC Run-2 analyses using containerised computational workflows. The project is ...looking to assess the global coverage of BSM physics and requires running O(5k) computational workflows representing pMSSM model points. Following ATLAS Analysis Preservation policies, many analyses have been preserved as containerised Yadage workflows, and after validation were added to a curated selection for the pMSSM study. To run the workflows at scale, we utilised the REANA reusable analysis platform. We describe how the REANA platform was enhanced to ensure the best concurrent throughput by internal service scheduling changes. We discuss the scalability of the approach on Kubernetes clusters from 500 to 5000 cores. Finally, we demonstrate a possibility of using additional ad-hoc public cloud infrastructure resources by running the same workflows on the Google Cloud Platform.
We describe a novel approach for experimental High-Energy Physics (HEP) data analyses that is centred around the declarative rather than imperative paradigm when describing analysis computational ...tasks. The analysis process can be structured in the form of a Directed Acyclic Graph (DAG), where each graph vertex represents a unit of computation with its inputs and outputs, and the graph edges describe the interconnection of various computational steps. We have developed REANA, a platform for reproducible data analyses, that supports several such DAG workflow specifications. The REANA platform parses the analysis workflow and dispatches its computational steps to various supported computing backends (Kubernetes, HTCondor, Slurm). The focus on declarative rather than imperative programming enables researchers to concentrate on the problem domain at hand without having to think about implementation details such as scalable job orchestration. The declarative programming approach is further exemplified by a multi-level job cascading paradigm that was implemented in the Yadage workflow specification language. We present two recent LHC particle physics analyses, ATLAS searches for dark matter and CMS jet energy correction pipelines, where the declarative approach was successfully applied. We argue that the declarative approach to data analyses, combined with recent advancements in container technology, facilitates the portability of computational data analyses to various compute backends, enhancing the reproducibility and the knowledge preservation behind particle physics data analyses.
The data acquisition system (DAQ) of the CMS experiment at the CERN Large Hadron Collider (LHC) assembles events of 2MB at a rate of 100 kHz. The event builder collects event fragments from about 750 ...sources and assembles them into complete events which are then handed to the High-Level Trigger (HLT) processes running on
O
(1000) computers. The aging eventbuilding hardware will be replaced during the long shutdown 2 of the LHC taking place in 2019/20. The future data networks will be based on 100 Gb/s interconnects using Ethernet and Infiniband technologies. More powerful computers may allow to combine the currently separate functionality of the readout and builder units into a single I/O processor handling simultaneously 100 Gb/s of input and output traffic. It might be beneficial to preprocess data originating from specific detector parts or regions before handling it to generic HLT processors. Therefore, we will investigate how specialized coprocessors, e.g. GPUs, could be integrated into the event builder. We will present the envisioned changes to the event-builder compared to today’s system. Initial measurements of the performance of the data networks under the event-building traffic pattern will be shown. Implications of a folded network architecture for the event building and corresponding changes to the software implementation will be discussed.
The Compact Muon Solenoid (CMS) is one of the experiments at the CERN Large Hadron Collider (LHC). The CMS Online Monitoring system (OMS) is an upgrade and successor to the CMS Web-Based Monitoring ...(WBM)system, which is an essential tool for shift crew members, detector subsystem experts, operations coordinators, and those performing physics analyses. The CMS OMS is divided into aggregation and presentation layers. Communication between layers uses RESTful JSON:API compliant requests. The aggregation layer is responsible for collecting data from heterogeneous sources, storage of transformed and pre-calculated (aggregated) values and exposure of data via the RESTful API. The presentation layer displays detector information via a modern, user-friendly and customizable web interface. The CMS OMS user interface is composed of a set of cutting-edge software frameworks and tools to display non-event data to any authenticated CMS user worldwide. The web interface tree-like component structure comprises (top-down): workspaces, folders, pages, controllers and portlets. A clear hierarchy gives the required flexibility and control for content organization. Each bottom element instantiates a portlet and is a reusable component that displays a single aspect of data, like a table, a plot, an article, etc. Pages consist of multiple different portlets and can be customized at runtime by using a drag-and-drop technique. This is how a single page can easily include information from multiple online sources. Different pages give access to a summary of the current status of the experiment, as well as convenient access to historical data. This paper describes the CMS OMS architecture, core concepts and technologies of the presentation layer.
In this paper we describe the development of a streamlined framework for large-scale ATLAS pMSSM reinterpretations of LHC Run-2 analyses using containerised computational workflows. The project is ...looking to assess the global coverage of BSM physics and requires running O(5k) computational workflows representing pMSSM model points. Following ATLAS Analysis Preservation policies, many analyses have been preserved as containerised Yadage workflows, and after validation were added to a curated selection for the pMSSM study. To run the workflows at scale, we utilised the REANA reusable analysis platform. We describe how the REANA platform was enhanced to ensure the best concurrent throughput by internal service scheduling changes. We discuss the scalability of the approach on Kubernetes clusters from 500 to 5000 cores. Finally, we demonstrate a possibility of using additional ad-hoc public cloud infrastructure resources by running the same workflows on the Google Cloud Platform.
The Compact Muon Solenoid (CMS) experiment makes a vast use of alignment and calibration measurements in several data processing workflows: in the High Level Trigger, in the processing of the ...recorded collisions and in the production of simulated events for data analysis and studies of detector upgrades. A complete alignment and calibration scenario is factored in approximately three-hundred records, which are updated independently and can have a time-dependent content, to reflect the evolution of the detector and data taking conditions. Given the complexity of the CMS condition scenarios and the large number (50) of experts who actively measure and release calibration data, in 2015 a novel web-based service has been developed to structure and streamline their management. The cmsDbBrowser provides an intuitive and easily accessible entry point for the navigation of existing conditions by any CMS member, for the bookkeeping of record updates and for the actual composition of complete calibration scenarios. This paper describes the design, choice of technologies and the first year of usage in production of the cmsDbBrowser.
40 MHz Level-1 Trigger Scouting for CMS Badaro, Gilbert; Behrens, Ulf; Branson, James ...
EPJ Web of Conferences,
01/2020, Letnik:
245
Journal Article, Conference Proceeding
Recenzirano
Odprti dostop
The CMS experiment will be upgraded for operation at the HighLuminosity LHC to maintain and extend its physics performance under extreme pileup conditions. Upgrades will include an entirely new ...tracking system, supplemented by a track finder processor providing tracks at Level-1, as well as a high-granularity calorimeter in the endcap region. New front-end and back-end electronics will also provide the Level-1 trigger with high-resolution information from the barrel calorimeter and the muon systems. The upgraded Level-1 processors, based on powerful FPGAs, will be able to carry out sophisticated feature searches with resolutions often similar to the offline ones, while keeping pileup effects under control. In this paper, we discuss the feasibility of a system capturing Level-1 intermediate data at the beam-crossing rate of 40 MHz and carrying out online analyzes based on these limited-resolution data. This 40 MHz scouting system would provide fast and virtually unlimited statistics for detector diagnostics, alternative luminosity measurements and, in some cases, calibrations. It has the potential to enable the study of otherwise inaccessible signatures, either too common to fit in the Level-1 accept budget, or with requirements which are orthogonal to “mainstream” physics, such as long-lived particles. We discuss the requirements and possible architecture of a 40 MHz scouting system, as well as some of the physics potential, and results from a demonstrator operated at the end of Run-2 using the Global Muon Trigger data from CMS. Plans for further demonstrators envisaged for Run-3 are also discussed.