The High Luminosity LHC (HL-LHC) will start operating in 2027 after the third Long Shutdown (LS3), and is designed to provide an ultimate instantaneous luminosity of 7:5 × 10
34
cm
−2
s
−1
, at the ...price of extreme pileup of up to 200 interactions per crossing. The number of overlapping interactions in HL-LHC collisions, their density, and the resulting intense radiation environment, warrant an almost complete upgrade of the CMS detector. The upgraded CMS detector will be read out by approximately fifty thousand highspeed front-end optical links at an unprecedented data rate of up to 80 Tb/s, for an average expected total event size of approximately 8 − 10 MB. Following the present established design, the CMS trigger and data acquisition system will continue to feature two trigger levels, with only one synchronous hardware-based Level-1 Trigger (L1), consisting of custom electronic boards and operating on dedicated data streams, and a second level, the High Level Trigger (HLT), using software algorithms running asynchronously on standard processors and making use of the full detector data to select events for offline storage and analysis. The upgraded CMS data acquisition system will collect data fragments for Level-1 accepted events from the detector back-end modules at a rate up to 750 kHz, aggregate fragments corresponding to individual Level- 1 accepts into events, and distribute them to the HLT processors where they will be filtered further. Events accepted by the HLT will be stored permanently at a rate of up to 7.5 kHz. This paper describes the baseline design of the DAQ and HLT systems for the Phase-2 of CMS.
40 MHz Level-1 Trigger Scouting for CMS Badaro, Gilbert; Behrens, Ulf; Branson, James ...
EPJ Web of Conferences,
01/2020, Letnik:
245
Journal Article, Conference Proceeding
Recenzirano
Odprti dostop
The CMS experiment will be upgraded for operation at the HighLuminosity LHC to maintain and extend its physics performance under extreme pileup conditions. Upgrades will include an entirely new ...tracking system, supplemented by a track finder processor providing tracks at Level-1, as well as a high-granularity calorimeter in the endcap region. New front-end and back-end electronics will also provide the Level-1 trigger with high-resolution information from the barrel calorimeter and the muon systems. The upgraded Level-1 processors, based on powerful FPGAs, will be able to carry out sophisticated feature searches with resolutions often similar to the offline ones, while keeping pileup effects under control. In this paper, we discuss the feasibility of a system capturing Level-1 intermediate data at the beam-crossing rate of 40 MHz and carrying out online analyzes based on these limited-resolution data. This 40 MHz scouting system would provide fast and virtually unlimited statistics for detector diagnostics, alternative luminosity measurements and, in some cases, calibrations. It has the potential to enable the study of otherwise inaccessible signatures, either too common to fit in the Level-1 accept budget, or with requirements which are orthogonal to “mainstream” physics, such as long-lived particles. We discuss the requirements and possible architecture of a 40 MHz scouting system, as well as some of the physics potential, and results from a demonstrator operated at the end of Run-2 using the Global Muon Trigger data from CMS. Plans for further demonstrators envisaged for Run-3 are also discussed.
The Data Acquisition (DAQ) system of the Compact Muon Solenoid (CMS) experiment at the LHC is a complex system responsible for the data readout, event building and recording of accepted events. Its ...proper functioning plays a critical role in the data-taking efficiency of the CMS experiment. In order to ensure high availability and recover promptly in the event of hardware or software failure of the subsystems, an expert system, the DAQ Expert, has been developed. It aims at improving the data taking efficiency, reducing the human error in the operations and minimising the on-call expert demand. Introduced in the beginning of 2017, it assists the shift crew and the system experts in recovering from operational faults, streamlining the post mortem analysis and, at the end of Run 2, triggering fully automatic recovery without human intervention. DAQ Expert analyses the real-time monitoring data originating from the DAQ components and the high-level trigger updated every few seconds. It pinpoints data flow problems, and recovers them automatically or after given operator approval. We analyse the CMS downtime in the 2018 run focusing on what was improved with the introduction of automated recovery; present challenges and design of encoding the expert knowledge into automated recovery jobs. Furthermore, we demonstrate the web-based, ReactJS interfaces that ensure an effective cooperation between the human operators in the control room and the automated recovery system. We report on the operational experience with automated recovery.
The second phase of the LHC, the High-Luminosity LHC, is scheduled to start in 2029, after a shutdown during which the beam intensity and focusing will be significantly upgraded. For this HL-LHC era, ...also the CMS detector will receive an extensive upgrade, primarily to maintain its physics performance at increasing pileup. The Phase-2 CMS Level-1 trigger rate will increase to 750 kHz, for an estimated data rate in excess of 50 Tbit/s. The Phase-2 CMS off-detector electronics will be based on the ATCA standard, with back-end boards receiving the detector data from the on-detector front-ends via custom, radiation-tolerant, optical links. The CMS Phase-2 data acquisition design tightens the integration between trigger control and data flow, extending the synchronous regime of the DAQ system. At the core of the design is the DAQ and Timing Hub, a custom ATCA hub card forming the bridge between the different, detector-specific, control and readout electronics and the common timing, trigger, and control systems. The overall synchronisation and data flow of the experiment is handled by the Trigger and Timing Control and Distribution System. For increased flexibility during commissioning and calibration runs, the design of the Phase-2 trigger and timing distribution system breaks with the traditional distribution tree, in favour of a configurable network connecting multiple independent control units to all off-detector endpoints. In order to reduce the number of custom hardware designs required, the DAQ hardware is designed such that it can also be used to implement the Trigger and Timing Control and Distribution System.
The data acquisition (DAQ) of the Compact Muon Solenoid (CMS) experiment at CERN, collects data for events accepted by the Level-1 Trigger from the different detector systems and assembles them in an ...event builder prior to making them available for further selection in the High Level Trigger, and finally storing the selected events for offline analysis. In addition to the central DAQ providing global acquisition functionality, several separate, so-called “MiniDAQ” setups allow operating independent data acquisition runs using an arbitrary subset of the CMS subdetectors.
During Run 2 of the LHC, MiniDAQ setups were running their event builder and High Level Trigger applications on dedicated resources, separate from those used for the central DAQ. This cleanly separated MiniDAQ setups from the central DAQ system, but also meant limited throughput and a fixed number of possible MiniDAQ setups. In Run 3, MiniDAQ-3 setups share production resources with the new central DAQ system, allowing each setup to operate at the maximum Level-1 rate thanks to the reuse of the resources and network bandwidth. Configuration management tools had to be significantly extended to support the synchronization of the DAQ configurations needed for the various setups.
We report on the new configuration management features and on the first year of operational experience with the new MiniDAQ-3 system.
The CMS Orbit Builder for the HL-LHC at CERN Amoiridis, Vassileios; Behrens, Ulf; Bocci, Andrea ...
EPJ Web of Conferences,
2024, Letnik:
295
Journal Article, Conference Proceeding
Recenzirano
Odprti dostop
The Compact Muon Solenoid (CMS) experiment at CERN incorporates one of the highest throughput data acquisition systems in the world and is expected to increase its throughput by more than a factor of ...ten for High-Luminosity phase of Large Hadron Collider (HL-LHC). To achieve this goal, the system will be upgraded in most of its components. Among them, the event builder software, in charge of assembling all the data read out from the different sub-detectors, is planned to be modified from a single event builder to an orbit builder that assembles multiple events at the same time. The throughput of the event builder will be increased from the current 1.6 Tb/s to 51 Tb/s for the HL-LHC orbit builder. This paper presents preliminary network transfer studies in preparation for the upgrade. The key conceptual characteristics are discussed, concerning differences between the CMS event builder in Run 3 and the CMS Orbit Builder for the HL-LHC. For the feasibility studies, a pipestream benchmark, mimicking event-builder-like traffic has been developed. Preliminary performance tests and results are discussed.
The CMS data acquisition (DAQ) is implemented as a service-oriented architecture where DAQ applications, as well as general applications such as monitoring and error reporting, are run as ...self-contained services. The task of deployment and operation of services is achieved by using several heterogeneous facilities, custom configuration data and scripts in several languages. In this work, we restructure the existing system into a homogeneous, scalable cloud architecture adopting a uniform paradigm, where all applications are orchestrated in a uniform environment with standardized facilities. In this new paradigm DAQ applications are organized as groups of containers and the required software is packaged into container images. Automation of all aspects of coordinating and managing containers is provided by the Kubernetes environment, where a set of physical and virtual machines is unified in a single pool of compute resources. We demonstrate that a container-based cloud architecture provides an acrossthe-board solution that can be applied for DAQ in CMS. We show strengths and advantages of running DAQ applications in a container infrastructure as compared to a traditional application model.
The Online Monitoring System (OMS) at the Compact Muon Solenoid experiment (CMS) at CERN aggregates and integrates different sources of information into a central place and allows users to view, ...compare and correlate information. It displays real-time and historical information. The tool is heavily used by run coordinators, trigger experts and shift crews, to ensure the quality and efficiency of data taking. It provides aggregated information for many use cases including data certification. OMS is the successor of Web Based Monitoring (WBM), which was in use during Run 1 and Run 2 of the LHC. WBM started as a small tool and grew substantially over the years so that maintenance became challenging. OMS was developed from scratch following several design ideas: to strictly separate the presentation layer from the data aggregation layer, to use a well-defined standard for the communication between presentation layer and aggregation layer, and to employ widely used frameworks from outside the HEP community. A report on the experience from the operation of OMS for the first year of data taking of Run 3 in 2022 is presented.
The New CMS DAQ System for Run-2 of the LHC Bawej, Tomasz; Behrens, Ulf; Branson, James ...
IEEE transactions on nuclear science,
06/2015, Letnik:
62, Številka:
3
Journal Article
Recenzirano
Odprti dostop
The data acquisition (DAQ) system of the CMS experiment at the CERN Large Hadron Collider assembles events at a rate of 100 kHz, transporting event data at an aggregate throughput of 100 GB/s to the ...high level trigger (HLT) farm. The HLT farm selects interesting events for storage and offline analysis at a rate of around 1 kHz. The DAQ system has been redesigned during the accelerator shutdown in 2013/14. The motivation is twofold: Firstly, the current compute nodes, networking, and storage infrastructure will have reached the end of their lifetime by the time the LHC restarts. Secondly, in order to handle higher LHC luminosities and event pileup, a number of sub-detectors will be upgraded, increasing the number of readout channels and replacing the off-detector readout electronics with a μTCA implementation. The new DAQ architecture will take advantage of the latest developments in the computing industry. For data concentration, 10/40 Gb/s Ethernet technologies will be used, as well as an implementation of a reduced TCP/IP in FPGA for a reliable transport between custom electronics and commercial computing hardware. A Clos network based on 56 Gb/s FDR Infiniband has been chosen for the event builder with a throughput of ~ 4 Tb/s. The HLT processing is entirely file based. This allows the DAQ and HLT systems to be independent, and to use the HLT software in the same way as for the offline processing. The fully built events are sent to the HLT with 1/10/40 Gb/s Ethernet via network file systems. Hierarchical collection of HLT accepted events and monitoring meta-data are stored into a global file system. This paper presents the requirements, technical choices, and performance of the new system.
TCP and the socket abstraction have barely changed over the last two decades, but at the network layer there has been a giant leap from a few megabits to 100 gigabits in bandwidth. At the same time, ...CPU architectures have evolved into the multi-core era and applications are expected to make full use of all available resources. Applications in the data acquisition domain based on the standard socket library running in a Non-Uniform Memory Access (NUMA) architecture are unable to reach full efficiency and scalability without the software being adequately aware about the IRQ (Interrupt Request), CPU and memory affinities. During the first long shutdown of LHC, the CMS DAQ system is going to be upgraded for operation from 2015 onwards and a new software component has been designed and developed in the CMS online framework for transferring data with sockets. This software attempts to wrap the low-level socket library to ease higher-level programming with an API based on an asynchronous event driven model similar to the DAT uDAPL API. It is an event-based application with NUMA optimizations, that allows for a high throughput of data across a large distributed system. This paper describes the architecture, the technologies involved and the performance measurements of the software in the context of the CMS distributed event building.