Abstract
Deep Learning (DL) methods and Computer Vision are becoming important tools for event reconstruction in particle physics detectors. In this work, we report on the use of submanifold sparse ...convolutional neural networks (SparseNets) for the classification of track and shower hits from a DUNE prototype liquid-argon detector at CERN (ProtoDUNE-SP). By taking advantage of the three-dimensional nature of the problem we use a set of nine input features to classify sparse and locally dense hits associated to track or shower particles. The SparseNet has been trained on a test sample and shows promising results: efficiencies and purities greater than 90%. This has also been achieved with a considerable speedup and substantially less resource utilization with respect to other DL networks such as graph neural networks. This method offers great scalability advantages for future large neutrino detectors such as the planned DUNE experiment.
Emerging high-performance storage technologies are opening up the possibility of designing new distributed data acquisition (DAQ) system architectures, in which the live acquisition of data and their ...processing are decoupled through a storage element. An example of these technologies is 3D XPoint, which promises to fill the gap between memory and traditional storage and offers unprecedented high throughput for nonvolatile data. In this article, we characterize the performance of persistent memory devices that use the 3D XPoint technology, in the context of the DAQ system for one large Particle Physics experiment, DUNE. This experiment must be capable of storing, upon a specific signal, incoming data for up to 100 s, with a throughput of 1.5 TB/s, for an aggregate size of 150 TB. The modular nature of the apparatus allows splitting the problem into 150 identical units operating in parallel, each at 10 GB/s. The target is to be able to dedicate a single CPU to each of those units for DAQ and storage.
Over the next few years, the LHC will prepare for the upcoming High-Luminosity upgrade in which it is expected to deliver ten times more pp collisions. This will create a harsher radiation ...environment and higher detector occupancy. In this context, the ATLAS experiment, one of the general purpose experiments at the LHC, plans substantial upgrades to the detectors and to the trigger system in order to efficiently select events. Similarly, the Data Acquisition System (DAQ) will have to redesign the data-flow architecture to accommodate for the large increase in event and data rates. The Phase-II DAQ design involves a large distributed storage system that buffers data read out from the detector, while a computing farm (Event Filter) analyzes and selects the most interesting events. This system will have to handle 5.2 TB/s of input data for an event rate of 1 MHz and provide access to 3 TB/s of these data to the filtering farm. A possible implementation for such a design is based on distributed file systems (DFS) which are becoming ubiquitous among the big data industry. Features of DFS such as replication strategies and smart placement policies match the distributed nature and the requirements of the new data-flow system. This paper presents an up-to-date performance evaluation of some of the DFS currently available: GlusterFS, HadoopFS and CephFS. After characterization of the future data-flow systems workload, we report on small-scale raw performance and scalability studies. Finally, we conclude on the suitability of such systems to the tight constraints expected for the ATLAS experiment in phase-II and, in general, what benefits the HEP community can take from these storage technologies.
The ATLAS experiment will undergo a major upgrade to take advantage of the new conditions provided by the upgraded High-Luminosity LHC. The Trigger and Data Acquisition system (TDAQ) will record data ...at unprecedented rates: the detectors will be read out at 1 MHz generating around 5 TB/s of data. The Dataflow system (DF), component of TDAQ, introduces a novel design: readout data are buffered on persistent storage while the event filtering system analyses them to select 10000 events per second for a total recorded throughput of around 60 GB/s. This approach allows for decoupling the detector activity from the event selection process. New challenges then arise for DF: design and implement a distributed, reliable, persistent storage system supporting several TB/s of aggregated throughput while providing tens of PB of capacity. In this paper we first describe some of the challenges that DF is facing: data safety with persistent storage limitations, indexing of data at high-granularity in a highly-distributed system, and high-performance management of storage capacity. Then the ongoing R&D to address each of the them is presented and the performance achieved with a working prototype is shown.
Data acquisition systems are a key component for successful data taking in any experiment. The DAQ is a complex distributed computing system and coordinates all operations, from the data selection ...stage of interesting events to storage elements. For the High Luminosity upgrade of the Large Hadron Collider, the experiments at CERN need to meet challenging requirements to record data with a much higher occupancy in the detectors. The DAQ system will receive and deliver data with a significantly increased trigger rate, one million events per second, and capacity, terabytes of data per second. An effective way to meet these requirements is to decouple real-time data acquisition from event selection. Data fragments can be temporarily stored in a large distributed key-value store. Fragments belonging to the same event can be then queried on demand, by the data selection processes. Implementing such a model relies on a proper combination of emerging technologies, such as persistent memory, NVMe SSDs, scalable networking, and data structures, as well as high performance, scalable software. In this paper, we present
DAQDB
(Data Acquisition Database) — an open source implementation of this design that was presented earlier, with an extensive evaluation of this approach, from the single node to the distributed performance. Furthermore, we complement our study with a description of the challenges faced and the lessons learned while integrating DAQDB with the existing software framework of the ATLAS experiment.
The DUNE detector is a neutrino physics experiment that is expected to take data starting from 2028. The data acquisition (DAQ) system of the experiment is designed to sustain several TB/s of ...incoming data which will be temporarily buffered while being processed by a software based data selection system. In DUNE, some rare physics processes (e.g. Supernovae Burst events) require storing the full complement of data produced over 1-2 minute window. These are recognised by the data selection system which fires a specific trigger decision. Upon reception of this decision data are moved from the temporary buffers to local, high performance, persistent storage devices. In this paper we characterize the performance of novel 3DXPoint SSD devices under different workloads suitable for high-performance storage applications. We then illustrate how such devices may be applied to the DUNE use-case: to store, upon a specific signal, 100 seconds of incoming data at 1.5 TB/s distributed among 150 identical units each operating at approximately 10GB/s.
The data acquisition system of particle physics experiments is a mission-critical component responsible for the experiments’ success. The next generation of large-scale particle physics experiments ...will have millions of sensors producing a large amount of data at high rates. The DUNE and Phase-II ATLAS experiments are expected to start taking data in the late 2020s. Data rates from both detectors will reach orders of terabytes per second, posing a significant challenge to the data acquisition chain and, especially, to the storage and dataflow system which will need to be designed accordingly. Therefore, it becomes essential to investigate methods to collect, store and transport the data efficiently. This thesis presents the work done on the performance characterization of different storage technologies and dataflow methods that are suitable for implementing the storage systems of both the ATLAS and DUNE experiments.Persistent memory devices and novel solid-state devices have been investigated as a potential solution to implement the local storage system of the DUNE experiment. This was achieved using both synthetic benchmarks with an emulated workload and integration testing for each storage technology. Performance results obtained after carefully tuning the devices show that such technologies can sustain the target rates required by the experiment.A distributed high-throughput key-value store (DAQDB) was designed for the data acquisition task and it was extensively tested as a solution for the large storage buffer of the ATLAS experiment. The results show that the current implementation of DAQDB cannot sustain the target bandwidth required by the ATLAS experiment. Therefore, the work was followed by the investigation of dataflow methods that combine local storage management solutions as a possible means to achieve the target goals of the experiment. An extensive study on the evolution of the storage system with discrete event simulations was also done to understand the advantages and limitations of different data acquisition architectures.The experimental research was also completed by investigating a novel algorithm (SparseNet) designed to classify track and shower energy deposits across a liquid argon detector. The SparseNet algorithm was extensively tested with both Monte Carlo data and beam data from a test setup of the DUNE experiment available at CERN. Preliminary results show that the SparseNet outperforms the currently adopted track and shower classification algorithm.
Several observables sensitive to the fragmentation of b quarks into b hadrons are measured using 36 fb−1 of √s=13 TeV proton-proton collision data collected with the ATLAS detector at the LHC. Jets ...containing b hadrons are obtained from a sample of dileptonic t¯t events, and the associated set of charged-particle tracks is separated into those from the primary pp interaction vertex and those from the displaced b-decay secondary vertex. This division is used to construct observables that characterize the longitudinal and transverse momentum distributions of the b hadron within the jet. The measurements have been corrected for detector effects and provide a test of heavy-quark-fragmentation modeling at the LHC in a system where the top-quark decay products are color connected to the proton beam remnants. The unfolded distributions are compared with the predictions of several modern Monte Carlo parton-shower generators and generator tunes, and a wide range of agreement with the data is observed, with p values varying from 5×10−4 to 0.98. These measurements complement similar measurements from e+e− collider experiments in which the b quarks originate from a color singlet Z/γ∗.
A study of B+c→J/ψD+s and B+c→J/ψD∗+s decays using 139 fb−1 of integrated luminosity collected with the ATLAS detector from s√ = 13 TeV pp collisions at the LHC is presented. The ratios of the ...branching fractions of the two decays to the branching fraction of the B+c → J/ψπ+ decay are measured: B(B+c→J/ψD+s)/B(B+c→J/ψπ+) = 2.76 ± 0.47 and B(B+c→J/ψD∗+s)/B(B+c→J/ψπ+) = 5.33 ± 0.96. The ratio of the branching fractions of the two decays is found to be B(B+c→J/ψD∗+s)/B(B+c→J/ψD∗+s) = 1.93 ± 0.26. For the B+c→J/ψD∗+s decay, the transverse polarization fraction, Γ±±/Γ, is measured to be 0.70 ± 0.11. The reported uncertainties include both the statistical and systematic components added in quadrature. The precision of the measurements exceeds that in all previous studies of these decays. These results supersede those obtained in the earlier ATLAS study of the same decays with s√ = 7 and 8 TeV pp collision data. A comparison with available theoretical predictions for the measured quantities is presented.
The yield of charged particles opposite to a Z boson with large transverse momentum (pT) is measured in 260 pb−1 of pp and 1.7 nb−1 of Pb+Pb collision data at 5.02 TeV per nucleon pair recorded ...with the ATLAS detector at the Large Hadron Collider. The Z boson tag is used to select hard-scattered partons with specific kinematics, and to observe how their showers are modified as they propagate through the quark-gluon plasma created in Pb+Pb collisions. Compared with pp collisions, charged-particle yields in Pb+Pb collisions show significant modifications as a function of charged-particle pT in a way that depends on event centrality and Z boson pT. The data are compared with a variety of theoretical calculations and provide new information about the medium-induced energy loss of partons in a pT regime difficult to measure through other channels.