Data preservation in high energy physics Basaglia, T.; Bellis, M.; Blomer, J. ...
The European physical journal. C, Particles and fields,
09/2023, Letnik:
83, Številka:
9
Journal Article
Recenzirano
Odprti dostop
Data preservation is a mandatory specification for any present and future experimental facility and it is a cost-effective way of doing fundamental research by exploiting unique data sets in the ...light of the continuously increasing theoretical understanding. This document summarizes the status of data preservation in high energy physics. The paradigms and the methodological advances are discussed from a perspective of more than ten years of experience with a structured effort at international level. The status and the scientific return related to the preservation of data accumulated at large collider experiments are presented, together with an account of ongoing efforts to ensure long-term analysis capabilities for ongoing and future experiments. Transverse projects aimed at generic solutions, most of which are specifically inspired by open science and FAIR principles, are presented as well. A prospective and an action plan are also indicated.
During the past two years large parts of the CERN batch farm have been moved to virtual machines running on the CERN internal cloud. During this process a large fraction of the resources, which had ...previously been used as physical batch worker nodes, were converted into hypervisors. Due to the large spread of the per-core performance in the farm, caused by its heterogenous nature, it is necessary to have a good knowledge of the performance of the virtual machines. This information is used both for scheduling in the batch system and for accounting. While in the previous setup worker nodes were classified and benchmarked based on the purchase order number, for virtual batch worker nodes this is no longer possible; the information is now either hidden or hard to retrieve. Therefore we developed a new scheme to classify worker nodes according to their performance. The new scheme is flexible enough to be usable both for virtual and physical machines in the batch farm. With the new classification it is possible to have an estimation of the performance of worker nodes also in a very dynamic farm with worker nodes coming and going at a high rate, without the need to benchmark each new node again. An extension to public cloud resources is possible if all conditions under which the benchmark numbers have been obtained are fulfilled.
The machine/job features mechanism Alef, M; Cass, T; Keijser, J J ...
Journal of physics. Conference series,
10/2017, Letnik:
898, Številka:
9
Journal Article
Recenzirano
Odprti dostop
Within the HEPiX virtualization group and the Worldwide LHC Computing Grid's Machine/Job Features Task Force, a mechanism has been developed which provides access to detailed information about the ...current host and the current job to the job itself. This allows user payloads to access meta information, independent of the current batch system or virtual machine model. The information can be accessed either locally via the filesystem on a worker node, or remotely via HTTP(S) from a webserver. This paper describes the final version of the specification from 2016 which was published as an HEP Software Foundation technical note, and the design of the implementations of this version for batch and virtual machine platforms. We discuss early experiences with these implementations and how they can be exploited by experiment frameworks.
Review of CERN Data Centre Infrastructure Andrade, P; Bell, T; van Eldik, J ...
Journal of physics. Conference series,
01/2012, Letnik:
396, Številka:
4
Journal Article
Recenzirano
Odprti dostop
The CERN Data Centre is reviewing strategies for optimizing the use of the existing infrastructure and expanding to a new data centre by studying how other large sites are being operated. Over the ...past six months, CERN has been investigating modern and widely-used tools and procedures used for virtualisation, clouds and fabric management in order to reduce operational effort, increase agility and support unattended remote data centres. This paper gives the details on the project's motivations, current status and areas for future investigation.
The CMS CERN Analysis Facility (CAF) Buchmüller, O; Bonacorsi, D; Fanzago, F ...
Journal of physics. Conference series,
04/2010, Letnik:
219, Številka:
5
Journal Article
Recenzirano
Odprti dostop
The CMS CERN Analysis Facility (CAF) was primarily designed to host a large variety of latency-critical workflows. These break down into alignment and calibration, detector commissioning and ...diagnosis, and high-interest physics analysis requiring fast-turnaround. In addition to the low latency requirement on the batch farm, another mandatory condition is the efficient access to the RAW detector data stored at the CERN Tier-0 facility. The CMS CAF also foresees resources for interactive login by a large number of CMS collaborators located at CERN, as an entry point for their day-by-day analysis. These resources will run on a separate partition in order to protect the high-priority use-cases described above. While the CMS CAF represents only a modest fraction of the overall CMS resources on the WLCG GRID, an appropriately sized user-support service needs to be provided. We will describe the building, commissioning and operation of the CMS CAF during the year 2008. The facility was heavily and routinely used by almost 250 users during multiple commissioning and data challenge periods. It reached a CPU capacity of 1.4MSI2K and a disk capacity at the Peta byte scale. In particular, we will focus on the performances in terms of networking, disk access and job efficiency and extrapolate prospects for the upcoming LHC first year data taking. We will also present the experience gained and the limitations observed in operating such a large facility, in which well controlled workflows are combined with more chaotic type analysis by a large number of physicists.
This document summarizes the status of data preservation in high energy physics. The paradigms and the methodological advances are discussed from a perspective of more than ten years of experience ...with a structured effort at international level. The status and the scientific return related to the preservation of data accumulated at large collider experiments are presented, together with an account of ongoing efforts to ensure long-term analysis capabilities for ongoing and future experiments. Transverse projects aimed at generic solutions, most of which are specifically inspired by open science and FAIR principles, are presented as well. A prospective and an action plan are also indicated.
Most LCG sites are currently running on SL(C)4. However, this operating system is already rather old, and it is becoming difficult to get the required hardware drivers, to get the best out of recent ...hardware. A possible way out is the migration to SL(C)5 based systems where possible, in combination with virtualization methods. The former is typically possible for nodes where the software to run the services is available and tested, while the latter offers a possibility to make use of the new hardware platforms whilst maintaining operating system compatibility. Since autumn 2008, CERN has offered public interactive and batch worker nodes for evaluation to the experiments. For the Grid environment, access is granted by a dedicated CEs. The status of the evaluation, feedback received from the experiments and the status of the migration will be reviewed, and the status of virtualization of services at CERN will be reported. Beyond this, the migration to a new operating system also offers an excellent opportunity to upgrade the fabric infrastructure used to manage the servers.
Monitoring the efficiency of user jobs Casey, James; Rodrigues, Daniel; Schwickerath, Ulrich ...
Journal of physics. Conference series,
04/2010, Letnik:
219, Številka:
7
Journal Article
Recenzirano
Odprti dostop
Instrumentation of jobs throughout its life-cycle is not obvious, as they are quite independent after being submitted, crossing multiple environments and locations until landing on a worker node. In ...order to measure correctly the resources used at each step, and to compare it with the view from a Fabric Infrastructure, a solution is proposed using Messaging System for the Grids (MSG) for integrating information coming from different sources.
Usage of LSF for batch farms at CERN Schwickerath, U; Lefebure, V
Journal of physics. Conference series,
07/2008, Letnik:
119, Številka:
4
Journal Article
Recenzirano
Odprti dostop
CERN uses Platforms Load Sharing Facility (LSF)1 since 1998 to manage the large batch system installations. Since that time, the farm has increased significantly, and commodity based hardware running ...GNU/Linux has replaced other Unix flavors on specialized hardware. In this paper we will present how the system is set up nowadays. We will briefly report on issues seen in the past, and actions which have been taken to resolve them. In this context the status of the evaluation of the most recent version of this product, LSF 7.0, is presented, and the planned migration scenario is described.
InfiniBand—Experiences at the Forschungszentrum Karlsruhe Schwickerath, Ulrich; Heiss, Andreas
Nuclear instruments & methods in physics research. Section A, Accelerators, spectrometers, detectors and associated equipment,
04/2006, Letnik:
559, Številka:
1
Journal Article
Recenzirano
The Institute for Scientific Computing (IWR) at the Forschungszentrum Karlsruhe has been evaluating the InfiniBand InfiniBand Trade Association, InfiniBand Architecture Specification, Release 1.0, ...October 24, 2000 technology since end of the year 2002. The performance of the interconnect has been tested on different platforms and architectures using MPI. Sequential file transfer performance was measured with the RFIO protocol running on native InfiniBand Ulrich Schwickerath, Andreas Heiss, Nucl. Instr. and Meth. A 534 (2004) 130,
http://www.fzk.de/infiniband, and a newly developed InfiniBand—enabled version of the XROOTD.