Abstract
Making general particle transport simulation for high-energy physics (HEP) single-instruction-multiple-thread (SIMT) friendly, to take advantage of accelerator hardware, is an important ...alternative for boosting the throughput of simulation applications. To date, this challenge is not yet resolved, due to difficulties in mapping the complexity of Geant4 components and workflow to the massive parallelism features exposed by graphics processing units (GPU). The AdePT project is one of the R&D initiatives tackling this limitation and exploring GPUs as potential accelerators for offloading some part of the CPU simulation workload. Our main target is to implement a complete electromagnetic shower demonstrator working on the GPU. The project is the first to create a full prototype of a realistic electron, positron, and gamma electromagnetic shower simulation on GPU, implemented as either a standalone application or as an extension of the standard Geant4 CPU workflow. Our prototype currently provides a platform to explore many optimisations and different approaches. We present the most recent results and initial conclusions of our work, using both a standalone GPU performance analysis and a first implementation of a hybrid workflow based on Geant4 on the CPU and AdePT on the GPU.
CernVM is a Virtual Software Appliance capable of running physics applications from the LHC experiments at CERN. It aims to provide a complete and portable environment for developing and running LHC ...data analysis on any end-user computer (laptop, desktop) as well as on the Grid, independently of Operating System platforms (Linux, Windows, MacOS). The experiment application software and its specific dependencies are built independently from CernVM and delivered to the appliance just in time by means of a CernVM File System (CVMFS) specifically designed for efficient software distribution. The procedures for building, installing and validating software releases remains under the control and responsibility of each user community. We provide a mechanism to publish pre-built and configured experiment software releases to a central distribution point from where it finds its way to the running CernVM instances via the hierarchy of proxy servers or content delivery networks. In this paper, we present current state of CernVM project and compare performance of CVMFS to performance of traditional network file system like AFS and discuss possible scenarios that could further improve its performance and scalability.
ALFA: The new ALICE-FAIR software framework Al-Turany, M.; Buncic, P.; Hristov, P. ...
Journal of physics. Conference series,
12/2015, Letnik:
664, Številka:
7
Journal Article
Recenzirano
Odprti dostop
The commonalities between the ALICE and FAIR experiments and their computing requirements led to the development of large parts of a common software framework in an experiment independent way. The ...FairRoot project has already shown the feasibility of such an approach for the FAIR experiments and extending it beyond FAIR to experiments at other facilities1, 2. The ALFA framework is a joint development between ALICE Online- Offline (O2) and FairRoot teams. ALFA is designed as a flexible, elastic system, which balances reliability and ease of development with performance using multi-processing and multithreading. A message- based approach has been adopted; such an approach will support the use of the software on different hardware platforms, including heterogeneous systems. Each process in ALFA assumes limited communication and reliance on other processes. Such a design will add horizontal scaling (multiple processes) to vertical scaling provided by multiple threads to meet computing and throughput demands. ALFA does not dictate any application protocols. Potentially, any content-based processor or any source can change the application protocol. The framework supports different serialization standards for data exchange between different hardware and software languages.
Status and future perspectives of CernVM-FS Blomer, J; Buncic, P; Charalampidis, I ...
Journal of physics. Conference series,
01/2012, Letnik:
396, Številka:
5
Journal Article
Recenzirano
Odprti dostop
The CernVM File System (CernVM-FS) provides a scalable, reliable and close to zero-maintenance software distribution service. It was developed to assist HEP collaborations to deploy their software on ...the worldwide-distributed computing infrastructure used to run their data processing applications. CernVM-FS is deployed on a wide range of computing resources, ranging from powerful worker nodes at Tier 1 grid sites to simple virtual appliances running on volunteer computers. We present a new approach to stage updates and changes into the file system, which aims to reduce the delay in distributing a software release to less than an hour. In addition, it significantly reduces the complexity with respect to both required capabilities of the master storage as well as installation and maintenance. Furthermore, we will discuss new requirements for additional features on the client side that have arisen from HEP community feedback.
The CernVM File System (CernVM-FS) is a snapshotting read-only file system designed to deliver software to grid worker nodes over HTTP in a fast, scalable and reliable way. In recent years it became ...the de-facto standard method to distribute HEP experiment software in the WLCG and starts to be adopted by other grid computing communities outside HEP. This paper focusses on the recent developments of the CernVM-FS Server, the central publishing point of new file system snapshots. Using a union file system, the CernVM-FS Server allows for direct manipulation of a (normally read-only) CernVM-FS volume with copy-on-write semantics. Eventually the collected changeset is transformed into a new CernVM-FS snapshot, constituting a transactional feedback loop. The generated repository data is pushed into a content addressable storage requiring only a RESTful interface and gets distributed through a hierarchy of caches to individual grid worker nodes. Additonally we describe recent features, such as file chunking, repository garbage collection and file system history that enable CernVM- FS for a wider range of use cases.
Open access is one of the important leverages for long-term data preservation for a HEP experiment. To guarantee the usability of data analysis tools beyond the experiment lifetime it is crucial that ...third party users from the scientific community have access to the data and associated software. The ALICE Collaboration has developed a layer of lightweight components built on top of virtualization technology to hide the complexity and details of the experiment-specific software. Users can perform basic analysis tasks within CernVM, a lightweight generic virtual machine, paired with an ALICE specific contextualization. Once the virtual machine is launched, a graphical user interface is automatically started without any additional configuration. This interface allows downloading the base ALICE analysis software and running a set of ALICE analysis modules. Currently the available tools include fully documented tutorials for ALICE analysis, such as the measurement of strange particle production or the nuclear modification factor in Pb-Pb collisions. The interface can be easily extended to include an arbitrary number of additional analysis modules. We present the current status of the tools used by ALICE through the CERN open access portal, and the plans for future extensions of this system.
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the ...fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(102) sites, O(105) cores, O(108) jobs per year, O(103) users, and ATLAS data volume is O(1017) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled 'Next Generation Workload Management and Analysis System for Big Data' (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. We will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.
During the last years, several Grid computing centres chose virtualization as a better way to manage diverse use cases with self-consistent environments on the same bare infrastructure. The maturity ...of control interfaces (such as OpenNebula and OpenStack) opened the possibility to easily change the amount of resources assigned to each use case by simply turning on and off virtual machines. Some of those private clouds use, in production, copies of the Virtual Analysis Facility, a fully virtualized and self-contained batch analysis cluster capable of expanding and shrinking automatically upon need: however, resources starvation occurs frequently as expansion has to compete with other virtual machines running long-living batch jobs. Such batch nodes cannot relinquish their resources in a timely fashion: the more jobs they run, the longer it takes to drain them and shut off, and making one-job virtual machines introduces a non-negligible virtualization overhead. By improving several components of the Virtual Analysis Facility we have realized an experimental "Docked" Analysis Facility for ALICE, which leverages containers instead of virtual machines for providing performance and security isolation. We will present the techniques we have used to address practical problems, such as software provisioning through CVMFS, as well as our considerations on the maturity of containers for High Performance Computing. As the abstraction layer is thinner, our Docked Analysis Facilities may feature a more fine-grained sizing, down to single-job node containers: we will show how this approach will positively impact automatic cluster resizing by deploying lightweight pilot containers instead of replacing central queue polls.
The traditional virtual machine (VM) building and and deployment process is centered around the virtual machine hard disk image. The packages comprising the VM operating system are carefully ...selected, hard disk images are built for a variety of different hypervisors, and images have to be distributed and decompressed in order to instantiate a virtual machine. Within the HEP community, the CernVM File System (CernVM-FS) has been established in order to decouple the distribution from the experiment software from the building and distribution of the VM hard disk images. We show how to get rid of such pre-built hard disk images altogether. Due to the high requirements on POSIX compliance imposed by HEP application software, CernVM-FS can also be used to host and boot a Linux operating system. This allows the use of a tiny bootable CD image that comprises only a Linux kernel while the rest of the operating system is provided on demand by CernVM-FS. This approach speeds up the initial instantiation time and reduces virtual machine image sizes by an order of magnitude. Furthermore, security updates can be distributed instantaneously through CernVM-FS. By leveraging the fact that CernVM-FS is a versioning file system, a historic analysis environment can be easily re-spawned by selecting the corresponding CernVM-FS file system snapshot.
Lately, there is a trend in scientific projects to look for computing resources in the volunteering community. In addition, to reduce the development effort required to port the scientific software ...stack to all the known platforms, the use of Virtual Machines (VMs)u is becoming increasingly popular. Unfortunately their use further complicates the software installation and operation, restricting the volunteer audience to sufficiently expert people. CernVM WebAPI is a software solution addressing this specific case in a way that opens wide new application opportunities. It offers a very simple API for setting-up, controlling and interfacing with a VM instance in the users computer, while in the same time offloading the user from all the burden of downloading, installing and configuring the hypervisor. WebAPI comes with a lightweight javascript library that guides the user through the application installation process. Malicious usage is prohibited by offering a per-domain PKI validation mechanism. In this contribution we will overview this new technology, discuss its security features and examine some test cases where it is already in use.