The LHCb experiment has been using the CMT build and configuration tool for its software since the first versions, mainly because of its multi-platform build support and its powerful configuration ...management functionality. Still, CMT has some limitations in terms of build performance and the increased complexity added to the tool to cope with new use cases added recently. Therefore, we have been looking for a viable alternative and we have investigated the possibility of adopting the CMake tool, which does a very good job for building and is getting very popular in the HEP community. The result of this study is a CMake-based framework which provides most of the special configuration features available natively only in CMT, with the advantages of better performances, flexibility and portability.
The LHCb Conditions Database project provides the necessary tools to handle non-event time-varying data. The main users of conditions are reconstruction and analysis processes, which are running on ...the Grid. To allow efficient access to the data, we need to use a synchronized replica of the content of the database located at the same site as the event data file, i.e. the LHCb Tier1. The replica to be accessed is selected from information stored on LFC (LCG File Catalog) and managed with the interface provided by the LCG developed library CORAL. The plan to limit the submission of jobs to those sites where the required conditions are available will also be presented. LHCb applications are using the Conditions Database framework on a production basis since March 2007. We have been able to collect statistics on the performance and effectiveness of both the LCG library COOL (the library providing conditions handling functionalities) and the distribution framework itself. Stress tests on the CNAF hosted replica of the Conditions Database have been performed and the results will be summarized here.
The purpose of this paper is to identify a set of steps leading to an improved interface for LHCb's Nightly Builds Dashboard. The goal is to have an efficient application that meets the needs of both ...the project developers, by providing them with a user friendly interface, as well as those of the computing team supporting the system, by providing them with a dashboard allowing for better monitoring of the build job themselves. In line with what is already used by LHCb, the web interface has been implemented with the Flask Python framework for future maintainability and code clarity. The Database chosen to host the data is the schema-less CouchDB7, serving the purpose of flexibility in document form changes. To improve the user experience, we use JavaScript libraries such as JQuery11.
HEP experiments produce enormous data sets at an ever-growing rate. To cope with the challenge posed by these data sets, experiments' software needs to embrace all capabilities modern CPUs offer. ...With decreasing memory core ratio, the one-process-per-core approach of recent years becomes less feasible. Instead, multi-threading with fine-grained parallelism needs to be exploited to benefit from memory sharing among threads. Gaudi is an experiment-independent data processing framework, used for instance by the ATLAS and LHCbexperiments at CERN's Large Hadron Collider. It has originally been designed with only sequential processing in mind. In a recent effort, the frame work has been extended to allow for multi-threaded processing. This includes components for concurrent scheduling of several algorithms - either processingthe same or multiple events, thread-safe data store access and resource management. In the sequential case, the relationships between algorithms are encoded implicitly in their pre-determined execution order. For parallel processing, these relationships need to be expressed explicitly, in order for the scheduler to be able to exploit maximum parallelism while respecting dependencies between algorithms. Therefore, means to express and automatically track these dependencies need to be provided by the framework. In this paper, we present components introduced to express and track dependencies of algorithms to deduce a precedence-constrained directed acyclic graph, which serves as basis for our algorithmically sophisticated scheduling approach for tasks with dynamic priorities. We introduce an incremental migration path for existing experiments towards parallel processing and highlight the benefits of explicit dependencies even in the sequential case, such as sanity checks and sequence optimization by graph analysis.
Preparing HEP software for concurrency Clemencic, M; Hegner, B; Mato, P ...
Journal of physics. Conference series,
01/2014, Letnik:
513, Številka:
5
Journal Article
Recenzirano
Odprti dostop
The necessity for thread-safe experiment software has recently become very evident, largely driven by the evolution of CPU architectures towards exploiting increasing levels of parallelism. For ...high-energy physics this represents a real paradigm shift, as concurrent programming was previously only limited to special, well-defined domains like control software or software framework internals. This paradigm shift, however, falls into the middle of the successful LHC programme and many million lines of code have already been written without the need for parallel execution in mind. In this paper we have a closer look at the offline processing applications of the LHC experiments and their readiness for the many-core era. We review how previous design choices impact the move to concurrent programming. We present our findings on transforming parts of the LHC experiment reconstruction software to thread-safe code, and the main design patterns that have emerged during the process. A plethora of parallel-programming patterns are well known outside the HEP community, but only a few have turned out to be straightforward enough to be suited for non-expert physics programmers. Finally, we propose a potential strategy for the migration of existing HEP experiment software to the many-core era.
The data processing model of the LHCb experiment implies handling of an evolving set of heterogeneous metadata entities and relationships between them. The entities range from software and databases ...states to architecture specificators and software/data deployment locations. For instance, there is an important relationship between the LHCb Conditions Database (CondDB), which provides versioned, time dependent geometry and conditions data, and the LHCb software, which is the data processing applications (used for simulation, high level triggering, reconstruction and analysis of physics data). The evolution of CondDB and of the LHCb applications is a weakly-homomorphic process. It means that relationships between a CondDB state and LHCb application state may not be preserved across different database and application generations. These issues may lead to various kinds of problems in the LHCb production, varying from unexpected application crashes to incorrect data processing results. In this paper we present Ariadne – a generic metadata relationships tracking system based on the novel NoSQL Neo4j graph database. Its aim is to track and analyze many thousands of evolving relationships for cases such as the one described above, and several others, which would otherwise remain unmanaged and potentially harmful. The highlights of the paper include the system's implementation and management details, infrastructure needed for running it, security issues, first experience of usage in the LHCb production and potential of the system to be applied to a wider set of LHCb tasks.
A common code repository is of primary importance in a distributed development environment such as large HEP experiments. CVS (Concurrent Versions System) has been used in the past years at CERN for ...the hosting of shared software repositories, among which were the repositories for the Gaudi Framework and the LHCb software projects. Many developers around the world produced alternative systems to share code and revisions among several developers, mainly to overcome the limitations in CVS, and CERN has recently started a new service for code hosting based on the version control system Subversion. The differences between CVS and Subversion and the way the code was organized in Gaudi and LHCb CVS repositories required careful study and planning of the migration. Special care was used to define the organization of the new Subversion repository. To avoid as much as possible disruption in the development cycle, the migration has been gradual with the help of tools developed explicitly to hide the differences between the two systems. The principles guiding the migration steps, the organization of the Subversion repository and the tools developed will be presented, as well as the problems encountered both from the librarian and the user points of view.
CORAL and COOL during the LHC long shutdown Valassi, A; Clemencic, M; Dykstra, D ...
Journal of physics. Conference series,
01/2014, Letnik:
513, Številka:
4
Journal Article
Recenzirano
Odprti dostop
CORAL and COOL are two software packages used by the LHC experiments for managing detector conditions and other types of data using relational database technologies. They have been developed and ...maintained within the LCG Persistency Framework, a common project of the CERN IT department with ATLAS, CMS and LHCb. This presentation reports on the status of CORAL and COOL at the time of CHEP2013, covering the new features and enhancements in both packages, as well as the changes and improvements in the software process infrastructure. It also reviews the usage of the software in the experiments and the outlook for ongoing and future activities during the LHC long shutdown (LS1) and beyond.
The Conditions Database (CondDB) of the LHCb experiment provides versioned, time dependent geometry and conditions data for all LHCb data processing applications (simulation, high level trigger ...(HLT), reconstruction, analysis) in a heterogeneous computing environment ranging from user laptops to the HLT farm and the Grid. These different use cases impose front-end support for multiple database technologies (Oracle and SQLite are used). Sophisticated distribution tools are required to ensure timely and robust delivery of updates to all environments. The content of the database has to be managed to ensure that updates are internally consistent and externally compatible with multiple versions of the physics application software. In this paper we describe three systems that we have developed to address these issues. The first system is a CondDB state tracking extension to the Oracle 3D Streams replication technology, to trap cases when the CondDB replication was corrupted. Second, an automated distribution system for the SQLite-based CondDB, providing also smart backup and checkout mechanisms for the CondDB managers and LHCb users respectively. And, finally, a system to verify and monitor the internal (CondDB self-consistency) and external (LHCb physics software vs. CondDB) compatibility. The former two systems are used in production in the LHCb experiment and have achieved the desired goal of higher flexibility and robustness for the management and operation of the CondDB. The latter one has been fully designed and is passing currently to the implementation stage.
In the LHCb experiment a wide variety of Monte Carlo simulated samples needs to be produced for the experiment's physics program. Monte Carlo productions are handled centrally similarly to all ...massive processing of data in the experiment. In order to cope with the large set of different types of simulation samples, necessary procedures based on common infrastructures have been set up with a numerical event type identification code used throughout. The various elements in the procedure, from writing a configuration for an event type to deploying them on the production environment, from submitting and processing a request to retrieving the sample produced as well as the conventions established to allow their interplay will be described. The choices made have allowed a high level of automation of Monte Carlo productions that are handled centrally in a transparent way with experts concentrating on their specific tasks. As a result the massive Monte Carlo production of the experiment is efficiently processed on a world-wide distributed system with minimal manpower.