An Oracle-based event index for ATLAS Gallas, E J; Dimitrov, G; Vasileva, P ...
Journal of physics. Conference series,
10/2017, Volume:
898, Issue:
4
Journal Article
Peer reviewed
Open access
The ATLAS Eventlndex System has amassed a set of key quantities for a large number of ATLAS events into a Hadoop based infrastructure for the purpose of providing the experiment with a number of ...event-wise services. Collecting this data in one place provides the opportunity to investigate various storage formats and technologies and assess which best serve the various use cases as well as consider what other benefits alternative storage systems provide. In this presentation we describe how the data are imported into an Oracle RDBMS (relational database management system), the services we have built based on this architecture, and our experience with it. We've indexed about 26 billion real data events thus far and have designed the system to accommodate future data which has expected rates of 5 and 20 billion events per year. We have found this system offers outstanding performance for some fundamental use cases. In addition, profiting from the co-location of this data with other complementary metadata in ATLAS, the system has been easily extended to perform essential assessments of data integrity and completeness and to identify event duplication, including at what step in processing the duplication occurred.
Usage of condition data in ATLAS is extensive for offline reconstruction and analysis (e.g. alignment, calibration, data quality). The system is based on the LCG Conditions Database infrastructure, ...with read and write access via an ad hoc C++ API (COOL), a system which was developed before Run 1 data taking began. The infrastructure dictates that the data is organized into separate schemas (assigned to subsystems groups storing distinct and independent sets of conditions), making it difficult to access information from several schemas at the same time. We have thus created PL SQL functions containing queries to provide content extraction at multi-schema level. The PL SQL API has been exposed to external clients by means of a Java application providing DB access via REST services, deployed inside an application server (JBoss WildFly). The services allow navigation over multiple schemas via simple URLs. The data can be retrieved either in XML or JSON formats, via simple clients (like curl or Web browsers).
The ATLAS Conditions Database, based on the LCG Conditions Database infrastructure, contains a wide variety of information needed in online data taking and offline analysis. The total volume of ATLAS ...conditions data is in the multi-Terabyte range. Internally, the active data is divided into 65 separate schemas (each with hundreds of underlying tables) according to overall data taking type, detector subsystem, and whether the data is used offline or strictly online. While each schema has a common infrastructure, each schema's data is entirely independent of other schemas, except at the highest level, where sets of conditions from each subsystem are tagged globally for ATLAS event data reconstruction and reprocessing. The partitioned nature of the conditions infrastructure works well for most purposes, but metadata about each schema is problematic to collect in global tools from such a system because it is only accessible via LCG tools schema by schema. This makes it difficult to get an overview of all schemas, collect interesting and useful descriptive and structural metadata for the overall system, and connect it with other ATLAS systems. This type of global information is needed for time critical data preparation tasks for data processing and has become more critical as the system has grown in size and diversity. Therefore, a new system has been developed to collect metadata for the management of the ATLAS Conditions Database. The structure and implementation of this metadata repository will be described. In addition, we will report its usage since its inception during LHC Run 1, how it has been exploited in the process of conditions data evolution during LSI (the current LHC long shutdown) in preparation for Run 2, and long term plans to incorporate more of its information into future ATLAS Conditions Database tools and the overall ATLAS information infrastructure.
The EventIndex is the complete catalogue of all ATLAS events, keeping the references to all files that contain a given event in any processing stage. It replaces the TAG database, which had been in ...use during LHC Run 1. For each event it contains its identifiers, the trigger pattern and the GUIDs of the files containing it. Major use cases are event picking, feeding the Event Service used on some production sites, and technical checks of the completion and consistency of processing campaigns. The system design is highly modular so that its components (data collection system, storage system based on Hadoop, query web service and interfaces to other ATLAS systems) could be developed separately and in parallel during LSI. The EventIndex is in operation for the start of LHC Run 2. This paper describes the high-level system architecture, the technical design choices and the deployment process and issues. The performance of the data collection and storage systems, as well as the query services, are also reported.
The ATLAS EventIndex and its evolution towards Run 3 Villaplana Perez, M; Alexandrov, E; Aleksandrov, I ...
Journal of physics. Conference series,
04/2020, Volume:
1525, Issue:
1
Journal Article, Conference Proceeding
Peer reviewed
Open access
The ATLAS experiment has produced hundreds of petabytes of data and expects to have one order of magnitude more in the future. This data are spread among hundreds of computing Grid sites around the ...world. The EventIndex is the complete catalogue of all ATLAS events, real and simulated, keeping the references to all permanent files that contain a given event in any processing stage. It provides the means to select and access event data in the ATLAS distributed storage system, and provides support for completeness and consistency checks and trigger and offline selection overlap studies. The EventIndex employs various data handling technologies like Hadoop and Oracle databases, and it is integrated with other parts of the ATLAS distributed computing infrastructure, including systems for data, metadata, and production management. The project has been in operation since the start of LHC Run 2 in 2015, and it is in permanent development in order to satisfy the production and analysis demands and follow technology evolution. The main data store in Hadoop, based on MapFiles and HBase, has worked well during Run 2 but new solutions are being explored for the future. This paper reports on the current system performance and on the studies of a new data storage prototype that can carry the EventIndex through Run 3.
Processing of the large amount of data produced by the ATLAS experiment requires fast and reliable access to what we call Auxiliary Data Files (ADF). These files, produced by Combined Performance, ...Trigger and Physics groups, contain conditions, calibrations, and other derived data used by the ATLAS software. In ATLAS this data has, thus far for historical reasons, been collected and accessed outside the ATLAS Conditions Database infrastructure and related software. For this reason, along with the fact that ADF are effectively read by the software as binary objects, this class of data appears ideal for testing the proposed Run 3 conditions data infrastructure now in development. This paper describes this implementation as well as the lessons learned in exploring and refining the new infrastructure with the potential for deployment during Run 2.
Conditions data (for example: alignment, calibration, data quality) are used extensively in the processing of real and simulated data in ATLAS. The volume and variety of the conditions data needed by ...different types of processing are quite diverse, so optimizing its access requires a careful understanding of conditions usage patterns. These patterns can be quantified by mining representative log files from each type of processing and gathering detailed information about conditions usage for that type of processing into a central repository.
The ATLAS EventIndex System, developed for use in LHC Run 2, is designed to index every processed event in ATLAS, replacing the TAG System used in Run 1. Its storage infrastructure, based on Hadoop ...open-source software framework, necessitates revamping how information in this system relates to other ATLAS systems. It will store more indexes since the fundamental mechanisms for retrieving these indexes will be better integrated into all stages of data processing, allowing more events from later stages of processing to be indexed than was possible with the previous system. Connections with other systems (conditions database, monitoring) are fundamentally critical to assess dataset completeness, identify data duplication, and check data integrity, and also enhance access to information in EventIndex by user and system interfaces. This paper gives an overview of the ATLAS systems involved, the relevant metadata, and describe the technologies we are deploying to complete these connections.
ATLAS maintains a rich corpus of event-by-event information that provides a global view of the billions of events the collaboration has measured or simulated, along with sufficient auxiliary ...information to navigate to and retrieve data for any event at any production processing stage. This unique resource has been employed for a range of purposes, from monitoring, statistics, anomaly detection, and integrity checking, to event picking, subset selection, and sample extraction. Recent years of data-taking provide a foundation for assessment of how this resource has and has not been used in practice, of the uses for which it should be optimized, of how it should be deployed and provisioned for scalability to future data volumes, and of the areas in which enhancements to functionality would be most valuable. This paper describes how ATLAS event-level information repositories and selection infrastructure are evolving in light of this experience, and in view of their expected roles both in wide-area event delivery services and in an evolving ATLAS analysis model in which the importance of efficient selective access to data can only grow.
The ATLAS EventIndex is a data catalogue system that stores event-related metadata for all (real and simulated) ATLAS events, on all processing stages. As it consists of different components that ...depend on other applications (such as distributed storage, and different sources of information) we need to monitor the conditions of many heterogeneous subsystems, to make sure everything is working correctly. This paper describes how we gather information about the EventIndex components and related subsystems: the Producer-Consumer architecture for data collection, health parameters from the servers that run EventIndex components, EventIndex web interface status, and the Hadoop infrastructure that stores EventIndex data. This information is collected, processed, and then displayed using CERN service monitoring software based on the Kibana analytic and visualization package, provided by CERN IT Department. EventIndex monitoring is used both by the EventIndex team and ATLAS Distributed Computing shifts crew.