ALICE Expert System Ionita, C; Carena, F
Journal of physics. Conference series,
01/2014, Letnik:
523, Številka:
1
Journal Article
Recenzirano
Odprti dostop
The ALICE experiment at CERN employs a number of human operators (shifters), who have to make sure that the experiment is always in a state compatible with taking Physics data. Given the complexity ...of the system and the myriad of errors that can arise, this is not always a trivial task. The aim of this paper is to describe an expert system that is capable of assisting human shifters in the ALICE control room. The system diagnoses potential issues and attempts to make smart recommendations for troubleshooting. At its core, a Prolog engine infers whether a Physics or a technical run can be started based on the current state of the underlying sub-systems. A separate C++ component queries certain SMI objects and stores their state as facts in a Prolog knowledge base. By mining the data stored in different system logs, the expert system can also diagnose errors arising during a run. Currently the system is used by the on-call experts for faster response times, but we expect it to be adopted as a standard tool by regular shifters during the next data taking period.
ALICE (A Large Ion Collider Experiment) is a heavy-ion detector studying the physics of strongly interacting matter and the quark-gluon plasma at the CERN LHC (Large Hadron Collider). The ALICE ...Data-AcQuisition (DAQ) system handles the data flow from the sub-detector electronics to the permanent data storage in the CERN computing center. The DAQ farm consists of about 1000 devices of many different types ranging from direct accessible machines to storage arrays and custom optical links. The system performance monitoring tool used during the LHC run 1 will be replaced by a new tool for run 2. This paper shows the results of an evaluation that has been conducted on six publicly available monitoring tools. The evaluation has been carried out by taking into account selection criteria such as scalability, flexibility, reliability as well as data collection methods and display. All the tools have been prototyped and evaluated according to those criteria. We will describe the considerations that have led to the selection of the Zabbix monitoring tool for the DAQ farm. The results of the tests conducted in the ALICE DAQ laboratory will be presented. In addition, the deployment of the software on the DAQ machines in terms of metrics collected and data collection methods will be described. We will illustrate how remote nodes are monitored with Zabbix by using SNMP-based agents and how DAQ specific metrics are retrieved and displayed. We will also show how the monitoring information is accessed and made available via the graphical user interface and how Zabbix communicates with the other DAQ online systems for notification and reporting.
A Large Ion Collider Experiment (ALICE) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN Large Hadron Collider (LHC). The ...online Data Quality Monitoring (DQM) plays an essential role in the experiment operation by providing shifters with immediate feedback on the data being recorded in order to quickly identify and overcome problems. An immediate access to the DQM results is needed not only by shifters in the control room but also by detector experts worldwide. As a consequence, a new web application has been developed to dynamically display and manipulate the ROOT-based objects produced by the DQM system in a flexible and user friendly interface. The architecture and design of the tool, its main features and the technologies that were used, both on the server and the client side, are described. In particular, we detail how we took advantage of the most recent ROOT JavaScript I O and web server library to give interactive access to ROOT objects stored in a database. We describe as well the use of modern web techniques and packages such as AJAX, DHTMLX and jQuery, which has been instrumental in the successful implementation of a reactive and efficient application. We finally present the resulting application and how code quality was ensured. We conclude with a roadmap for future technical and functional developments.
The ALICE DAQ infoLogger Chapeland, S; Carena, F; Carena, W ...
Journal of physics. Conference series,
01/2014, Letnik:
513, Številka:
1
Journal Article
Recenzirano
Odprti dostop
ALICE (A Large Ion Collider Experiment) is a heavy-ion experiment studying the physics of strongly interacting matter and the quark-gluon plasma at the CERN LHC (Large Hadron Collider). The ALICE DAQ ...(Data Acquisition System) is based on a large farm of commodity hardware consisting of more than 600 devices (Linux PCs, storage, network switches). The DAQ reads the data transferred from the detectors through 500 dedicated optical links at an aggregated and sustained rate of up to 10 Gigabytes per second and stores at up to 2.5 Gigabytes per second. The infoLogger is the log system which collects centrally the messages issued by the thousands of processes running on the DAQ machines. It allows to report errors on the fly, and to keep a trace of runtime execution for later investigation. More than 500000 messages are stored every day in a MySQL database, in a structured table keeping track for each message of 16 indexing fields (e.g. time, host, user, ...). The total amount of logs for 2012 exceeds 75GB of data and 150 million rows. We present in this paper the architecture and implementation of this distributed logging system, consisting of a client programming API, local data collector processes, a central server, and interactive human interfaces. We review the operational experience during the 2012 run, in particular the actions taken to ensure shifters receive manageable and relevant content from the main log stream. Finally, we present the performance of this log system, and future evolutions.
The ALICE data quality monitoring system Haller, B von; Telesca, A; Chapeland, S ...
Journal of physics. Conference series,
12/2011, Letnik:
331, Številka:
2
Journal Article
Recenzirano
Odprti dostop
ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN Large Hadron Collider (LHC). The ...online Data Quality Monitoring (DQM) is a key element of the Data Acquisition's software chain. It provide shifters with precise and complete information to quickly identify and overcome problems, and as a consequence to ensure acquisition of high quality data. DQM typically involves the online gathering, the analysis by user-defined algorithms and the visualization of monitored data. This paper describes the final design of ALICE'S DQM framework called AMORE (Automatic MOnitoRing Environment), as well as its latest and coming features like the integration with the offline analysis and reconstruction framework, a better use of multi-core processors by a parallelization effort, and its interface with the eLogBook. The concurrent collection and analysis of data in an online environment requires the framework to be highly efficient, robust and scalable. We will describe what has been implemented to achieve these goals and the procedures we follow to ensure appropriate robustness and performance. We finally review the wide range of usages people make of this framework, from the basic monitoring of a single sub-detector to the most complex ones within the High Level Trigger farm or using the Prompt Reconstruction and we describe the various ways of accessing the monitoring results. We conclude with our experience, before and after the LHC startup, when monitoring the data quality in a challenging environment.
ALICE experiment web site, May 10, 2009
〈
http://public.web.cern.ch/public/en/LHC/ALICEen.html
〉
(A Large Ion Collider Experiment) is an experiment at the LHC (Large Hadron Collider) optimized for ...the study of heavy-ion collisions.
The main aim of the experiment is to study the behavior of strongly interaction matter and quark gluon plasma. The ALICE DAQ (Data Acquisition) system has been deployed and used intensively during the commissioning of the experiment. This paper will present the evolution of one particular area of the system: the detector readout.
The data produced by each detector are received by DATE ALICE data acquisition web site, May 10, 2009
〈
http://phdepaid.web.cern.ch/phdepaid/
〉
(ALICE Data Acquisition program) using PCI (Peripheral Component Interconnect) based card called D-RORC web site, May 10, 2009
〈
http://alice-proj-ddl.web.cern.ch/alice-proj-ddl/rorc_intro.html
〉
(DAQ Readout Receiver Card). Of the order of 400 of these cards are installed in the PCs of the DAQ farm, and they are connected by optical links called DDL web site, May 10, 2009
〈
http://alice-proj-ddl.web.cern.ch/alice-proj-ddl/ddl_intro.html
〉
(Detector Data Link) to the detector readout electronics. The D-RORC is controlled by the readout software, part of the DATE program that reads the events coming from these cards. We will present the results obtained during the performance tests of the new release of the D-RORC, PCI Express, in development at CERN. The paper will review the working principles of the D-RORC, its use by the readout software and the benefits in using PCI Express instead PCI-X. It will also introduce the work in progress for the new release of the readout software towards the next hardware platform based on 64-bit computer architecture and DDLs based on 10G Ethernet.
ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the Quark-Gluon Plasma at the CERN Large Hadron Collider (LHC). A ...large bandwidth and flexible Data-Acquisition System (DAQ) has been designed and deployed to collect sufficient statistics in the short running time available per year for heavy ions and to accommodate very different requirements originating from the 18 sub-detectors. After several months of data taking with beam, lots of experience has been accumulated and some important developments have been initiated in order to evolve towards a more automated and reliable experiment. We will present the experience accumulated so far and the new developments. Several upgrades of existing ALICE detectors or addition of new ones have also been proposed with a significant impact on the DAQ. We will review these proposals, their implication for the DAQ and the way they will be addressed.
Preparing the ALICE DAQ upgrade Carena, F; Carena, W; Chapeland, S ...
Journal of physics. Conference series,
01/2012, Letnik:
396, Številka:
1
Journal Article
Recenzirano
Odprti dostop
In November 2009, after 15 years of design and installation, the ALICE experiment started to detect and record the first collisions produced by the LHC. It has been collecting hundreds of millions of ...events ever since with both proton and heavy ion collisions. The future scientific programme of ALICE has been refined following the first year of data taking. The physics targeted beyond 2018 will be the study of rare signals. Several detectors will be upgraded, modified, or replaced to prepare ALICE for future physics challenges. An upgrade of the triggering and readout systems is also required to accommodate the needs of the upgraded ALICE and to better select the data of the rare physics channels. The ALICE upgrade will have major implications in the detector electronics and controls, data acquisition, event triggering and offline computing and storage systems. Moreover, the experience accumulated during more than two years of operation has also lead to new requirements for the control software. We will review all these new needs and the current R&D activities to address them. Several papers of the same conference present in more details some elements of the ALICE online system.
ALICE moves into warp drive Carena, F; Carena, W; Chapeland, S ...
Journal of physics. Conference series,
01/2012, Letnik:
396, Številka:
1
Journal Article
Recenzirano
Odprti dostop
A Large Ion Collider Experiment (ALICE) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN Large Hadron Collider (LHC). ...Since its successful start-up in 2010, the LHC has been performing outstandingly, providing to the experiments long periods of stable collisions and an integrated luminosity that greatly exceeds the planned targets. To fully explore these privileged conditions, we aim at maximizing the experiment's data taking productivity during stable collisions. We present in this paper the evolution of the online systems towards helping us understand reasons of inefficiency and address new requirements. This paper describes the features added to the ALICE Electronic Logbook (eLogbook) to allow the Run Coordination team to identify, prioritize, fix and follow causes of inefficiency in the experiment. Thorough monitoring of the data taking efficiency provides reports for the collaboration to portray its evolution and evaluate the measures (fixes and new features) taken to increase it. In particular, the eLogbook helps decision making by providing quantitative input, which can be used to better balance risks of changes in the production environment against potential gains in quantity and quality of physics data. It will also present the evolution of the Experiment Control System (ECS) to allow on-the-fly error recovery actions of the detector apparatus while limiting as much as possible the loss of integrated luminosity. The paper will conclude with a review of the ALICE efficiency so far and the future plans to improve its monitoring.