The CMS experiment has collected an enormous volume of metadata about its computing operations in its monitoring systems, describing its experience in operating all of the CMS workflows on all of the ...Worldwide LHC Computing Grid Tiers. Data mining efforts into all these information have rarely been done, but are of crucial importance for a better understanding of how CMS did successful operations, and to reach an adequate and adaptive modelling of the CMS operations, in order to allow detailed optimizations and eventually a prediction of system behaviours. These data are now streamed into the CERN Hadoop data cluster for further analysis. Specific sets of information (e.g. data on how many replicas of datasets CMS wrote on disks at WLCG Tiers, data on which datasets were primarily requested for analysis, etc) were collected on Hadoop and processed with MapReduce applications profiting of the parallelization on the Hadoop cluster. We present the implementation of new monitoring applications on Hadoop, and discuss the new possibilities in CMS computing monitoring introduced with the ability to quickly process big data sets from mulltiple sources, looking forward to a predictive modeling of the system.
The CMS Data Management System Giffels, M; Guo, Y; Kuznetsov, V ...
Journal of physics. Conference series,
01/2014, Letnik:
513, Številka:
4
Journal Article
Recenzirano
Odprti dostop
The data management elements in CMS are scalable, modular, and designed to work together. The main components are PhEDEx, the data transfer and location system; the Data Booking Service (DBS), a ...metadata catalog; and the Data Aggregation Service (DAS), designed to aggregate views and provide them to users and services. Tens of thousands of samples have been cataloged and petabytes of data have been moved since the run began. The modular system has allowed the optimal use of appropriate underlying technologies. In this contribution we will discuss the use of both Oracle and NoSQL databases to implement the data management elements as well as the individual architectures chosen. We will discuss how the data management system functioned during the first run, and what improvements are planned in preparation for 2015.
During the first LHC run, the CMS experiment collected tens of Petabytes of collision and simulated data, which need to be distributed among dozens of computing centres with low latency in order to ...make efficient use of the resources. While the desired level of throughput has been successfully achieved, it is still common to observe transfer workflows that cannot reach full completion in a timely manner due to a small fraction of stuck files which require operator intervention. For this reason, in 2012 the CMS transfer management system, PhEDEx, was instrumented with a monitoring system to measure file transfer latencies, and to predict the completion time for the transfer of a data set. The operators can detect abnormal patterns in transfer latencies while the transfer is still in progress, and monitor the long-term performance of the transfer infrastructure to plan the data placement strategy. Based on the data collected for one year with the latency monitoring system, we present a study on the different factors that contribute to transfer completion time. As case studies, we analyze several typical CMS transfer workflows, such as distribution of collision event data from CERN or upload of simulated event data from the Tier-2 centres to the archival Tier-1 centres. For each workflow, we present the typical patterns of transfer latencies that have been identified with the latency monitor. We identify the areas in PhEDEx where a development effort can reduce the latency, and we show how we are able to detect stuck transfers which need operator intervention. We propose a set of metrics to alert about stuck subscriptions and prompt for manual intervention, with the aim of improving transfer completion times.
Tier-2 to Tier-2 data transfers have been identified as a necessary extension of the CMS computing model. The Debugging Data Transfers (DDT) Task Force in CMS was charged with commissioning Tier-2 to ...Tier-2 PhEDEx transfer links beginning in late 2009, originally to serve the needs of physics analysis groups for the transfer of their results between the storage elements of the Tier-2 sites associated with the groups. PhEDEx is the data transfer middleware of the CMS experiment. For analysis jobs using CRAB, the CMS Remote Analysis Builder, the challenges of remote stage out of job output at the end of the analysis jobs led to the introduction of a local fallback stage out, and will eventually require the asynchronous transfer of user data over essentially all of the Tier-2 to Tier-2 network using the same PhEDEx infrastructure. In addition, direct file sharing of physics and Monte Carlo simulated data between Tier-2 sites can relieve the operational load of the Tier-1 sites in the original CMS Computing Model, and already represents an important component of CMS PhEDEx data transfer volume. The experience, challenges and methods used to debug and commission the thousands of data transfers links between CMS Tier-2 sites world-wide are explained and summarized. The resulting operational experience with Tier-2 to Tier-2 transfers is also presented.
During the LHC Run-1 data taking, all experiments collected large data volumes from proton-proton and heavy-ion collisions. The collisions data, together with massive volumes of simulated data, were ...replicated in multiple copies, transferred among various Tier levels, transformed slimmed in format content. These data were then accessed (both locally and remotely) by large groups of distributed analysis communities exploiting the WorldWide LHC Computing Grid infrastructure and services. While efficient data placement strategies - together with optimal data redistribution and deletions on demand - have become the core of static versus dynamic data management projects, little effort has so far been invested in understanding the detailed data-access patterns which surfaced in Run-1. These patterns, if understood, can be used as input to simulation of computing models at the LHC, to optimise existing systems by tuning their behaviour, and to explore next-generation CPU storage network co-scheduling solutions. This is of great importance, given that the scale of the computing problem will increase far faster than the resources available to the experiments, for Run-2 and beyond. Studying data-access patterns involves the validation of the quality of the monitoring data collected on the "popularity of each dataset, the analysis of the frequency and pattern of accesses to different datasets by analysis end-users, the exploration of different views of the popularity data (by physics activity, by region, by data type), the study of the evolution of Run-1 data exploitation over time, the evaluation of the impact of different data placement and distribution choices on the available network and storage resources and their impact on the computing operations. This work presents some insights from studies on the popularity data from the CMS experiment. We present the properties of a range of physics analysis activities as seen by the data popularity, and make recommendations for how to tune the initial distribution of data in anticipation of how it will be used in Run-2 and beyond.
CMS computing needs reliable, stable and fast connections among multi-tiered computing infrastructures. For data distribution, the CMS experiment relies on a data placement and transfer system, ...PhEDEx, managing replication operations at each site in the distribution network. PhEDEx uses the File Transfer Service (FTS), a low level data movement service responsible for moving sets of files from one site to another, while allowing participating sites to control the network resource usage. FTS servers are provided by Tier-0 and Tier-1 centres and are used by all computing sites in CMS, according to the established policy. FTS needs to be set up according to the Grid site's policies, and properly configured to satisfy the requirements of all Virtual Organizations making use of the Grid resources at the site. Managing the service efficiently requires good knowledge of the CMS needs for all kinds of transfer workflows. This contribution deals with a revision of FTS servers used by CMS, collecting statistics on their usage, customizing the topologies and improving their setup in order to keep CMS transferring data at the desired levels in a reliable and robust way.
FTS3: Quantitative Monitoring Riahi, H; Salichos, M; Keeble, O ...
Journal of physics. Conference series,
01/2015, Letnik:
664, Številka:
6
Journal Article
Recenzirano
Odprti dostop
The overall success of LHC data processing depends heavily on stable, reliable and fast data distribution. The Worldwide LHC Computing Grid (WLCG) relies on the File Transfer Service (FTS) as the ...data movement middleware for moving sets of files from one site to another. This paper describes the components of FTS3 monitoring infrastructure and how they are built to satisfy the common and particular requirements of the LHC experiments. We show how the system provides a complete and detailed cross-virtual organization (VO) picture of transfers for sites, operators and VOs. This information has proven critical due to the shared nature of the infrastructure, allowing a complete view of all transfers on shared network links between various workflows and VOs using the same FTS transfer manager. We also report on the performance of the FTS service itself, using data generated by the aforementioned monitoring infrastructure both during the commissioning and the first phase of production. We also explain how this monitoring information and network metrics produced can be used both as a starting point for troubleshooting data transfer issues, but also as a mechanism to collect information such as transfer efficiency between sites, achieved throughput and its evolution over time, most common errors, etc, and take decision upon them to further optimize transfer workflows. The service setup is subject to sites policies to control the network resource usage, as well as all the VOs making use of the Grid resources at the site to satisfy their requirements. FTS3 is the new version of FTS and has been deployed in production in August 2014.
PhEDEx is the data-movement solution for CMS at the LHC. Created in 2004, it is now one of the longest-lived components of the CMS dataflow/workflow world. As such, it has undergone significant ...evolution over time, and continues to evolve today, despite being a fully mature system. Originally a toolkit of agents and utilities dedicated to specific tasks, it is becoming a more open framework that can be used in several ways, both within and beyond its original problem domain. In this talk we describe how a combination of refactoring and adoption of new technologies that have become available over the years have made PhEDEx more flexible, maintainable, and scaleable.
PhEDEx has been serving CMS community since 2004 as the data broker. Every PhEDEx operation is initiated by a request, e.g. request to move or to delete data, and so on. A request has it own life ...cycle, including creation, approval, notification, and book keeping and the details depend on its type. Currently, only two kinds of requests, transfer and deletion, are fully integrated in PhEDEx. They are tailored specifically to the operations' workflows. To be able to serve a new type of request it generally means a fair amount of development work. After several many years of operation, we have gathered enough experience to rethink the request handling in PhEDEx. Generalized Request Project is set to abstract such experience and design a request system which is not tied into current workflow yet it is general enough to accommodate current and future requests. The challenges are dealing with different stages in a request's life cycle, complexity of approval process and complexity of the ability and authority associated with each role in the context of the request. We start with a high level abstraction driven by a deterministic finite automata, followed by a formal description and handling of approval process, followed by a set of tools that make such system friendly to the users. Since we have a formal way to describe the life of a request and a mechanism to systematically handle it, to serve a new kind of request is merely a configuration issue, adding the description of the new request rather than development effort. In this paper, we share the design and implementation of a generalized request framework and the experience of taking an existing serving system through a re-design and re-deployment.
The CMS experiment at the LHC relies on 7 Tier-1 centres of the WLCG to perform the majority of its bulk processing activity, and to archive its data. During the first run of the LHC, these two ...functions were tightly coupled as each Tier-1 was constrained to process only the data archived on its hierarchical storage. This lack of flexibility in the assignment of processing workflows occasionally resulted in uneven resource utilisation and in an increased latency in the delivery of the results to the physics community. The long shutdown of the LHC in 2013-2014 was an opportunity to revisit this mode of operations, disentangling the processing and archive functionalities of the Tier-1 centres. The storage services at the Tier-1s were redeployed breaking the traditional hierarchical model: each site now provides a large disk storage to host input and output data for processing, and an independent tape storage used exclusively for archiving. Movement of data between the tape and disk endpoints is not automated, but triggered externally through the WLCG transfer management systems. With this new setup, CMS operations actively controls at any time which data is available on disk for processing and which data should be sent to archive. Thanks to the high-bandwidth connectivity guaranteed by the LHCOPN, input data can be freely transferred between disk endpoints as needed to take advantage of free CPU, turning the Tier-1s into a large pool of shared resources. The output data can be validated before archiving them permanently, and temporary data formats can be produced without wasting valuable tape resources. Finally, the data hosted on disk at Tier-1s can now be made available also for user analysis since there is no risk any longer of triggering chaotic staging from tape. In this contribution, we describe the technical solutions adopted for the new disk and tape endpoints at the sites, and we report on the commissioning and scale testing of the service. We detail the procedures implemented by CMS computing operations to actively manage data on disk at Tier-1 sites, and we give examples of the benefits brought to CMS workflows by the additional flexibility of the new system.