The Florida CMS Tier2 center, one of the CMS Tier2 centers, has been using the Lustre filesystem for its data storage backend system since 2004. Recently, the data access pattern at our site has ...changed greatly due to various new access methods that include file transfers through the GridFTP servers, read access from the worker nodes, and the remote read access through the xrootd servers. In order to optimize the file access performance, we have to consider all the possible access patterns and each pattern needs to be studied separately. In this presentation, we report on our work to optimize file access with the Lustre filesystem at the Florida CMS T2 using an approach based on analyzing these access patterns.
With the explosion of big data in many fields, the efficient management of knowledge about all aspects of the data analysis gains in importance. A key feature of collaboration in large scale projects ...is keeping a log of what is being done and how - for private use, reuse, and for sharing selected parts with collaborators and peers, often distributed geographically on an increasingly global scale. Even better if the log is automatically created on the fly while the scientist or software developer is working in a habitual way, without the need for extra efforts. This saves time and enables a team to do more with the same resources. The CODESH - COllaborative DEvelopment SHell - and CAVES - Collaborative Analysis Versioning Environment System projects address this problem in a novel way. They build on the concepts of virtual states and transitions to enhance the collaborative experience by providing automatic persistent virtual logbooks. CAVES is designed for sessions of distributed data analysis using the popular ROOT framework, while CODESH generalizes the approach for any type of work on the command line in typical UNIX shells like bash or tcsh. Repositories of sessions can be configured dynamically to record and make available the knowledge accumulated in the course of a scientific or software endeavor. Access can be controlled to define logbooks of private sessions or sessions shared within or between collaborating groups. A typical use case is building working scalable systems for analysis of Petascale volumes of data as encountered in the LHC experiments. Our approach is general enough to find applications in many fields.
We have developed remote data access for large volumes of data over the Wide Area Network based on the Lustre filesystem and Kerberos authentication for security. In this paper we explore a prototype ...for two-step data access from worker nodes at Florida Tier3 centers, located behind a firewall and using a private network, to data hosted on the Lustre filesystem at the University of Florida CMS Tier2 center. At the Tier3 center we use a client which mounts securely the Lustre filesystem and hosts an XrootD server. The worker nodes access the data from the Tier3 client using POSIX compliant tools via the XrootD-fs filesystem. We perform scalability tests with up to 200 jobs running in parallel on the Tier3 worker nodes.
This paper reports the design and implementation of a secure, wide area network (WAN), distributed filesystem by the ExTENCI project (Extending Science Through Enhanced National CyberInfrastructure), ...based on the Lustre filesystem. The system is used for remote access to analysis data from the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC), and from the Lattice Quantum ChromoDynamics (LQCD) project. Security is provided by Kerberos authentication and authorization with additional fine grained control based on Lustre ACLs (Access Control List) and quotas. We investigate the impact of using various Kerberos security flavors on the I/O rates of CMS applications on client nodes reading and writing data to the Lustre filesystem, and on LQCD benchmarks. The clients can be real or virtual nodes. We are investigating additional options for user authentication based on user certificates.
.
Drell–Yan process at LHC,
, is one of the benchmarks for confirmation of Standard Model at TeV energy scale. Since the theoretical prediction for the rate is precise and the final state is clean as ...well as relatively easy to measure, the process can be studied at the LHC even at relatively low luminosity. Importantly, the Drell–Yan process is an irreducible background to several searches of beyond Standard Model physics and hence the rates at LHC energies need to be measured accurately. In the present study, the methods for measurement of the Drell–Yan mass spectrum and the estimation of the cross-section have been developed for LHC operation at the centre-of-mass energy of 10 TeV and an integrated luminosity of 100 pb
− 1
in the context of CMS experiment.
This paper presents our effort to integrate the Lustre filesystem with BeStMan, GridFTP and Ganglia to make it a fully functional WLCG SE (Storage Element). We first describe the configuration of our ...Lustre filesystem at the University of Florida and our integration process. We then present benchmark performance figures and IO rates from the CMS analysis jobs and the WAN data transfer performance that are conducted on the Lustre SE.
We describe the work on creating system images of Lustre virtual clients in the ExTENCI project (Extending Science Through Enhanced National Cyberlnfrastructure), using several virtual technologies ...(Xen, VMware, VirtualBox, KVM). These virtual machines can be built at several levels, from a basic Linux installation (we use Scientific Linux 5 as an example), adding a Lustre client with Kerberos authentication, and up to complete clients including local or distributed (based on CernVM-FS) installations of the full CERN and project specific software stack for typical LHC experiments. The level, and size, of the images are determined by the users on demand. Various sites and individual users can just download and use them out of the box on Linux/UNIX, Windows and Mac OS X based hosts. We compare the performance of virtual clients with that of real physical systems for typical high energy physics applications like Monte Carlo simulations or analysis of data stored in ROOT trees.
This paper presents storage implementations that utilize the Lustre file system for CMS analysis with direct POSIX file access while keeping dCache as the frontend for data distribution and ...management. We describe two implementations that integrate dCache with Lustre and how to enable user data access without going through the dCache file read protocol. Our initial CMS analysis job measurement and transfer performance results are shown and the advantages of different implementations are briefly discussed.
Physics at the Large Hadron Collider (LHC) and the International
e
+
e
-
Linear Collider (ILC) will be complementary in many respects, as has been demonstrated at previous generations of hadron and ...lepton colliders. This report addresses the possible interplay between the LHC and ILC in testing the Standard Model and in discovering and determining the origin of new physics. Mutual benefits for the physics programme at both machines can occur both at the level of a combined interpretation of Hadron Collider and Linear Collider data and at the level of combined analyses of the data, where results obtained at one machine can directly influence the way analyses are carried out at the other machine. Topics under study comprise the physics of weak and strong electroweak symmetry breaking, supersymmetric models, new gauge theories, models with extra dimensions, and electroweak and QCD precision physics. The status of the work that has been carried out within the LHC/ILC Study Group so far is summarized in this report. Possible topics for future studies are outlined.