NUK - logo
E-viri
Recenzirano Odprti dostop
  • Design and Evaluation in a ...
    Andreetto, P; Bauce, M; Bertocco, S; Capannini, F; Cecchi, M; Compostella, G; Dorigo, A; Frizziero, E; Giacomini, F; Gianelle, A; Lucchesi, D; Mezzadri, M; Monforte, S; Prelz, F; Molinari, E; Rebatto, D; Sgaravatto, M; Zangrando, L

    Journal of physics. Conference series, 12/2011, Letnik: 331, Številka: 6
    Journal Article

    The High Throughput Computing paradigm typically involves a scenario whereby a given, estimated processing power is made available and sustained by the computing environment over a medium/long period of time. As a consequence, the performance goals are in general targeted at maximizing resource utilization to obtain the expected throughput, rather than minimizing run time for individual jobs. This does not mean that optimal resource selection through adequate workload management is not desired nor effective, nonetheless, relatively small and pre-assessed percentages of suboptimal choices or unexpected events can be tolerated. However, there are use-cases, among the HEP community, for which the described model does not immediately fit. This paper deals with the workload needs primarily driven by the Collider Detector at Fermilab (CDF) experimental collaboration. In particular, the CDF analysis facility (CAF) typically operates by splitting its computations into so-called sections, which can be seen as sets of uniform and independent jobs. Processing a section cannot be considered completed until all _its jobs have been successfully executed, thus requiring a Minimum Completion Time (MCT) dynamic scheduling policy where not even a single job should lay in non-terminal Grid states. A significant part of the CDF analysis is processed on the European Grid infrastructure through the gLite Workload Management System (WMS) 2. This paper describes the design enhancements and ranking algorithms the WMS has been provided with to implement an adaptive scheduling policy to minimise MCT. Case study, outlined approach and first results are presented.