DIKUL - logo
E-resources
Peer reviewed Open access
  • Disaster recovery of the IN...
    Luca dell’, Agnello

    EPJ Web of Conferences, 2019, Volume: 214
    Journal Article, Conference Proceeding

    The year 2017 was most likely a turning point for the INFN Tier- 1. In fact, on November 9th 2017 early at morning, a large pipe of the city aqueduct, located under the road next to CNAF, broke. As a consequence, a river of water and mud flowed towards the Tier-1 data center. The level of the water did not exceed the threshold of safety of the waterproof doors but, due to the porosity of the external walls and the floor, it could find a way into the data center. The flooding almost compromised all the activities and represented a serious threat to future of the Tier-1 itself. The most affected part of the data center was the electrical room, with all switchboards for both power lines and for the continuity systems, but the damages were diffused also to all the IT systems, including all the storage devices and the tape library. After a careful assessment of the damages, an intense recovery activity was launched, aimed not only to restore the services but also to secure data stored on disks and tapes. After nearly two months, in January, we were able to start to reopen gradually all the services, including part of the farm and the storage systems. The long tail of recovery (tapes recovery, second power line) has lasted until the end of May. As a short term consequence we have started a deep consolidation of the data center infrastructure to be able to cope also with this type of incidents; for the medium and long term we are working to move to a new, larger, location, able also to accommodate the foreseen increase of resources for HL-LHC.