UNI-MB - logo
UMNIK - logo
 
E-resources
Full text
Peer reviewed
  • Health Status Assessment an...
    Xu, Chang; Wang, Gang; Liu, Xiaoguang; Guo, Dongdong; Liu, Tie-Yan

    IEEE transactions on computers, 2016-Nov.-1, 2016-11-1, 20161101, Volume: 65, Issue: 11
    Journal Article

    Recently, in order to improve reactive fault tolerance techniques in large scale storage systems, researchers have proposed various statistical and machine learning methods based on SMART attributes. Most of these studies have focused on predicting failures of hard drives, i.e., labeling the status of a hard drive as "good" or not. However, in real-world storage systems, hard drives often deteriorate gradually rather than suddenly. Correspondingly, their SMART attributes change continuously towards failure. Inspired by this observation, we introduce a novel method based on Recurrent Neural Networks (RNN) to assess the health statuses of hard drives based on the gradually changing sequential SMART attributes. Compared to a simple failure prediction method, a health status assessment is more valuable in practice because it enables technicians to schedule the recovery of different hard drives according to the level of urgency. Experiments on real-world datasets for disks of different brands and scales demonstrate that our proposed method can not only achieve a reasonable accurate health status assessment, but also achieve better failure prediction performance than previous work.