DIKUL - logo
E-resources
Full text
  • On the Effectiveness of Vir...
    Jain, Yash; Kwon, Hyeokhyen; Ploetz, Thomas

    Adjunct Proceedings of the 2022 ACM International Joint Conference on Pervasive and Ubiquitous Computing and the 2022 ACM International Symposium on Wearable Computers, 09/2022
    Conference Proceeding

    The successful training of human activity recognition (HAR) systems typically heavily depends on the availability of sufficient amounts of labeled sensor data. Unfortunately, obtaining large-scale labeled datasets is usually expensive and often limited by practical and / or privacy reasons. Recently, IMUTube was introduced to tackle this data scarcity problem through generating weakly-labeled, virtual IMU data from unconstrained video repositories, such as YouTube. IMUTube was demonstrated to be very effective at classifying locomotion or gym exercises that involve large movements of body parts. Yet, many important daily activities, such as eating, do not exhibit such substantial body (part) movements but are rather based on more subtle, fine-grained motions. This work explores the utility of IMUTube for such subtle motion activities with specific, exemplary application to eating detection. We found that–surprisingly–IMUTube is also very effective for this challenging HAR domain. Our experiment demonstrates that eating recognition systems benefit from virtual IMU data extracted from video datasets with significant improvements of recognition accuracy (increases of 8.4% and 5.9% F1-score absolute for both curated and in-the-wild video datasets, respectively relative to 71.5% F1-score of the baseline), which is encouraging for the broader use of systems like IMUTube.