E-viri
Recenzirano
Odprti dostop
-
Ashar, Neha; Raut, Gopal; Trivedi, Vasundhara; Vishvakarma, Santosh Kumar; Kumar, Akash
IEEE access, 01/2024, Letnik: 12Journal Article
In response to the escalating demand for hardware-efficient Deep Neural Network (DNN) architectures, we present a novel quantize-enabled multiply-accumulate (MAC) unit. Our methodology employs a right shift-and-add computation for MAC operation, enabling runtime truncation without additional hardware. This architecture optimally utilizes hardware resources, enhancing throughput performance while reducing computational complexity through bit-truncation techniques. Our key methodology involves designing a hardware-efficient MAC computational algorithm that supports both iterative and pipeline implementations, catering to diverse hardware efficiency or enhanced throughput requirements in accelerators. Additionally, we introduce a processing element (PE) with a pre-loading bias scheme, reducing one clock delay and eliminating the need for conventional extra resources in PE implementation. The PE facilitates quantization-based MAC calculations through an efficient bit-truncation method, removing the necessity for extra hardware logic. This versatile PE accommodates variable bit-precision with a dynamic fraction part within the sfxpt < N,f > representation, meeting specific model or layer demands. Through software emulation, our proposed approach demonstrates minimal accuracy loss, revealing under 1.6% loss for LeNet-5 using MNIST and around 4% for ResNet-18 and VGG-16 with CIFAR-10 in the sfxpt <8,5> format compared to conventional float32-based implementations. Hardware performance parameters on the Xilinx-Virtex-7 board unveil a 37% reduction in area utilization and a 45% reduction in power consumption compared to the best state-of-the-art MAC architecture. Extending the proposed MAC to a LeNet DNN model results in a 42% reduction in resource requirements and a significant 27% reduction in delay. This architecture provides notable advantages for resource-efficient, high-throughput edge-AI applications.
![loading ... loading ...](themes/default/img/ajax-loading.gif)
Vnos na polico
Trajna povezava
- URL:
Faktor vpliva
Dostop do baze podatkov JCR je dovoljen samo uporabnikom iz Slovenije. Vaš trenutni IP-naslov ni na seznamu dovoljenih za dostop, zato je potrebna avtentikacija z ustreznim računom AAI.
Leto | Faktor vpliva | Izdaja | Kategorija | Razvrstitev | ||||
---|---|---|---|---|---|---|---|---|
JCR | SNIP | JCR | SNIP | JCR | SNIP | JCR | SNIP |
Baze podatkov, v katerih je revija indeksirana
Ime baze podatkov | Področje | Leto |
---|
Povezave do osebnih bibliografij avtorjev | Povezave do podatkov o raziskovalcih v sistemu SICRIS |
---|
Vir: Osebne bibliografije
in: SICRIS
To gradivo vam je dostopno v celotnem besedilu. Če kljub temu želite naročiti gradivo, kliknite gumb Nadaljuj.