Abstract
This paper proposes a novel nested-loop optimization approach which utilizes the maximum tolerable model accuracy loss as a hyperparameter to improve DNNs compression for hardware accelerators. This process includes local inner-loop optimization and global outer-loop optimization with bottom-up feedback. Our multi-level approach encompasses optimization tasks distributed across different computational spaces, such as software and hardware (High-Level Synthesis, HLS). As an example of an optimization task, we introduce and detail the mixed-precision Quantization Heuristic Search (QHS), which adjusts numerical representations, reducing hardware complexity while maintaining accuracy within user-defined tolerances. This approach offers a new perspective for model compression, leading to efficient and effective DNN hardware accelerators.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - International Conference on Field Programmable Technology 2024, ICFPT 2024 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Number of pages | 2 |
| ISBN (Electronic) | 9798331523213 |
| ISBN (Print) | 9798331523220 |
| DOIs | |
| Publication status | Published - 18 Aug 2025 |
| Event | 23rd International Conference on Field Programmable Technology, ICFPT 2024 - Sydney, Australia Duration: 10 Dec 2024 → 12 Dec 2024 |
Publication series
| Name | Proceedings - International Conference on Field-Programmable Technology, ICFPT |
|---|---|
| ISSN (Print) | 2837-0430 |
| ISSN (Electronic) | 2837-0449 |
Conference
| Conference | 23rd International Conference on Field Programmable Technology, ICFPT 2024 |
|---|---|
| Country/Territory | Australia |
| City | Sydney |
| Period | 10/12/24 → 12/12/24 |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Fingerprint
Dive into the research topics of 'Optimizing DNN Accelerator Compression Using Tolerable Accuracy Loss'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver