Skip to main navigation Skip to search Skip to main content

Optimizing DNN Accelerator Compression Using Tolerable Accuracy Loss

Zhiqiang Que, Anyan Zhao, Jose G.F. Coutinho, Ce Guo, Wayne Luk

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

This paper proposes a novel nested-loop optimization approach which utilizes the maximum tolerable model accuracy loss as a hyperparameter to improve DNNs compression for hardware accelerators. This process includes local inner-loop optimization and global outer-loop optimization with bottom-up feedback. Our multi-level approach encompasses optimization tasks distributed across different computational spaces, such as software and hardware (High-Level Synthesis, HLS). As an example of an optimization task, we introduce and detail the mixed-precision Quantization Heuristic Search (QHS), which adjusts numerical representations, reducing hardware complexity while maintaining accuracy within user-defined tolerances. This approach offers a new perspective for model compression, leading to efficient and effective DNN hardware accelerators.
Original languageEnglish
Title of host publicationProceedings - International Conference on Field Programmable Technology 2024, ICFPT 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages2
ISBN (Electronic)9798331523213
ISBN (Print)9798331523220
DOIs
Publication statusPublished - 18 Aug 2025
Event23rd International Conference on Field Programmable Technology, ICFPT 2024 - Sydney, Australia
Duration: 10 Dec 202412 Dec 2024

Publication series

NameProceedings - International Conference on Field-Programmable Technology, ICFPT
ISSN (Print)2837-0430
ISSN (Electronic)2837-0449

Conference

Conference23rd International Conference on Field Programmable Technology, ICFPT 2024
Country/TerritoryAustralia
CitySydney
Period10/12/2412/12/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Fingerprint

Dive into the research topics of 'Optimizing DNN Accelerator Compression Using Tolerable Accuracy Loss'. Together they form a unique fingerprint.

Cite this