Projects per year
Abstract
The trend for heterogeneous embedded systems is the integration of accelerators and general-purpose CPU cores on the same die. In these integrated architectures, like the Zynq UltraScale+ board (CPU+FPGA) that we target in this work, hardware support for shared memory and low-overhead synchronization between the accelerator and the CPU cores make the case for exploring strategies that exploit a tight collaboration between the CPUs and the accelerator. In this paper we propose a novel lightweight scheduling strategy, FastFit, targeted to FPGA accelerators, and a new scheduler based on it, named MultiFastFit, which asynchronously tackles heterogeneous systems comprised of a variety of CPU cores and FPGA IPs. Our strategy significantly reduces the overhead to automatically compute the near-optimal chunksizes when compared to a previous state-of-the-art auto-tuned approach, which makes our approach more suitable for fine-grained applications. Additionally, our scheduler MultiFastFit has been designed to enable the efficient co-execution of work among compute devices in such a way that all the devices are busy while minimizing the load unbalance. Our approaches have been evaluated using four benchmarks carefully tuned for the low-power UltraScale+ platform. Our experiments demonstrate that the FastFit strategy always finds the near-optimal FPGA chunksize for any device configuration at a reasonable cost, even for fine-grained and irregular applications, and that heterogeneous CPU+FPGA co-executions that exploit all the compute devices are usually faster and more energy efficient than the CPU-only and FPGA-only executions. We have also compared MultiFastFit with other state-of-the-art scheduling strategies, finding that it outperforms other auto-tuned approach up to 2x and it achieves similar results to manually-tuned schedulers without requiring an offline search of the ideal CPU-FPGA partition or FPGA chunk granularity.
Original language | English |
---|---|
Article number | 102398 |
Journal | Journal of Systems Architecture |
Volume | 124 |
Early online date | 15 Jan 2022 |
DOIs | |
Publication status | Published - Mar 2022 |
Bibliographical note
Funding Information:This work was partially supported by the Spanish projects PID2019-105396RB-I00 , UMA18-FEDERJA-108 , and UK EPSRC projects ENEAC ( EP/N002539/1 ), HOPWARE ( EP/V040863/1 ) and RS MINET ( INF\R2\192044 ). Funding for open access charge: Universidad de Málaga / CBUA .
Publisher Copyright:
© 2022 The Authors
Keywords
- Energy efficiency
- FPGA
- Heterogeneous architecture
- Heterogeneous scheduling
- Throughput model
Fingerprint
Dive into the research topics of 'Lightweight asynchronous scheduling in heterogeneous reconfigurable systems'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Heterogeneous computing platforms for resource-aware video and data analytics
Nunez-Yanez, J. L.
27/07/21 → 26/10/21
Project: Research
-