Abstract
Large Language Models (LLMs) deliver state-of-the-art performance but demand high computation and memory, making deployment in resource-limited settings challenging. Field-Programmable Gate Arrays (FPGAs) offer parallelism and efficiency, yet most prior FPGA accelerators rely on low-level, platform-specific flows that hinder portability. This work presents oneLLM, to our knowledge, the first FPGA-based LLM inference design using Intel's oneAPI, enabling a unified high-level programming model across CPUs, GPUs, and FPGAs. Our deeply pipelined, multi-kernel hardware architecture connects specialized kernels via oneAPI pipes for on-chip streaming, reducing host-device communication. Implemented on an Intel Agilex 7 FPGA, it achieves 3 times faster than a CPU implementation, and 8.8 times faster than a non-pipelined baseline while meeting resource constraints, demonstrating the potential of portable FPGA development for LLM acceleration. Code available at https://github.com/custom-computing-ic/llm-oneapi-fpga.
| Original language | English |
|---|---|
| Title of host publication | 2025 IEEE 16th International Conference on ASIC (ASICON) |
| Publisher | IEEE Computer Society |
| Number of pages | 4 |
| ISBN (Electronic) | 9798331539177 |
| ISBN (Print) | 9798331539184 |
| DOIs | |
| Publication status | Published - 19 Jan 2026 |
| Event | 2025 IEEE 16th International Conference on ASIC, ASICON 2025 - Kunming, China Duration: 21 Oct 2025 → 24 Oct 2025 |
Publication series
| Name | Proceedings of International Conference on ASIC |
|---|---|
| ISSN (Print) | 2162-7541 |
| ISSN (Electronic) | 2162-755X |
Conference
| Conference | 2025 IEEE 16th International Conference on ASIC, ASICON 2025 |
|---|---|
| Country/Territory | China |
| City | Kunming |
| Period | 21/10/25 → 24/10/25 |
Bibliographical note
Publisher Copyright:© 2025 IEEE.
Fingerprint
Dive into the research topics of 'Optimizing LLM inference for FPGAs'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver