Skip to content

A Streaming Dataflow Engine for Sparse Matrix-Vector Multiplication using High-Level Synthesis

Research output: Contribution to journalArticle

Original languageEnglish
Number of pages14
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
DateAccepted/In press - 21 Mar 2019
DatePublished (current) - 23 Apr 2019


Using high-level synthesis techniques, this paper proposes an adaptable high-performance streaming dataflow engine for sparse matrix dense vector multiplication (SpMV) suitable for embedded FPGAs. As the SpMV is a memorybound algorithm, this engine combines the three concepts of loop pipelining, dataflow graph, and data streaming to utilize most of the memory bandwidth available to the FPGA. The main goal of this paper is to show that FPGAs can provide comparable performance for memory-bound applications to that of the corresponding CPUs and GPUs but with significantly less energy consumption. Experimental results indicate that the FPGA provides higher performance compared to that of embedded GPUs for small and medium-size matrices by an average factor of 3.25 whereas the embedded GPU is faster for larger size matrices by an average factor of 1.58. In addition, the FPGA implementation is more energy efficient for the range of considered matrices by an average factor of 8.9 compared to the embedded CPU and GPU. A case study based on adapting the proposed SpMV optimization to accelerate the support vector machine (SVM) algorithm, one of the successful classification techniques in the machine learning literature, justifies the benefits of utilizing the proposed FPGAbased SpMV compared to that of the embedded CPU and GPU. The experimental results show that the FPGA is faster by an average factor of 1.7 and consumes less energy by an average factor of 6.8 compared to the GPU.

    Research areas

  • Computer architecture, Edge Computing., Energy, Engines, Field programmable gate arrays, FPGA, Hardware, High-Level Synthesis, Machine learning, Optimization, Sparse matrices, Sparse-Matrix-Vector, Support Vector Machine

Download statistics

No data available



  • Full-text PDF (accepted author manuscript)

    Rights statement: This is the author accepted manuscript (AAM). The final published version (version of record) is available online via IEEE at . Please refer to any applicable terms of use of the publisher.

    Accepted author manuscript, 5.24 MB, PDF document


View research connections

Related faculties, schools or groups