Pipelined Streaming Computation of Histogram in FPGA OpenCL

Mohammad Hosseinabady*, Jose Luis Nunez-Yanez

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

2 Citations (Scopus)
446 Downloads (Pure)


The emergence of High-Level Synthesis (HLS) techniques and tools, along with new features in high-end FPGAs such as multi-port memory interfaces, has enabled designers to utilize FPGAs not only for compute-bound but also for memory-bound tasks. This paper explains how to efficiently parallelise histogram, as a memory-bound task, utilizing the OpenCL framework running on FPGA. We have run our implementation on three high-end FPGAs including Alpha Data 7v3, Alpha Data ADM-PCIE-KU3 and Xilinx KU115. The 256 fixed-width bins histogram running on 7v3, KU3 and KU115 platforms shows 8.38, 15.29 and 38.57 Giga bin Update Per Second (GUPS), respectively. The best result, i.e., 38.57 GUPS on KU115 platform defeats the Nvidia GeForce 1060 GPU with 31.36 GUPS. In addition, it shows better performance than the one obtained in the dual socket 8-core Intel Xeon E5-2690 with 13 GUPS and 60-core Intel Xeon Phi 5110P coprocessor with 18 GUPS. The proposed implementation is not sensitive to locally invariant (LI) data sets, while the performance of GPU and CPU implementations drops with LI data. Processing locally invariant data sets shows that our FPGA implementation can be up to 91.4% and 44.9% faster than that of the GeForce 1060 and 1080 GPUs, respectively. The source codes of the designs are available at https://github.com/Hosseinabady/histogram-sdaccel.

Original languageEnglish
Title of host publicationParallel Computing is Everywhere
PublisherIOS Press
Number of pages10
ISBN (Electronic)9781614998433
ISBN (Print)9781614998426
Publication statusPublished - 7 Mar 2018

Publication series

NameAdvances in Parallel Computing
ISSN (Print)0927-5452
ISSN (Electronic)1879-808X


  • FPGA
  • High-Level Synthesis
  • Histogram
  • Stream Computing


Dive into the research topics of 'Pipelined Streaming Computation of Histogram in FPGA OpenCL'. Together they form a unique fingerprint.

Cite this