In this chapter we show the potential for using the OpenCL standard parallel programming language to deliver portable performance on Intel Xeon Phi coprocessors, Xeon processors, and many-core devices such as GPUs from multiple vendors. This portable performance can be delivered from a single program without needing multiple versions of the code, an advantage of OpenCL over most other approaches available today. As proof of OpenCL’s ability to deliver performance portability, we describe results from the BUDE molecular docking code, which sustains over 30% of peak floating-point performance on a wide variety of processors, including laptop CPUs, Xeon, Xeon Phi, and GPUs.
|Title of host publication||High Performance Parallelism Pearls|
|Subtitle of host publication||Multicore and Many-core Programming Approaches|
|Editors||James Reinders, James Jeffers|
|Number of pages||17|
|Publication status||Published - 4 Nov 2014|