Analyzing and improving performance portability of OpenCL applications via auto-tuning

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

1 Citation (Scopus)

Abstract

The increasing uptake of portable, parallel programming models such as OpenCL has fueled extensive research into performance portability. Automatic performance tuning techniques have shown promise for generating kernels which are highly optimized for specific architectures, but do not address the issue of performance portability directly. With the range of architectures and possible optimizations continuously growing, the concept of achieving performance portability from a single code base becomes ever more attractive.

In this talk, we present an approach for analyzing performance portability that exploits that black-box nature of automatic performance tuning techniques. We demonstrate this approach across a diverse range of GPU and CPU architectures for two simple OpenCL applications. We then discuss the potential for auto-tuning to aid the generation of performance portable OpenCL kernels by incorporating multi-objective optimization techniques into the tuning process.
Original languageEnglish
Title of host publicationProceedings of the 5th International Workshop on OpenCL, IWOCL 2017
PublisherAssociation for Computing Machinery (ACM)
VolumePart F127755
ISBN (Electronic)9781450352147
DOIs
Publication statusPublished - 16 May 2017
Event5th International Workshop on OpenCL, IWOCL 2017 - Toronto, Canada
Duration: 16 May 201718 May 2017

Conference

Conference5th International Workshop on OpenCL, IWOCL 2017
CountryCanada
CityToronto
Period16/05/1718/05/17

Keywords

  • Auto-Tuning
  • GPGPU
  • OpenCL
  • Performance portability

Fingerprint

Dive into the research topics of 'Analyzing and improving performance portability of OpenCL applications via auto-tuning'. Together they form a unique fingerprint.

Cite this