Abstract
Recent revisions to the ISO C++ standard have added specifications for parallel algorithms. These additions cover common use-cases, including sequence traversal, reduction, and even sorting, many of which are highly applicable in HPC, and thus represent a potential for increased performance and productivity.
This study evaluates the state of the art for implementing heterogeneous HPC applications using the latest built-in ISO C++17 parallel algorithms. We implement C++17 ports of representative HPC mini-apps that cover both compute-bound and memory bandwidth-bound applications. We then conduct benchmarks on CPUs and GPUs, comparing our ports to other widely-available parallel programming models, such as OpenMP, CUDA, and SYCL.
Finally, we show that C++17 parallel algorithms are able to achieve competitive performance across multiple mini-apps on many platforms, with some notable exceptions. We also discuss several key topics, including portability, and describe workarounds for a number of remaining issues, including index-based traversal and accelerator device/memory management.
This study evaluates the state of the art for implementing heterogeneous HPC applications using the latest built-in ISO C++17 parallel algorithms. We implement C++17 ports of representative HPC mini-apps that cover both compute-bound and memory bandwidth-bound applications. We then conduct benchmarks on CPUs and GPUs, comparing our ports to other widely-available parallel programming models, such as OpenMP, CUDA, and SYCL.
Finally, we show that C++17 parallel algorithms are able to achieve competitive performance across multiple mini-apps on many platforms, with some notable exceptions. We also discuss several key topics, including portability, and describe workarounds for a number of remaining issues, including index-based traversal and accelerator device/memory management.
| Original language | English |
|---|---|
| Title of host publication | 2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) |
| Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
| Pages | 36-47 |
| Number of pages | 12 |
| ISBN (Electronic) | 9781665451857 |
| ISBN (Print) | 9781665451864 |
| DOIs | |
| Publication status | Published - 30 Jan 2023 |
| Event | 2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) - Dallas, United States Duration: 14 Dec 2022 → … https://www.dcs.warwick.ac.uk/pmbs/pmbs/PMBS/Welcome.html |
Conference
| Conference | 2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) |
|---|---|
| Abbreviated title | PMBS22 |
| Country/Territory | United States |
| City | Dallas |
| Period | 14/12/22 → … |
| Internet address |
Bibliographical note
Funding Information:ACKNOWLEDGMENT This work makes use of the following services: Isambard UK National Tier-2 HPC Service (https://gw4.ac.uk/isambard) operated by GW4 and the UK Met Office, funded by EPSRC (EP/P020224/1); HPC Zoo, a research cluster managed by the HPC Group at the University of Bristol (https://uob-hpc.github.io/zoo); Intel DevCloud, an online cluster for developers (https://devcloud.intel.com);EC2 Graviton instances, with access supported by AWS.
Publisher Copyright:
© 2022 IEEE.
Keywords
- Performance Portability
- Programming Models
- GPUs
- C++17
- PSTL
- stdpar
Fingerprint
Dive into the research topics of 'Evaluating ISO C++ Parallel Algorithms on Heterogeneous HPC Systems'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver