Abstract
Fortran DO CONCURRENT has emerged as a new
way to achieve parallel execution of loops on CPUs and GPUs.
This paper studies the performance portability of this construct
on a range of processors and compares it with the incumbent
models: OpenMP, OpenACC and CUDA. To do this study
fairly, we implemented the BabelStream memory bandwidth
benchmark from scratch, entirely in modern Fortran, for all of
the models considered, which include Fortran DO CONCURRENT,
as well as two variants of OpenACC, four variants of OpenMP
(2 CPU and 2 GPU), CUDA Fortran, and both loop- and
array-based references. BabelStream Fortran matches the C++
implementation as closely as possible, and can be used to make
language-based comparisons. This paper represents one of the
first detailed studies of the performance of Fortran support on
heterogeneous architectures; we include results for AArch64 and
x86 64 CPUs as well as AMD, Intel and NVIDIA GPU platforms.
way to achieve parallel execution of loops on CPUs and GPUs.
This paper studies the performance portability of this construct
on a range of processors and compares it with the incumbent
models: OpenMP, OpenACC and CUDA. To do this study
fairly, we implemented the BabelStream memory bandwidth
benchmark from scratch, entirely in modern Fortran, for all of
the models considered, which include Fortran DO CONCURRENT,
as well as two variants of OpenACC, four variants of OpenMP
(2 CPU and 2 GPU), CUDA Fortran, and both loop- and
array-based references. BabelStream Fortran matches the C++
implementation as closely as possible, and can be used to make
language-based comparisons. This paper represents one of the
first detailed studies of the performance of Fortran support on
heterogeneous architectures; we include results for AArch64 and
x86 64 CPUs as well as AMD, Intel and NVIDIA GPU platforms.
Original language | English |
---|---|
Title of host publication | 2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) |
Subtitle of host publication | PMBS22 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 1-18 |
ISBN (Print) | 978-1-6654-5185-7 |
DOIs | |
Publication status | Published - 18 Nov 2022 |
Event | SC 2022 Workshops International Conference for High Performance Computing, Networking, Storage and Analysis - Dallas, United States Duration: 13 Nov 2022 → 18 Nov 2022 |
Conference
Conference | SC 2022 Workshops International Conference for High Performance Computing, Networking, Storage and Analysis |
---|---|
Country/Territory | United States |
City | Dallas |
Period | 13/11/22 → 18/11/22 |