Abstract
In this paper we evaluate the efficacy of the Arm Scalable Vector Extension (SVE) instruction set for HPC workloads using a set of established mini-apps. Exploiting the vector capabilities of SVE will be a key factor in achieving high performance on upcoming generations of Arm-based processors. SVE is a flexible instruction set, but its design is fundamentally different from other contemporary SIMD extensions, such as AVX or NEON, which could present a challenge to its adoption. We use a selection of mini-apps which covers a wide range of scientific application classes to investigate SVE, using a combination of static and dynamic analysis. We inspect how SVE capabilities are used in the mini- apps’ kernels, as generated by all SVE compilers available at the time of writing, for both arithmetic and memory operations. We compare our findings against similar data gathered on currently available processors. Although the extent to which vector code is generated varies by mini- app, all compilers tested successfully utilise SVE to vectorise more code than they are able to when targeting NEON, Arm’s previous-generation SIMD instruction set. For most mini-apps, we expect performance im- provements as SVE width is increased.
Original language | English |
---|---|
Number of pages | 16 |
DOIs | |
Publication status | E-pub ahead of print - 18 Aug 2020 |
Event | Euro-Par: 26th International European Conference on Parallel and Distributed Computing - Online, Warsaw, Poland Duration: 24 Aug 2020 → 28 Aug 2020 https://2020.euro-par.org/ |
Conference
Conference | Euro-Par: 26th International European Conference on Parallel and Distributed Computing |
---|---|
Abbreviated title | Euro-Par 2020 |
Country/Territory | Poland |
City | Warsaw |
Period | 24/08/20 → 28/08/20 |
Internet address |
Keywords
- Instruction sets
- SVE
- Vectorisation
- SIMD
- Data Parallelism