An Improved Parallelism Scheme for Deterministic Discrete Ordinates Transport

Tom Deakin*, Simon McIntosh-Smith, Matt Martineau, Wayne Gaudin

*Corresponding author for this work

Research output: Contribution to journalSpecial issue (Academic Journal)peer-review

4 Citations (Scopus)
481 Downloads (Pure)

Abstract

In this paper we demonstrate techniques for increasing the node-level parallelism of a deterministic discrete ordinates neutral particle transport algorithm on a structured mesh to exploit many-core technologies. Transport calculations form a large part of the computational workload of physical simulations and so good performance is vital for the simulations to complete in reasonable time. We will demonstrate our approach utilizing the SNAP mini-app, which gives a simplified implementation of the full transport algorithm but remains similar enough to the real algorithm to act as a useful proxy for research purposes.

We present an OpenCL implementation of our improved algorithm which achieves a speedup of up to 2.5× on a many-core GPGPU device compared to a state-of-the-art multi-core node for the transport sweep, and up to 4× compared to the multi-core CPUs in the largest GPU enabled supercomputer; the first time this scale of speedup has been achieved for algorithms of this class. We then discuss ways to express our scheme in OpenMP 4.0 and demonstrate the performance on an Intel Knights Corner Xeon Phi compared to the original scheme.
Original languageEnglish
Pages (from-to)555-569
Number of pages15
JournalInternational Journal of High Performance Computing Applications
Volume32
Issue number4
Early online date18 Sep 2016
DOIs
Publication statusPublished - 1 Jul 2018

Fingerprint Dive into the research topics of 'An Improved Parallelism Scheme for Deterministic Discrete Ordinates Transport'. Together they form a unique fingerprint.

Cite this