Leveraging Many-Core Technology for Deterministic Neutral Particle Transport at Extreme Scale

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)


With disruptive changes to supercomputing architecture at the node level, al- gorithms are required to leverage the increased parallelism and high bandwidth memory technologies from many-core devices. The deterministic discrete ordin- ates transport equation is an important equation which models the movement and interaction of neutral particles, such as neutrons, through materials. The balance equation counts the loss and gain of the particles through changes in direction, energy, and as a result of collisions with material nuclei. Solving this equation cannot be performed analytically for all but the simplest problems and so in practice the solution must be approximated using numerical methods.
The Discrete Ordinates (Sn) discretisation used in solving the equation nu- merically imposes a wavefront dependency across the spatial domain, and as such there is a corresponding limitation on the concurrency of the algorithm. The sweep and finite difference discretisation of the spatial domain result in complex data reuse patterns. The problem is of high dimensionality, with an- gular, energy and spatial domains modelled over time. Therefore the solution itself has a high memory footprint so that it often becomes bound by the avail- able memory capacity of supercomputer nodes. All these factors mean that exploiting many-core technology is a significant challenge.
This thesis will investigate solving the transport equation on many-core ar- chitectures. Performance models will be developed in order to capture the be- haviour of the memory accesses and communication patterns. A GPU imple- mentation of a transport mini-app will be developed using a concurrent scheme which demonstrates for the first time that such devices can be used to provide speedups. The reduction in runtime is in line with the memory bandwidth advantages GPUs have over CPU architectures. Mini-apps will be developed to capture the critical computation to examine the solver on cache-based ar- chitectures. Finally a high order discontinuous Galerkin finite element method will be investigated in order to mitigate memory capacity constraints of high bandwidth memories at an algorithmic level.
Date of Award8 May 2018
Original languageEnglish
Awarding Institution
  • The University of Bristol
SupervisorSimon N Mcintosh-Smith (Supervisor)

Cite this