In recent years the computer processors underpinning the large, distributed, workhorse computers used to solve the Boltzmann transport equation have become ever more parallel and diverse. Traditional CPU architectures have increased in core count, reduced in clock speed and gained a deep memory hierarchy. Multiple processor vendors oﬀer a collectively diverse range of both CPUs and GPUs, with the architectures used in the fastest machines in the world ever growing in diversity of many-core architectures. Going forward, the landscape of processor technologies will require our codes to function well across multiple architectures. This ever increasing range of architectures represents a unique challenge for solving the Boltzmann equation using deterministic methods in particular, and so it is important to characterise the performance of those key algorithms across the processor spectrum. The solution of the transport equation is computationally expensive, and so we require well optimised and highly parallel solver implementations in order to solve interesting problems quickly. In this work we explore the performance proﬁles of deterministic SN transport sweeps for both 3D structured (Cartesian) and unstructured (hexahedral) meshes. The study focuses on the characteristics of computational performance which are responsible for the actual performance of a transport solver.
|Journal||The Journal of Computational and Theoretical Transport|
|Early online date||7 Jun 2020|
|Publication status||E-pub ahead of print - 7 Jun 2020|