Scalable Applications on Heterogeneous System Architectures: A Systematic Performance Analysis Framework
The efficient parallel execution of scientific applications is a key challenge in high-performance computing (HPC). With growing parallelism and heterogeneity of compute resources as well as increasingly complex software, performance analysis has become an indispensable tool in the development and optimization of parallel programs. As waiting or idle time can propagate over multiple levels of parallelism, e.g. from a delayed task on an accelerator over host threads to another compute node, the actual cause of an inefficiency might be difficult to find. This thesis proposes a framework for systematic performance analysis of scalable, heterogeneous applications, which covers process- and thread-level parallelism as well as computation offloading. It addresses two essential aspects that have so far been neglected: potential inefficiencies with computation offloading and generic analyses across programming models. Furthermore, established analyses are combined in such a way that inefficiencies and program regions can be prioritized to enable a more focused optimization process. The analysis results are presented in form of a program region profile, a summary of all inefficiencies, and timelines. In the core analyses, the implementation is independent of application programming interfaces (APIs) and can detect further inefficiencies and wait states by adding new analysis rules. It is applied to synthetic and real-world programs to validate the applicability, correctness, and scalability.
Dieses Projekt ist Open Access und öffentlich zugänglich.
Sammlungen in diesem Bereich
CASITA is a tool for automatic analysis of OTF2 trace files that have been generated with Score-P. It determines program activities with high impact on the total program runtime and the load balancing. CASITA generates an ...
Anonymen Nutzern werden evtl. nicht alle Bereiche oder Sammlungen angezeigt. Bitte loggen Sie sich ein, wenn Sie Ihre freigegebenen Bereiche und Sammlungen sehen wollen.
(Technische Universität Dresden, 2019)CASITA is a tool for automatic analysis of OTF2 trace files that have been generated with Score-P. It determines program activities with high impact on the total program runtime and the load balancing. CASITA generates an ...