Master Thesis MSTR-2024-29

BibliographyImhoff, Nils: Optimization of intra-node communication in HPC systems: development and implementation of a zero-copy API.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 29 (2024).
62 pages, english.
Abstract

The landscape of High Performance Computing (HPC) is dynamic and intra-node communication efficiency has emerged as a critical factor in system performance. This thesis presents the Zero-Copy Application Programming Interface (ZCom), which utilizes cross-partition memory (XPMEM) technology to improve data transfer within shared memory environments. As such ZCom has also minimised the data replication which is usual associated with Message Passing Interface (MPI) operations, reducing the communication overhead, hence, leading to an improved computational efficiency. An extensive performance test with microbenchmarks as well as the MiniGhost benchmark suite shows that ZCom significantly improves communication efficacy especially in weak and strong scaling cases with respect to other MPI-based approaches. The approach taken by ZCom, which facilitates direct memory access among processes represents a paradigm move towards minimized data movement and thus makes it an innovative solution in HPC communications. The potential of ZCom is clear from the performance improvements observed; however, this thesis also identifies the current deficiencies with the evaluation of ZCom which is performed with a small set of applications and benchmarks. This fact highlights the need for more studies aimed at the generalization of ZCom and its influence on various HPC systems and architectures. The said work sets a very solid ground for future development that intends to optimize the performance and scalability of intra-node communication in HPC environments.

Full text and
other links
Volltext
Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Scientific Computing
Superviser(s)Pflüger, Prof. Dirk; Bernreuther, Dr. Martin; Simmendinger, Dr. Christian
Entry dateSeptember 19, 2024
   Publ. Computer Science