Bibliography | Hou, Jie: Network Performance Improvement of open MPI on Windows HPC Cluster. University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 3072 (2011). 81 pages, english.
|
Abstract | In the past decade, rapid and tremendous advances have taken place in the field of computer and network design enabling us to be able to connect thousands of computers together to form high performance clusters. Normally, these clusters are used to solve computationally challenging scientific problems, which support things like astro-physics, weather prediction, nanoscience modeling, biological computations, computational fluid dynamics, etc.
Message Passing Interface (MPI) is currently the de facto standard to write applications for these clusters. As the applications operate on larger and more complex data, the size of the computing clusters is becoming larger and larger. So in such clusters, the communication system also plays a pivotal role in achieving high performance. Some relatively new interconnecting technologies that deliver very low latency and very high bandwidth are used in more and more large computing clusters. InfiniBand is a typical example of such interconnecting technology, and furthermore it is based on open-standards and gaining rapid acceptance.
Historically, UNIX and its various flavors are the popularly used operating system for High Performance Computing (HPC) systems of all sizes, from normal clusters to the largest supercomputers. In the past few years Windows has continuously increased its presence as an operating system for running clusters, especially in the commercial area. As the vast majority of HPC applications use MPI as their programming model, the use of Windows for HPC clusters also requires an efficient MPI implementation. Currently, the MPI developers are trying their best to provide some MPI libraries that can be used in Windows environments. For example, Open MPI has already released a Windows version, but it does not support high speed network devices such as InfiniBand, which has already been well supported by openib Byte Transfer Layer (BTL) component of the UNIX version Open MPI. InfiniBand can bring many advantages in the HPC field, so this thesis focuses on developing a BTL component that can make use of InfiniBand technology for Open MPI in Windows environments. At initial stage of this work, the openib BTL component was immigrated to Windows platforms. This component uses the standard interface initially defined by InfiniBand specification and has been optimized for Linux environments. Its performances on a Windows HPC cluster are better than the tcp BTL component of Open MPI. It is convinced that InfiniBand can bring gains to Open MPI in Windows environments. The WinVerbs driver package is developed and optimized by OpenFabrics Alliance (OFA) to support Remote Direct Memory Access (RDMA) devices such as InfiniBand in Windows environments. Besides, it is also a lower level and relatively new driver that can bring more gains. This thesis will primarily focus on verifying the use of WinVerbs driver in Open MPI.
|