An Improved Mechanism to Detect Wireless Network Failures from MPI-2 Programs

E.M. Macías and Á. Suárez (Spain)


MPI-2, Wireless Networks,Unreliable Communications


The current development of the Wireless Local Area Net works (WLAN) makes them suitable for extending the Local Area Network (LAN) as parallel and distributed computing architecture. MPI (Message Passing Interface) is widely used to write portable and efficient message passing programs. Although MPI can be used to write parallel and distributed programs for running on a LAN and WLAN heterogeneous platform, some extra programming efforts must be done to efficiently adapt them to the unreliable wireless communication channel. It is important to take these failures into account because they could originate abrupt disconnections of some computers in the WLAN. These disconnections could be reflected as communication failures in MPI programs. In this paper we present a simple mechanism to detect frequent wireless channel failures. We improve it to obtain a low overhead method that presents good experimental results.

