Hands-On Session Messaging Fundamentals #3:
MPI Message Performance

Objective: To better understand message passing performance.
The code for this exercise can be found here. Instructions on how to log into the remote machine and how to download the source code to your working directory (using wget) can be found here.

mpiPingPong.c is a basic pingpong code. It times each of 100 repetitions of ping-pong operations on a buffer of 64 ints, printing out the minimum and maximum timing.

  1. Run the code and make sure it works. After doing so several times, do you observe a potential problem for timing this operation? Hint: the overhead and resolution of the MPI timing function used by the program, MPI_Wtime() on the current system is approximately 1e-06s and 0.25e-06s respectively. Rectify the timing problem (hint: measure the time over all repetitions instead).
  2. Currently the code only does pingpong between processes 0 and 1 for a message containing 64 integers and measures the time using MPI_Wtime(). Modify the code so that it runs with a message length len from 1 to 4*1024*1024 integers in powers of 4 (i.e. 1, 4, 16, 64, 256, 1024, ...). Have the code print out the average time and the corresponding bandwidth (on process 0). As always, test the code interactively first. Are the results what you expected?
  3. What latency did you measure and what peak bandwidth? How does the bandwidth change with message length? Does the latency seem to fit a straight line?
  4. Further modify the code so that it measures the pingpong time between process 0 and all other processes in MPI_COMM_WORLD for messages of 1, 1024 and 1048576 integers.
  5. Run your code on the batch system using 32 CPUs and complete the following table:
    Message Size (ints) time for pingpong between two processes
    within a nodebetween two nodes
    1  
    1024  
    1048576  
  6. What results did you expect to see? Are the results in line with these expectations? If not why not?