CS698 GPU Cluster Programming MPI+CUDA - Homework 2

Homework 2 on Matrix multiplicaiton with MPI collective communication

This homework is the same as HW1, except you use only collective communicaiton.

Using the template template.c, write an MPI program to multiply two matrices A and B to produce C, where all matrices are of the same dimension. Use the following variables:

You may use different variables to your liking but you are nonetheless strongly suggestged to use the ones above to keep it simple.

The master randomly generates both A and B. Matrix A will be equally distributed to num_procs processes while matrix B will be broadcast. In other words, my_work rows of A will be distributed to each and every process while n rows of B will be distributed to each and every process.

For each process, perform matrix multiply on nprocs/n rows.

The master collects my_work rows from num_procs processes.

Now on the master, perform nxn matrix multiply.

Compare the results from the pareallel version with the serial version.

They must match. Otherwise, all futile. You wasted a lot of electrons. Do it again until you match the two.