Program #6 -- CS 491 



Problem 11.5 from the book, page 288

In light of simplifying your work I am making the following
reducitons to the work.


Only the implementation of matrix multiply.  No benchmarking

You can have a full A, B and C defined in each compute node.

Each compute node does N/P rows to fill in the n/p rows of C

the head node reduces the results to a final C