Program #6 -- CS 491 Problem 11.5 from the book, page 288 In light of simplifying your work I am making the following reducitons to the work. Only the implementation of matrix multiply. No benchmarking You can have a full A, B and C defined in each compute node. Each compute node does N/P rows to fill in the n/p rows of C the head node reduces the results to a final C