Title: COMP309/509 - Lecture 17
class: middle, center, inverse
COMP309/509 - Parallel and Distributed Computing
Lecture 17 - PVM Matrix Multiplication
By Mitchell Welch
University of New England
Reading
Summary
- Matrix Multiplication on a Ring
- Matrix Example Code
Matrix Multiplication on a Ring
- Today I’ll discuss ring matrix multiplication:
- How it works
- How to test it
- What to be careful about, and
- Some general remarks about PVM hacking.
- recall Lecture 9 had the matrix multiplication fundamentals.
Matrix Multiplication on a Ring
- In the Row-Column algorithm, process i:
- reads in row i of A
- reads in column i of B
- multiplies the row with the column, computing: c{i,i}
- then, in a loop:
- writes its column to the ring
- reads a column from the ring, and
- computes the next entry in its product row wrapping around at the appropriate point until each:
has been computed.
- After $ M - 1$ iteractions of this loop the computation is complete.
Matrix Multiplication on a Ring
- In the Examples subdirectory you will find prc.
This (executable) program implements the above algorithm.
- We’ll play with it in a moment.
- This row-column algorithm is very efficient, once the processors have read in the columns of B .
- However B is stored in row major form:
- row one first, followed by row two, followed by row three etc
- Reading in a column is very inefficient because of all the seeks:
- it has to do a seek for each member of the column.
- Whereas reading in a row only requires one seek.
Matrix Multiplication on a Ring
- In the Row-Row algorithm, process i :
- reads in row i of A
- reads in row i of B
- computes the increment this B row makes to each entry in the C row.
- then, in a loop:
- writes its B row to the ring
- reads a B row from the ring, and
- computes the increment this B row makes to each entry in the C row
- After M - 1 iteractions of this loop the computation is complete.
Matrix Multiplication on a Ring
- Suppose we start with the following arrays:
int A[M+1], B[M+1], C[M+1];
- We’ll ignore the entries
A[0]
, B[0]
, and C[0]
to keep the indexing simple.
The rows, like the processes, will be numbered 1 through M .
- Each process initializes
A[]
and B[]
from the appropriate files.
- The i th process initializes
C[]
thusly:
for(j = 1; j < M + 1; j++)
C[j] = A[i] * B[j];
- In other words c{i,j} = a{i,i} x b_{i,j} .
In the $ i$ th process this amounts to row i of B’s contribution to the i th row of the product.
Matrix Multiplication on a Ring
- In the Examples subdirectory you will find prr.
- This (executable) program implements the above algorithm.
- Both prc and prr take four arguments:
Examples>prc
Usage: prc matrixA matrixB matrixC M
Examples>prr
Usage: prr matrixA matrixB matrixC M
Examples>
- matrixA and matrixB will be (files containing binary representations of) square matrices of dimension M
- i.e they are M x M matrices
- matrixC will be the name of a file where the answer will be stored.
- Again in binary.
Matrix Multiplication on a Ring
- To test these programs we need:
- To be able to make $ {\tt M} \times {\tt M}$ matrices.
- To be able to compute or check their product.
- We can do this with just two simply programs:
- One that makes a matrix with random entries, and
- One that makes the identity matrix.
- If B is the identity, then:
- A x B = B x A = A
- If we choose A to be a random matrix
- Thus we can use the Unix diff to check the answer!
- i.e. diff A C will be a good indication of the correctness of our product!
Matrix Example Code
Matrix Example Code
- OK so a simple test using these look like:
turing.une.edu.au.Examples> mkIdentityMatrix I 4
Finished writing I
turing.une.edu.au.Examples> mkRandomMatrix R 4
Finished writing R
turing.une.edu.au.Examples> prc I R IRrc 4
turing.une.edu.au.Examples> diff R IRrc
turing.une.edu.au.Examples>
Matrix Example Code
turing.une.edu.au.Examples> getMatrix R 4
R[1][1] = 1351
R[1][2] = 1386
R[1][3] = 1369
R[1][4] = 969
R[2][1] = 1780
R[2][2] = 1381
R[2][3] = 437
R[2][4] = 165
R[3][1] = 1327
R[3][2] = 981
R[3][3] = 195
R[3][4] = 1376
R[4][1] = 313
R[4][2] = 1283
R[4][3] = 1965
R[4][4] = 2026
Finished reading R
Matrix Example Code
turing.une.edu.au.Examples> getMatrix IRrc 4
IRrc[1][1] = 1351
IRrc[1][2] = 1386
IRrc[1][3] = 1369
IRrc[1][4] = 969
IRrc[2][1] = 1780
IRrc[2][2] = 1381
IRrc[2][3] = 437
IRrc[2][4] = 165
IRrc[3][1] = 1327
IRrc[3][2] = 981
IRrc[3][3] = 195
IRrc[3][4] = 1376
IRrc[4][1] = 313
IRrc[4][2] = 1283
IRrc[4][3] = 1965
IRrc[4][4] = 2026
Finished reading IRrc
turing.une.edu.au.Examples>
Matrix Example Code
- Using a modicum of Perl, we write the script:
test_u_run:
#!/usr/bin/perl -w
print "Enter the size of the matrix ( = nprocs in ring ): ";
my $nprocs = <>;
chomp($nprocs);
`rm -f I R RIrr RIrc IRrr IRrc`;
`./mkRandomMatrix R $nprocs`;
`./mkIdentityMatrix I $nprocs`;
p_check('./prc', 'I', 'R', 'IRrc', 'R', $nprocs);
p_check('./prr', 'I', 'R', 'IRrr', 'R', $nprocs);
p_check('./prc', 'R', 'I', 'RIrc', 'R', $nprocs);
p_check('./prr', 'R', 'I', 'RIrr', 'R', $nprocs);
sub p_check {
my($exe, $a, $b, $c, $r, $nprocs) = @_;
`$exe $a $b $c $nprocs`;
my $diff = `diff $r $c`;
if($diff){
print "$r and $c differ\n";
} else {
print "$r and $c agree\n";
}
}
Debugging
printf
to stderr
is your friend.
- The initial task prints to the console.
- All others print to
/tmp/pvml.<uid>
on the machine you initiated the computation.
- Learn your using the Unix id command
- Silence from spawned tasks usually means they died from a
SIGSEGV
(i.e. a segmentation fault).
- Always make sure that stuff in the spawned tasks gets initialized or allocated, just like it does in the parent task. This is a very common mistake.
Debugging
#define VERBOSE 1
...
if(VERBOSE)fprintf(stderr, "Yada Yada Yada\n");
- so you can easily generate a version of the program with diagnostic messages swithched on or off.
Summary
- Matrix Multiplication on a Ring
- Matrix Example Code
- Debugging
class: middle, center, inverse
Questions?
Reading