COMP309/509 - Parallel and Distributed Computing
Lecture 13 - The Beowulf Cluster and PVM
By Mitchell Welch
University of New England
Reading
Summary
- The Beowulf Cluster
- Parallel Virtual Machine (PVM)
- Firing Up a PVM Program
The Beowulf Cluster
The Beowulf Cluster
- A variation on the PoPC (Pile of PCs) model:
- A collection of PCs applied in unison to a problem.
Similar to:
- COWs (Cluster of Workstations), and
- NOWs (Networks of Workstations)
- But emphasising:
- Off the shelf components,
- Dedicated processors (as opposed to scavenging cycles), and
- A private SAN ( System Area Network)
The Beowulf Cluster
The Beowulf Cluster
- Our Beowulf Cluster:
- A virtual 8 node system running on the turing.une.edu.au architecture.
- This configuration is used to provide reliability, no necessarily processing power.
- This system is designed for teaching, not super computing.
- Never the less, it is still a powerful machine.
The Beowulf Cluster
- Programming on a Beowulf Cluster is Distributed Computing.
- To be precise: Distributed Memory MIMD Computing
- MIMD = Multiple Instruction Multiple Data
- A program generally consists of a (large) number of autonomous processes or tasks that:
- Communicate with one another via asynchronous message passing
- Coordinate with one another via synchronization primitives
The Beowulf Cluster
- Crudely put, one subdivides:
- A problem into subproblems and smaller data sets.
- Each subproblem and data set is farmed out as a task.
- The computation then progresses with these tasks:
- communicating their datasets and their results
- possibly spawning new tasks and coordinating their actions and results.
- A Beowulf cluster should be able solve problems with:
- a speedup approaching 3 orders of magnitude (base 2)
- datasets approaching 3 orders of magnitude larger (base 2)
than a single CPU (comparable to an individual node).
The Beowulf Cluster
- Both are three lettered acronyms (TLAs) !!!
- PVM is a product, MPI is a Standard
- Both provide distributed computing.
- PVM is designed for heterogeneous clusters.
- MPI is designed for homogeneous clusters.
- Both can be used as, essentially, either:
- C libraries,
- Python bindings
- Fortran 77 libraries.
Parallel Virtual Machine (PVM)
- PVM offers simple forms of communication with some message ordering guaranteed.
- MPI offers simple and complex forms of communication with no message ordering guaranteed.
- PVM has simple forms of synchronization.
- MPI has simple and complex forms of synchronization.
- In MPI the distinction between synchronization and communication is blurred.
Parallel Virtual Machine (PVM)
- MPI is a standard, there are two versions MPI(1) and MPI(2)
- MPI(1) is strictly SPMD, while PVM can do spawns, i.e. fork-execs.
- MPI(2) can do spawns, i.e. fork-execs.
- We’ll start with PVM and then go on with MPI(1) and (2).
Parallel Virtual Machine (PVM)
- So far we have played with two concurrent idioms:
- Unix processes that communicate and synchronize mainly via pipes, shared memory, semaphore sets and sometimes via signals.
- POSIX threads that communicate and synchronize via shared memory, mutexes, and condition variables.
- Neither actually provided us with a great deal of true parallelism because of the hardware limitations hereabouts.
turing
has 16 CPU’s and bourbaki
has 8.
- We will spend the rest of the unit playing with two more idioms which will provide us with a great deal of parallelism.
Parallel Virtual Machine (PVM)
- The Parallel Virtual Machine is a system for programming on a distributed network of machines.
- In PVM the world consists of tasks that communicate via message passing.
- Rather like (but not exactly) the sending of signals with the signals being a somewhat richer collection of data than simply integers.
Parallel Virtual Machine (PVM)
- Tasks are basically Unix processes, created, possibly on remote machines, via a fork-exec sequence.
- Thus unlike both processes and threads they do not share anything of consequence.
- Thus anything they need must either be obtained by calls to PVM library functions, or else be passed in as arguments to the
fork-exec
sequence, such as file names.
Parallel Virtual Machine (PVM)
Parallel Virtual Machine (PVM)
- Just like Unix Processes, and POSIX threads, we will use PVM by using the PVM C library interface.
- We’ll also need to do some PVM daemon initializations, but we’ll come to that soon enough.
Thus all our PVM code will begin with:
#include "pvm3.h"
as well as linking it with the appropriate dohickies:
-Wall hello.c -L/usr/share/pvm3/lib/LINUXX86_64 -I/usr/share/pvm3/include -lpvm3 -o hello
Parallel Virtual Machine (PVM)
- The library consists of about 60 functions, and a heap of error codes and symbolic flags.
- The functions can be classified into seven different categories:
- Task identification, creation and destruction.
- System information and configuration.
- Signalling.
- Message creation and intialization.
- Message sending.
- Message receiving.
- Group operations.
Parallel Virtual Machine (PVM)
- We’ll look at these in detail soon enough.
- For now lets see if we can get one simple example working.
- If you dare, try installing PVM on your home computer.
- To install, follow the checklist.
Parallel Virtual Machine (PVM)
- Installing PVM using Norms' How-to-sheet
- PVM is already installed on turing, do not try and install it there! - You don’t have the permissions.
- To use PVM on your linux machine at home, several things must happen:
Parallel Virtual Machine (PVM)
Download, then install the rpm:
su # Become the root user
password:
rpm -hiv pvm-3.4.4-16.i386.rpm
exit
- Change the cd command to suit were your PVM RPM is located.
- The only output from the rpm command should be some hashes:
#####
- There should be no other error messages.
Parallel Virtual Machine (PVM)
- The location of the files supplied by this rpm are not convenient for software development.
- To fix this, here is a little shell script to create some symbolic links to allow easy compiling and running of PVM programs:
#!/bin/sh
cd /usr/include
ln -s ../share/pvm3/include/pvm3.h
ln -s ../share/pvm3/include/pvmproto.h
ln -s ../share/pvm3/include/pvmtev.h
cd /usr/lib
ln -s ../share/pvm3/lib/LINUX/libpvm3.a
ln -s ../share/pvm3/lib/LINUX/libpvmtrc.a
ln -s ../share/pvm3/lib/LINUX/libgpvm3.a
cd /usr/bin
ln -s ../share/pvm3/lib/LINUX/pvmgs
Copy this to your home system and run it as the root user.
Parallel Virtual Machine (PVM)
- For PVM to work, the PVM_ROOT environment variable must be set.
- If it’s not done, the best way to set it is to edit your
~/.bashrc
file, and add the lines:
PVM_ROOT=/usr/share/pvm3/
export PVM_ROOT
- If you use a shell other than bash, set your
PVM_ROOT
in the appropriate way.
Parallel Virtual Machine (PVM)
- PVM requires a network to work.
- If you don’t have one then you can make it use your localhost loopback network.
- For that to work, your hostname must appear in the
/etc/hosts
file on the localhost line.
- For example, if your hostname command gives the output “myplace”, then you should have a line in
/etc/hosts
that looks something like:
127.0.0.1 localhost myplace
Parallel Virtual Machine (PVM)
- Compiled programs must be run from a particular directory for PVM to find them.
- On a linux machine, that directory is:
~/pvm3/bin/LINUXX86_64
- Make this directory.
~
means your home directory.
Parallel Virtual Machine (PVM)
- Make sure you have the environment variable PVM_ROOT and the
~/pvm3/bin/LINUXX86_64 directory
in place.
- Now you need to know how to start your PVM and compile programs. This should be done on a node of the cluster (like u1).
- To start PVM, type pvm:
my_machine>pvm
pvm> conf
1 host, 1 data format
HOST DTID ARCH SPEED
myplace 40000 LINUXX86_64 1000
pvm> quit
pvmd still running.
my_machine>
- PVM is now running in the background.
Firing Up a PVM Program
- A simple PVM program might consist of a slave and a master program.
- You compile these programs on a node of the cluster (e.g. u1) like this:
gcc -Wall master.c -L/usr/share/pvm3/lib/LINUXX86_64 -I/usr/share/pvm3/include -lpvm3 -o master
gcc -Wall slave.c -L/usr/share/pvm3/lib/LINUXX86_64 -I/usr/share/pvm3/include -lpvm3 -o slave
- Note the libraries we are linking with:
pvm3
is the basic library
gpvm3
is the group operations library
Firing Up a PVM Program
- The binaries must run from a known directory so:
cp slave ~/pvm3/bin/LINUXX86_64
cp master ~/pvm3/bin/LINUXX86_64
cd ~/pvm3/bin/LINUXX86_64
./master
$ pvm
pvmd already running.
pvm> halt
$
Firing Up a PVM Program
- Here is a quick start guide for using the parallel Beowulf machine, bourbaki:
- Now rlogin into bourbaki, then rlogin to a node on the cluster, say rlogin b7.
- Try it now.
Firing Up a PVM Program
- From the machine bourbaki (or any of the nodes), you can rlogin, rsh etc, to any one of the node machines: b1, b2, b3, …, b8.
- For example, to start your PVM machine, (you can use ssh):
ssh bourbaki # log on bourbaki.
ssh b1 # Login to one of the nodes.
pvm # Start your pvm machine on 4 systems
add b2
add b3
add b4
quit
Firing Up a PVM Program
- If you have a program that consists of two files,
master.c and slave.c,
you might build it like this:
# From any machine
cp slave.c master.c $home/pvm3/bin/LINUXX86_64
ssh bourbaki
ssh b1
cd pvm3/bin/LINUXX86_64
gcc -Wall master.c -L/usr/share/pvm3/lib/LINUXX86_64 -I/usr/share/pvm3/include -lpvm3 -o master
gcc -Wall slave.c -L/usr/share/pvm3/lib/LINUXX86_64 -I/usr/share/pvm3/include -lpvm3 -o slave
pvm
add b2
add b3
add b4
quit
./master
Firing Up a PVM Program
- Just like the other idioms some thread in some process has to start the ball rolling.
- Lets call this the parent thread.
- The very first thing the parent must do, before it calls any other PVM routine is enroll with the PVM daemon.
- This is done calling the routine
pvm_mytid
.
Firing Up a PVM Program
- It also generates a unique tid for this process.
int pvm_mytid( void )
- Which brings us to the next point.
- Just like processes and threads, tasks have unique identifiers.
- In the case of tasks these are ints.
- We’ll use these for identification purposes e.g.:
as addresses in the sending of messages
Firing Up a PVM Program
- I know its a bit tacky, but lets do helloworld.c in PVM.
- I’m not sure if they come with the Linux RPM, so you may want to download them to your own machine.
- The hello world example in PVM consists of two programs:
- The parent program hello.c, and
- The child program hello_other.c.
Firing Up a PVM Program
#include <stdio.h>
#include <stdlib.h>
#include "pvm3.h"
int main(void){
int cc, tid;
char buf[100];
printf("i'm t%x\n", pvm_mytid());
cc = pvm_spawn("hello_other", NULL, 0, "", 1, &tid);
if (cc == 1) {
cc = pvm_recv(-1, -1);
pvm_bufinfo(cc, NULL, NULL, &tid);
pvm_upkstr(buf);
printf("from t%x: %s\n", tid, buf);
} else
printf("can't start hello_other\n");
pvm_exit();
exit(EXIT_SUCCESS);
}
Firing Up a PVM Program
- Lets look first at the parent.
- Apart from allocating store, the very first thing the parent does is register with the PVM daemon (pvmd).
- It then prints out its newly gained tid.
- Now comes the creation of the child task.
Firing Up a PVM Program
- This is done via the call to pvm_spawn.
int pvm_spawn(char *task,
char **argv,
int flag,
char *where,
int ntask,
int *tids )
- Which could be understood as ntask thinly disguised calls to fork-execv.
Firing Up a PVM Program
- This call tries to create ntask children tasks all executing task on the appropriate arguments.
- The task character string names the executable file name and must already reside on the host on which it is to be started in the area
$HOME/pvm3/bin/$PVM_ARCH/
Firing Up a PVM Program
argv
is the not quite the usual argument array. It doesn’t include the executable name.
- Thus it is analagous to
&argv[1]
in a fork-execv
.
- Lets not fret upon the flag, nor the where for the now.
- Though there are no prizes for guessing what the where argument is.
- The
pvm_spawn
call returns the number of tasks succesfully created.
Firing Up a PVM Program
- Assuming that the child was successfully spawned
pvm_spawn
returns a value of 1.
- The parent then waits for a message from the child.
- Consequently lets turn our attention to the child:
Firing Up a PVM Program
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include "pvm3.h"
int main(void){
int ptid;
char buf[100];
ptid = pvm_parent();
strcpy(buf, "hello, world from ");
gethostname(buf + strlen(buf), 64);
pvm_initsend(PvmDataDefault);
pvm_pkstr(buf);
pvm_send(ptid, 1);
pvm_exit();
exit(EXIT_SUCCESS);
}
Firing Up a PVM Program
- The first thing the child does, is figure out who its parent is.
- It need not enrol with the PVM daemon since the daemon created it, so presumably is aware if its existence.
- It then gets ready to send the parent a message.
This happens in four stages.
Firing Up a PVM Program
- Step 1:
- We first make the data we wish to send.
- In this case a string of the form
- “hello, world from somemachine.here.edu.au”
- Obviously the child has yet to learn the rudiments of English grammar, but lets ignore ridiculously placed commas for the time being.
- Step 2:
- It then creates a buffer with which to send the message, by calling
-
int pvm_initsend( int encoding )
Firing Up a PVM Program
- Step 3:
- It then packs the message into the buffer via:
-
int pvm_pkstr( char *sp )
- Step 4:
- Finally we send the message on its way:
-
int pvm_send( int tid, int msgtag )
Firing Up a PVM Program
- Step 1:
- The parent first waits (i.e blocks) for a message using
int pvm_recv(int tid, int msgtag)
- If tid and msgtag are both -1, then pvm_recv will accept any message from any process.
- Just like the send, the buffer is not passed into the recieve.
- Step 2:
- It then checks out the contents of the buffer to make sure it got what it was expecting.
int pvm_bufinfo(int bufid, int *bytes, int *msgtag, int *tid)
Firing Up a PVM Program
- Step 3:
- It then unpacks the buffer using
int pvm_upkstr( char *sp )
- Step 4:
- Finally it does whatever it wanted to do with the data it was sent.
- In this case print it.
Firing Up a PVM Program
- To run this example we do the following.
- We first start the PVM daemon.
- This is done like so:
[comp309@immortal Examples]$ pvm
pvm> conf
conf
1 host, 1 data format
HOST DTID ARCH SPEED DSIG
immortal 40000 LINUXX86_64 1000 0x00408841
pvm> quit
quit
Console: exit handler called
pvmd still running.
Firing Up a PVM Program
- The next step is to compile these.
- This can be done in any number of ways (use the makefile!), but we’ll do it like so:
[comp309@immortal Examples]$ make
gcc -Wall hello.c -L/usr/share/pvm3/lib/LINUXX86_64 -I/usr/share/pvm3/include -lpvm3 -o hello
gcc -Wall hello_other.c -L/usr/share/pvm3/lib/LINUXX86_64 -I/usr/share/pvm3/include -lpvm3 -o hello_other
Firing Up a PVM Program
- Next we copy the executables to the PVM path:
[comp309@turing Examples]$ cp hello hello_other /home/comp309/pvm3/bin/LINUXX86_64/
Now we can run them.
[comp309@turing Examples]$ ./hello
i'm t40002
from t40003: hello, world from immortal
- Now we can shut up shop, and finish for the day.
[comp309@turing Examples]$ pvm
pvmd already running.
pvm> halt
halt
Terminated
Firing Up a PVM Program
- Now we enter the world of the cluster, and start up PVM, and happily add some nodes:
[comp309@bourbaki comp309] rlogin b1
[comp309@b1 comp309] pvm
pvm> add b2
add b2
1 successful
HOST DTID
b2 80000
pvm> add b3
add b2
1 successful
HOST DTID
b2 c0000
pvm> add b4 b5 b6 b7 b8
5 successful
HOST DTID
b4 100000
b5 140000
b6 180000
b7 1c0000
b8 200000
Firing Up a PVM Program
- So now we can go to the appropriate directory and start the b’s a rolling:
[comp309@b1 comp309] cd Lectures/Lecture_13/Examples/
b1.Examples> ./hello
i'm t40002
from t80001: hello, world from b2
b1.Examples> pvm
pvmd already running.
pvm> halt
halt
Terminated
Firing Up a PVM Program
- Rather than typing sixteen machine names to the pvm console on b1, we can put a file called say, setup, in the appropriate area (the \$HOMES/pvm3/bin/LINUXX86_64 area) like thus:
b1
b2
b3
b4
b5
b6
b7
b8
Firing Up a PVM Program
Then do something like:
b1 % pvm setup
pvm> conf
8 hosts, 1 data format
HOST DTID ARCH SPEED
b1 40000 LINUXX86_64 1000
b2 80000 LINUXX86_64 1000
b3 c0000 LINUXX86_64 1000
b4 100000 LINUXX86_64 1000
b5 140000 LINUXX86_64 1000
b6 180000 LINUXX86_64 1000
b7 1c0000 LINUXX86_64 1000
b8 200000 LINUXX86_64 1000
pvm> quit
pvmd still running.
b1 %
Firing Up a PVM Program
COMPILER = gcc
CFLAGS = -Wall
LIBS = -L$(PVM_ROOT)/lib/$(PVM_ARCH) -I$(PVM_ROOT)/include -lpvm3
GLIBS = ${LIBS} -lgpvm3
EXES = hello hello_other
all: ${EXES}
hello: hello.c
${COMPILER} ${CFLAGS} hello.c $(LIBS) -o hello
hello_other: hello_other.c
${COMPILER} ${CFLAGS} hello_other.c $(LIBS) -o hello_other
host:
cp -f ${EXES} ${HOME}/pvm3/bin/$(PVM_ARCH)/
clean:
rm -f *~ *.o ${EXES}
Summary
- The Beowulf Cluster
- Parallel Virtual Machine (PVM)
- Firing Up a PVM Program
Questions?
Reading