wiki:AnugaParallel

Version 12 (modified by steve, 12 years ago) (diff)

--

INSTALLING anuga_parallel

First you should install the most uptodate version of the code. Follow the instructions to install Anuga on Ubuntu. This should download the anuga code (along wit hte parallel code) to a directory of the form

/home/username/anuga_core

where username is of course your username on your machine.

Make sure you have setup your PYTHONPATH to point to location of the source directory

For instance I have the following line in my .bashrc file

export PYTHONPATH=/home/username/anuga_core/source

At this stage you should have a working version of the sequential anuga program. I.e. you should be able to run command

python test_all.py

from the anuga_core directory and have your installation pass all the tests (well nearly all as this is the development version and there are sometimes a few tests that fail).

anuga_parallel

Now to get anuga_parallel to work, we need to install some other packages.

MPI

Now you need to install MPI on your system. OPENMPI and MPICH2 are supported by pypar (see below) so both should be ok. But I tend to use mpich2.

So install mpich2 on your system via apt-get

sudo apt-get install mpich2

Make sure mpi works. You should be able to run a program in parallel. Try something as simple as

mpirun -np 4 pwd

should produce the output of pwd 4 times.

PYPAR

We use pypar as the interface between mpi and python. The most recent version of PYPAR is available from http://code.google.com/p/pypar/

(There is an old version on sourceforge http://sourceforge.net/projects/pypar/ don't use that)

Install pypar following the instructions in the download. You should be able use the standard command

python setup.py install

or maybe

sudo python setup.py install

from the source directory in the pypar distribution.

Fire up python and see if you can import pypar

Make sure the pypar examples work

Problem with Installation

We have been seeing the following error when trying to import pypar.

>>> import pypar
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pypar.py", line 863, in <module>
    mpi = CDLL('libmpi.so.0', RTLD_GLOBAL)
  File "/usr/lib/python2.6/ctypes/__init__.py", line 353, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libmpi.so.0: cannot open shared object file: No such file or directory

PYMETIS

In the anuga_parallel directory there is a subdirectory pymetis.

Follow the instructions in README to install. Essentially just run make.

If you have a 64 bit machine run

make COPTIONS="-fPIC"

From the pymetis directory, test using test_all.py, ie

python test_all.py

ANUGA_PARALLEL

Should now be ready to run some parallel anuga code. Go back to the anuga_parallel directory and run test_all.py

Hopefully that all works.

Example program

Run run_parallel_sw_merimbula.py

First just run it as a sequential program, via

python run_parallel_sw_merimbula.py

Then try a parallel run using a command like

mpirun -np 4 python run_parallel_sw_merimbula.py

That should run on 4 processors

You should look at the code in run_parallel_sw_merimbula.py

Essentially a fairly standard example, with the extra command

domain = distribute(domain)

which sets up all the parallel stuff.

Also for efficiency reasons we only setup the original full sequential mesh on processor 0, hence the statement

if myid == 0:
     domain = create_domain_from_file(mesh_filename)
     domain.set_quantity('stage', Set_Stage(x0, x1, 2.0))
else:
     domain = None

The output will be an sww file associated to each processor.

There is a script anuga/utilities/sww_merge.py which provides a function to merge sww files into one sww file for viewing with the anuga viewer.

Suppose your parallel code produced 3 sww files, domain_P3_0.sww domain_P3_1.sww and domain_P3_2.sww

The base name would be "domain" and the number of processors would be 3. To stitch these 3 files together either run the sww_merge.py as a script with the command

python /dir/to/anuga/utilities/sww_merge.py -f domain -np 3

or add the following command at the end of your simulation script

domain.sww_merge()