Version 16 (modified by steve, 12 years ago) (diff) |
---|
INSTALLING anuga_parallel
First you should install the most uptodate version of the code. Follow the instructions to install Anuga on Ubuntu. This should download the anuga code (along with the parallel code) to a directory of the form
/home/username/anuga_core
where username is of course your username on your machine.
Make sure you have setup your PYTHONPATH to point to location of the source directory
For instance I have the following line in my .bashrc file
export PYTHONPATH=/home/username/anuga_core/source
At this stage you should have a working version of the sequential anuga program. I.e. you should be able to run command
python test_all.py
from the anuga_core directory and have your installation pass all the unit tests (well nearly all, as this is the development version and there are sometimes a few unit tests that fail).
anuga_parallel
Now to get anuga_parallel to work, we need to install some other packages.
MPI
Now you need to install MPI on your system. OPENMPI and MPICH2 are supported by pypar (see below) so both should be ok. But I tend to use mpich2.
So install mpich2 on your system via apt-get
sudo apt-get install mpich2
Make sure mpi works. You should be able to run a program in parallel. Try something as simple as
mpirun -np 4 pwd
should produce the output of pwd 4 times.
PYPAR
We use pypar as the interface between mpi and python. The most recent version of PYPAR is available from http://code.google.com/p/pypar/
(There is an old version on sourceforge, don't use that)
Install pypar following the instructions in the download. You should be able use the standard command
python setup.py install
or maybe
sudo python setup.py install
from the source directory in the pypar distribution.
Fire up python and see if you can import pypar
You should obtain
>>> import pypar Pypar (version 2.1.4) initialised MPI OK with 1 processors
Make sure the pypar examples work
Problem with pypar Installation
We have been seeing the following error when trying to import pypar.
>>> import pypar Traceback (most recent call last): File "<stdin>", line 1, in <module> File "pypar.py", line 863, in <module> mpi = CDLL('libmpi.so.0', RTLD_GLOBAL) File "/usr/lib/python2.6/ctypes/__init__.py", line 353, in __init__ self._handle = _dlopen(self._name, mode) OSError: libmpi.so.0: cannot open shared object file: No such file or directory
For us this was caused by there not being a file libmpi.so.0 in the /usr/lib directory. This is really a pypar bug, but as a work around we created a link from the file /usr/lib/libmpi.so to /usr/lib/libmpi.so.0 via the command
sudo ln /usr/lib/libmpi.so /usr/lib/libmpi.so.0
Then import pypar should produce the following
>>> import pypar Pypar (version 2.1.4) initialised MPI OK with 1 processors
PYMETIS
In the anuga_parallel directory there is a subdirectory pymetis.
Follow the instructions in README to install. Essentially just run make.
If you have a 64 bit machine run
make COPTIONS="-fPIC"
From the pymetis directory, test using test_all.py, ie
python test_all.py
ANUGA_PARALLEL
Should now be ready to run some parallel anuga code. Go back to the anuga_parallel directory and run test_all.py
Hopefully that all works.
Example program
Run run_parallel_sw_merimbula.py
First just run it as a sequential program, via
python run_parallel_sw_merimbula.py
Then try a parallel run using a command like
mpirun -np 4 python run_parallel_sw_merimbula.py
That should run on 4 processors
You should look at the code in run_parallel_sw_merimbula.py
Essentially a fairly standard example, with the extra command
domain = distribute(domain)
which sets up all the parallel stuff.
Also for efficiency reasons we only setup the original full sequential mesh on processor 0, hence the statement
if myid == 0: domain = create_domain_from_file(mesh_filename) domain.set_quantity('stage', Set_Stage(x0, x1, 2.0)) else: domain = None
The output will be an sww file associated to each processor.
There is a script anuga/utilities/sww_merge.py which provides a function to merge sww files into one sww file for viewing with the anuga viewer.
Suppose your parallel code produced 3 sww files, domain_P3_0.sww domain_P3_1.sww and domain_P3_2.sww
The base name would be "domain" and the number of processors would be 3. To stitch these 3 files together either run the sww_merge.py as a script with the command
python /dir/to/anuga/utilities/sww_merge.py -f domain -np 3
or add the following command at the end of your simulation script
domain.sww_merge()