Opened 17 years ago
Closed 14 years ago
#256 closed defect (fixed)
Is parallel anuga working?
Reported by: | nick | Owned by: | hudson |
---|---|---|---|
Priority: | normal | Milestone: | ANUGA enhancements |
Component: | Functionality and features | Version: | |
Severity: | minor | Keywords: | |
Cc: |
Description
I have run a couple of test with parallel anuga and none have worked completely lately.
Most recently i have ran this broome scenaria on 8 nodes and it stalled in the evolution J:\inundation\data\western_australia\broome_tsunami_scenario_2006\anuga\outputs\20080308_002224_run_trial_4.9_dampier_nbartzis
Can someone please run this model and see if it works and it is not just a simple mistake i have made...
Note, could it have anything to do with having 'bypass = True' commented out on 180 in pypar_broadcast and pypar.reduce inside of parallel_shallow_water
Cheers Nick
Change History (7)
comment:1 Changed 17 years ago by ole
- Owner changed from ole to steve
comment:2 Changed 17 years ago by ole
Stephen got parallel_advection to go and will try to run the ANU test for parallel shallow water and see if the problem is obvious.
comment:3 Changed 15 years ago by ole
- Milestone set to ANUGA enhancements
comment:4 Changed 15 years ago by hudson
- Owner changed from steve to hudson
comment:5 Changed 14 years ago by hudson
The "new" pymetis fails to install without hacking, although I managed to coax the old one in the ANUGA repository into installing properly. I got a couple of tests running with some errors. pymetis seems to be poorly maintained, and it should be easy to write our own code to subdivide a mesh and refactor the parallel code.
comment:6 Changed 14 years ago by steve
I just recompiled pymetis on my new linux (64bit0 machine. I did run problem with metis defining log2. I gave it a new name ilog2 and that seemed to fix it up. Parallel anuga is working for me.
By the way, we should keep with metis as it provides a good partitioning of hte mesh.
Cheers Steve
comment:7 Changed 14 years ago by hudson
- Resolution set to fixed
- Status changed from new to closed
OK, it seems to be running OK for 2 people - I'll close this task. Please create a new one for specific bugs.
Tried run_okushiri_parallel yesterday and it is clear that it doesn't work anymore. Some processes charge ahead without synchronising. It could have to do with the way we bypass computations for dry cells or the Runge-Kutta timesteps.
It would be good to run the simple examples the ANU used when developing the parallel capability and see if that shows the problem.