Opened 17 years ago
Closed 15 years ago
#256 closed defect (fixed)
Is parallel anuga working?
Reported by: | nick | Owned by: | James Hudson |
---|---|---|---|
Priority: | normal | Milestone: | ANUGA enhancements |
Component: | Functionality and features | Version: | |
Severity: | minor | Keywords: | |
Cc: |
Description
I have run a couple of test with parallel anuga and none have worked completely lately.
Most recently i have ran this broome scenaria on 8 nodes and it stalled in the evolution J:\inundation\data\western_australia\broome_tsunami_scenario_2006\anuga\outputs\20080308_002224_run_trial_4.9_dampier_nbartzis
Can someone please run this model and see if it works and it is not just a simple mistake i have made...
Note, could it have anything to do with having 'bypass = True' commented out on 180 in pypar_broadcast and pypar.reduce inside of parallel_shallow_water
Cheers Nick
Change History (7)
comment:1 Changed 17 years ago by
Owner: | changed from ole to steve |
---|
comment:2 Changed 17 years ago by
Stephen got parallel_advection to go and will try to run the ANU test for parallel shallow water and see if the problem is obvious.
comment:3 Changed 16 years ago by
Milestone: | → ANUGA enhancements |
---|
comment:4 Changed 15 years ago by
Owner: | changed from steve to James Hudson |
---|
comment:5 Changed 15 years ago by
The "new" pymetis fails to install without hacking, although I managed to coax the old one in the ANUGA repository into installing properly. I got a couple of tests running with some errors. pymetis seems to be poorly maintained, and it should be easy to write our own code to subdivide a mesh and refactor the parallel code.
comment:6 Changed 15 years ago by
I just recompiled pymetis on my new linux (64bit0 machine. I did run problem with metis defining log2. I gave it a new name ilog2 and that seemed to fix it up. Parallel anuga is working for me.
By the way, we should keep with metis as it provides a good partitioning of hte mesh.
Cheers Steve
comment:7 Changed 15 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
OK, it seems to be running OK for 2 people - I'll close this task. Please create a new one for specific bugs.
Tried run_okushiri_parallel yesterday and it is clear that it doesn't work anymore. Some processes charge ahead without synchronising. It could have to do with the way we bypass computations for dry cells or the Runge-Kutta timesteps.
It would be good to run the simple examples the ANU used when developing the parallel capability and see if that shows the problem.