Opened 18 years ago

Last modified 18 years ago

#165 closed defect

problem with C code in fit interpolate — at Version 4

Reported by: nick Owned by: duncan
Priority: high Milestone:
Component: Functionality and features Version:
Severity: normal Keywords:
Cc:

Description (last modified by nick)

Can you provide any information about this error, it is the third time it has occurred with this code

-----------------------
*** glibc detected *** double free or corruption (!prev): 0x000000000162fc10 ***


One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 8578 failed on node n10 (192.168.0.241) due to signal 6.
-----------------------------------------------------------------------------

It occurs when running an exmouth model in parallel, (the sequential run works

this is the directory of the parallel run J:\inundation\data\western_australia\exmouth_tsunami_scenario\anuga\outputs\20070525_034301_run_basic_1.4_nbartzis

this is the directory of the last serial run J:\inundation\data\western_australia\exmouth_tsunami_scenario\anuga\outputs\20070525_033245_run_basic_1.4_nbartzis

+-------------------------------------------------------------
| Fri May 25 15:02:22 2007. Evaluating function _file_function
+-------------------------------------------------------------
| Argument:     '/d/xrd/gem/5/nhi/inundation/data/western_australia/exmouth_tsunami_scenario/anuga/boundaries/urs/dampier/1_10000/exmouth_3854_17042007.sww'
| Keyword Args: {'quantities': ['stage', 'xmomentum', 'ymomentum'], 'interpolation_points': Array: (187, 2), 'time_thinning': 12, 'domain_starttime': 0, 'verbose': True}
| Reason:       No cached result
+-------------------------------------------------------------

Reading /d/xrd/gem/5/nhi/inundation/data/western_australia/exmouth_tsunami_scenario/anuga/boundaries/urs/dampier/1_10000/exmouth_3854_17042007.sww
File_function data obtained from: /d/xrd/gem/5/nhi/inundation/data/western_australia/exmouth_tsunami_scenario/anuga/boundaries/urs/dampier/1_10000/exmouth_3854_17042007.sww
  References:
    Lower left corner: [189945.759564, 7488724.335279]
    Start time:   4000.000000
Building interpolation matrix from source mesh (648 vertices, 1086 triangles)
FitInterpolate: Building mesh
FitInterpolate: Building quad tree
Interpolating (187 interpolation points, 1034 timesteps). Timesteps were thinned by a factor of 12
 time step 0 of 1034
 time step 104 of 1034
 time step 208 of 1034
 time step 312 of 1034

here is an example of the same problem with code that runs quicker

\inundation\data\western_australia\exmouth_tsunami_scenario\anuga\outputs\20070530_062210_run_store_0_nbartzis

you should be able to execute "run_exmouth.py" from this directory and it will create another directory with the output results

there is one difference i can see the code fails at a slightly different spot

Find midpoint coordinates of entire boundary
Initialise file_function
Caching: looking for cached files /d/cit/1/cit/unixhome/nbartzis/.python_cache/_file_function[-1913934578119235209]_{Result,Args,Admin}.z
Caching: Dependencies are ['/d/xrd/gem/5/nhi/inundation/data/western_australia/exmouth_tsunami_scenario/anuga/boundaries/urs/dampier/1_10000/exmouth_3854_17042007.sww']
+-------------------------------------------------------------
| Tue Jun  5 08:49:22 2007. Evaluating function _file_function
+-------------------------------------------------------------
| Argument:     '/d/xrd/gem/5/nhi/inundation/data/western_australia/exmouth_tsunami_scenario/anuga/boundaries/urs/dampier/1_10000/exmouth_3854_17042007.sww'
| Keyword Args: {'quantities': ['stage', 'xmomentum', 'ymomentum'], 'interpolation_points': [[  192783.96757945  7566067.52060781]
 [  192597.7735207   7576851.25925281]
 [  192621.04777805  7575503.29192219]
 [  192807.2418368   7564719.55327719]
 [  192644.32203539  7574155.32459156]
 [  192714.14480742  7570111....
| Reason:       No cached result
+-------------------------------------------------------------

Reading /d/xrd/gem/5/nhi/inundation/data/western_australia/exmouth_tsunami_scenario/anuga/boundaries/urs/dampier/1_10000/exmouth_3854_17042007.sww
File_function data obtained from: /d/xrd/gem/5/nhi/inundation/data/western_australia/exmouth_tsunami_scenario/anuga/boundaries/urs/dampier/1_10000/exmouth_3854_17042007.sww
  References:
    Lower left corner: [189945.759564, 7488724.335279]
    Start time:   4000.000000

3 of the 4 processes fail at this point and other continues on until it creates a .sww file. All the results are in the directory noted above.

It would be very useful if the error could be verified. This would be done by someone else running this code using mpirun in parallel eg "mpirun c0-3 python run_exmouth.py"

Change History (4)

comment:1 Changed 18 years ago by nick

Description: modified (diff)

comment:2 Changed 18 years ago by nick

Description: modified (diff)
Owner: changed from ole to duncan
Summary: problem with C code in file_functionproblem with C code in fit interpolate

comment:3 Changed 18 years ago by nick

Description: modified (diff)

comment:4 Changed 18 years ago by nick

Description: modified (diff)
Note: See TracTickets for help on using tickets.