Search the Community
Showing results for tags 'gpu model failure'.
Found 1 result
-
GPU model failure late in simulation
Francis Lane posted a topic in MATH Errors & Simulation Failure
Hi, We are running some TUFLOW models using the GPU hardware and HPC solution scheme (Build 2017-09-AC). We found that some of the runs fail without warning (well into the simulation). It seems that they are 'exiting without prompt', so we are unable to view the dos window to see what has caused the models to fail. We have tried disabling the 'quick edit mode' in the dos window in case the issue was related to the 'TUFLOW pause mid simulation? cause and solution' topic posted by Chris Huxley, but this didn't make any difference. Three runs (15, 25, and 540 minute) completed without any apparent issues. Two runs (90 and 120 minute) failed, but when re-started they completed successfully. However three longer runs (720, 2400 and 2880 minute) failed and will not complete on re-start. We have reviewed the .tlf files (both standard .tlf and hpc.tlf). We appreciate that the adaptive timestep will mask potential instabilities, but there is nothing in the tlf files to indicate that TUFLOW is having instability issues (i.e. the timesteps are consistent at the time of failure). We would appreciate some guidance on how to resolve this issue. We are managing the runs via TRIM. Kind regards, Francis Lane