Jump to content
TUFLOW Forum
Francis Lane

Multiple GPU runs - GPU cards assigned to first model

Recommended Posts

Hi,

Administrator - not sure which topic this should be in. Please move or create a new one as appropriate.

We have just started running multiple GPU models on our grunt computer. We noticed that the first model in the list ran fairly quickly, however the others were much slower. We have the following commands in our .tcf file.

 If Scenario == GPUEx
  Hardware == GPU
  Solution Scheme == HPC
  GPU Device IDs == 0,1,2
 End if

Do we need to assign a separate GPU device to each specific model to overcome this?

Ideally we would like the GPU cards to be automatically assigned. We would appreciate advice on how best to set this up.

Share this post


Link to post
Share on other sites

Hi Francis,

As a default, only the first graphics card (ID 0 / -pu0) is used for TUFLOW HPC-GPU simulations. If you want to use the second, third, fourth graphics card either the "GPU Devices IDs" command should be specified withing the .tcf or you can use the graphics card switches in the batch file (-pu1, -pu2, -pu3).

When two or more simulations are running on the same graphics card or the same set of graphics cards, the computational resources are shared between the simulations and the simulations might slow down.

If you don't want your simulations to slow down due to the shared resources, you can either use the "GPU Devices IDs" command or the batch file switches.

 

Kind Regards,

Pavlina

Share this post


Link to post
Share on other sites

Hi Guys

I came across this same issue recently when batching up some StormTide runs for a local LGA. I too had a grunt machine with multiple cards and whished to use loops in a batch file as I had many event scenarios.

In my case I had 4 Tuflow licenses, 2 GPU add on modules and two GPU cards in our grunt machine. Ordinarily using classic I would use a loop in a batch file and limit my amount of runs based on instances of TUFLOW running in the 'tasklist' which is possible as each instance of TUFLOW correlates to 1 license and one CPU core.

However trying to automate the process using loops is not so straight forward when you have multiple GPU cards - as each instance of TUFLOW can be split between multiple cards.

In my case, the majority of my runs took similar time to run. So - I customized a batch file to switch between GPU cards when starting a new simulation. This is not ideal, as there will be times when two simulations may share the same graphics card (particularly problematic when running large models that use up all the GPU memory). However it works for the most part. Its super basic and clunky - but may help?

See script below:

@ echo off
setlocal enabledelayedexpansion
echo Model Design - HPC BRC ALL StormTide Simulations

rem ______________SET RUN VARIABLES_____________
set TUFLOWEXE=C:\tuflow\TUFLOW.2018-03-AC\2018-03-AC\TUFLOW_iSP_w64.exe
set RUN=start "TUFLOW" /low "%TUFLOWEXE%" -b
set /a CPU_Cores=2
set /a GTX_Card=0       rem sets GPU card count to zero

set tcf=BCR_~e1~_003a_~e2~_5m_03.tcf

Set A=S T
rem set B in loop below


rem ______________SET LOOPS____________________
FOR %%a in (%A%) do (

rem _____DEFINE RUN LOGIC______
 IF %%a==T (
  set B=E_ F2030_ F2050_ F2070_ F2100_ F2130_
  set C=HAT
 )
  IF %%a==S (
  set B=E_ F2050_ F2070_ F2100_ F2130_
   set C=00020Y 00100Y 01000Y 10000Y
  )
 
rem ______________RUN LOOPS____________________
 FOR %%b in (!B!) do (
  FOR %%c in (!C!) do (
   call :do_while_loop_start
   SET /a GTX_Card+=1               rem increment "up" GPU card count 
   If /i !GTX_Card!==1 (
    %RUN% -e1 %%a -e2 %%b%%c -pu!GTX_Card! %tcf%
    Set /a GTX_Card=-1                rem increment "down" GPU card count 
    timeout 5
   ) ELSE (
    %RUN% -e1 %%a -e2 %%b%%c -pu!GTX_Card! %tcf%
    timeout 5 
   )
  )
 )
)


rem ___________COUNT RUN INSTANCES______________
:do_while_loop_start
    set /a count=0
    for /f %%x in ('tasklist ^| find /c "TUFLOW"') do set count=%%x
    if %count% geq %CPU_Cores% (
        PING 1.1.1.1 -n 1 -w 60000 >NUL
        goto do_while_loop_start
)

endlocal

run_BRC_03_All.txt

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...