You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expected behavior: After the esm-tools detect a model instability, decide to kill the current simulation, and resubmit the new simulation with the ENSTDIF-workaround for one year (see the message from the simulation logfile below), the simulation should continue with a new job-ID and the ENSTDIF-workaround being active for one year.
ERROR: high wind speed was found during your run, applying wind speed fix and resubmitting...
Will kill the run now...
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: got SIGCONT
slurmstepd: error: *** JOB 6235540 ON prod-0040 CANCELLED AT 2020-06-18T19:53:29 ***
srun: forcing job termination
0: slurmstepd: error: *** STEP 6235540.0 ON prod-0040 CANCELLED AT 2020-06-18T19:53:29 ***
srun: error: prod-0045: tasks 180-215: Exited with exit code 1
srun: Terminating job step 6235540.0
Observed behavior: The simulation run, during which high windspeed is detected in the atmout file, is actually killed, but no re-submission occurs (or, if re-submission occurs via the tools, the attempt is not successful).
Suggested solution: Fix or implement the esm-tools code that, after killing the current simulation, indeed resubmits the simulation (with ENSTDIF-workaround for exactly one year only).
employed run script (on Ollie):
/home/ollie/stepanek/esm-tools_v4/esm_tools/myrunscripts/myinitialtest_yearly_old.yaml
an important question to be clarified here would be: do we want to automatically resubmit the job? I'm happy to program that, maybe with the option to turn the automatic resubmit off.
I'd ask one of the other developers (or @christian-stepanek) to have another look at the code to ensure it does the right thing and is understandable. Given that it requires a change to esm-runscripts, I'm moving the issue there.
Expected behavior: After the esm-tools detect a model instability, decide to kill the current simulation, and resubmit the new simulation with the ENSTDIF-workaround for one year (see the message from the simulation logfile below), the simulation should continue with a new job-ID and the ENSTDIF-workaround being active for one year.
Observed behavior: The simulation run, during which high windspeed is detected in the atmout file, is actually killed, but no re-submission occurs (or, if re-submission occurs via the tools, the attempt is not successful).
Suggested solution: Fix or implement the esm-tools code that, after killing the current simulation, indeed resubmits the simulation (with ENSTDIF-workaround for exactly one year only).
employed run script (on Ollie):
/home/ollie/stepanek/esm-tools_v4/esm_tools/myrunscripts/myinitialtest_yearly_old.yaml
employed esm_tools versions:
esm_archiving : unknown version!
esm_autotests : unknown version!
esm_calendar : 4.0.1
esm_database : 4.0.0
esm_environment : 4.0.1
esm_master : 4.0.2
esm_parser : 4.0.2
esm_profile : 4.0.0
esm_rcfile : 4.0.0
esm_runscripts : 4.0.3
esm_tools : 4.0.9
esm_plugin_manager : 4.0.1
esm_version_checker : 4.0.2
The text was updated successfully, but these errors were encountered: