-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restart issue #63
Comments
Dear Savio, $ h5dump -a /patch-000000/species ../ClusterSim_2/checkpoints/dump-00000-0000000000.h5 According to your error, it should return : ... {
ATTRIBUTE "species" {
DATATYPE H5T_STD_U32LE
DATASPACE SCALAR
DATA {
(0): 0
}
}
} If it's the case don't you have an error file from the third simulation ? Regards. Julien |
Seems like it might be the checkpoint file which is corrupted. I've attached the logfile from the final run to show the error that it observes. If ClusterSim2 does need to be rerun can you recommend how I avoid this error? |
looks like Since the simulation did several checkpoints, a wise thing to try is keep on disk more than one checkpoint. You can achieve this with set it to 2 an even if the latest checkpoint is corrupted you will still have the previous one. |
Ah I forgot to load some of the mpi modules before. Here is the results from the h5dump file: [svr11@cx2-login checkpoints]$ h5stat dump-00000-00000000*.h5 I'll try with two dump files. Maybe one will work. |
One note about this issue. If you terminate your job too early after the time of the checkpoint, then the storage of data into checkpoint files may be interrupted, causing corrupt files. To avoid this, you should let the simulation at least 5 minutes to complete the checkpoint. In some cases, 5 minutes is not sufficient. |
The issue was in fact that I had run out of space and the checkpoint file couldn't complete its save! I think it is running fine now. |
Hello,
I'm trying to start a simulation from the output of a previous simulation. I have successfully restarted a simulation from two previous checkpoints. However on the third time I obtain a rather odd error.
''
READING fields and particles for restart
ERROR src/Tools/H5.h:324 (getVect) Reading vector Position-0 is not 1D but -1D
ERROR src/Tools/H5.h:324 (getVect) Reading vector Position-1 is not 1D but -1D
ERROR src/Tools/H5.h:324 (getVect) Reading vector Position-1 is not 1D but -1D
ERROR src/Tools/H5.h:324 (getVect) Reading vector Position-0 is not 1D but -1D
ERROR src/Tools/H5.h:324 (getVect) Reading vector Position-0 is not 1D but -1D
ERROR src/Tools/H5.h:324 (getVect) Reading vector Position-0 is not 1D but -1D
ERROR src/Checkpoint/Checkpoint.cpp:588 (restartPatch) Number of species differs between dump (0) and namelist (3)
ERROR src/Tools/H5.h:324 (getVect) Reading vector Position-0 is not 1D but -1D
ERROR src/Tools/H5.h:324 (getVect) Reading vector Momentum-0 is not 1D but -1D
``
I don't understand this error as its trying to read data that smilei has output itself? I've attached all the log and input deck files that I've used for this simulation.
log_1.txt
log_2.txt
log_3.txt
log_4.txt
SimulationFiles.zip
Thanks
Savio Rozario
The text was updated successfully, but these errors were encountered: