Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openPMD plugin: do not use JSON configurations when restarting #3674 #3675

Conversation

franzpoeschel
Copy link
Contributor

It is currently impossible to specify different plugin parameters for checkpointing and restarting. ADIOS2 reading routines do apparently read the InitialBufferSize parameter even when opened in read-only mode. See screenshot below: Memory profile when reading 16 iterations, one time not specifying the InitialBufferSize, one time specifying it.
Bildschirmfoto vom 2021-07-09 11-11-05

Since there is currently no useful configuration to use for reading routines when restarting, just don't use JSON configurations at all when reading.

@PrometheusPi
Copy link
Member

@franzpoeschel Thank you for providing this fix that quickly 👍

@sbastrakov sbastrakov added bug a bug in the project's code component: plugin in PIConGPU plugin labels Jul 9, 2021
@sbastrakov
Copy link
Member

I am not sure if the "bug" label applies, feel free to change if necessary.

@sbastrakov sbastrakov added this to the 0.6.0 / 1.0.0: Next Stable milestone Jul 9, 2021
@PrometheusPi
Copy link
Member

I compiled the pull request with a previously failing setup on taurus ml and will report back if it solves the issue.

@franzpoeschel
Copy link
Contributor Author

I am not sure if the "bug" label applies, feel free to change if necessary.

Maybe technically not PIConGPU's fault, but the current behavior leads to crashes where one would not expect any. Calling that a bug is justified in my opinion.

@PrometheusPi
Copy link
Member

I tried running a restart on taurus ml, but now got the following error:

Unhandled exception of type 'St13runtime_error' with message 'Chunk does not reside inside dataset (Dimension on index 2 - DS: 481689600 - Chunk: 482651136)', terminating

@psychocoderHPC
Copy link
Member

I will merge this PR, the issue from @PrometheusPi is with high probability triggered by problems.
@PrometheusPi is trying to restart a simulation run that was written with an older dev branch short before we havely refactored the absorbers.

@psychocoderHPC psychocoderHPC merged commit ed44d6b into ComputationalRadiationPhysics:dev Jul 15, 2021
@PrometheusPi
Copy link
Member

Yes, @psychocoderHPC is right, the error message above originated from a changed absorber default.

@franzpoeschel
Copy link
Contributor Author

Note: ADIOS2 has now read parameters, e.g. for setting the number of threads in decompression. So, we might want to add some functionality into PIConGPU to use this now again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug a bug in the project's code component: plugin in PIConGPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants