-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Checkpointing the buffer #188
Comments
thanks.
In the case where you have buffer checkpoint False, each run just creates its own new membuffer again and then does the pretraining to fill up the action buffer or whatever it is, and everything works fine. |
Hi @Disastorm,
This means that the I also tried to test it and it works (on Linux), I can restart an experiment multiple times:
I understand that the logic of |
Thanks yea thats basically the same behavior I saw, its just somehow there was something that was triggering the auto-delete of the memmap files from the first run. I don't really know what was triggering it but it seemed to potentially be related to when I was ctrl-c the run, not sure if i messed up a setting somewhere, or if its related to running on windows, etc. Anyway, I'm just using buffer.checkpoint = False for now so you can close this issue, but just wanted to mention that there may be some trigger somewhere that auto-deletes the memmap files from the first run when you do ctrl-c on one of the later runs. |
Hey @Disastorm, we have indeed a problem with memmapped arrays on Windows: can you try out this branch pls? |
Oh I see thanks. I'll try that out for whenever I try to train a new model, for the one I'm currently running I already have buffer.checkpoint = False so I can't try it on this one. or are you saying that actually even when checkpoint = False, the memmap is not working properly and I should use that branch regardless? btw separate question, is there a way to set exploration in dreamerv3? would i adjust ent_coef, or do i need to use one of those other things like the Plan2Explore configs ( I don't know what Plan2Explore is ). |
Nope, the memmap is working properly, the problem arise when you checkpoint the buffer and try to resume multiple times, in that particular case the memmap buffer on Windows will be deleted. If you can try that new branch so we are sure it fixes your problem, then we can close the issue
I will open a new issue with the question to keep things in order |
confirmed this is fixed. |
Hi @Disastorm, I've copied here your new question, so that we keep closed the other issue:
@belerico
Hey just wondering how this buffer checkpointing works? I have
And so when resuming it doesn't do the pretraining buffer steps anymore, however I noticed the buffer files don't ever get updated, the last modified date is just when the first training started. Is this a problem? The files I'm referring to are the .memmap files, I see now it doesn't keep creating them for each run when checkpoint = True, so I assumed it would be using the ones from the previous run, but their update date isn't changing at all. Is it inside the checkpoint file itself? The filesize of the checkpoint still looks pretty similar to when running with checkpoint: False I think.
The text was updated successfully, but these errors were encountered: