-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misconfig? #118
Comments
updated to the 2.0.~ of lightning |
tried lower version of pytorch lightning DEBUG:piper_train:Checkpoints will be saved every 1000 epoch(s) |
actually it looks like, I jump to tensor board from here and figure out how training is doing |
1.7.7 is supported version |
some things to mention. because, below is a snippet from my testbed.
for me - my dataset, settings, and hardware, complete 1 x epoch in ~28seconds, your checkpoints occur every 1000 epochs - if its taking 1 min per epoch (depending on your dataset, batch size, precision level, test samples taken, your hardware capability, etc.) you wont see any updates in the console/terminal window for 1000 x minutes (16.6 hours!), and even if you're completing 1 x epoch every second, that's still going to take 16.6 minutes before you see any updates. i'd suggest, creating a very small test dataset, like literally 2 or 3 lines in your dataset.jsonl, maybe record some fresh really short wavs maybe 5 words each file so it wont take long to process it. then change your command to below:
then assume its going to take (on the extreme) 2 mins per epoch. but i just think you had too much data, with settings as such, that you wouldn't see an update in a VERY long time leading you to believe something was up or it had crashed. if the test works, you know its nothing more than the combination of your dataset, settings, and hardware capability. after that, you can go back to setting --checkpoint_epochs to 1000 again if you wish. |
I am seeing this error from lightning
raise MisconfigurationException(
pytorch_lightning.utilities.exceptions.MisconfigurationException: The provided lr scheduler
ExponentialLR
doesn't follow PyTorch's LRScheduler API. You should override theLightningModule.lr_scheduler_step
hook with your own logic if you are using a custom LR scheduler.The text was updated successfully, but these errors were encountered: