-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add training guide and align text detection model training with recognition model #8
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is needed for ONNX export to work.
This is the current default behavior, but PyTorch warns that this value is changing in future. Setting `antialias=True` might produce better results, but currently leads to out-of-range errors during loss computation which needs to be resolved first.
Align the text detection model training script with the recognition model training by: - Adding a `--export` option to export a checkpoint to ONNX after loading it - Adding a `--max-epochs` flag to trigger automatic termination of the training process after a fixed number of epochs - Adding wandb integration to allow tracking training progress
Model export is now implement in `train_detection` instead.
When using `antialias=True` with the `Resize` transform on target masks, the resulting values could sometimes be slightly above 1.0. The same thing happened when training with CUDA even without this.
- Use `pin_memory` for data loaders. This was already used in the recognition training script. - Move prediction / target masks to CPU once per batch, instead of separately per item
These need to be installed separately as the dependencies will vary by platform and GPU.
robertknight
force-pushed
the
training-guide
branch
2 times, most recently
from
January 30, 2024 07:47
5c67e64
to
e478d39
Compare
robertknight
force-pushed
the
training-guide
branch
from
January 30, 2024 07:53
e478d39
to
293c774
Compare
Replace precise intersection test with a cheap bounding box intersection test.
robertknight
force-pushed
the
training-guide
branch
from
January 30, 2024 07:57
293c774
to
4688482
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add a guide with steps to train the text detection and recognition models from scratch. Along the way various improvements to the text detection training tool and other things were needed.
torch
,torchvision
from the Pipfile and install them separately. This is needed because the dependencies vary by platform and GPUantialias
setting inResize
transform by setting it explicitlyFixes #6