Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warning when trying to export recognition to onnx #31

Closed
Phaired opened this issue Aug 21, 2024 · 7 comments
Closed

warning when trying to export recognition to onnx #31

Phaired opened this issue Aug 21, 2024 · 7 comments

Comments

@Phaired
Copy link
Contributor

Phaired commented Aug 21, 2024

Hey, I'm getting this warning when exporting the text recognition, and I think it's causing the export to not work correctly. It seems like it's not exporting the last batch or something similar because the results are very poor when using it with your OCR library.

poetry run python -m ocrs_models.train_rec hiertext datasets/hiertext/ \
  --checkpoint text-rec-checkpoint.pt \
  --export text-recognition.onnx
Model param count 2412549
/workspace/ocrs-models/ocrs_models/train_detection.py:212: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(filename, map_location=device)
/root/.cache/pypoetry/virtualenvs/ocrs-models-PAoYRsOs-py3.10/lib/python3.10/site-packages/torch/onnx/symbolic_opset9.py:4545: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with GRU can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model.
  warnings.warn(
@Phaired Phaired changed the title warning when trying to export recognition to onyx warning when trying to export recognition to onnx Aug 21, 2024
@robertknight
Copy link
Owner

This warning is quite normal in PyTorch codebases that haven't been updated very recently, as it was only added in PyTorch v2.4.0. Adding weights_only=True to the torch.load command should resolve it.

This warning shouldn't affect the accuracy of the output. I suspect something else is happening. Can you upload the model (in ONNX format) and a few examples of images it is trained to recognize?

@Phaired
Copy link
Contributor Author

Phaired commented Aug 21, 2024

Sorry but I'm talking about this warning :

/root/.cache/pypoetry/virtualenvs/ocrs-models-PAoYRsOs-py3.10/lib/python3.10/site-packages/torch/onnx/symbolic_opset9.py:4545: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with GRU can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model.
  warnings.warn(

The training was good :
text-recognition.onnx.zip

Epoch 51 validation loss 0.006593015574736102 char error rate 0.0011795811587944627

Here are some image extracted from the training (it's a synthetic dataset so nothing more to see)
195_229_477_253
125_230_151_248

@robertknight
Copy link
Owner

Did you modify the alphabet used for classification (DEFAULT_ALPHABET)? If so can you post the alphabet you used. I notice that this model has a last output dimension of size 69 rather than 97. The ocrs library hard-codes the alphabet to match the models (see here). The library and ocrs CLI tool ought to have a setting so you can specify the alphabet, or there should be a way to embed it in the model, but that's currently missing.

@Phaired
Copy link
Contributor Author

Phaired commented Aug 21, 2024

Oh yes I did

DEFAULT_ALPHABET = (
    " 0123456789"
    + "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzéèà-%?"
)

so how should I change it ?

@Phaired
Copy link
Contributor Author

Phaired commented Aug 23, 2024

I tested this pull request that I made using the same alphabet I used for training, but I'm still getting incorrect characters during OCR. Am I missing something? or could there be an issue with the exported model, even though the training seemed to go well?

let engine = OcrEngine::new(OcrEngineParams {
    detection_model: Some(detection_model),
    recognition_model: Some(recognition_model),
    alphabet: Some(" 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzéèà-%?".to_string()),
    ..Default::default()
})?;

@robertknight
Copy link
Owner

I tested robertknight/ocrs#100 that I made using the same alphabet I used for training, but I'm still getting incorrect characters during OCR.

There might be an issue with the inputs being slightly different when using the ocrs library than what the model saw in training. You can export these inputs using ocrs --text-line-images {image.png}. This will create a folder called lines that contains the input images to the recognition step. You can then try using these images with the Python code to find out if the problem is with differences in the input, or whether there is a problem with the exported model. The default models are naturally robust to input variation because they were trained on a wide variety of images. I have seen issues when training on highly homogenous synthetic data where the models can be overly sensitive to unimportant details (eg. borders around the image).

@Phaired
Copy link
Contributor Author

Phaired commented Aug 26, 2024

I fine-tuned both of my models using additional images and made some modifications like adjusting font size, text offset, and tilt. While the text-detection model seems to work well, the recognition model isn't performing as expected. Below are the ONNX models and the image I tested them on.

models.zip
l

remybarranco@MacBook-Pro-de-Remy examples % cargo run -p ocrs-cli -r -- l.png --detect-model text-detection.rten                                  
    Finished `release` profile [optimized] target(s) in 0.03s
     Running `/Users/remybarranco/Developer/ocrs-fork/target/release/ocrs l.png --detect-model text-detection.rten`
Alphabet:  0123456789?%ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzéèà
251
21
41
7
7
7
7
401
7
7
remybarranco@MacBook-Pro-de-Remy examples % cargo run -p ocrs-cli -r -- l.png --detect-model text-detection.rten --rec-model text-recognition.rten
    Finished `release` profile [optimized] target(s) in 0.03s
     Running `/Users/remybarranco/Developer/ocrs-fork/target/release/ocrs l.png --detect-model text-detection.rten --rec-model text-recognition.rten`
Alphabet:  0123456789?%ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzéèà
0

@Phaired Phaired closed this as completed Sep 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants