-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exporting edgetpu models #20
Comments
@lkaino My apologies for delay in responding. We are maintaining another repo that we are syncing daily to support quantized tflite export. The repo is at: https://github.com/DeGirum/ultralytics_yolov8/tree/franklin_current In this fork, we introduced two new parameters for export: |
@shashichilappagari thanks for your response! I was able to export an edgetpu version of yolov8s from
However, not all of the operations will run on Edge TPU, is this normal or did I do something wrong?
|
I exported yolov8n and it compiled without issues. Unfortunately, the new postprocess function ( It seems I have to abandon this project for now, thanks for the help. |
Some operations may not map to edgetpu, but our tests show that this is fine. |
@lkaino The frigate repo looks very interesting. We will see if we can help you with the numpy version of the code. |
That would be awesome! I got the postprocessing running with the numpy version you linked. Unfortunately the model doesn't detect much of anything. Is there something wrong in the way I'm exporting the model:
Of course there might be something wrong in the way how I prepare the input tensor (or in the postprocessing):
|
@lkaino Looks like you exported with size (280,280) but sending an input of size (288,288). However, this should have thrown an error at the intepreter.invoke() step itself. |
@lkaino I found one potential place where things could have gone wrong. Can you try one of the following: model = YOLO(model_yaml).load(model.ckpt_path) or export with |
Thanks! Exporting with 280 results in the input tensor being 288, for some reason. |
I'm intentionally using the tflite model and tensorflow directly without YOLO. This is how it is used in Frigate. I can try with the hw_optimized=False. |
@lkaino Sorry, what I meant was during the export stage in Ultralytics repo, you loaded model using just the checkpoint. If Btw, YOLOv8 model expects input sizes to be multiples of 32. Hence, 280 becomes 288. |
Ah sorry, I misunderstood. The model works when I export it without the hw optimization. Could you explain how to load the model using
|
@lkaino You simply use I am assuming you trained the model with relu6. Otherwise, you should use whatever config yaml you used for training. |
Sorry, I don't know YOLO so well. 'relu6-yolov8.yaml' loads the model architecture with relu6 activations, I assume. 'relu6-yolov8n.pt' would be the training weights which I don't have. I haven't done any training yet, I would like to use the model pretrained with COCO dataset, with HW optimizations. Is that possible, or do I have to retrain the model?
|
@lkaino If you are using standard checkpoints, then you can do the following: |
This one works with the HW optimized on edgetpu! Thanks! Would be interesting to know the difference between the original:
and the working one:
Would need to check the documentation or the implementation of the YOLO class. |
@lkaino Great to hear that you could get the model working on edge tpu. When we use Do you see any speed difference between when |
@shashichilappagari I measured the model execution time with and without Both of them take 33ms on average with 288 input size on yolov8n. Unfortunately the numpy version of |
@lkaino Thanks for sharing these numbers. In our measurements, we were getting 33ms for input size of 640x640. So inference time for 288x288 should be much faster. May I ask which host CPU are you using? Is it some raspberry pi type of device? if so, it can explain why the post-processing is so slow. We developed a package called degirum pysdk that can run on our hardware, edge tpu, cpu, and gpus. In this package, the postprocessors are written in c++ and are much faster. You can try our sw and see if you find it useful. You can even run various ML models directly in your browser without installing any SW to get a feel for our sw and decide if it is worth spending any time on. You can sign up for our cloud platform at : https://cs.degirum.com |
The CPU is Intel i7-4700MQ CPU @ 2.40GHz, it's an old HP Zbook. Your package sounds interesting, but unfortunately I can't afford much more time on this project. Do you have python bindings for the SDK and how difficult would it be to integrate it? My goal was to try if Yolo would be feasible alternative to MobileNetV2 in Frigate on edgetpu. The model itself is on par with it's execution time, but the mobilenet needs very little postprocessing which makes it more energy-efficient choice. |
@lkaino The degirum package is actually a python package but all post-processing functions are implemented in c++ to which we have python bindings. We believe that it is very easy to integrate as the code to run any model on any HW is only 4 lines. You can take a look at our docs at: https://docs.degirum.com/content/ We are planning to spend some time on integrating pysdk into frigate and I will keep you posted on our progress. |
That's nice to hear! I spent some time to profile the execution time, most of the time (50 ms) is spent in the double for loop (4x1701). See below, the operation is so complex that I have no idea how to optimize it.
|
@lkaino Thanks for this useful information on profiling. We will take a look and see if it can be improved. |
I tried generating new models with different input size from https://github.com/DeGirum/ultralytics_degirum, but the scripts are not working. There is a clear typo in https://github.com/DeGirum/ultralytics_degirum/blob/131c0b71c03bf3455d43c6ede5af2813c7dfa64f/dg_quantize.py#L11, tflite.Interpreter should be tf.lite.Interpreter. Even after fixing that it does not work.
Is above repository being maintained? What would be the easiest way to export for example "yolov8n_relu6_coco--576x576_quant_tflite_edgetpu_1" with a smaller input image size, e.g. 320x320?
The text was updated successfully, but these errors were encountered: