-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model size reduced to 43Mb! and new Webapp #295
Comments
Hi, Kikedao,
Thanks a lot for your fantastic work. The webapp is really impressive.
Your webapp and details are now included in our readme file:
https://github.com/xuebinqin/U-2-Net/blob/master/README.md.
Feel free to let us know, if you have any questions or concerns.
Regards,
Xuebin
…On Fri, Mar 18, 2022 at 10:08 PM Kikedao ***@***.***> wrote:
Hi @xuebinqin <https://github.com/xuebinqin> !
I made a webapp that uses U2net at its core, site is https://silueta.me,
it would be great if you showcase it in the readme :)
I struggled a lot to make it work in a free Heroku instance, limited to
512Mb of RAM, one of the things I achieved, and that I have seen asked for
a lot in the issues, is I managed to reduce the model size to 43Mb from the
170Mb of the original model.
So I want to share with you all the reduced size model, feel free to test
it and add it to the pretrained models in the sourcecode if you wish so:
https://drive.google.com/file/d/19KEnD0ndbyzpyh9SKdjPZe5KWDOG67el/view?usp=sharing
The limited size helps a lot not to reach the soft limit of 512Mb and the
hard limit of 1024Mb of RAM in Heroku (instance is destroyed and restarted
when you reach 1024Mb and swap memory is used when you surpass 512Mb).
I perceived some degradation in the result with the size reduction that I
will show in the following images but I think many people will find it
usefull.
Thanks a lot for this awesome model and PLEASE showcase the webapp
https://silueta.me , I did it with a lot of love and respect (you will
see in the site that I mention U2Net and your research) to showcase it in
my portfolio (I'm a frontend/creative technologist), if I manage to get
some consistent visits in the site I will hopefully monetize it a little
and be able to purchase dedicated hardware resources to bring it to the
next level ;)
Obviously I will answer any questions from the community in this thread if
you find any of my knowhow usefull, don't be shy.
*These are the comparison images from the original model and the reduced
size model:*
Almost imperceptible difference with human portrait images:
*ORIGINAL U2Net Model*
[image: original1]
<https://user-images.githubusercontent.com/1637356/159105729-27a2b69b-7aa4-40c7-b457-183ff0d39abc.png>
*Reduced Size Model*
[image: reduced1]
<https://user-images.githubusercontent.com/1637356/159105750-72a5d5eb-76cd-4250-ae32-e3b9a2a21bda.png>
Here you can detect a little difference in the left shoulder:
*ORIGINAL U2Net Model*
[image: original1a]
<https://user-images.githubusercontent.com/1637356/159105782-66987644-2aec-4e7d-b2af-09c5eeb4fb80.png>
*Reduced Size Model*
[image: reduced1a]
<https://user-images.githubusercontent.com/1637356/159105790-c3d53afb-b93b-4a86-a711-46e94ba5db42.png>
The main difference is with Non-human images, here is one of my cats, in
this case there is little difference, you can notice a little less of
'shadow' in the overall matting, but the resutl is very good in both:
**ORIGINAL U2Net Model*
[image: original2]
<https://user-images.githubusercontent.com/1637356/159105822-c6aa5b77-1d51-4088-ae8d-3ddb4b2e215e.png>
*Reduced Size Model*
[image: reduced2]
<https://user-images.githubusercontent.com/1637356/159105829-2624ef87-3823-474b-a64e-8ba4214160a7.png>
And last one of the biggest examples with loss I managed, my other cat,
you can notice the bottom par of the body is way more transparent:
**ORIGINAL U2Net Model*
[image: original3]
<https://user-images.githubusercontent.com/1637356/159105853-a28a63b3-bedc-4d79-91f7-c4f3af08eefc.png>
*Reduced Size Model*
[image: reduced3]
<https://user-images.githubusercontent.com/1637356/159105862-b2c63dcc-5588-408e-b4c5-727c6305bd69.png>
—
Reply to this email directly, view it on GitHub
<#295>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSGORNNNMRG6AYBMPUVRLTVAVHK5ANCNFSM5RDOPXEA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Xuebin Qin
PhD
Department of Computing Science
University of Alberta, Edmonton, AB, Canada
Homepage: https://xuebinqin.github.io/
|
hi @Kikedao , could you please share your training code? model compression seems interesting. thank you |
Hi @anguoyang, For this version I didn't retrain it, I used the pretrained model and quantized it using the quantization utils from onnx-runtime, following the original docs: As you can see in the screenshots there is a slight loss, but it was good enough for the end result I was searching for, the gain in RAM consumption for inference is massive. |
@Kikedao thank you so much for the information. I want to put person segmentation on mobile/edge devices, so I retrained the person segmentation model(lite version) from dennis (https://github.com/dennisbappert/u-2-net-portrait), with supervisely cleaned images(2667 pairs), unfortunately, the result is bad(about 400 epochs), if I just do the quantization, it is difficult to decrease it into very small size(say <5MB), and I also trained with u2net-lite of this original repo, the result was even more worse, it is strange that if we train with SOD datasets on lite model, the result was not so bad, @xuebinqin could you please give us some advices, thank you. |
BTW, We trained the u2net-lite of this repo for about 300 epochs, the train loss is around 0.2, and the tar is around 0.02, although it is not small enough, and still in decreasing, the result is much worse than my expectation, as I trained the u2net-lite before with SOD datasets |
Hi @anguoyang, I see you are aiming at an extreme small size model (<5Mb). The first foreground/background segmentation I managed to train some years ago with my own dataset was based in almost 20k custom images, very good quality cutouts from KNN human operated trimap segmentation, and extended to 80k+ images from some preprocessing of rotation, change of brigtness, contrast, etc. I've worked with models around 5Mb for cute little things like pitch and tone detection, running in the browser client, but I have a strong intuition 5Mb is WAY too low for any good CNN computer vision model. Maybe I'm very wrong, if that's the case please correct me: The model I quatizised takes only 43Mb as a file, a good Disk/RAM weight optimization if you ask me, but the inference can bring the RAM usage to 500Mb+ anytime, no way to use it in a mobile system in a mainstream app if you don't want to crash the average user device (sadly). |
@anguoyang, as I finished writing the last comment I had an idea. If I'm not wrong u2net is trained based from a public dataset of 320x320px images. So, I would try that:
As you halve one side dimension of a square the area is reduced not by half but 4 times, so 'maybe' the weight of the model should be reduced in the same proportion, 4 times.. The quantizised model I proposed is 43Mb from the original 170Mb, so, maybe you can reach at least <10Mb due to the half resolution of the input I'm proposing. I'm just a frontend developer, I'm not a data scientist nor an expert in machine learning, maybe this suggestion is silly, but I would give it a try. If it works...share your model and results with all of us please ;) |
hi @Kikedao If you don't mind, can you tell me which pre-trained model you select? |
Hi, this looks very impressive!!! |
Hi @mutsuyuki and @its-jd , glad you liked the app! I used the official 'standard' u2net.onnx model from this repo and quantized it as I explain in the previous comments. You can download and use my quantized model with the code of this repo, as easy as replacing the original file of the model with mine, no code modification required. Also if you are curious in my web app I implemented some optimizations in pre and post processing of the input image and output mask just to try to get slightly better results and mainly to try to optimize memory comsumption as I have it running in a free Heroku instance with very tight RAM limitations. I have a batch of optimizations I wasn't able yet to deploy in production because of the computing limitations of the hosting but I would like to use if I get a better server in the future. The idea is using u2net to get a mask but not using it directly for the output, instead using it to generate a trimap and perform KNN or CloseForm alpha matting for even better results (inspired by the awesome PyMatting library: https://pymatting.github.io/). I have a little Ethereum minning rig at my home with 6 GPUs, 8GB VRAM each one of them so 48GB of sweet DDR5 memory (compare it to the 512Mb of 'normal' RAM limitation of Heroku ;) ), that I plan to use in the future to try to deploy the model and bring it to the next level with the optimizations I mentioned before and also retraining it a with a larger dataset I already have. But at the time I can't afford to invest the time of creating that service, I just did it to showcase that I can use and deploy ML models in a (humble) production scale with a clean profesional UI to add it to the rest of my portfolio as a frontend/creative developer. Feel free to ask me anything, and sorry if my responses are too extense, I try not to answer just the direct questions but also give additional insights to anyone that can read this thread, it's my way to try to give back a little to the community that has helped myself so much with the amazing work of so many contributors. |
@Kikedao awesome Work ! I have tried to send you an email to your Gmail mentioned at your webapp, but it was returned to med with error "email does not exist" :) Kind regards, Stefan Nielsen |
@Kikedao , are you able to make it work for AWS? |
Hi, thank for onnx file.
Here are my code onnx:
update: I converted onnx on my way, the result are faster than original weights: |
Hi @StefanHavbro , sorry the email was wrong in the web, I corrected it and now it should show right in the footer of the web app. Hi @ioskevinshah , it should be 'easy' to make it run in AWS, it depends on the EC2 instance you have but I don't think it would be harder than in Heroku where I have it now. In my experience the problem is it would be VERY expensive to do it. I used Flask and Gunicorn in Heroku. In the past I've used AWS Sagemaker and AWS EC2 with Lambda and Gateway to deploy similar models...but the cost is just too much and with little control over the expenses. I'm talking about EC2 instances with machine learning GPUs at that time (6-5 years ago), one of the amazing thinks about the u2net model is you can run it fast enough in a CPU and not only in CUDA/GPU, but AWS for me is a no go for a personal free project. Hi @hungtooc , wow, thanks a lot for your tests, it's very impressive how much the computation cost has risen when run in the GPU with CUDA with the quantized model, it's very counter-intuitive. I don't have benchmarks but I can assure you in CPU the speed was very little affected, the app is running in a somewhat low-end CPU in Heroku (I think the free instances are a 'virtual' 2 core CPU at 2,5Ghz) and it takes at most 10 to 15 seconds depending on input image (almost same inference time before quantization). In my personal computer the inference is way faster in CPU, 1 second more or less (Ryzen 3600 CPU, 12 cores at 4.2 Ghz Max, with fast RAM and overall decent hardware). But your tests are in GPU and we are not talking about 10 seconds, but 0.5 secs at most even using my model, it's another world! With the original model you could almost do inference at realtime video, amazing metrics. I can't install CUDA in my machine right now and test it so I can't help you, I'm very sorry. It's strange you get such a big performance loss with the quantized model, around 10X worse in GPU, but I have no clue about it. If I understood correctly quantization is supposed to remove things like big float numbers from the weights (changes integers from 32 and 16 bits to 8 bits for example), reducing the model size a lot, but maybe this affects the computation in the inference in a bad way when using numpy and parallel tools like CUDA... I have no idea, in the end I'm just an amateur at ML :) |
I would love to share the code of the whole app, but one of the main things I worked on was security, and I can't share it for the following reason, let me explain it a little: The inference code I used is almost the same as the inference code from this repo, my work and main difference was this:
I've worked over the years with websites with millions of users in a month, sometimes even millions in 3-4 days. Developing an awesome machine learning model and deploying it in production, even at a humble scale like mine, are two very different things, it takes a different skillset. I'm much better at the second one than the first so feel free to ask me anything, if I can help I will. |
Hi @Kikedao, here are my inference code for onnx i executed it on Colab free GPU https://colab.research.google.com/drive/1yfgGdoFDEOrGQCB2Cevl-GAAUChPQlzI?usp=sharing |
@hungtooc I just ran your Colab Notebook, at first glance it's very nice, the output speed metrics are awesome. But come on, I'm a frontend programmer, why don't you show me/us the input and output images? :) Talking more seriously, it would be nice if you post input and output image comparions, side by side, so people reading this thread in the future can obtain good info. |
can you help send code to convert original model to onnx? [email protected] |
can you help send code to convert original model to onnx? [email protected] |
Hi @Kikedao , this is an awesome app you have built. Much respect. I just have 1 question. Are you using any other pre or post processing method too? I compared the official implementation's result and you web app result and even when you've quantized the weights, your results are better on a couple of images which I tried with original model. I'm just curious to know how could that be until unless you're using some extra pre or post processing methods given you say that you have not re trained the model. |
@Kikedao could you share what pre and post processing you are using in your app? results are way better than original |
@Kikedao could you share the pre and post processing you are using? I have tried some examples and the difference is astronomical. it is as if you are using another model. I've tried some morphological filters and tried to tune alpha matting but they did not help at all. |
Hi @Kikedao, Thank you for sharing your great work!! It really works well, even seems better than SOTA matting algorithme for some images. Here are some result of non human samples. |
@Kikedao @anguoyang @juergengunz @xuebinqin @mutsuyuki i am new to the world of ML and AL , and i don't know how to convert this .pth file into onnx file which i can use in the rembg code. |
A simple |
@deshwalmahesh thanks for the reply , the problem was that , i was not able to create the dummy inputs for onnx conversion but i found the code that convert the u2net.pth to u2net.onnx . |
Keep on going further in that blog, you'll find a code where they show you the inference too. Start from basic of concepts instead of directly using it on advanced concepts. |
I convert .pth to .onnx but get the wrong inference result with onnx. You can see details in this issue WRONG. I spent about 3 days trying to solve this problem but no results. So sad... Can you take a look at what is wrong? Or should I modify the model structure? Thanks for your help. |
@Kikedao Thanks for sharing the smaller model. I also have the size limitation while deploy app. When i use with rembg, it will automatically download the models, could you share how to make rembg work with your model if I include your model file in my project as a file? |
Thanks for your great work. |
torch to onnx model conversion script:
|
This app is having CORS error and getting stuck at starting: "Access to XMLHttpRequest at 'https://superportraitsegmentation.herokuapp.com/status' from origin 'https://silueta.me' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource." shown in browser console. |
Hi @xuebinqin !
I made a webapp that uses U2net at its core, site is https://silueta.me, it would be great if you showcase it in the readme :)
I struggled a lot to make it work in a free Heroku instance, limited to 512Mb of RAM, one of the things I achieved, and that I have seen asked for a lot in the issues, is I managed to reduce the model size to 43Mb from the 170Mb of the original model.
So I want to share with you all the reduced size model, feel free to test it and add it to the pretrained models in the sourcecode if you wish so:
https://drive.google.com/file/d/14Uy2F2i59MZONzf4eyRtpnZWiJB60drL/view?usp=sharing
The limited size helps a lot not to reach the soft limit of 512Mb and the hard limit of 1024Mb of RAM in Heroku (instance is destroyed and restarted when you reach 1024Mb and swap memory is used when you surpass 512Mb).
I perceived some degradation in the result with the size reduction that I will show in the following images but I think many people will find it usefull.
Thanks a lot for this awesome model and PLEASE showcase the webapp https://silueta.me , I did it with a lot of love and respect (you will see in the site that I mention U2Net and your research) to showcase it in my portfolio (I'm a frontend/creative technologist), if I manage to get some consistent visits in the site I will hopefully monetize it a little and be able to purchase dedicated hardware resources to bring it to the next level ;)
Obviously I will answer any questions from the community in this thread if you find any of my knowhow usefull, don't be shy.
These are the comparison images from the original model and the reduced size model:
Almost imperceptible difference with human portrait images:
ORIGINAL U2Net Model
Reduced Size Model
Here you can detect a little difference in the left shoulder:
ORIGINAL U2Net Model
Reduced Size Model
The main difference is with Non-human images, here is one of my cats, in this case there is little difference, you can notice a little less of 'shadow' in the overall matting, but the resutl is very good in both:
*ORIGINAL U2Net Model
Reduced Size Model
And last one of the biggest examples with loss I managed, my other cat, you can notice the bottom par of the body is way more transparent:
*ORIGINAL U2Net Model
Reduced Size Model
The text was updated successfully, but these errors were encountered: