Does nunif/waifu2x support pair training (x to y mapping) like the previous version does? #250

SAOMDVN · 2024-11-06T15:31:08Z

Hi. I am quite impressed with the performance of the models and methods used in waifu2x and I want to train my own model based on them. It will be a 1x mapping from an image x (RGBA) to an image y (RGB or L), similar to noise model. However the changes can't be generated on the fly like noise. It seemed the old code supported this functionality: nagadomi/waifu2x#193. Is it possible on this new repo too? If so how do I arrange my dataset files accordingly?

Thank you for reading

nagadomi · 2024-11-07T09:11:38Z

Currently not supported.
In previous repo, I used that feature for text block segmentation for manga speech bubble.
If there is some kind of Image to Image conversion model that would be useful and popular, I might be able to support it.

SAOMDVN · 2024-11-16T16:56:21Z

I didn't have much time to understand your codebase but from what I can see, it seems the Waifu2xDataset class generate noise from the no-noise images from the train set and feed it to the model as input and the clean images as expected output. From that knowledge if one modify/override the gen noise function to transform the images in a certain way, they can achieve custom x to y mapping functionality, can't they? Just wanted to make sure my understanding is correct.

Regarding my use-case, I want to train a model that can remove anime illustration background. My targets are very specific illustrations with a solid white background, in a specific artist's artstyle. Of course it is not meant to be a plug-and-use solution and is more of an immediate processing step before I come in and manually remove the background that fit my personal standard.

This picture is the result of a model I trained based on TensorFlow's pix2pix tutorial. Left is the input and right is the ouput, meant to be an alpha mask. The main focus is the edges, as those empty black areas inside the character can be easily filled in by a human. I want the model to perform better and more accurate, and also the model is quite heavy, with a 600MB checkpoint file and 15GB of VRAM usage, that why I seek to use your model instead, as it was able to achieve brilliant quality for both upscale and denoising with a small model.

nagadomi · 2024-11-17T02:21:51Z

For character(person) segmentation,
characters often cover a larger area of the image, so it is important to use the global context of the entire image.
existing waifu2x models only use a small area, such as 64x64 blocks, so they may not be suitable for that application.

Also, background removal is a popular task, you can see many (pre-trained) models in
https://github.com/danielgatis/rembg?tab=readme-ov-file#models
anime-segmentation: https://github.com/SkyTNT/anime-segmentation/tree/main

For custom x,y input image, x,y are generated from a single image(im) at

nunif/waifu2x/training/dataset.py

Line 393 in 31c0137

x, y = self.transforms(im, im)

To use x,y two different images, it is possible to generate them from two images like

 x, y = self.transforms(im_x, im_y)

SAOMDVN · 2024-12-15T05:44:41Z

Hi. Thanks for the help! I decided to continue with waifu2x before trying out other solutions you suggested. Here is my model current accuracy:

I trained it with the following command

python train.py waifu2x --method noise --noise-level -1 --num-workers 4 --max-epoch 30 --arch waifu2x.swin_unet_1x --loss lbp --size 64 --disable-amp --data-dir ../data/ --model-dir ../model/

As you can see the edges are a little jagged, I am unsure if this is because of the low number of epochs trained or not big enough datasets (my dataset has 10 images in eval and 18 images in train, all in 4K or above, which when splitted with create_training_data.py becomes 3366 imgs in eval and 5128 imgs in train)

I am currently training 30 more epochs (with --resume) to see it it performs better, in the meanwhile I am seeking any wisdom you have that can help improve my model. Thank you a lot!

For reference here is my fork with my modifications: nunif fork

Edit: After 30 more epochs (60 epoch total) the model doesn't improve a bit. (No * best model updated log message, model .pth md5 hash is the same except for the checkpoint file) 😭

nagadomi · 2024-12-15T15:45:43Z

It looks like overfitting to the white color, so you may want to use random background color or composite with random background image.

As for the training commands,
By default, cyclic learning rate is used, and the number of cycles can be specified with --learning-rate-cycles (default 5).
When --max-epoch is low, it may not be good because the learning rate changes in very short cycles.
It may be worth trying --max-epoch 200 --num-samples 5000 (--num-samples 5000 will reduce the time per epoch to 1/10 of the default option).
When changing options, adding --resume --reset-state option allows you to start from a previously trained model.

The fundamental problem, as I wrote above, is that it is difficult to segment the characters from the background in 64x64 small areas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does nunif/waifu2x support pair training (x to y mapping) like the previous version does? #250

Does nunif/waifu2x support pair training (x to y mapping) like the previous version does? #250

SAOMDVN commented Nov 6, 2024 •

edited

Loading

nagadomi commented Nov 7, 2024

SAOMDVN commented Nov 16, 2024

nagadomi commented Nov 17, 2024

SAOMDVN commented Dec 15, 2024 •

edited

Loading

nagadomi commented Dec 15, 2024 •

edited

Loading

Does nunif/waifu2x support pair training (x to y mapping) like the previous version does? #250

Does nunif/waifu2x support pair training (x to y mapping) like the previous version does? #250

Comments

SAOMDVN commented Nov 6, 2024 • edited Loading

nagadomi commented Nov 7, 2024

SAOMDVN commented Nov 16, 2024

nagadomi commented Nov 17, 2024

SAOMDVN commented Dec 15, 2024 • edited Loading

nagadomi commented Dec 15, 2024 • edited Loading

SAOMDVN commented Nov 6, 2024 •

edited

Loading

SAOMDVN commented Dec 15, 2024 •

edited

Loading

nagadomi commented Dec 15, 2024 •

edited

Loading