knickknack/DDIM_ANIME at main · slatter666/knickknack

History

Name		Name	Last commit message	Last commit date
parent directory ..
gen		gen
dataset.py		dataset.py
model.py		model.py
readme.md		readme.md
run.py		run.py
sample.sh		sample.sh
unet.py		unet.py
utils.py		utils.py

readme.md

Denoising Diffusion Implicit Model for Generating Anime face

1. Introduction

Here we will train a diffusion model to generate anime face
The dataset can be downloaded from kaggle anime face dataset, download the dataset to dataset directory and put all the images under directory anime/raw/images, when you finish, the dataset looks like this:

dataset
├── anime
│   └── raw
│   │   └── images
│   │       ├── 46651_2014.jpg
│   │       ├── 4665_2003.jpg
│   │       ├── ...

Then we have to process these raw images, we've already done it, you can check this step following VAE_ANIME, then your directory looks like this:

dataset
├── anime
│   ├── processed
│   │   └── images
│   │       ├── 46651_2014.jpg
│   │       ├── 4665_2003.jpg
│   │       ├── ...
│   └── raw
│   │   └── images
│   │       ├── 46651_2014.jpg
│   │       ├── 4665_2003.jpg
│   │       ├── ...

2. Load dataset, Build model, Train model

For this task the code is nearly the same to DDPM_ANIME, we just modify the model.py and run.py to support DDIM generation
You don't have to train it again, just use the checkpoints in your DDPM_ANIME
Here I would like to run the program by shell, this will make sure that every time we run the program, random noise is the same

sh sample.sh

3. Check the quality of generated image

First, let's see the quality of generated image after 10, 100, 1000 steps sampling respectively

DDIM sample_steps=10

DDIM sample_steps=100

DDIM sample_steps=1000

When we set $eta$ to other value, the results are as follows:

sample_steps=100, eta=0.2

sample_steps=100, eta=0.5

DDPM sample_steps=100, eta=1.0

I also do another experiment, I add noise to the original image(forward process), then use the noisy image to generate image to see whether it can recover the original image. For the forward process and backward process I set t equals to 100, below are the results(first column is original image, second column is noisy image which we add t steps' noise to original image, third column is generated image using DDIM)

We can see generated images' quality are good using DDIM even for 10 steps, which is much faster than DDPM. And DDIM's generative processes are deterministic so the image's high-level feature remains the same while DDPMs are not. But for recovering image, DDIM is not able to deal with that but I think the result is better than DDPM
For more details, you can clone the project to local and run by yourself

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DDIM_ANIME

DDIM_ANIME

readme.md

Denoising Diffusion Implicit Model for Generating Anime face

1. Introduction

2. Load dataset, Build model, Train model

3. Check the quality of generated image

4. Some references

Files

DDIM_ANIME

Directory actions

More options

Directory actions

More options

Latest commit

History

DDIM_ANIME

Folders and files

parent directory

readme.md

Denoising Diffusion Implicit Model for Generating Anime face

1. Introduction

2. Load dataset, Build model, Train model

3. Check the quality of generated image

4. Some references