Skip to content

Latest commit

 

History

History

DDIM_ANIME

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Denoising Diffusion Implicit Model for Generating Anime face

1. Introduction

  • Here we will train a diffusion model to generate anime face
  • The dataset can be downloaded from kaggle anime face dataset, download the dataset to dataset directory and put all the images under directory anime/raw/images, when you finish, the dataset looks like this:
dataset
├── anime
│   └── raw
│   │   └── images
│   │       ├── 46651_2014.jpg
│   │       ├── 4665_2003.jpg
│   │       ├── ...
  • Then we have to process these raw images, we've already done it, you can check this step following VAE_ANIME, then your directory looks like this:
dataset
├── anime
│   ├── processed
│   │   └── images
│   │       ├── 46651_2014.jpg
│   │       ├── 4665_2003.jpg
│   │       ├── ...
│   └── raw
│   │   └── images
│   │       ├── 46651_2014.jpg
│   │       ├── 4665_2003.jpg
│   │       ├── ...

2. Load dataset, Build model, Train model

  • For this task the code is nearly the same to DDPM_ANIME, we just modify the model.py and run.py to support DDIM generation
  • You don't have to train it again, just use the checkpoints in your DDPM_ANIME
  • Here I would like to run the program by shell, this will make sure that every time we run the program, random noise is the same
sh sample.sh

3. Check the quality of generated image

  • First, let's see the quality of generated image after 10, 100, 1000 steps sampling respectively
DDIM sample_steps=10

sample anime faces step 10

DDIM sample_steps=100

sample anime faces step 100

DDIM sample_steps=1000

sample anime faces step 1000

  • When we set $eta$ to other value, the results are as follows:
sample_steps=100, eta=0.2

sample anime faces step 100

sample_steps=100, eta=0.5

sample anime faces step 100

DDPM sample_steps=100, eta=1.0

sample anime faces step 1000

  • I also do another experiment, I add noise to the original image(forward process), then use the noisy image to generate image to see whether it can recover the original image. For the forward process and backward process I set t equals to 100, below are the results(first column is original image, second column is noisy image which we add t steps' noise to original image, third column is generated image using DDIM)

recover t=100

  • We can see generated images' quality are good using DDIM even for 10 steps, which is much faster than DDPM. And DDIM's generative processes are deterministic so the image's high-level feature remains the same while DDPMs are not. But for recovering image, DDIM is not able to deal with that but I think the result is better than DDPM
  • For more details, you can clone the project to local and run by yourself

4. Some references