Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError on num_inference_steps values: 3, 9, 27, ...? #444

Closed
sausix opened this issue Sep 9, 2022 · 20 comments
Closed

IndexError on num_inference_steps values: 3, 9, 27, ...? #444

sausix opened this issue Sep 9, 2022 · 20 comments
Labels
bug Something isn't working

Comments

@sausix
Copy link

sausix commented Sep 9, 2022

Describe the bug

Some num_inference_steps values seem to be "problematic" and throw an IndexError.

Also the iteration status output does not end on that value in any relation. If it's on purpose, it's still confusing:

num_inference_steps: 5
6it [00:49, 8.27s/it]

num_inference_steps: 6
8it [01:02, 7.87s/it]

num_inference_steps: 7
9it [01:10, 7.88s/it]

num_inference_steps: 8
9it [01:13, 8.19s/it]

Reproduction

from diffusers import StableDiffusionPipeline

TOKEN = "(censored)"
MODEL_ID = "CompVis/stable-diffusion-v1-4"

pipe = StableDiffusionPipeline.from_pretrained(MODEL_ID, use_auth_token=TOKEN)
res = pipe("a photo of an astronaut riding a horse on mars", num_inference_steps=3, guidance_scale=7.5)

Tested on command line too. Unrelated to PyCharm environment.

Logs

/usr/bin/python3 /home/as/.config/JetBrains/PyCharm2022.2/scratches/scratch.py 

0it [00:07, ?it/s]
Traceback (most recent call last):
  File "/home/as/.config/JetBrains/PyCharm2022.2/scratches/scratch.py", line 7, in <module>
    res = pipe("a photo of an astronaut riding a horse on mars", num_inference_steps=3, guidance_scale=7.5)
  File "/home/as/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 148, in __call__
    latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs)["prev_sample"]
  File "/usr/lib/python3.10/site-packages/diffusers/schedulers/scheduling_pndm.py", line 136, in step
    return self.step_plms(model_output=model_output, timestep=timestep, sample=sample)
  File "/usr/lib/python3.10/site-packages/diffusers/schedulers/scheduling_pndm.py", line 212, in step_plms
    prev_sample = self._get_prev_sample(sample, timestep, prev_timestep, model_output)
  File "/usr/lib/python3.10/site-packages/diffusers/schedulers/scheduling_pndm.py", line 230, in _get_prev_sample
    alpha_prod_t = self.alphas_cumprod[timestep + 1 - self._offset]
IndexError: index 1000 is out of bounds for dimension 0 with size 1000

System Info

  • diffusers version: 0.4.0.dev0
  • Platform: Linux-5.19.7-arch1-1-x86_64-with-glibc2.36
  • Python version: 3.10.6
  • PyTorch version (GPU?): 1.12.1+cu102 (True)
  • Huggingface_hub version: 0.9.1
  • Transformers version: 4.21.3
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No
@sausix sausix added the bug Something isn't working label Sep 9, 2022
@jonatanklosko
Copy link
Contributor

I believe this is the issue well summarized in this comment.

The original bug comes from the DDIMScheduler:

>>> from diffusers import DDIMScheduler
>>> scheduler = DDIMScheduler()
>>> scheduler.set_timesteps(3, offset=1)
>>> scheduler.timesteps
tensor([1000,  667,  334,    1])

Note how the number of steps is not 3 as we specify.

The same off-by-1 issue is present in PNDMScheduler:

>>> from diffusers import PNDMScheduler
>>> scheduler = PNDMScheduler(skip_prk_steps=True)
>>> scheduler.set_timesteps(3, offset=1)
>>> scheduler.timesteps
[1000, 667, 667, 334, 1]

The expected number of steps is 4, and not 5. Sidenote: it is a bit confusing that set_timesteps(3) should result in 4 timesteps, but that's because it's actualy set_timesteps(num_ddim_timesteps).

@jonatanklosko
Copy link
Contributor

A loosely related question: why do we need the offset option? As far as I can see, the original implementation doesn't have a configurable offset and it is assumed to be 1. Also, in the stable diffusion pipeline it is always set to 1:

accepts_offset = "offset" in set(inspect.signature(self.scheduler.set_timesteps).parameters.keys())
extra_set_kwargs = {}
if accepts_offset:
extra_set_kwargs["offset"] = 1

And it's not clear from the docs :D

@sausix
Copy link
Author

sausix commented Sep 9, 2022

I believe this is the issue well summarized in this comment.

I am sorry for opening duplicate report. Did not see that. I made a search for "num_inference_steps", not for "ddim_steps". But searching now for "num_inference_steps" also does not show this thread.

@jonatanklosko
Copy link
Contributor

@sausix the issue I referenced is in another repository, so it's all good :)

@natolambert
Copy link
Contributor

Thanks for all the details on this, I'm going to take a look.

@natolambert
Copy link
Contributor

@jonatanklosko I wrote the scheduler docs and saw the offset parameter in multiple schedulers. When I worked on them more it wasn't there, so I need to do more reading before writing accurate docs, so I'll look at this too (maybe we should make another issue for it, depending on how fast all these can be solved).

@natolambert
Copy link
Contributor

natolambert commented Sep 10, 2022

Okay, I've tracked down what offset is for, seems like it's used to make the final timestep of inference in stable diffusion 1 rather than zero (which you said). The option is just a more elegant approach that makes it feature proof, but I see your point that making a bool may be better (if you want that, make a new issue).

You can see where it is used in diffusers with this search (the stable diffusion pipelines). This is matching the stable diffusion implementation here (with a weird comment from the authors).

Now for the harder one, checking the index error. I remember encountering this in the past, I'll likely add a test to the schedulers to cover this.

@jonatanklosko
Copy link
Contributor

@natolambert yeah, my confusion was mostly that in the original implementation it is set to 1 and same in the stable diffusion pipeline, so I wondered if offset=0 is ever used. Now I looked for set_timesteps and looks like in most cases offset is not specified.

Interestingly in CompVis/latent-diffusion they also use offset of 1 (ref), which is not the case in the latent diffusion pipeline (ref), not sure if that's expected or not.

Another possible inconsistency I found is this:

alpha_prod_t = self.alphas_cumprod[timestep + 1 - self._offset]
alpha_prod_t_prev = self.alphas_cumprod[timestep_prev + 1 - self._offset]

alpha_prod_t = self.alphas_cumprod[timestep]
alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod

It's equivalent for offset=1, but not for offset=0

@natolambert
Copy link
Contributor

natolambert commented Sep 10, 2022

Yeah I just realized the 1 vs int difference. I'm not sure, if you want more on that open another issue @jonatanklosko, it's better for the folks who added that to decide (I don't have a strong opinion).

@natolambert
Copy link
Contributor

@patrickvonplaten @patil-suraj, I'm not sure about the discrepancy @jonatanklosko brings up above with the indexing and offset.

I vaguely remember there being some of the PNDM tests failing a while back confusing people. Maybe it was because this change was introduced? I'll keep looking a bit.

@natolambert
Copy link
Contributor

natolambert commented Sep 10, 2022

@jonatanklosko I can tell you that the PNDMScheduler and DDIMScheduler have indexed the alpha's differently for a while. Before introducing offset, PNDMScheduler had

alpha_prod_t = self.alphas_cumprod[timestep + 1]
alpha_prod_t_prev = self.alphas_cumprod[timestep_prev + 1]

And DDIMScheduler had

alpha_prod_t = self.alphas_cumprod[timestep]
alpha_prod_t_prev = self.alphas_cumprod[timestep_prev]

Though I'm not sure why DDIMScheduler isn't shifted by the offset.

@jonatanklosko
Copy link
Contributor

jonatanklosko commented Sep 10, 2022

@natolambert sure I will!

I'm also looking through openai/glide-text2im and I'm almost sure they don't use the 1 offset there, so maybe that's the reason for making it configurable.

@natolambert
Copy link
Contributor

natolambert commented Sep 10, 2022

I opened #465 to keep this on track about the indexing error / confusion.

@jonatanklosko
Copy link
Contributor

@natolambert perfect, thank you!

@natolambert
Copy link
Contributor

Okay the power of 3 thing seems fixed in #466, I need to check the number of iterations though before merged.

@sausix

@natolambert
Copy link
Contributor

Okay, looks good to me, will close this issue as we will merge #466 shortly.

>>> res = pipe("a photo of an astronaut riding a horse on mars", num_inference_steps=3, guidance_scale=7.5)
100%|██████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:08<00:00,  2.02s/it]
>>> res = pipe("a photo of an astronaut riding a horse on mars", num_inference_steps=5, guidance_scale=7.5)
100%|██████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:08<00:00,  1.36s/it]
>>> res = pipe("a photo of an astronaut riding a horse on mars", num_inference_steps=6, guidance_scale=7.5)
100%|██████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:09<00:00,  1.40s/it]
>>> res = pipe("a photo of an astronaut riding a horse on mars", num_inference_steps=7, guidance_scale=7.5)
100%|██████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:11<00:00,  1.44s/it]
>>> res = pipe("a photo of an astronaut riding a horse on mars", num_inference_steps=9, guidance_scale=7.5)
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:14<00:00,  1.48s/it]

@sausix
Copy link
Author

sausix commented Sep 11, 2022

@natolambert Thank for fixing quickly.
Will test it tommorrow too. Found some other non 3 related iterations yesterday.

@patil-suraj
Copy link
Contributor

@natolambert yeah, my confusion was mostly that in the original implementation it is set to 1 and same in the stable diffusion pipeline, so I wondered if offset=0 is ever used. Now I looked for set_timesteps and looks like in most cases offset is not specified.

Interestingly in CompVis/latent-diffusion they also use offset of 1 (ref), which is not the case in the latent diffusion pipeline (ref), not sure if that's expected or not.

Another possible inconsistency I found is this:

alpha_prod_t = self.alphas_cumprod[timestep + 1 - self._offset]
alpha_prod_t_prev = self.alphas_cumprod[timestep_prev + 1 - self._offset]

alpha_prod_t = self.alphas_cumprod[timestep]
alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod

It's equivalent for offset=1, but not for offset=0

The offsetting is very specific to latnet/stable diffusion model, it's not used by other models like the original ddpm models.
Also we should indeed update the latent diffusion pipeline to add the offset there. Feel free to open a PR if you want :)

@jonatanklosko
Copy link
Contributor

@patil-suraj I'm happy to send a PR, but it seems to me that offset would make more sense as scheduler configuration field, rather than argument to set_timestamps, as mentioned in #465 (comment), wdyt?

@patrickvonplaten
Copy link
Contributor

Let's close the issue only when the PR is in main :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants