Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chameleon model failed after receiving two times the same inputs #32022

Closed
2 of 4 tasks
francescortu opened this issue Jul 17, 2024 · 2 comments · Fixed by #32037
Closed
2 of 4 tasks

Chameleon model failed after receiving two times the same inputs #32022

francescortu opened this issue Jul 17, 2024 · 2 comments · Fixed by #32037
Labels

Comments

@francescortu
Copy link
Contributor

francescortu commented Jul 17, 2024

System Info

  • transformers version: 4.43.0.dev0
  • Platform: Linux-5.15.0-1046-nvidia-x86_64-with-glibc2.35
  • Python version: 3.10.9
  • Huggingface_hub version: 0.23.5
  • Safetensors version: 0.4.3
  • Accelerate version: 0.31.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.3.1+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: no
  • Using GPU in script?: yes
  • GPU type: NVIDIA A100-SXM4-40GB

Who can help?

@zucchini-nlp

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

After importing the model and processor

from transformers import ChameleonProcessor, ChameleonForCausalLM
processor = ChameleonProcessor.from_pretrained("facebook/chameleon-7b", torch_dtype=torch.float16)
model = ChameleonForCausalLM.from_pretrained("facebook/chameleon-7b", torch_dtype=torch.float16, device_map="cuda:0")

from PIL import Image
image1 = Image.open("<path-to-image>")
image2=Image.open("<path-to-image>")
prompt = "<image>"

inputs = processor(text=[prompt,prompt], images=[image, image2], return_tensors="pt").to("cuda:0")

Execute

output = model(**inputs)
output = model(**inputs)

The second time I run the code, it raises an error. After some investigation, I found that the problem is with the forward pass, which substitutes the token in input_ids with the correct image token from the autoencoder. However, when run the second time, it attempts to perform the substitution again since pixel_values are still present, but the token is no longer there (as it has been replaced with the real image tokens).

Expected behavior

I expect the forward pass not to modify the input tensors.

@amyeroberts
Copy link
Collaborator

cc @zucchini-nlp

@zucchini-nlp
Copy link
Member

@francescortu thanks for noticing, made a PR to fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants