Interpolate clip #31900

manuelsh · 2024-07-10T20:45:24Z

This PR addresses the suggestions of @amyeroberts in the existing PR #30783.

amyeroberts

Thanks for adding this!

Just a small comment on the tests

amyeroberts · 2024-07-18T16:33:27Z

tests/models/x_clip/test_modeling_x_clip.py

+
+    @slow
+    def test_inference_interpolate_pos_encoding(self):
+        # ViT models have an `interpolate_pos_encoding` argument in their forward method,


Suggested change

# ViT models have an `interpolate_pos_encoding` argument in their forward method,

# XCLIP models have an `interpolate_pos_encoding` argument in their forward method,

amyeroberts · 2024-07-22T18:04:52Z

tests/models/x_clip/test_modeling_x_clip.py

+        # to visualize self-attention on higher resolution images.
+        model = XCLIPModel.from_pretrained("microsoft/xclip-base-patch32").to(torch_device)
+
+        image_processor = XCLIPProcessor.from_pretrained("microsoft/xclip-base-patch32", size=480)


The returned object is a processor - not an image processor

Suggested change

image_processor = XCLIPProcessor.from_pretrained("microsoft/xclip-base-patch32", size=480)

processor = XCLIPProcessor.from_pretrained("microsoft/xclip-base-patch32", size=480)

amyeroberts · 2024-07-22T18:06:52Z

tests/models/x_clip/test_modeling_x_clip.py

+        with torch.no_grad():
+            outputs = model(**inputs, interpolate_pos_encoding=True)


Can you add in the test to check an error is raise if nterpolate_pos_encoding=False?

amyeroberts · 2024-07-22T18:07:08Z

tests/models/kosmos2/test_modeling_kosmos2.py

+
+        # forward pass
+        with torch.no_grad():
+            outputs = model(**inputs, interpolate_pos_encoding=True)


Same here re failing

Hi @amyeroberts , I am getting the same results in model with interpolate_pos_encoding=True or False in both x_clip or kosmos2. No error, just the same exact tensor for outputs.vision_model_output.last_hidden_state. I've tried with different sizes for the shortest_edge parameter.

I think the reason is because a logic like this:

if not interpolate_pos_encoding and (height != self.image_size[0] or width != self.image_size[1]): raise ValueError( f"Input image size ({height}*{width}) doesn't match model" f" ({self.image_size[0]}*{self.image_size[1]})." )

is not implemented in the modeling_kosmos2.py file and its corresponding one in modeling_x_clip.py.

Happy to implement it.

Part of the issue is originated, in the case of the model kosmos2, because

processor = AutoProcessor.from_pretrained( "microsoft/kosmos-2-patch14-224", padding_side="left", size={"shortest_edge": 480} ) image = Image.open("./tests/fixtures/tests_samples/COCO/000000039769.png") inputs = processor(text="what's in the image", images=image, return_tensors="pt").to(torch_device)

is not changing the image size. It is still returning a 224x224 image, even when you reduce the shortest_edge to less than 224.

Need to investigate further while I learn more about the library.

I've found out what was missing:

# default imnage size of pretrained kosmos_2 is 224 x 224 processor = AutoProcessor.from_pretrained( "microsoft/kosmos-2-patch14-224", size={"shortest_edge": 180}, crop_size = {'height': 180, 'width': 180} )

Now, I have implemented the ValueError in the modeling_kosmos2.py file.

…ake repo-consistency` and related processes.

…all models

manuelsh · 2024-08-04T17:10:39Z

@amyeroberts I've implemented changes in all clip family models to account for the use case when interpolate_pos_encoding=False and the resolution is not the same as the pretrained model. I also implemented its respective tests.

…transformers into interpolate_clip

manuelsh · 2024-08-04T22:35:18Z

I've attempted to sync with the huggingface:main branch with

git fetch upstream
git rebase upstream/main

as there are some failed tests due to being out of sync, but is not working.

When running make repo-consistency I get:

python utils/check_copies.py
Traceback (most recent call last):
  File "/usr/src/app/transformers/utils/check_copies.py", line 1106, in <module>
    check_copies(args.fix_and_overwrite, args.file)
  File "/usr/src/app/transformers/utils/check_copies.py", line 852, in check_copies
    new_diffs = is_copy_consistent(filename, overwrite, buffer)
  File "/usr/src/app/transformers/utils/check_copies.py", line 675, in is_copy_consistent
    target_lines, theoretical_code, theoretical_code_splits = find_code_and_splits(
  File "/usr/src/app/transformers/utils/check_copies.py", line 521, in find_code_and_splits
    code, (lines, target_start_index, target_end_index) = find_code_in_transformers(
  File "/usr/src/app/transformers/utils/check_copies.py", line 456, in find_code_in_transformers
    raise ValueError(f" {object_name} does not match any function or class in {module}.")
ValueError:  models.clip.test_modeling_clip.CLIPModelTest.test_model_get_set_embeddings does not match any function or class in models/clip/test_modeling_clip.

and I see I am missing this test and the get_set_embeddings even after fetching.

@amyeroberts help would be appreciated.

manuelsh · 2024-08-13T22:00:16Z

Closing this one in favour of #32600

fixes clip interpolate

6ed7b47

manuelsh marked this pull request as ready for review July 10, 2024 20:49

manuelsh mentioned this pull request Jul 10, 2024

fixes clip interpolate #30783

Closed

5 tasks

amyeroberts reviewed Jul 22, 2024

View reviewed changes

nileshkokane01 and others added 3 commits August 4, 2024 18:29

fixes clip interpolate

b79862d

All suggestions have been addressed, plus some changes done by the `m…

0cd8a8f

…ake repo-consistency` and related processes.

adding use case of interpolate_pos = False and respective testing to …

db5a20a

…all models

manuelsh force-pushed the interpolate_clip branch from 060857f to db5a20a Compare August 4, 2024 20:49

Manuel Sanchez Hernandez added 3 commits August 4, 2024 23:27

Merge branch 'interpolate_clip' of https://github.com/nileshkokane01/…

9e97baf

…transformers into interpolate_clip

merging conflicts

8b646f5

formatting

c1f7eff

manuelsh closed this Aug 4, 2024

manuelsh reopened this Aug 4, 2024

manuelsh closed this Aug 4, 2024

manuelsh mentioned this pull request Aug 4, 2024

Interpolate clip #32415

Closed

manuelsh reopened this Aug 4, 2024

Manuel Sanchez Hernandez and others added 3 commits August 5, 2024 01:21

formatting

865d2d2

fixes clip interpolate

97db896

syncing with main branch

8e2de94

manuelsh mentioned this pull request Aug 11, 2024

adding positional encoder changes and tests #32600

Merged

manuelsh closed this Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interpolate clip #31900

Interpolate clip #31900

manuelsh commented Jul 10, 2024

amyeroberts left a comment

amyeroberts Jul 18, 2024

amyeroberts Jul 22, 2024

amyeroberts Jul 22, 2024

amyeroberts Jul 22, 2024

manuelsh Aug 3, 2024 •

edited

Loading

manuelsh Aug 3, 2024 •

edited

Loading

manuelsh Aug 4, 2024 •

edited

Loading

manuelsh Aug 4, 2024

manuelsh commented Aug 4, 2024

manuelsh commented Aug 4, 2024 •

edited

Loading

manuelsh commented Aug 13, 2024

	# ViT models have an `interpolate_pos_encoding` argument in their forward method,
	# XCLIP models have an `interpolate_pos_encoding` argument in their forward method,

	image_processor = XCLIPProcessor.from_pretrained("microsoft/xclip-base-patch32", size=480)
	processor = XCLIPProcessor.from_pretrained("microsoft/xclip-base-patch32", size=480)

		with torch.no_grad():
		outputs = model(**inputs, interpolate_pos_encoding=True)

Interpolate clip #31900

Interpolate clip #31900

Conversation

manuelsh commented Jul 10, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Jul 18, 2024

Choose a reason for hiding this comment

amyeroberts Jul 22, 2024

Choose a reason for hiding this comment

amyeroberts Jul 22, 2024

Choose a reason for hiding this comment

amyeroberts Jul 22, 2024

Choose a reason for hiding this comment

manuelsh Aug 3, 2024 • edited Loading

Choose a reason for hiding this comment

manuelsh Aug 3, 2024 • edited Loading

Choose a reason for hiding this comment

manuelsh Aug 4, 2024 • edited Loading

Choose a reason for hiding this comment

manuelsh Aug 4, 2024

Choose a reason for hiding this comment

manuelsh commented Aug 4, 2024

manuelsh commented Aug 4, 2024 • edited Loading

manuelsh commented Aug 13, 2024

manuelsh Aug 3, 2024 •

edited

Loading

manuelsh Aug 3, 2024 •

edited

Loading

manuelsh Aug 4, 2024 •

edited

Loading

manuelsh commented Aug 4, 2024 •

edited

Loading