Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLaVa] Some improvements #27895

Merged
merged 3 commits into from
Dec 11, 2023
Merged

Conversation

NielsRogge
Copy link
Contributor

What does this PR do?

Some minor improvements when going over the LLaVa code.

@@ -270,22 +270,22 @@ def resize_token_embeddings(self, new_num_tokens: Optional[int] = None, pad_to_m
def _merge_input_ids_with_image_features(
self, image_features, inputs_embeds, input_ids, attention_mask, position_ids
):
nb_images, image_hidden_dim, embed_dim = image_features.shape
num_images, num_image_patches, embed_dim = image_features.shape
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not really a "hidden dim" but rather the number of image patches. As below "num_image_tokens" is already used, this could be confusing with "num_image_patches". The former actually refers to the number of special image tokens, hence I've renamed that.

This model was contributed by [ArthurZ](https://huggingface.co/ArthurZ) and [ybelkada](https://huggingface.co/ybelkada).
The original code can be found [here](https://github.com/haotian-liu/LLaVA/tree/main/llava).

## Usage tips
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @younesbelkada just an FYI, for new docs we always have a ## Usage tips section and a ## Resources section

I'm not sure the CookieCutter creates those sections by default

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup, can you confirm the slow tests pass with these changes?

@NielsRogge
Copy link
Contributor Author

They are failing on my setup:

FAILED tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_small_model_integration_test_batch - AssertionError: Lists differ: ['USER:  \nWhat are the things I should be [267 chars]ock'] != ['\nUSER: What are the things I should be c[306 chars] R.']
FAILED tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_small_model_integration_test_llama - AssertionError: 'USER[116 chars]hich appears to be a dock or pier extending ov[562 chars]ier.' != 'USER[116 chars]hich is a pier or dock extending over a body o[572 chars]ies.'
FAILED tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_small_model_integration_test_llama_batched - AssertionError: Lists differ: ['USE[114 chars]ANT: When visiting this serene location, one s[177 chars]ed.'] != ['USE[114 chars]ANT: the water is calm and clear\n\nThe image [154 chars]ed.']

Could possibly also be the case on main, will check

@NielsRogge
Copy link
Contributor Author

Yeah it still fails for me both on main and my branch even after #27909. So @ArthurZucker could you perhaps run the slow tests from my branch before merging? But I'm pretty sure it doesn't affect integration tests.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! can confirm this PR does not affect slow tests

@younesbelkada younesbelkada merged commit 7ea21f1 into huggingface:main Dec 11, 2023
21 checks passed
iantbutler01 pushed a commit to BismuthCloud/transformers that referenced this pull request Dec 16, 2023
* More improvements

* Improve variable names

* Update READMEs, improve docs
staghado pushed a commit to staghado/transformers that referenced this pull request Jan 15, 2024
* More improvements

* Improve variable names

* Update READMEs, improve docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants