Fix FA2 tests #29909

ylacombe · 2024-03-27T15:35:48Z

What does this PR do?

#26572 introduced an artifact that avoid properly testing inference with Flash Attention 2, the model supposed to be loaded without Flash Attention 2 (as a reference to compare) was in fact using Flash Attention 2!

cc @fxmarty @ArthurZucker @amyeroberts

HuggingFaceDocBuilderDev · 2024-03-27T15:56:38Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

AH. That's a great catch. Thanks for it!

ArthurZucker · 2024-03-28T07:05:31Z

tests/test_modeling_common.py

-                model = model_class.from_pretrained(
-                    tmpdirname, torch_dtype=torch.bfloat16, attn_implementation="flash_attention_2"
-                )
+                model = model_class.from_pretrained(tmpdirname, torch_dtype=torch.bfloat16)


let's update the name to test_flash_attn_2_inference_equivalence or something like that!

Will do!

On a side note, how to make sure that every model using FA2 still passes ? The tests are slow, so I'm not actually sure the CI is totally green ?

You'll need to run the tests manually. You can select just the flash attention tests by doing something like:

RUN_SLOW=1 pytest tests/models -k "flash_attn" on a GPU setup

amyeroberts

Good spot - thanks for fixing!

amyeroberts · 2024-03-28T09:37:44Z

tests/test_modeling_common.py

-                model = model_class.from_pretrained(
-                    tmpdirname, torch_dtype=torch.bfloat16, attn_implementation="flash_attention_2"
-                )
+                model = model_class.from_pretrained(tmpdirname, torch_dtype=torch.bfloat16)


You'll need to run the tests manually. You can select just the flash attention tests by doing something like:

RUN_SLOW=1 pytest tests/models -k "flash_attn" on a GPU setup

ylacombe · 2024-03-28T14:58:58Z

I've ran RUN_SLOW=1 pytest tests/models -k "flash_attn" as requested and got the following results. In particular, inference tests from QWen, Whisper and StableLM failed!

I'll open an issue to keep trace of the different failures. Should I still merge the PR in the meantime?

FAILED tests/models/bark/test_modeling_bark.py::BarkSemanticModelTest::test_flash_attn_2_from_config - ValueError: Unrecognized configuration class <class 'transformers.models.bark.configuration_bark.BarkSemanticConfig'> for this kind of AutoModel: AutoModelForCausalLM.
FAILED tests/models/bark/test_modeling_bark.py::BarkCoarseModelTest::test_flash_attn_2_from_config - ValueError: Unrecognized configuration class <class 'transformers.models.bark.configuration_bark.BarkCoarseConfig'> for this kind of AutoModel: AutoModelForCausalLM.
FAILED tests/models/bark/test_modeling_bark.py::BarkCoarseModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/gemma/test_modeling_gemma.py::GemmaModelTest::test_flash_attn_2_generate_padding_right - AssertionError: ValueError not raised
FAILED tests/models/gpt_bigcode/test_modeling_gpt_bigcode.py::GPTBigCodeMHAModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/gpt_neo/test_modeling_gpt_neo.py::GPTNeoModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/gpt_neox/test_modeling_gpt_neox.py::GPTNeoXModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_flash_attn_2_inference_equivalence - AssertionError: assert False
FAILED tests/models/stablelm/test_modeling_stablelm.py::StableLmModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_flash_attn_2_inference_equivalence_right_padding - AssertionError: assert False
FAILED tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_flash_attn_2_inference_equivalence - AssertionError: assert False
FAILED tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_flash_attn_2_inference_equivalence_right_padding - AssertionError: assert False

amyeroberts · 2024-03-28T15:12:21Z

@ylacombe Thanks for running and sharing the results! Merging depends on whether the same tests are failing on main, if they are, then merging is fine; if not, the tests will need to be fixed :)

ylacombe · 2024-03-28T15:16:34Z

Testing this right now then !

ylacombe · 2024-03-28T15:23:23Z

Well, the same tests fail except qwen2 and stablelm that are introduced by this PR, but this makes sense since the FA2 tests were'nt actually testing FA2

ArthurZucker · 2024-03-30T17:20:34Z

Feel free to mege!
FYI @ydshieh more failing tests incoming I am afraid 😨

ydshieh · 2024-04-02T08:02:41Z

😨

😨😨😨😨😨

ydshieh · 2024-04-02T08:06:49Z

@ylacombe

Thanks a lot ❤️ for the fix and great catch!

One nit: It would be really nice 🙏 if you can mention, in the PR description, a bit why the previous testing is done improperly. Something as simple as

the model supposed to be loaded without attn_implementation="flash_attention_2" (as a reference to compare) was using attn_implementation="flash_attention_2"

This way, it's super clear what the PR is doing even before diving into the changes.

fxmarty · 2024-04-02T09:06:39Z

afaik many FA2 tests were already failing (they are not in the CI) due to diffs in logits

ydshieh · 2024-04-02T09:12:39Z

afaik many FA2 tests were already failing (they are not in the CI) due to diffs in logits

@fxmarty I think we or you (?) have run those tests before merging. Do you know why we have many failing FA2 tests? Or those many failing tests are only for newly added (many) models ..?

ydshieh · 2024-04-02T09:13:17Z

Oh, they are not run on T4 GPUs.

fxmarty · 2024-04-02T09:16:44Z

@ydshieh When I used to run these tests locally (some months ago), it was because the diff tolerance was too low between eager/fa2. Some models (as whisper) somehow require a large diff tolerance

* fix FA2 tests * refactor inference test name

fix FA2 tests

e86d221

ylacombe requested a review from ArthurZucker March 28, 2024 07:02

ArthurZucker approved these changes Mar 28, 2024

View reviewed changes

amyeroberts approved these changes Mar 28, 2024

View reviewed changes

refactor inference test name

654da1a

ylacombe merged commit 569f6c7 into huggingface:main Apr 1, 2024
18 checks passed

ylacombe deleted the fix-fa2-tests branch April 1, 2024 08:20

ArthurZucker mentioned this pull request Apr 1, 2024

Fix copies main ci #29979

Merged

ArthurZucker pushed a commit that referenced this pull request Apr 22, 2024

Fix FA2 tests (#29909)

e47c906

* fix FA2 tests * refactor inference test name

itazap pushed a commit that referenced this pull request May 14, 2024

Fix FA2 tests (#29909)

5c30277

* fix FA2 tests * refactor inference test name

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix FA2 tests #29909

Fix FA2 tests #29909

ylacombe commented Mar 27, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 27, 2024

ArthurZucker left a comment

ArthurZucker Mar 28, 2024

ylacombe Mar 28, 2024

amyeroberts Mar 28, 2024

amyeroberts left a comment

amyeroberts Mar 28, 2024

ylacombe commented Mar 28, 2024

amyeroberts commented Mar 28, 2024

ylacombe commented Mar 28, 2024

ylacombe commented Mar 28, 2024

ArthurZucker commented Mar 30, 2024

ydshieh commented Apr 2, 2024

ydshieh commented Apr 2, 2024

fxmarty commented Apr 2, 2024

ydshieh commented Apr 2, 2024

ydshieh commented Apr 2, 2024

fxmarty commented Apr 2, 2024 •

edited

Loading

Fix FA2 tests #29909

Fix FA2 tests #29909

Conversation

ylacombe commented Mar 27, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Mar 27, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Mar 28, 2024

Choose a reason for hiding this comment

ylacombe Mar 28, 2024

Choose a reason for hiding this comment

amyeroberts Mar 28, 2024

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Mar 28, 2024

Choose a reason for hiding this comment

ylacombe commented Mar 28, 2024

amyeroberts commented Mar 28, 2024

ylacombe commented Mar 28, 2024

ylacombe commented Mar 28, 2024

ArthurZucker commented Mar 30, 2024

ydshieh commented Apr 2, 2024

ydshieh commented Apr 2, 2024

fxmarty commented Apr 2, 2024

ydshieh commented Apr 2, 2024

ydshieh commented Apr 2, 2024

fxmarty commented Apr 2, 2024 • edited Loading

ylacombe commented Mar 27, 2024 •

edited

Loading

fxmarty commented Apr 2, 2024 •

edited

Loading