Ignore non-causal mask in more cases with SDPA #30138

fxmarty · 2024-04-09T08:26:45Z

Fixes #30095. In the non-causal case, we can ignore the mask even for key_value_length != tgt_len as we will never have any fully masked row and will never hit the SDPA's mem-efficient attention backend issue.

Depends on #28802

HuggingFaceDocBuilderDev · 2024-04-09T08:45:29Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

minostauros · 2024-04-09T14:28:14Z

src/transformers/modeling_attn_mask_utils.py

@@ -406,31 +406,18 @@ def _prepare_4d_attention_mask_for_sdpa(mask: torch.Tensor, dtype: torch.dtype,
        tgt_len (`int`):
            The target length or query length the created mask shall have.
    """
-    batch_size, key_value_length = mask.shape
+    _, key_value_length = mask.shape


Can we change the input arg mask: torch.Tensor to mask: Optional[torch.Tensor] and return None immediately if mask is None? The docstring is not compliant with the actual input. (mask ("torch.Tensor" or "None" ):)
Will it break the is_tracing check?

@minostauros Yes indeed ideally we would want to do that. In practice, the calls to these functions in modeling files are always guarded by:

if attention_mask is not None:

but we should IMO indeed accept Optional[torch.Tensor]. I'll leave that to an other PR.

ArthurZucker

Good cleanup IMO. Let's make sure the workflow is triggered for this

github-actions · 2024-05-10T08:03:18Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

ArthurZucker · 2024-05-10T11:54:38Z

This is probably fixed on main

fxmarty · 2024-06-03T09:14:49Z

src/transformers/models/bert/modeling_bert.py

+        is_causal = (
+            True if self.is_decoder and not is_cross_attention and attention_mask is None and tgt_len > 1 else False
+        )


FYI @hackyon

fxmarty added 2 commits April 9, 2024 10:19

update non-causal mask for sdpa

fdf6c9b

add test

8389928

fxmarty requested a review from ArthurZucker April 9, 2024 08:28

minostauros reviewed Apr 9, 2024

View reviewed changes

update docstrings

f007b9b

fxmarty requested review from amyeroberts, LysandreJik and ArthurZucker and removed request for ArthurZucker April 15, 2024 08:10

ArthurZucker approved these changes Apr 15, 2024

View reviewed changes

minostauros mentioned this pull request May 7, 2024

_prepare_4d_attention_mask_for_sdpa is not for causal attention but claims... #30095

Closed

fxmarty added 3 commits June 3, 2024 10:13

Merge branch 'main' into sdpa-non-causal-mask-fix

ec1c47e

add one more test

7e69881

fix cross attention bug

d748667

fxmarty commented Jun 3, 2024

View reviewed changes

fxmarty mentioned this pull request Jun 3, 2024

Fix vision encoder decoder attention implementation pick #31203

Closed

gentler atol/rtol

1dd1879

fxmarty merged commit 221aaec into huggingface:main Jun 3, 2024
22 checks passed

fxmarty mentioned this pull request Jun 21, 2024

[RoBERTa-based] Add support for sdpa #30510

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore non-causal mask in more cases with SDPA #30138

Ignore non-causal mask in more cases with SDPA #30138

fxmarty commented Apr 9, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 9, 2024

minostauros Apr 9, 2024 •

edited

Loading

fxmarty Apr 11, 2024

ArthurZucker left a comment

github-actions bot commented May 10, 2024

ArthurZucker commented May 10, 2024

fxmarty Jun 3, 2024

Ignore non-causal mask in more cases with SDPA #30138

Ignore non-causal mask in more cases with SDPA #30138

Conversation

fxmarty commented Apr 9, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Apr 9, 2024

minostauros Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

fxmarty Apr 11, 2024

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

github-actions bot commented May 10, 2024

ArthurZucker commented May 10, 2024

fxmarty Jun 3, 2024

Choose a reason for hiding this comment

fxmarty commented Apr 9, 2024 •

edited

Loading

minostauros Apr 9, 2024 •

edited

Loading