FlashAttention2 implementation for OpenELM model #34485

GorkaUrbizu · 2024-10-29T09:55:53Z

Model description

OpenELM is already available on HF🤗 and transformers, but the model lacks support for flashAttention/FlashAttention2.

I'd love to have the flash attention available for OpenELM in the transformers enviroment.

thanks in advance.

Open source status

The model implementation is available
The model weights are available

Provide useful links for the implementation

No response

Rocketknight1 · 2024-10-29T11:12:20Z

Hi @GorkaUrbizu, OpenELM uses custom code, and so we can't actually add support in Transformers! However, I believe it already supports SDPA, which can dispatch to FlashAttention on newer versions of PyTorch.

GorkaUrbizu added the New model label Oct 29, 2024

Rocketknight1 closed this as completed Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FlashAttention2 implementation for OpenELM model #34485

FlashAttention2 implementation for OpenELM model #34485

GorkaUrbizu commented Oct 29, 2024

Rocketknight1 commented Oct 29, 2024 •

edited

Loading

FlashAttention2 implementation for OpenELM model #34485

FlashAttention2 implementation for OpenELM model #34485

Comments

GorkaUrbizu commented Oct 29, 2024

Model description

Open source status

Provide useful links for the implementation

Rocketknight1 commented Oct 29, 2024 • edited Loading

Rocketknight1 commented Oct 29, 2024 •

edited

Loading