From f963e381c7d32a01d1f5b7d889267c8e1f7991f7 Mon Sep 17 00:00:00 2001 From: Arthur <48595927+ArthurZucker@users.noreply.github.com> Date: Tue, 5 Mar 2024 10:06:22 +0100 Subject: [PATCH] Update docs/source/en/model_doc/mamba.md Co-authored-by: Lysandre Debut --- docs/source/en/model_doc/mamba.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/model_doc/mamba.md b/docs/source/en/model_doc/mamba.md index bd7cb278ce2820..3ce869d3204e51 100644 --- a/docs/source/en/model_doc/mamba.md +++ b/docs/source/en/model_doc/mamba.md @@ -30,7 +30,7 @@ Tips: - Mamba is a new `state space model` architecture that rivals the classic Transformers. It is based on the line of progress on structured state space models, with an efficient hardware-aware design and implementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention). - Mamba stacks `mixer` layers, which are the equivalent of `Attention` layers. The core logic of `mamba` is held in the `MambaMixer` class. -- Two implementation cohabit: one is optimized and uses fast cuda kernels, while the other one is naive but can run on any device! +- Two implementations cohabit: one is optimized and uses fast cuda kernels, while the other one is naive but can run on any device! - The current implementation leverages the original cuda kernels: the equivalent of flash attention for Mamba are hosted in the [`mamba-ssm`](https://github.com/state-spaces/mamba) and the [`causal_conv1d`](https://github.com/Dao-AILab/causal-conv1d) repositories. Make sure to install them if your hardware supports them! - Contributions to make the naive path faster are welcome 🤗