Skip to content

Commit

Permalink
Impl simple mamba model (huggingface#1480)
Browse files Browse the repository at this point in the history
This draft PR is a work in progress implementation of the mamba model.
This PR currently loads weights, and produces correct logits after a
single pass.

This PR still needs to correctly integrate this model so it produces
tokens as expected, and apply optimization to avoid all copies during
runtime/unnecessary operations.

[Mamba: Linear-Time Sequence Modeling with Selective State Spaces
(Albert Gu and Tri Dao)](https://arxiv.org/abs/2312.00752)
https://github.com/johnma2006/mamba-minimal

https://github.com/huggingface/candle/blob/main/candle-examples/examples/mamba-minimal/model.rs
huggingface/transformers#28094

Notes: this dev work is currently targeting `state-spaces/mamba-130m`,
so if you want to test please use that model. Additionally when starting
the router the prefill needs to be limited: `cargo run --
--max-batch-prefill-tokens 768 --max-input-length 768`

Integration tests have been added and basic functionality such as model
loading is supported.

```bash
cd integration-tests
pytest -vv models/test_fused_kernel_mamba.py
```
- [x] add tests
- [x] load model
- [x] make simple request
- [ ] resolve warmup issue
- [ ] resolve output issues

fetching models tested during dev
```bash
text-generation-server download-weights state-spaces/mamba-130m
text-generation-server download-weights state-spaces/mamba-1.4b
text-generation-server download-weights state-spaces/mamba-2.8b
```

The server can be run
```bash
cd server
 MASTER_ADDR=127.0.0.1 MASTER_PORT=5555 python text_generation_server/cli.py serve state-spaces/mamba-2.8b
```

router
```bash
cargo run
```

make a request
```bash
curl -s localhost:3000/generate \
    -X POST \
    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
    -H 'Content-Type: application/json' | jq
```

response
```json
{
  "generated_text": "\n\nDeep learning is a machine learning technique that uses a deep neural network to learn from data."
}
```

---------

Co-authored-by: Nicolas Patry <[email protected]>
  • Loading branch information
2 people authored and kdamaszk committed Apr 23, 2024
1 parent 99cb270 commit 51a4e62
Show file tree
Hide file tree
Showing 9 changed files with 1,509 additions and 0 deletions.
73 changes: 73 additions & 0 deletions integration-tests/models/__snapshots__/test_mamba/test_mamba.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
{
"details": {
"best_of_sequences": null,
"finish_reason": "length",
"generated_tokens": 10,
"prefill": [],
"seed": null,
"tokens": [
{
"id": 187,
"logprob": -0.3552246,
"special": false,
"text": "\n"
},
{
"id": 187,
"logprob": -0.38378906,
"special": false,
"text": "\n"
},
{
"id": 30763,
"logprob": -1.140625,
"special": false,
"text": "Deep"
},
{
"id": 4715,
"logprob": -0.5551758,
"special": false,
"text": " learning"
},
{
"id": 310,
"logprob": -0.59033203,
"special": false,
"text": " is"
},
{
"id": 247,
"logprob": -0.70654297,
"special": false,
"text": " a"
},
{
"id": 747,
"logprob": -2.0410156,
"special": false,
"text": " new"
},
{
"id": 1511,
"logprob": -2.3789062,
"special": false,
"text": " type"
},
{
"id": 273,
"logprob": -0.0026435852,
"special": false,
"text": " of"
},
{
"id": 5145,
"logprob": -1.2841797,
"special": false,
"text": " machine"
}
],
"top_tokens": null
},
"generated_text": "\n\nDeep learning is a new type of machine"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
{
"details": {
"best_of_sequences": null,
"finish_reason": "length",
"generated_tokens": 10,
"prefill": [
{
"id": 2502,
"logprob": null,
"text": " red"
},
{
"id": 13,
"logprob": -2.5234375,
"text": ","
},
{
"id": 8862,
"logprob": -3.4433594,
"text": " yellow"
},
{
"id": 13,
"logprob": -0.43017578,
"text": ","
},
{
"id": 209,
"logprob": -8.21875,
"text": " "
}
],
"seed": 0,
"tokens": [
{
"id": 187,
"logprob": 0.0,
"special": false,
"text": "\n"
},
{
"id": 395,
"logprob": -0.46411133,
"special": false,
"text": "and"
},
{
"id": 13735,
"logprob": -2.1132812,
"special": false,
"text": " orange"
},
{
"id": 313,
"logprob": -1.2128906,
"special": false,
"text": " ("
},
{
"id": 249,
"logprob": -2.3671875,
"special": false,
"text": "in"
},
{
"id": 253,
"logprob": 0.0,
"special": false,
"text": " the"
},
{
"id": 1340,
"logprob": -1.640625,
"special": false,
"text": " order"
},
{
"id": 597,
"logprob": -0.5488281,
"special": false,
"text": " they"
},
{
"id": 3176,
"logprob": -0.48608398,
"special": false,
"text": " appear"
},
{
"id": 275,
"logprob": 0.0,
"special": false,
"text": " in"
}
],
"top_tokens": null
},
"generated_text": "blue, red, yellow, \nand orange (in the order they appear in"
}
Loading

0 comments on commit 51a4e62

Please sign in to comment.