Adds prompt lookup decoding (ngram speculation) #375

tgaddair · 2024-04-03T17:30:47Z

Adds prompt lookup decoding (ngram speculation) via initialization param --speculative-tokens.

Example:

lorax-launcher ... --speculative-tokens 3

Closes #259.
Related #205.
Related #329.

lighteternal · 2024-04-08T07:01:58Z

Hi @tgaddair! Many thanks for the new merge 🙏
This new argument --speculative-tokens 3 is not available while running the docker run command in the latest image, is this correct?

tgaddair added 30 commits March 15, 2024 09:40

Refactored mistral

af27cf4

WIP: medusa

d9dbffd

WIP: multi_zip

bd80773

Cleanup

3f31226

Fix server

5c07ad8

WIP download

81040f6

Fixed plumbing

95b83df

Merge branch 'main' into medusa

bdf0f69

Merge branch 'main' into medusa

4506c4c

Adapters

f42094b

Merge branch 'main' into medusa

a0c3ca9

Wired up config

c0d1509

Merge branch 'main' into medusa

805ae62

Cleanup

d0bf5d1

Use new config

5de6c8a

WIP abstract configs

d8e8f70

Merge branch 'main' into medusa

f311e48

Merge branch 'main' into medusa

84e577a

Module map

9f47ce4

Merge branch 'main' into medusa

4f74a14

WIP: refactor

a4178a7

TEST: disable medusa

e4d11d4

Rename

8ebd82e

Fixed imports

bfcfa27

Clean up

59401ec

Fixed adapter indices

fcf0db6

Merge

14ee9af

Config

4b710b1

Merge branch 'main' into medusa

f9400e5

Removed debug

c36f406

tgaddair added 21 commits March 29, 2024 10:59

Removed duplicate tokens

09348e2

Plumb through batching medusa

e3a53fd

Wire up medusa

41d7f80

WIP: speculative tokens

ac4b373

Fixed imports

9a14628

Refactor

e23e003

Fixed

b258588

Support additional models

39a6fc1

Plumb through spculative tokens

45d8c6c

Merge

bb42a52

Added default

afc06ee

Added default medusa

2423c87

Guard medusa dynamic adapter loading

a19f62a

Fix adapter speculation

fd95c3e

cargo fmt

4521d52

Fix format

8d5561c

Fixed Generation

8dc9325

Fix tests

53d8ea2

Ngram speculation

de80ada

Wire up speculative tokens in the launcher

775ee7a

Merge

f03bc57

tgaddair mentioned this pull request Apr 3, 2024

Add support for ngram speculation with --speculative-tokens param #259

Closed

4 tasks

tgaddair merged commit 3ca53b9 into main Apr 3, 2024
2 checks passed

tgaddair deleted the ngram branch April 3, 2024 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds prompt lookup decoding (ngram speculation) #375

Adds prompt lookup decoding (ngram speculation) #375

tgaddair commented Apr 3, 2024

lighteternal commented Apr 8, 2024 •

edited

Loading

Adds prompt lookup decoding (ngram speculation) #375

Adds prompt lookup decoding (ngram speculation) #375

Conversation

tgaddair commented Apr 3, 2024

lighteternal commented Apr 8, 2024 • edited Loading

lighteternal commented Apr 8, 2024 •

edited

Loading