Draft: feat: Support Llama 3 model (#478) #479

reneleonhardt · 2024-04-19T05:37:10Z

Wow, fresh of the press, seems to be one of the best models now!! 🚀

Waiting for llama.cpp support... 😅
ggml-org/llama.cpp#6747
ggml-org/llama.cpp#6751

ggml-org/llama.cpp@a55d8a9

carlrobertoh · 2024-04-20T20:22:07Z

Is there any more work to be expected here, or can we perhaps remove the "Draft" prefix?

reneleonhardt · 2024-04-20T20:28:52Z

As far as I can see it's finished (can you double-check the InfillPrompt?), I'm just waiting for the llama.cpp server support 😅
Or can you check it locally? In LM Studio I did see no big problems, the model is amazing, I just have to ask it to tell the truth when it fantasized to me about non-existing Kotlin operators 😉
7B Q8 is super fast on M1 Max 🚀

carlrobertoh · 2024-04-20T20:35:20Z

Nice! It looks like the model doesn't support infilling, or at least I couldn't find anything. Maybe we can just remove the new infill template for now.

reneleonhardt · 2024-04-20T20:40:01Z

Nice! It looks like the model doesn't support infilling, or at least I couldn't find anything. Maybe we can just remove the new infill template for now.

Really? All main calls I saw contained --in-prefix and suffix ggml-org/llama.cpp#6747 (comment) that's why I added it 😅
But if you think it's a problem I can remove it for now 🙂

carlrobertoh · 2024-04-20T20:49:22Z

Hmm, their model card doesn't specify anything related to that matter - https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3

reneleonhardt · 2024-04-20T20:59:21Z

All I can see is that there is a lot of confusion about the model, some people have many problems, some have none... I guess it depends on how you use it, and some HF models also seem to contain problems.
phymbert mentioned a special infill endpoint ggml-org/llama.cpp#6747 (comment), is CodeGPT using that?

carlrobertoh · 2024-04-20T21:40:34Z

No, but we were using it for a while, I think. However, I can't remember the reason why we switched back.

* feat: Support Llama 3 model (#478) * Use new InfillPrompt * Switch to lmstudio-community * Use new Prompt * llama.cpp removed the BOS token ggml-org/llama.cpp@a55d8a9 * Add tests * I would prefer a stream based solution * Add 70B models * Add tests for skipping blank system prompt * Remove InfillPrompt for now

feat: Support Llama 3 model (carlrobertoh#478)

d6f2a0e

reneleonhardt changed the title ~~feat: Support Llama 3 model (#478)~~ Draft: feat: Support Llama 3 model (#478) Apr 19, 2024

reneleonhardt force-pushed the support-llama3 branch from 0ec7c86 to e91a278 Compare April 19, 2024 06:26

reneleonhardt added 3 commits April 19, 2024 09:23

Use new InfillPrompt

2494ba5

Switch to lmstudio-community

f1bc31f

Use new Prompt

ecb270e

reneleonhardt force-pushed the support-llama3 branch from aa92aab to ecb270e Compare April 19, 2024 07:24

reneleonhardt added 5 commits April 19, 2024 21:28

llama.cpp removed the BOS token

bb50044

ggml-org/llama.cpp@a55d8a9

Add tests

54feedd

I would prefer a stream based solution

fe3f74e

Add 70B models

3c2bd00

Add tests for skipping blank system prompt

8a1e7cb

Remove InfillPrompt for now

1ed41e5

carlrobertoh force-pushed the master branch from 279777c to bcb33ae Compare April 20, 2024 22:09

carlrobertoh merged commit 6e6a499 into carlrobertoh:master Apr 20, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: feat: Support Llama 3 model (#478) #479

Draft: feat: Support Llama 3 model (#478) #479

reneleonhardt commented Apr 19, 2024 •

edited

Loading

carlrobertoh commented Apr 20, 2024

reneleonhardt commented Apr 20, 2024

carlrobertoh commented Apr 20, 2024

reneleonhardt commented Apr 20, 2024 •

edited

Loading

carlrobertoh commented Apr 20, 2024

reneleonhardt commented Apr 20, 2024 •

edited

Loading

carlrobertoh commented Apr 20, 2024

Draft: feat: Support Llama 3 model (#478) #479

Draft: feat: Support Llama 3 model (#478) #479

Conversation

reneleonhardt commented Apr 19, 2024 • edited Loading

carlrobertoh commented Apr 20, 2024

reneleonhardt commented Apr 20, 2024

carlrobertoh commented Apr 20, 2024

reneleonhardt commented Apr 20, 2024 • edited Loading

carlrobertoh commented Apr 20, 2024

reneleonhardt commented Apr 20, 2024 • edited Loading

carlrobertoh commented Apr 20, 2024

reneleonhardt commented Apr 19, 2024 •

edited

Loading

reneleonhardt commented Apr 20, 2024 •

edited

Loading

reneleonhardt commented Apr 20, 2024 •

edited

Loading