-
Notifications
You must be signed in to change notification settings - Fork 44.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(forge/llm): Add LlamafileProvider
#7091
feat(forge/llm): Add LlamafileProvider
#7091
Conversation
…der for llamafiles. Currently it just extends OpenAIProvider and only overrides methods that are necessary to get the system to work at a basic level. Update ModelProviderName schema and config/configurator so that app startup using this provider is handled correctly. Add 'mistral-7b-instruct-v0' to OpenAIModelName/OPEN_AI_CHAT_MODELS registries.
…-Instruct chat template, which supports the 'user' & 'assistant' roles but does not support the 'system' role.
…kens`, and `get_tokenizer` from classmethods so I can override them in LlamafileProvide (and so I can access instance instance attributes from inside them). Implement class `LlamafileTokenizer` that calls the llamafile server's `/tokenize` API endpoint.
…tes on the integration; add helper scripts for downloading/running a llamafile + example env file.
…gs for reproducibility
…ange serve.sh to use model's full context size (this does not seem to cause OOM errors, surpisingly).
✅ Deploy Preview for auto-gpt-docs canceled.
|
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
@CodiumAI-Agent /review |
PR Review
Code feedback:
✨ Review tool usage guide:Overview: The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR.
See the review usage page for a comprehensive guide on using this tool. |
@k8si any chance you could enable maintainer write access on this PR? |
@Pwuts it doesn't look like I have the ability to do that. I added you as a maintainer to the forked project, is that sufficient or do others need write access? Alternatively, you could branch off my branch and I can just accept the changes via PR? |
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly. |
…vider`, `GroqProvider` and `LlamafileProvider` and rebase the latter three on `BaseOpenAIProvider`
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7091 +/- ##
==========================================
- Coverage 54.21% 53.81% -0.41%
==========================================
Files 122 124 +2
Lines 6875 7021 +146
Branches 881 909 +28
==========================================
+ Hits 3727 3778 +51
- Misses 3015 3110 +95
Partials 133 133
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
# 5 tokens for [INST], [/INST], which actually get | ||
# tokenized into "[, INST, ]" and "[, /, INST, ]" | ||
# by the mistral tokenizer | ||
prompt_added += 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's 7? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@k8si can you clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this "works" but the model isn't great. Can we constrain the output schema to our models like is an option in the llamafile UI?
I think we could implement something like that by allowing to pass a model as the |
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly. |
Background
This draft PR is a step toward enabling the use of local models in AutoGPT by adding llamafile as an LLM provider.
Implementation notes are included in
forge/forge/llm/providers/llamafile/README.md
Related issues:
Depends on:
BaseOpenAIProvider
-> deduplicateGroqProvider
&OpenAIProvider
#7178Changes 🏗️
Add minimal implementation of
LlamafileProvider
, a newChatModelProvider
for llamafiles. It extendsBaseOpenAIProvider
and only overrides methods that are necessary to get the system to work at a basic level.Add support for
mistral-7b-instruct-v0.2
. This is the only model currently supported byLlamafileProvider
because this is the only model I tested anything with.Misc changes to app configuration to enable switching between openai/llamafile providers. In particular, added config fieldLLM_PROVIDER
that, when set to 'llamafile', will useLllamafileProvider
in agents rather thanOpenAIProvider
.Add instructions to use AutoGPT with llamafile in the docs at
autogpt/setup/index.md
Limitations:
PR Quality Scorecard ✨
+2 pts
+5 pts
+5 pts
+5 pts
-4 pts
+4 pts
+5 pts
-5 pts
agbenchmark
to verify that these changes do not regress performance?+10 pts