-
-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Experimental Training Features, Providers Refactor #1155
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Josh-XT
changed the title
Refactor TTS, Audio to Text, and Image Generation to Providers
Refactor TTS, Audio to Text, Embeddings, and Image Generation to Providers
Mar 29, 2024
Josh-XT
changed the title
Refactor TTS, Audio to Text, Embeddings, and Image Generation to Providers
Providers Refactor, Dataset Creation Functionality Created
Mar 29, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Providers Refactor
default
provider that usesgpt4free
for LLM,faster-whisper
for audio transcription/translation,streamlabs
for text-to-speech, ONNXall-MiniLM-L6-v2
embedder (256 chunk size), andstable diffusion
on Hugging Face for image generation (RequiresHUGGINGFACE_API_KEY
).Refactor TTS, Audio to Text, Embeddings, and Image Generation to Providers
There are now multiple provider services instead of having multiple extensions for different providers for things like TTS, audio to text, embeddings, and image generation.
Each provider now has a
services
property which is a list services available from that provider. Providers with an embeddings service will have an additional property forchunk_size
for the embedder.For example, the OpenAI provider has:
New Experimental Training Features
These new training features require some testing and will improve as better training methods become available. The first implementation for training that I have built in is DPO. Open to feedback and improvements.
DPO, CPO, and ORPO style Dataset Creation Functionality Created
question/good answer/bad answer
dataset in DPO / CPO / ORPO format to be used in Transformers (or pick your solution) to fine-tune models./api/agent/{agent_name}/memory/dataset
AGiXT/agixt/WORKSPACE/{dataset_name}.json
.Example with Python SDK
The example below will consume the AGiXT GitHub repository into the agent's memory, then create a synthetic dataset with the learned information.
Model Training Based on Agent Memories
Finally making training a full process instead of stopping at the memories. After your agent learns from GitHub repo, files, arXiv articles, websites, or YouTube captions, you can use the new training endpoint to:
unsloth
private_repo
on abool
once complete if your agent has aHUGGINGFACE_API_KEY
in its config.Chat Completions endpoint modifications
Several modifications have been made to the Chat Completions endpoint to bring it more in line with the OpenAI endpoints. These modifications were in addition to changes in #1154 .