Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add local model support and examples #147

Closed
wants to merge 4 commits into from

Conversation

vacekj
Copy link

@vacekj vacekj commented Dec 10, 2024

this PR adds a client impl for calling local models, such as ollama and lm studio. it also duplicates some examples, such as simple chat, embeddings and tools calls. it also adds a collab example with local models, that demonstrates models working together to find a good prompt for themselves.

I have verified all of the examples work for me, on an M1 Max w/ ollama latest.

@cvauclair
Copy link
Contributor

Hey @vacekj thank you for your contribution and interest in Rig, this is seriously awesome to see! We'll review this ASAP!

In the meantime, I've enabled CI on this PR so if you could address the couple linting/styling issues that might pop up that would be awesome. Cheers!

@vacekj
Copy link
Author

vacekj commented Dec 10, 2024

I belive my last commit should address the singular lint issue. Please feel free to re-run the CI to see if everything's alright.

@cvauclair cvauclair linked an issue Dec 10, 2024 that may be closed by this pull request
1 task
@cvauclair cvauclair requested a review from 0xMochan December 10, 2024 19:33
@cvauclair
Copy link
Contributor

FYI @vacekj the currently failing tests are known to fail when a PR is made from an outside contributor, so no need to worry about those

@akashicMarga
Copy link

@vacekj @cvauclair instead of local.rs in providers we can have ollama.rs, lm_studio.rs, as there are other servers too which run locally like llama.cpp, LLMedge, ?

@vacekj
Copy link
Author

vacekj commented Dec 11, 2024

@akashicMarga All of the projects you mentioned use the standard OpenAI API. I don't think it's necessary to duplicate the client., as it would be the same interface for LM Studio, ollama and LLMEdge.

@akashicMarga
Copy link

@akashicMarga All of the projects you mentioned use the standard OpenAI API. I don't think it's necessary to duplicate the client., as it would be the same interface for LM Studio, ollama and LLMEdge.

thought so, I was just thinking let's say in future if any local server that follows some other format where it will be integrated. so local_opena_ai_compatible.rs, or something on this line?

@cvauclair
Copy link
Contributor

@akashicMarga @vacekj I think it's fine to have a local.rs for local openai-compatible servers. However, if say for example it turns out that using the ollama base API (and not the openai compatible one) works better, then we could make a dedicated client for it in ollama.rs.

@cvauclair
Copy link
Contributor

@vacekj are there differences between your local.rs provider client and the openai.rs one? The latter allows the user to set the url with from_url which you could use to point the client to a local server, so perhaps a lot of code can be reused 🤔

@0xMochan
Copy link
Contributor

@vacekj are there differences between your local.rs provider client and the openai.rs one? The latter allows the user to set the url with from_url which you could use to point the client to a local server, so perhaps a lot of code can be reused 🤔

Yea, the OpenAI compatibility server from ollama works fine using just the OpenAI client which is also what the documentation recommends when using the official openai pypi library

use rig::{completion::Prompt, providers};

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // Create an Ollama client using the OpenAI compatibility layer
    let client = providers::openai::Client::from_url("ollama", "http://localhost:11434");

    // Create agent with a single context prompt
    let comedian_agent = client
        .agent("llama3.2:latest")
        .preamble("You are a comedian here to entertain the user using humour and jokes.")
        .build();

    // Prompt the agent and print the response
    let response = comedian_agent.prompt("Entertain me!").await?;
    println!("{}", response);

    Ok(())
}

There is the normal Ollama endpoint that would be worth to create an ollama.rs file for. For this PR to continue, it would be suitable to go for a direct ollama integration to use those specific features!

@vacekj
Copy link
Author

vacekj commented Dec 13, 2024

Great, then I will implement the API ollama lists here: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-chat-completion

That could unlock some cool new capabilities, like an agent pulling a new model and spawning it on demand. Agents spawning other agents, like recruiting for an army hehe.

@cvauclair
Copy link
Contributor

@vacekj I'll close the PR for now to reduce clutter and because we're talking about a new whole new client implementation here. We can re-open this one later or create a new one once the new integration will be ready for review.

As always, thanks for contributing :)

@cvauclair cvauclair closed this Dec 16, 2024
@0xMochan 0xMochan mentioned this pull request Dec 28, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Local Models Support
4 participants