Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure 100% works with local models #69

Closed
lalalune opened this issue Oct 28, 2024 · 7 comments
Closed

Make sure 100% works with local models #69

lalalune opened this issue Oct 28, 2024 · 7 comments

Comments

@lalalune
Copy link
Member

We have a local llama setup but we haven't used it since all this hype started, so we need to go through and make sure that all local models are working correctly.

@sirkitree sirkitree added this to Eliza Oct 30, 2024
@o-on-x
Copy link
Contributor

o-on-x commented Nov 1, 2024

Created fork that uses ollama for llama.ts instead of node-llama-cpp. lowers technical debt of having to build llama-cpp & download model. PR might not want to if instead an olllama.ts should be added & not remove the llama-cpp local options.
https://github.com/o-on-x/eliza

@lalalune
Copy link
Member Author

lalalune commented Nov 2, 2024

Created fork that uses ollama for llama.ts instead of node-llama-cpp. lowers technical debt of having to build llama-cpp & download model. PR might not want to if instead an olllama.ts should be added & not remove the llama-cpp local options. https://github.com/o-on-x/eliza

can you please review the latest and add this as an additional provider option? If you search the code for 'ollama' you will see that there is already a comment.

@yodamaster726
Copy link
Contributor

I'm trying the latest code and trying to configure it to use my local ollama setup, but eliza keeps wanting to download it's own model. I don't want llama models getting downloaded into my src tree - bad practice - if it's going to do that it should put it in off the root under some /models directory or something like that. At the moment I'm trying to get this to work and I'm willing to update the docs for others to benefit from this.

Here is what I've configured on my .env
X_SERVER_URL=http://localhost:11434/
XAI_API_KEY=
XAI_MODEL=llama2

OLLAMA_HOST=http://localhost:11434/
OLLAMA_MODEL=llama2

Here is the output when I do the pnpm run dev

eliza>$ pnpm run dev

eliza@ dev /Users/davidjaramillo/Documents/Projects/eliza
pnpm --dir core dev

[email protected] dev /Users/davidjaramillo/Documents/Projects/eliza/core
tsc && nodemon

[nodemon] 3.1.7
[nodemon] to restart at any time, enter rs
[nodemon] watching path(s): src/**/*
[nodemon] watching extensions: ts
[nodemon] starting node --loader ts-node/esm src/index.ts
(node:93769) ExperimentalWarning: --experimental-loader may be removed in the future; instead use register():
--import 'data:text/javascript,import { register } from "node:module"; import { pathToFileURL } from "node:url"; register("ts-node/esm", pathToFileURL("./"));'
(Use node --trace-warnings ... to show where the warning was created)
(node:93769) [DEP0180] DeprecationWarning: fs.Stats constructor is deprecated.
(Use node --trace-deprecation ... to show where the warning was created)
(node:93769) [DEP0040] DeprecationWarning: The punycode module is deprecated. Please use a userland alternative instead.
No characters found, using default character
Starting agent for character Eliza
sqlite-vec extensions loaded successfully.
Importing action from: /Users/davidjaramillo/Documents/Projects/eliza/core/src/actions/askClaude.ts
Chat started. Type 'exit' to quit.
You: Importing action from: /Users/davidjaramillo/Documents/Projects/eliza/core/src/custom_actions/epicAction.ts
Server running at http://localhost:3000/
Failed to import action from /Users/davidjaramillo/Documents/Projects/eliza/core/src/custom_actions/epicAction.ts: Error: Cannot find module '/Users/davidjaramillo/Documents/Projects/eliza/core/src/custom_actions/epicAction.ts' imported from /Users/davidjaramillo/Documents/Projects/eliza/core/src/cli/config.ts
at finalizeResolution (/Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/ts-node@10.9.2_@types[email protected][email protected]/node_modules/ts-node/dist-raw/node-internal-modules-esm-resolve.js:366:11)
at moduleResolve (/Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/ts-node@10.9.2_@types[email protected][email protected]/node_modules/ts-node/dist-raw/node-internal-modules-esm-resolve.js:801:10)
at Object.defaultResolve (/Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/ts-node@10.9.2_@types[email protected][email protected]/node_modules/ts-node/dist-raw/node-internal-modules-esm-resolve.js:912:11)
at /Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/ts-node@10.9.2_@types[email protected][email protected]/node_modules/ts-node/src/esm.ts:218:35
at entrypointFallback (/Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/ts-node@10.9.2_@types[email protected][email protected]/node_modules/ts-node/src/esm.ts:168:34)
at /Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/ts-node@10.9.2_@types[email protected][email protected]/node_modules/ts-node/src/esm.ts:217:14
at addShortCircuitFlag (/Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/ts-node@10.9.2_@types[email protected][email protected]/node_modules/ts-node/src/esm.ts:409:21)
at resolve (/Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/ts-node@10.9.2_@types[email protected][email protected]/node_modules/ts-node/src/esm.ts:197:12)
at nextResolve (node:internal/modules/esm/hooks:748:28)
at Hooks.resolve (node:internal/modules/esm/hooks:240:30)
Creating runtime for character Eliza
Agent ID b850bc30-45f8-0041-a00a-83df46d8555d
Initializing LlamaLocal service for agent b850bc30-45f8-0041-a00a-83df46d8555d Eliza
Constructing
modelName model.gguf
Checking model
Model already exists.
Agent ID b850bc30-45f8-0041-a00a-83df46d8555d
Initializing LlamaLocal service for agent b850bc30-45f8-0041-a00a-83df46d8555d Eliza
**** No CUDA detected - local response will be slow
Creating grammar
Loading model
this.modelPath /Users/davidjaramillo/Documents/Projects/eliza/core/src/services/model.gguf
gguf_init_from_file: failed to read key-value pairs
[node-llama-cpp] llama_model_load: error loading model: llama_model_loader: failed to load model from /Users/davidjaramillo/Documents/Projects/eliza/core/src/services/model.gguf
[node-llama-cpp]
[node-llama-cpp] llama_load_model_from_file: failed to load model
Model initialization failed. Deleting model and retrying... Error: Failed to load model
at LlamaModel._create (file:///Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/[email protected][email protected]/node_modules/node-llama-cpp/dist/evaluator/LlamaModel/LlamaModel.js:479:23)
at async Object. (file:///Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/[email protected][email protected]/node_modules/node-llama-cpp/dist/bindings/Llama.js:194:24)
at async withLock (file:///Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/[email protected]/node_modules/lifecycle-utils/dist/withLock.js:36:16)
at async Llama.loadModel (file:///Users/davidjaramillo/Documents/Projects/eliza/node_modules/.pnpm/[email protected][email protected]/node_modules/node-llama-cpp/dist/bindings/Llama.js:190:16)
at async LlamaService.initializeModel (file:///Users/davidjaramillo/Documents/Projects/eliza/core/src/services/llama.ts:73:26)
Model deleted.
Checking model
this.modelPath /Users/davidjaramillo/Documents/Projects/eliza/core/src/services/model.gguf
Model not found. Downloading...
Following redirect to: https://cdn-lfs-us-1.hf.co/repos/97/cc/97ccd42703c9e659e537d359fa1863623f9371054d9ea5e4dd163214b7803ad1/c77c263f78b2f56fbaddd3ef2af750fda6ebb4344a546aaa0bfdd546b1ca8d84?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27Hermes-3-Llama-3.1-8B.Q8_0.gguf%3B+filename%3D%22Hermes-3-Llama-3.1-8B.Q8_0.gguf%22%3B&Expires=1730863045&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTczMDg2MzA0NX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zLzk3L2NjLzk3Y2NkNDI3MDNjOWU2NTllNTM3ZDM1OWZhMTg2MzYyM2Y5MzcxMDU0ZDllYTVlNGRkMTYzMjE0Yjc4MDNhZDEvYzc3YzI2M2Y3OGIyZjU2ZmJhZGRkM2VmMmFmNzUwZmRhNmViYjQzNDRhNTQ2YWFhMGJmZGQ1NDZiMWNhOGQ4ND9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSoifV19&Signature=ggX7ncHRkcidhIoJlVT2nDNU8pwhhrXGxg%7E0yEJbd65gVd2OGlJwlB7FZZOKwEWhATCrHSucHjuBFAor1bvwPH2X9IauYZugc-ZRXpuW8eFKAATOeuP2VqogGBAvz61WjExPzCy7yrHGi9ZeCGfq1WQ4zh2ST%7Ef9offz7s%7E9ZtYcoNB2HlwzgaxEHXXgssesQOuRbzTWyN41lYQafKpSpNlSvyfs7VIGFCfLaOanAuWxxXsjE2geHJIQUGwbCtJTYfjYemkSL77WkCu5vFZlHpk7CHtqlGMFvChV1myQj4ilwyh0urCkII3QNXj-rXT7-eQXO3mHt4xUVAyfOateiw__&Key-Pair-Id=K24J24Z295AEI9
^CELIFECYCLE Command failed.
ELIFECYCLE Command failed.
Terminated: 15

@o-on-x
Copy link
Contributor

o-on-x commented Nov 3, 2024

previous version i had changed the llama.ts to use ollama instead. updated for the latest version
i will look into other ways to do this. but for now this sidesteps using node-llama-cpp & gguf model downloads
just set XAI_MODEL="name-of-model"

`
import { OpenAI } from 'openai';
import * as dotenv from 'dotenv';
import { debuglog } from 'util';

// Create debug logger
const debug = debuglog('LLAMA');

process.on('uncaughtException', (err) => {
debug('Uncaught Exception:', err);
process.exit(1);
});

process.on('unhandledRejection', (reason, promise) => {
debug('Unhandled Rejection at:', promise, 'reason:', reason);
});

interface QueuedMessage {
context: string;
temperature: number;
stop: string[];
max_tokens: number;
frequency_penalty: number;
presence_penalty: number;
useGrammar: boolean;
resolve: (value: any | string | PromiseLike<any | string>) => void;
reject: (reason?: any) => void;
}

class LlamaService {
private static instance: LlamaService | null = null;
private openai: OpenAI;
private modelName: string;
private embeddingModelName: string = 'nomic-embed-text';
private messageQueue: QueuedMessage[] = [];
private isProcessing: boolean = false;

private constructor() {
debug('Constructing LlamaService');
dotenv.config();
this.modelName = process.env.XAI_MODEL || 'llama3.2';
this.openai = new OpenAI({
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama',
dangerouslyAllowBrowser: true
});
debug(Using model: ${this.modelName});
debug('OpenAI client initialized');
}

public static getInstance(): LlamaService {
debug('Getting LlamaService instance');
if (!LlamaService.instance) {
debug('Creating new instance');
LlamaService.instance = new LlamaService();
}
return LlamaService.instance;
}

// Adding initializeModel method to satisfy ILlamaService interface
public async initializeModel(): Promise {
debug('Initializing model...');
try {
// Placeholder for model setup if needed
debug(Model ${this.modelName} initialized successfully.);
} catch (error) {
debug('Error during model initialization:', error);
throw error;
}
}

async queueMessageCompletion(
context: string,
temperature: number,
stop: string[],
frequency_penalty: number,
presence_penalty: number,
max_tokens: number
): Promise {
debug('Queueing message completion');
return new Promise((resolve, reject) => {
this.messageQueue.push({
context,
temperature,
stop,
frequency_penalty,
presence_penalty,
max_tokens,
useGrammar: true,
resolve,
reject,
});
this.processQueue();
});
}

async queueTextCompletion(
context: string,
temperature: number,
stop: string[],
frequency_penalty: number,
presence_penalty: number,
max_tokens: number
): Promise {
debug('Queueing text completion');
return new Promise((resolve, reject) => {
this.messageQueue.push({
context,
temperature,
stop,
frequency_penalty,
presence_penalty,
max_tokens,
useGrammar: false,
resolve,
reject,
});
this.processQueue();
});
}

private async processQueue() {
debug(Processing queue: ${this.messageQueue.length} items);
if (this.isProcessing || this.messageQueue.length === 0) {
return;
}

this.isProcessing = true;

while (this.messageQueue.length > 0) {
  const message = this.messageQueue.shift();
  if (message) {
    try {
      const response = await this.getCompletionResponse(
        message.context,
        message.temperature,
        message.stop,
        message.frequency_penalty,
        message.presence_penalty,
        message.max_tokens,
        message.useGrammar
      );
      message.resolve(response);
    } catch (error) {
      debug('Queue processing error:', error);
      message.reject(error);
    }
  }
}

this.isProcessing = false;

}

private async getCompletionResponse(
context: string,
temperature: number,
stop: string[],
frequency_penalty: number,
presence_penalty: number,
max_tokens: number,
useGrammar: boolean
): Promise<any | string> {
debug('Getting completion response');
try {
const completion = await this.openai.chat.completions.create({
model: this.modelName,
messages: [{ role: 'user', content: context }],
temperature,
max_tokens,
stop,
frequency_penalty,
presence_penalty,
});

  const response = completion.choices[0].message.content;

  if (useGrammar && response) {
    try {
      let jsonResponse = JSON.parse(response);
      return jsonResponse;
    } catch {
      const jsonMatch = response.match(/```json\s*([\s\S]*?)\s*```/);
      if (jsonMatch) {
        try {
          return JSON.parse(jsonMatch[1]);
        } catch {
          throw new Error("Failed to parse JSON from response");
        }
      }
      throw new Error("No valid JSON found in response");
    }
  }

  return response || '';
} catch (error) {
  debug('Completion error:', error);
  throw error;
}

}

async getEmbeddingResponse(input: string): Promise<number[] | undefined> {
debug('Getting embedding response');
try {
const embeddingResponse = await this.openai.embeddings.create({
model: this.embeddingModelName,
input,
});

  return embeddingResponse.data[0].embedding;
} catch (error) {
  debug('Embedding error:', error);
  return undefined;
}

}
}

debug('LlamaService module loaded');
export default LlamaService;
`

@o-on-x o-on-x closed this as completed Nov 3, 2024
@github-project-automation github-project-automation bot moved this from Backlog to Done in Eliza Nov 3, 2024
@o-on-x
Copy link
Contributor

o-on-x commented Nov 3, 2024

need to add support through the new updated model providers rather than just replace llama-cpp

@yodamaster726
Copy link
Contributor

@o-on-x I believe it was you that shared that code with me and I got it running locally.

But I need the .env settings that will connect to it. I've included what I used in the issue.

There must be more than just setting the XAI_MODEL.

I agree with @lalalune, are you looking to include this as an additional provider.

@o-on-x
Copy link
Contributor

o-on-x commented Nov 6, 2024

I added a new OLLAMA model provider. Also there is a switch now for llama.ts if you are using local provider ollama or defaults to llama-cpp. u can set the Ollama mode provider to use a remote url if hosting remotely. Select the models and embedding models. Env variables to set are included in the .env.example (also the image posting handling is in this code just dont merge that part in discord messages.ts) https://github.com/o-on-x/eliza_ollama

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants