-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add dll search path #982
Conversation
b682f5c
to
8bfffaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Great work!
engine/commands/engine_init_cmd.cc
Outdated
} else if (e == "cortex.tensorrt-llm") { | ||
return cortex_utils::kTensorrtLlmPath; | ||
} | ||
return cortex_utils::kLlamaLibPath; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 208 and 212 - is it extraneous to have 208?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 208 and 212 - is it extraneous to have 208?
We fallback to default, will change the function to make it clearer.
// Unload the llamacpp dll directory then load the tensorrt-llm | ||
// 2. If tensorrt-llm is loaded and new requested engine is llamacpp: | ||
// Do nothing, llamacpp can re-use tensorrt-llm dependencies (need to be tested careful) | ||
// 3. Add dll directory if met other conditions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I understand this in more detail:
- We are always loading the dll directory of the latest used engine
- Is this due to our decision to keep per-engine dependencies?
- Are we able to load dlls on an as-needed basis, without removing the "older" engine dlls?
I imagine this will come up often in demos - we will demo:
cortex run llama3.1
cortex run llama3.1:tensorrt-llm
cortex run llama3.1:onnx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only load the engine's dll directory if engine is not loaded. In case of tensorrt-llm and llamacpp engine, it is a little bit tricky because the cuda dependencies may conflict so we need to remove before add the dll directory.
@vansangpfiev I do not have sufficient knowledge to be able to assess this PR, and have just approved it first to ensure you guys are not blocked. I've left a couple of basic clarifying questions - if you don't mind |
Describe Your Changes
Fixes Issues