-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planning: Cortex handling Engine Variants #1453
Comments
dan-menlo
changed the title
epic: Cortex handling Engine versions and variants?
epic: Cortex handling Engine variants
Oct 13, 2024
dan-menlo
changed the title
epic: Cortex handling Engine variants
epic: Cortex handling Engine Variants
Oct 13, 2024
3 tasks
This was referenced Oct 13, 2024
This was referenced Oct 13, 2024
dan-menlo
assigned nguyenhoangthuan99 and namchuai and unassigned vansangpfiev and nguyenhoangthuan99
Oct 13, 2024
dan-menlo
changed the title
epic: Cortex handling Engine Variants
planning: Cortex handling Engine Variants
Oct 19, 2024
6 tasks
Closing into #1416 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Goal
win-avx-512
,win-hipblas
,win-llvm
sycl
,llvm
growLLVM
for ARM-based CPUs #1251sycl
for Intel-based CPUs #1252Tasklist
Questions
Scenario
llama-cuda-avx2
tollama-vulkan
Cortex needs an elegant way to handle different engine versions + variants, without confusing the user. From my naive perspective, there are two key approaches
Option 1: Every engine is versioned, and maintains a list of variants that it can use
use
command/engines
API endpoint would have ause
endpointOption 2: Every engine version/variant is a first-class Engine citizen
cortex engines list
will show a massive long list of engines> cortex engines list llama.cpp-b3919-cuda llama.cpp-b3821-vulkan
The text was updated successfully, but these errors were encountered: