Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epic: llama.cpp is installed by default #1217

Closed
2 tasks done
dan-menlo opened this issue Sep 15, 2024 · 5 comments
Closed
2 tasks done

epic: llama.cpp is installed by default #1217

dan-menlo opened this issue Sep 15, 2024 · 5 comments
Assignees
Milestone

Comments

@dan-menlo
Copy link
Contributor

dan-menlo commented Sep 15, 2024

Goal

Cortex.cpp should have a super easy UX to on par with market alternatives

  • User should have a 1-click installer, that prioritizes simple UX over size-complexity
    • Installer packages (or downloads at install time) llama.cpp binaries (e.g. up to 1gb)
    • Installer optimizes for "universal" installers (i.e. download all options, and then subsequently deletes all unnecessary files)
    • e.g. Mac Universal includes llama.cpp Mac for Intel and Apple Silicon
    • e.g. Windows and Nvidia Universal includes llama.cpp for both CUDA versions
  • For this epic, I am open to either:

Idea

I wonder whether the solution to this is a way to have an optional local lookup, as part of cortex engines install:

  • Installer can look in its installer folder to see if dependencies are available, and only pull from remote if needed
  • We do not need to make any changes to the installer (it still just runs cortex engines install)
  • This approach is elegant, and allows us flexibility in packaging

Out-of-scope (future)

  • We should offer a "cortex-alpine" installer which has minimal file size
    • Targeted for embedded use cases, and if people want to use ONNX or TensorRT-LLM without llama.cpp
    • User will have to download engines as a post-install step
  • We should offer "universal" installers that pre-package all potential dependencies
    • e.g. Large installer size, but packages all dependencies for offline install

Outcomes

  • Cortex.cpp installer should install llama.cpp by default
  • Cortex.cpp installer should install the correct version of llama.cpp (based on hardware)

Key Questions

  • Should we align with llama.cpp's versions? (e.g. with Vulkan, sycl)

Appendix

Why?

Our current cortex.cpp v0.1 onboarding UX is not user friendly:

  • llama.cpp only seems to be downloaded on first run of Cortex (at least on Windows)
    • Download UX is not very good (no progress indicator)
    • Download is often slow, or drops

Image

  • Very often, the llama.cpp engine download does not work, resulting in bad UX
    • "Engine not loaded yet"

Image

@dan-menlo dan-menlo added this to Menlo Sep 15, 2024
@dan-menlo dan-menlo converted this from a draft issue Sep 15, 2024
@dan-menlo dan-menlo changed the title epic: Cortex.cpp is installed with llama.cpp by default epic: llama.cpp is installed by default Sep 15, 2024
@dan-menlo dan-menlo assigned namchuai and unassigned vansangpfiev Sep 15, 2024
@dan-menlo
Copy link
Contributor Author

@hiento09 has worked on a PR that moves the llama.cpp Engine Install to Installer, but I am concerned that it's still not great UX

https://github.com/janhq/cortex.cpp/pull/1219/files

@namchuai
Copy link
Collaborator

Just saw this today. "Engine not loaded yet" does not mean the engine is not yet downloaded. It might have problem with the engine loading logic.

@freelerobot
Copy link
Contributor

QA Updates (v75)

  • ✅ works on windows
  • ✅ works on mac

@vansangpfiev
Copy link
Contributor

We are downloading the CUDA dependencies that Nvidia driver supports.
After the #1085 has been resolved, we should update CUDA dependencies logic for installer:

  • cortex searches for local folder to check CUDA 11.7/12.0 for llamacpp engine
  • if found, unzip the CUDA dependencies to the installation folder
  • if not, download CUDA 11.7/12.0 from jan host

cc: @hiento09 @dan-homebrew

@dan-menlo dan-menlo moved this from In Review to Review + QA in Menlo Sep 29, 2024
@gabrielle-ong
Copy link
Contributor

QA v123
✅ Mac
✅ Windows
✅ linux

Image
Image
Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

6 participants