Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Cortex.cpp Hardware Detection, Selection, and Memory Management #1089

Closed
dan-menlo opened this issue Sep 4, 2024 · 5 comments
Closed
Assignees

Comments

@dan-menlo
Copy link
Contributor

dan-menlo commented Sep 4, 2024

Overview

Note: We will probably need to break this discussion down into smaller topics:

  • Detection: how do we detect user's hardware, including GPUs? (Nvidia, AMD, etc)
  • Selection: Can the user select what hardware they want to run on (e.g. CPU-only, GPU 1 or 2)
  • Prediction: Can we predict which models won't run, based on Hardware Selection?
  • Memory Management: Are we able to detect how much GPU VRAM current models have (e.g. to prevent user from having OOM errors when loading new model)

Related

@dan-menlo dan-menlo added this to Menlo Sep 4, 2024
@dan-menlo dan-menlo converted this from a draft issue Sep 4, 2024
@dan-menlo dan-menlo changed the title Discussion: Cortex Hardware Detection & Memory Management Discussion: Cortex.cpp Hardware Detection & Memory Management Sep 4, 2024
@namchuai
Copy link
Collaborator

namchuai commented Sep 4, 2024

I have some additional information regarding this subject. Please add more if you have any idea. @vansangpfiev @nguyenhoangthuan99

  1. How do we detect user's hardware?
  1. How do we detect GPUs? (Nvidia, AMD, etc)?
  • We have to dump data from nvidia-smi (an executable from nvidia. which comes along with nvidia driver, IIRC)
  • For AMD GPU, we will dump data from vulkaninfoSDK. This executable is provided on the internet. We need to download it on demand or package it.
  1. Are we able to detect how much GPU VRAM current models have (e.g. to prevent user from having OOM errors when loading new model)?
    nvidia-smi does provide VRAM information. I'm not entirely sure about vulkaninfoSDK though. Will keep update this.

@freelerobot
Copy link
Contributor

freelerobot commented Sep 4, 2024

  1. When in runtime do we detect OS and architecture?
  2. Do we have graceful failures when users have incompatible setup?
  • What's the error message when we detect incompat OS? Recommendation? Fallback option (CPU)?
  • What's the error message when we detect incompatible hardware? Recommendation? Fallbck option (if any)?
  • Any other incompatible checks we can make and alert on?
  • Is it good practice to link to a support page?
  • Are these error messages implemented in cortexcpp API, so that Jan application can bubble it up to users?

At the moment we fail silently. Users get a vague message and have to send us their logs, creating more work on both sides.
If they have a niche architecture, and it is not supported, we just make it very clear in errors. (more likely, they'll download the wrong distro, in which case a clear error message would be nice).

  1. Do we currently have a compatibility chart anywhere on supported OS/hardware and versions?
  2. If not, lets make one? For all 3 engines.

@freelerobot
Copy link
Contributor

@dan-homebrew lets handle the common model loading graceful failures in a separate ticket. 🙏

@namchuai
Copy link
Collaborator

namchuai commented Sep 5, 2024

  1. When in runtime do we detect OS and architecture?
    I don't think we need this because our executable will be built for each platform, so we can using macro to detect OS and arch.

  2. Do we have graceful failures when users have incompatible setup?
    Currently we don't have a general message for user that have incompatible setup. I think we can run the check at main process when starting cortex and output std::err if user have incompatible setup.

  • What's the error message when we detect incompat OS? Recommendation? Fallback option (CPU)?
    • IMO: Incompatible OS! Cortex only support Windows, Linux and MacOS. Exiting..
  • What's the error message when we detect incompatible hardware? Recommendation? Fallbck option (if any)?
    • I might need some example here. Since we only have executable for amd64 and arm64. If running on other arch, the executable won't run.
    • About GPU incompatible, I don't have any idea. Please suggest!
  • Any other incompatible checks we can make and alert on?
  • Is it good practice to link to a support page?
    • Yes, I think so
  • Are these error messages implemented in cortexcpp API, so that Jan application can bubble it up to users?
    • I think we should be unopinionated and provide error message along with a error code? so that Jan (and other cortex consumer apps) can choose to bubble up to user or alternate it as they want.
  1. Do we currently have a compatibility chart anywhere on supported OS/hardware and versions?
  • We don't have any chart at the moment.
  1. If not, lets make one? For all 3 engines.
  • 👍

Please update me if I'm wrong @nguyenhoangthuan99 @vansangpfiev

@freelerobot
Copy link
Contributor

  1. See bug https://github.com/janhq/jan/issues/2734 . We also need to think through if this is an API endpoint used by Jan?

  2. I think we should have error codes like UnsupportedCPU, or InsufficentMemory, similar to OpenAI, but covering a lower level of errors that we might not want to abstract away from users at the moment. The errors get properly bubbled up to users in Jan app (cc @louis-jan ) so we can stop asking peopel for their logs. 😢

Compatibility chart DRAFT. @Van-QA I'm wondering if you have a better version?

https://docs.google.com/spreadsheets/d/1skQLXm2iVjEsG_TJsTN7jH7nfMTj7XMx6QBKG2DRlfc/edit?gid=1694305799#gid=1694305799

@dan-menlo dan-menlo changed the title Discussion: Cortex.cpp Hardware Detection & Memory Management Discussion: Cortex.cpp Hardware Detection & Config and Memory Management Sep 5, 2024
@dan-menlo dan-menlo changed the title Discussion: Cortex.cpp Hardware Detection & Config and Memory Management Discussion: Cortex.cpp Hardware Detection, Selection, and Memory Management Sep 5, 2024
@dan-menlo dan-menlo changed the title Discussion: Cortex.cpp Hardware Detection, Selection, and Memory Management Discussion: Cortex.cpp Hardware Detection, Selection, Prediction and Memory Management Sep 5, 2024
@dan-menlo dan-menlo changed the title Discussion: Cortex.cpp Hardware Detection, Selection, Prediction and Memory Management Discussion: Cortex.cpp Hardware Detection, Selection, and Memory Management Sep 5, 2024
@janhq janhq locked and limited conversation to collaborators Sep 5, 2024
@dan-menlo dan-menlo converted this issue into discussion #1111 Sep 5, 2024
@github-project-automation github-project-automation bot moved this from Need Investigation to Completed in Menlo Sep 5, 2024
@dan-menlo dan-menlo moved this from Completed to Discontinued in Menlo Sep 6, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants