Discussion: Cortex.cpp Model Loading and Inference Errors #1091

freelerobot · 2024-09-04T05:25:23Z

At the moment we fail silently and users have to send us logs. "model failed to load"

Can we get a handle on all the potential reasons why their model failed to load, and discuss how to handle each issue?

Goal:

Graceful failures
Predefined errors
Though there are endless errors, lets adopt the Pareto Rule, as 80% of our bugs are due to 20% common model loading challenges

Examples

Model won't fit in RAM/VRAM
Another model is running... other edge cases & race conditions
Wrong model format (i.e. unsupported runtime)
Version conflicts (in trt-llm engine scneario)
Missing model.yaml, template, key input/configs
Corrupted or missing model binaries
Incompat hardware. See

Questions:

What are the other common issues?
We support various engines, but should we standardize failure modes? This allows us to offer better dx/ux down the road.
What are the various ways that llamacpp, trtllm, directml currently handle errors? Do they have a predefined, neat list we can adopt?

Related issues:

The text was updated successfully, but these errors were encountered:

dan-menlo · 2024-09-04T05:31:20Z

@0xSage I recommend we expand this to

Error handling for Model Loading
Error handling for Model Running

freelerobot · 2024-09-04T09:59:48Z

Example:

Model Loading

Error Code	Error Message	Failover (if any)
InsufficientMemory	"The model is too big for your (V)RAM"	-

Model Running

Error Code	Error Message	Failover (if any)
ContextExceeded	"Your input exceeded the model context window of tokens"	-

dan-menlo · 2024-09-05T06:02:11Z

@0xSage @vansangpfiev I am renaming this discussion to "Model Loading and Inference Errors"

dan-menlo · 2024-09-05T06:02:41Z

This Bug report could be more informative, with better logs from cortex.cpp:
janhq/jan#3552

freelerobot added this to Menlo Sep 4, 2024

freelerobot converted this from a draft issue Sep 4, 2024

freelerobot changed the title ~~Discussion: Cortex.cpp Model Loading Common Errors~~ Discussion: Cortex.cpp Model Failed to Load Graceful Failure Sep 4, 2024

freelerobot changed the title ~~Discussion: Cortex.cpp Model Failed to Load Graceful Failure~~ Discussion: Cortex.cpp Model Loading Graceful Failures Sep 4, 2024

freelerobot changed the title ~~Discussion: Cortex.cpp Model Loading Graceful Failures~~ Discussion: Cortex.cpp Model Orchestration Errors Sep 4, 2024

freelerobot assigned vansangpfiev and nguyenhoangthuan99 Sep 4, 2024

dan-menlo changed the title ~~Discussion: Cortex.cpp Model Orchestration Errors~~ Discussion: Cortex.cpp Model Loading and Inference Errors Sep 5, 2024

janhq locked and limited conversation to collaborators Sep 5, 2024

dan-menlo converted this issue into discussion #1110 Sep 5, 2024

github-project-automation bot moved this from Need Investigation to Completed in Menlo Sep 5, 2024

dan-menlo moved this from Completed to Discontinued in Menlo Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Discussion: Cortex.cpp Model Loading and Inference Errors #1091

Discussion: Cortex.cpp Model Loading and Inference Errors #1091

freelerobot commented Sep 4, 2024 •

edited

Loading

dan-menlo commented Sep 4, 2024

freelerobot commented Sep 4, 2024 •

edited

Loading

dan-menlo commented Sep 5, 2024

dan-menlo commented Sep 5, 2024

This issue was moved to a discussion.

This issue was moved to a discussion.

Discussion: Cortex.cpp Model Loading and Inference Errors #1091

Discussion: Cortex.cpp Model Loading and Inference Errors #1091

Comments

freelerobot commented Sep 4, 2024 • edited Loading

dan-menlo commented Sep 4, 2024

freelerobot commented Sep 4, 2024 • edited Loading

Model Loading

Model Running

dan-menlo commented Sep 5, 2024

dan-menlo commented Sep 5, 2024

This issue was moved to a discussion.

freelerobot commented Sep 4, 2024 •

edited

Loading

freelerobot commented Sep 4, 2024 •

edited

Loading