epic: Cortex Model Repo supports Default Model download #1418

dan-menlo · 2024-10-04T01:30:57Z

Goal

Cortex's Built-in Libraries have a "Cortex Model repo" format
Cortex Model Repos are a critical data structure to support cortex pull <model> and cortex run <model>

High-level Structure

Cortex model repos are Git repos
Git repo has branches, which hold different "versions" of the model (i.e. quantization, engine etc)
- cortex run <model>:<branch> pulls a specific version
- cortex run <model> pulls a default version
Git repo's "main branch" holds README and metadata, to allow user to navigate other branches

Decisions

Decision 1: Default Model Download

We need to figure out which model version cortex pull <model> and cortex run model` will pull.

Option 1: `main` branch

We have so far gone with main branch to hold our recommended version
We basically merge recommended version, e.g. 3b-gguf into main branch

However, I do not think this is correct long term:

Longer-term, I think default model selection will be algorithmic
- e.g. Hardware detection -> pull most "optimized" model
main branch is non-descriptive as a branch name
- e.g. main could hold 3b-gguf, but user is unaware
- It is better to be upfront that we default to 3b-gguf
I think main branch requires more work to manage longer-term
- Process of switching main branch will take some time - i.e. merge, git issues etc

It is also incorrect to compare our approach to Ollama. Ollama uses a tag-based system similar to Docker, where latest is a pointer to 3b. It is difficult to replicate this in a "straightforward UX" in Git (i.e. tags are not very visible from main page)

Option 2: `main` branch has `metadata.yaml`

I would like to instead propose a metadata.yaml approach to defining Default Model downloads
This can be superseded in the future with an algorithmic approach to selecting Default Model
metadata.yaml is also used to generate the CLI UX for cortex pull or cortex run

How it works

Cortex Model Repo's main branch will hold a few files (see below)
The v1 of metadata.yaml will be very simple

# File system
metadata.yaml
README.md

# metadata.yaml
version: 1
name: mistral
default: 3b-gguf

In the future, metadata.yaml can be more complicated, and allow for fine-grained control of CLI UX, e.g. sections for 3b, 7b, or by engine.

Furthermore, we can use metadata.yaml as a data structure to hold information about the different Model versions.

Can expand to include MMLU scores
Can expand to include file sizes
This can all be automated in the future through CI/CD on the Git repo

The text was updated successfully, but these errors were encountered:

namchuai · 2024-10-04T02:27:00Z

I agree with Option 2. Current approach can't hold default model for other engines. Also, it caused duplication trouble between main branch and a specific branch (3b, for example).

I think we can go with option 2 before releasing because it's more future-proof.

gabrielle-ong · 2024-10-04T03:19:14Z

Ill update the 35 cortexso repos with metadata.yml on main branch.

@nguyenhoangthuan99 Can I get help on the recommended branches these 35 models? (listed from https://huggingface.co/cortexso)

Local models

cortexso/llama3.2 (3b-gguf-q4-km)
cortexso/mistral-nemo (12b-gguf-q4-ks)
cortexso/llama3 (8b-gguf-q4-ks)
cortexso/llama3.1 (8b-gguf-q4-ks)
cortexso/nomic-embed-text-v1 (main) - rarely used - 3 downloads last month

Future updates to default (when CI run to update branches)

cortexso/tinyllama (1b-gguf) (future 1b-gguf-q4-ks)
cortexso/mistral (7b-gguf) (future 7b-gguf-q4-ks)
cortexso/phi3 (mini-gguf) (future mini-gguf-q4-ks)
cortexso/gemma2 (2b-gguf) (future 2b-gguf-q4-ks)
cortexso/openhermes-2.5 (7b-gguf) (future 7b-gguf-q4-ks)
cortexso/mixtral (7x8b-gguf) (future 7x8b-gguf-q4-ks)
cortexso/yi-1.5 (34b-gguf) (future 34b-gguf-q4-ks)
cortexso/aya (12.9b-gguf) (future 12.9b-gguf-q4-ks)
cortexso/codestral (22b-gguf) (fututre 22b-gguf-q4-ks)
cortexso/command-r (35b-gguf) (future 35b-gguf-q4-ks)
cortexso/gemma (7b-gguf) (future 7b-gguf-q4-ks)
cortexso/qwen2 (7b-gguf) (future 7b-gguf-q4-ks)

Remote models (no need for metadata.yml as no gguf file)

cortexso/NVIDIA-NIM (remote model, we don't have gguf file for this model)
cortexso/gpt-4o-mini (remote model, we don't have gguf file for this model)
cortexso/open-router-auto (remote model, we don't have gguf file for this model)
cortexso/groq-mixtral-8x7b-32768 (remote model, we don't have gguf file for this model)
cortexso/groq-gemma-7b-it (remote model, we don't have gguf file for this model)
cortexso/groq-llama3-8b-8192 (remote model, we don't have gguf file for this model)
cortexso/groq-llama3-70b-8192 (remote model, we don't have gguf file for this model)
cortexso/claude-3-5-sonnet-20240620 (remote model, we don't have gguf file for this model)
cortexso/gpt-3.5-turbo (remote model, we don't have gguf file for this model)
cortexso/gpt-4o (remote model, we don't have gguf file for this model)
cortexso/martian-model-router (remote model, we don't have gguf file for this model)
cortexso/cohere-command-r-plus (remote model, we don't have gguf file for this model)
cortexso/cohere-command-r (remote model, we don't have gguf file for this model)
cortexso/mistral-large-latest (remote model, we don't have gguf file for this model)
cortexso/mistral-small-latest (remote model, we don't have gguf file for this model)
cortexso/claude-3-haiku-20240307 (remote model, we don't have gguf file for this model)
cortexso/claude-3-sonnet-20240229 (remote model, we don't have gguf file for this model)
cortexso/claude-3-opus-20240229 (remote model, we don't have gguf file for this model)

gabrielle-ong · 2024-10-04T03:21:16Z

QN: is it metadata.yaml or metadata.yml?
We are currently using model.yml so I think it should be .yml to be consistent

namchuai · 2024-10-04T03:23:12Z

@gabrielle-ong, I agree with metadata.yml

If possible, please start with cortexso/llama3.2. Thank you!

gabrielle-ong · 2024-10-04T03:37:25Z

Thanks @namchuai and @nguyenhoangthuan99! added to llama3.2 and working down the list

gabrielle-ong · 2024-10-04T04:24:34Z

Created all the metadata.yml files in the list
Categorized the list above - Noted Alex on future changes to the default branch for some models -
let me know when the CI is run to add the new branches for those models

gabrielle-ong · 2024-10-05T09:01:11Z

Thanks @james and @nguyenhoangthuan99!

dan-menlo added this to Menlo Oct 4, 2024

dan-menlo converted this from a draft issue Oct 4, 2024

dan-menlo assigned namchuai Oct 4, 2024

dan-menlo moved this from Investigating to Scheduled in Menlo Oct 4, 2024

dan-menlo changed the title ~~epic: cortex run vs cortex pull UX~~ epic: Cortex Model Repo format Oct 4, 2024

dan-menlo changed the title ~~epic: Cortex Model Repo format~~ epic: Cortex Model Repo supports Default Model download Oct 4, 2024

dan-menlo assigned nguyenhoangthuan99 and gabrielle-ong Oct 4, 2024

namchuai mentioned this issue Oct 4, 2024

feat: uplift pull and run cmd #1430

Merged

3 tasks

namchuai closed this as completed in #1430 Oct 4, 2024

github-project-automation bot moved this from Scheduled to Review + QA in Menlo Oct 4, 2024

gabrielle-ong moved this from Review + QA to Completed in Menlo Oct 5, 2024

This was referenced Oct 5, 2024

bug: Default model was not recommended due to capitalization #1437

Closed

chore: clean up Hugging Face Repos old files, future default branches janhq/models#48

Closed

bug: cortex run or pull redownloads existing model multiple times #1344

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

epic: Cortex Model Repo supports Default Model download #1418

epic: Cortex Model Repo supports Default Model download #1418

dan-menlo commented Oct 4, 2024 •

edited

Loading

namchuai commented Oct 4, 2024

gabrielle-ong commented Oct 4, 2024 •

edited

Loading

gabrielle-ong commented Oct 4, 2024

namchuai commented Oct 4, 2024

gabrielle-ong commented Oct 4, 2024

gabrielle-ong commented Oct 4, 2024

gabrielle-ong commented Oct 5, 2024

epic: Cortex Model Repo supports Default Model download #1418

epic: Cortex Model Repo supports Default Model download #1418

Comments

dan-menlo commented Oct 4, 2024 • edited Loading

Goal

High-level Structure

Decisions

Decision 1: Default Model Download

Option 1: main branch

Option 2: main branch has metadata.yaml

namchuai commented Oct 4, 2024

gabrielle-ong commented Oct 4, 2024 • edited Loading

Local models

Future updates to default (when CI run to update branches)

Remote models (no need for metadata.yml as no gguf file)

gabrielle-ong commented Oct 4, 2024

namchuai commented Oct 4, 2024

gabrielle-ong commented Oct 4, 2024

gabrielle-ong commented Oct 4, 2024

gabrielle-ong commented Oct 5, 2024

dan-menlo commented Oct 4, 2024 •

edited

Loading

Option 1: `main` branch

Option 2: `main` branch has `metadata.yaml`

gabrielle-ong commented Oct 4, 2024 •

edited

Loading