You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
it should install 2 binaries (cortex and cortex-server)
it should install with correct folder permissions
it should install with folders: /engines /logs (no /models folder until model pull)
Data/Folder structures
cortex.so models are stored in cortex.so/model_name/variants/, with .gguf and model.yml file
huggingface models are stored huggingface.co/author/model_name with .gguf and model.yml file
downloaded models are saved in cortex.db (view via SQL)
[to add] tests for copying models data folder & relative paths
Cortex Update
cortex -v should check output current version and check for updates
cortex update replaces the app, installer, uninstaller and binary file (without installing cortex.llamacpp)
cortex update should update from ~3-5 versions ago to latest (+3 to 5 bump)
cortex update should update from the previous version to latest (+1 bump)
cortex update should update from previous stable version to latest (stable checking)
it should gracefully update when server is actively running
Overall / App Shell
cortex returns helpful text in a timely* way
cortex or cortex -h displays help commands
CLI commands should start the API server, if not running [WIP cortex pull, cortex engines install]
it should correctly log to cortex-cli.log and cortex.log
There should be no stdout from inactive shell session
Engines
llama.cpp should be installed by default
it should run gguf models on llamacpp
it should install engines
it should list engines (Compatible, Ready, Not yet installed)
it should get engines
it should uninstall engines
it should gracefully continue engine installation if interrupted halfway (partial download)
it should gracefully handle when users try to CRUD incompatible engines (No variant found for xxx)
it should run trtllm models on trt-llm [WIP, not tested]
it shoud handle engine variants [WIP, not tested]
it should update engines versions [WIP, not tested]
Server
cortex start should start server and output API documentation page
users can see API documentation page
cortex stop should stop server
it should correctly log to cortex logs
cortex ps should return server status and running models (or no model loaded)
Model Pulling
Pulling a model should pull .gguf and model.yml file
Model download progress should appear (with accurate %, total time, download size, speed)
cortex.so
it should pull by built in model_ID
pull by model_ID should recommend default variant at the top (set in HF model.yml)
it should pull by built-in model_id:variant
huggingface.co
it should pull by HF repo/model ID
it should pull by full HF url (ending in .gguf)
Interrupted Download
it should allow user to interrupt / stop download
pulling again after interruption should accurately calculates remainder of model file size neeed to be downloaded (Found unfinished download! Additional XGB needs to be downloaded)
it should allow to continue downloading the remainder after interruption
Model Management
it should list downloaded models
it should get info of a local model
it should update models
it should delete a model
it should import models with model_id and model_path
[To deprecate] it should alias models (deprecate once cortex run with regex is implemented)
Model Running
cortex run <cortexso model> - if no local models detected, shows pull model menu
cortex run - if local model detected, runs the local model
cortex run - if multiple local models detected, shows list of local models for users to select
cortex run <invalid model id> should return gracefully Model not found!
run should autostart server
cortex run <model> starts interactive chat (by default)
cortex run <model> -d runs in detached mode
cortex models start <model>
terminate StdIn or exit() should exit interactive chat
Hardware Detection / Acceleration [WIP]
it should auto offload max ngl
it should correctly detect available GPUs
it should gracefully detect missing dependencies/drivers
CPU Extension (e.g. AVX-2, noAVX, AVX-512)
GPU Acceleration (e.g. CUDA11, CUDA12, Vulkan, sycl, etc)
Uninstallation / Reinstallation
it should uninstall 2 binaries (cortex and cortex-server)
it should uninstall with 2 options to delete or not delete data folder
it should gracefully uninstall when server is still running
uninstalling should not leave any dangling files
uninstalling should not leave any dangling processes
it should reinstall without having conflict issues with existing cortex data folders
--
2. API QA
Overall API
API page is updated at localhost:port endpoint (upon cortex start)
Nightly v192
Release candidate for v1.0.1
OSes
1. Manual QA
Installation
Data/Folder structures
cortex.so/model_name/variants/
, with .gguf and model.yml filehuggingface.co/author/model_name
with .gguf and model.yml fileCortex Update
Overall / App Shell
cortex
orcortex -h
displays help commandscortex pull
,cortex engines install
]Engines
Server
cortex start
should start server and output API documentation pagecortex stop
should stop servercortex ps
should return server status and running models (or no model loaded)Model Pulling
cortex.so
huggingface.co
Interrupted Download
Found unfinished download! Additional XGB needs to be downloaded
)Model Management
cortex run
with regex is implemented)Model Running
cortex run <cortexso model>
- if no local models detected, showspull
model menucortex run
- if local model detected, runs the local modelcortex run
- if multiple local models detected, shows list of local models for users to selectcortex run <invalid model id>
should return gracefullyModel not found!
cortex run <model>
starts interactive chat (by default)cortex run <model> -d
runs in detached modecortex models start <model>
exit()
should exit interactive chatHardware Detection / Acceleration [WIP]
CPU Extension (e.g. AVX-2, noAVX, AVX-512)
GPU Acceleration (e.g. CUDA11, CUDA12, Vulkan, sycl, etc)
Uninstallation / Reinstallation
--
2. API QA
Overall API
cortex start
)Endpoints
Chat Completions
v1/chat/completions
Engines
/v1/engines
/v1/engines/install/{name}
/v1/engines/install/{name}
/v1/engines/{name}
Models
/v1/models
lists models/v1/models/pull
starts download (websockets)websockets /events
emitted when model pull starts/v1/models/pull
stops download (websockets)websockets /events
stopped when model pull stops/v1/models/start
starts model/v1/models/stop
stops model/v1/models/{id}
deletes model/v1/models/{id}
gets model/v1/models/{model}
updates model.yaml paramsTest list for reference:
The text was updated successfully, but these errors were encountered: