diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 000000000..607c3db7f --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,42 @@ +### Repo Structure +``` +# Entity Definitions +domain/ # This is the core directory where the domains are defined. + abstracts/ # Abstract base classes for common attributes and methods. + models/ # Domain interface definitions, e.g. model, assistant. + repositories/ # Extensions abstract and interface + +# Business Rules +usecases/ # Application logic + assistants/ # CRUD logic (invokes dtos, entities). + chat/ # Logic for chat functionalities. + models/ # Logic for model operations. + +# Adapters & Implementations +infrastructure/ # Implementations for Cortex interactions + commanders/ # CLI handlers + models/ + questions/ # CLI installation UX + shortcuts/ # CLI chained syntax + types/ + usecases/ # Invokes UseCases + + controllers/ # Nest controllers and HTTP routes + assistants/ # Invokes UseCases + chat/ # Invokes UseCases + models/ # Invokes UseCases + + database/ # Database providers (mysql, sqlite) + +# Framework specific object definitions + dtos/ # DTO definitions (data transfer & validation) + entities/ # TypeORM entity definitions (db schema) + +# Providers + providers/cortex # Cortex [server] provider (a core extension) + repositories/extensions # Extension provider (core & external extensions) + +extensions/ # External extensions +command.module.ts # CLI Commands List +main.ts # Entrypoint +``` diff --git a/README.md b/README.md index e6b453e98..f81acab38 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,31 @@ -# Cortex Monorepo +# Cortex - CLI +

+ cortex-cpplogo +

-# Installation +

+ Documentation - API Reference + - Changelog - Bug reports - Discord +

+ +> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs! + +## About +Cortex is an openAI-compatible local AI server that developers can use to build LLM apps. It is packaged with a Docker-inspired command-line interface and a Typescript client library. It can be used as a standalone server, or imported as a library. + +Cortex currently supports two inference engines: + +- Llama.cpp +- TensorRT-LLM + +> Read more about Cortex at https://jan.ai/cortex + +## Quicklinks +**Cortex**: +- [Website](https://jan.ai/) +- [GitHub](https://github.com/janhq/cortex) +- [User Guides](https://jan.ai/cortex) +- [API reference](https://jan.ai/api-reference) ## Prerequisites @@ -12,10 +37,9 @@ Before installation, ensure that you have installed the following: - **NPM**: Needed to manage packages. - **CPU Instruction Sets**: Available for download from the [Cortex GitHub Releases](https://github.com/janhq/cortex/releases) page. - +>💡 The **CPU instruction sets** are not required for the initial installation of Cortex. This dependency will be automatically installed during the Cortex initialization if they are not already on your system. + ### **Hardware** @@ -37,88 +61,42 @@ Ensure that your system meets the following requirements to run Cortex: - **Disk**: At least 10GB for app and model download. -## Cortex Installation - -To install Cortex, follow the steps below: - -### Step 1: Install Cortex - -Run the following command to install Cortex globally on your machine: - -```bash -# Install using NPM globally +## Quickstart +To install Cortex CLI, follow the steps below: +1. Install the Cortex NPM package globally: +``` bash npm i -g @janhq/cortex ``` -### Step 2: Verify the Installation - -After installation, you can verify that Cortex is installed correctly by getting help information. - -```bash -# Get the help information -cortex -h -``` - -### Step 3: Initialize Cortex - -Once verified, you need to initialize the Cortex engine. - -1. Initialize the Cortex engine: - -``` +2. Initialize a compatible engine: +``` bash cortex init ``` -1. Select between `CPU` and `GPU` modes. - -```bash -? Select run mode (Use arrow keys) -> CPU - GPU -``` - -2. Select between GPU types. - -```bash -? Select GPU types (Use arrow keys) -> Nvidia - Others (Vulkan) -``` +3. Download a GGUF model from Hugging Face: +``` bash +# Pull a model most compatible with your hardware +cortex pull llama3 -3. Select CPU instructions (will be deprecated soon). +# Pull a specific variant with `repo_name:branch` +cortex pull llama3:7b -```bash -? Select CPU instructions (Use arrow keys) -> AVX2 - AVX - AVX-512 +# Pull a model with the HuggingFace `model_id` +cortex pull microsoft/Phi-3-mini-4k-instruct-gguf ``` - -1. Cortex will download the required CPU instruction sets if you choose `CPU` mode. If you choose `GPU` mode, Cortex will download the necessary dependencies to use your GPU. -2. Once downloaded, Cortex is ready to use! - -### Step 4: Pull a model - -From HuggingFace - -```bash -cortex pull janhq/phi-3-medium-128k-instruct-GGUF +4. Load the model: +``` bash +cortex models start llama3:7b ``` -From Jan Hub (TBD) - -```bash -cortex pull llama3 +5. Start chatting with the model: +``` bash +cortex chat tell me a joke ``` -### Step 5: Chat - -```bash -cortex run janhq/phi-3-medium-128k-instruct-GGUF -``` ## Run as an API server - +To run Cortex as an API server: ```bash cortex serve ``` @@ -135,18 +113,40 @@ To install Cortex from the source, follow the steps below: npx nest build ``` -1. Make the `command.js` executable: +4. Make the `command.js` executable: ```bash chmod +x '[path-to]/cortex/cortex-js/dist/src/command.js' ``` -1. Link the package globally: +5. Link the package globally: ```bash npm link ``` +## Cortex CLI Command +The following CLI commands are currently available: +> ⚠️ **Cortex is currently in Development**: More commands will be added soon! + +```bash + + serve Providing API endpoint for Cortex backend + chat Send a chat request to a model + init|setup Init settings and download cortex's dependencies + ps Show running models and their status + kill Kill running cortex processes + pull|download Download a model. Working with HuggingFace model id. + run [options] EXPERIMENTAL: Shortcut to start a model and chat + models Subcommands for managing models + models list List all available models. + models pull Download a specified model. + models remove Delete a specified model. + models get Retrieve the configuration of a specified model. + models start Start a specified model. + models stop Stop a specified model. + models update Update the configuration of a specified model. +``` ## Uninstall Cortex Run the following command to uninstall Cortex globally on your machine: @@ -155,3 +155,7 @@ Run the following command to uninstall Cortex globally on your machine: # Uninstall globally using NPM npm uninstall -g @janhq/cortex ``` +## Contact Support +- For support, please file a GitHub ticket. +- For questions, join our Discord [here](https://discord.gg/FTk2MvZwJH). +- For long-form inquiries, please email [hello@jan.ai](mailto:hello@jan.ai). \ No newline at end of file diff --git a/cortex-cpp/CONTRIBUTING.md b/cortex-cpp/CONTRIBUTING.md new file mode 100644 index 000000000..c7ea6a1f1 --- /dev/null +++ b/cortex-cpp/CONTRIBUTING.md @@ -0,0 +1,14 @@ +### Repo Structure + +``` +. +├── common # Common libraries or shared resources +├── controllers # Controller scripts or modules for managing interactions +├── cortex-common # Shared components across different cortex modules +├── cortex-cpp-deps # Dependencies specific to the cortex-cpp module +├── engines # Different processing or computational engines +├── examples # Example scripts or applications demonstrating usage +├── test # Test scripts and testing frameworks +└── utils # Utility scripts and helper functions + +``` \ No newline at end of file diff --git a/cortex-cpp/README.md b/cortex-cpp/README.md index 009c0254f..61e9c2807 100644 --- a/cortex-cpp/README.md +++ b/cortex-cpp/README.md @@ -1,71 +1,63 @@ # cortex-cpp - Embeddable AI

- nitrologo + cortex-cpplogo

- Documentation - API Reference - - Changelog - Bug reports - Discord + Documentation - API Reference + - Changelog - Bug reports - Discord

> ⚠️ **cortex-cpp is currently in Development**: Expect breaking changes and bugs! -## Features -- Fast Inference: Built on top of the cutting-edge inference library llama.cpp, modified to be production ready. -- Lightweight: Only 3MB, ideal for resource-sensitive environments. -- Easily Embeddable: Simple integration into existing applications, offering flexibility. -- Quick Setup: Approximately 10-second initialization for swift deployment. -- Enhanced Web Framework: Incorporates drogon cpp to boost web service efficiency. - ## About cortex-cpp -cortex-cpp is a high-efficiency C++ inference engine for edge computing, powering [Jan](https://jan.ai/). It is lightweight and embeddable, ideal for product integration. - -The binary of cortex-cpp after zipped is only ~3mb in size with none to minimal dependencies (if you use a GPU need CUDA for example) make it desirable for any edge/server deployment 👍. - -> Read more about Nitro at https://nitro.jan.ai/ +Cortex-cpp is a streamlined, stateless C++ server engineered to be fully compatible with OpenAI's API, particularly its stateless functionalities. It integrates a Drogon server framework to manage request handling and includes features like model orchestration and hardware telemetry, which are essential for production environments. -### Repo Structure +Remarkably compact, the binary size of cortex-cpp is around 3 MB when compressed, with minimal dependencies. This lightweight and efficient design makes cortex-cpp an excellent choice for deployments in both edge computing and server contexts. -``` -. -├── controllers -├── docs -├── llama.cpp -> Upstream llama C++ -├── cortex-cpp-deps -> Dependencies of the cortex-cpp project as a sub-project -└── utils -``` +> Utilizing GPU capabilities does require CUDA. -## Quickstart +## Prerequisites +### **Hardware** -**Step 1: Install cortex-cpp** +Ensure that your system meets the following requirements to run Cortex: -- For Linux and MacOS +- **OS**: + - MacOSX 13.6 or higher. + - Windows 10 or higher. + - Ubuntu 18.04 and later. +- **RAM (CPU Mode):** + - 8GB for running up to 3B models. + - 16GB for running up to 7B models. + - 32GB for running up to 13B models. +- **VRAM (GPU Mode):** - ```bash - curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh | sudo /bin/bash - - ``` + - 6GB can load the 3B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU. + - 8GB can load the 7B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU. + - 12GB can load the 13B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU. -- For Windows +- **Disk**: At least 10GB for app and model download. - ```bash - powershell -Command "& { Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat'; .\install.bat; Remove-Item -Path 'install.bat' }" - ``` +## Quickstart +To install Cortex CLI, follow the steps below: +1. Download cortex-cpp here: https://github.com/janhq/cortex/releases +2. Install cortex-cpp by running the downloaded file. -**Step 2: Downloading a Model** +3. Download a Model: ```bash mkdir model && cd model wget -O llama-2-7b-model.gguf https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf?download=true ``` -**Step 3: Run cortex-cpp server** +4. Run cortex-cpp server: ```bash title="Run cortex-cpp server" cortex-cpp ``` -**Step 4: Load model** +5. Load a model: ```bash title="Load model" curl http://localhost:3928/inferences/server/loadmodel \ @@ -77,7 +69,7 @@ curl http://localhost:3928/inferences/server/loadmodel \ }' ``` -**Step 5: Making an Inference** +6. Make an Inference: ```bash title="cortex-cpp Inference" curl http://localhost:3928/v1/chat/completions \ @@ -92,7 +84,8 @@ curl http://localhost:3928/v1/chat/completions \ }' ``` -Table of parameters +## Table of parameters +Below is the available list of the model parameters you can set when loading a model in cortex-cpp: | Parameter | Type | Description | |------------------|---------|--------------------------------------------------------------| @@ -116,20 +109,6 @@ Table of parameters |`grammar_file`| String |You can constrain the sampling using GBNF grammars by providing path to a grammar file| |`model_type` | String | Model type we want to use: llm or embedding, default value is llm| -***OPTIONAL***: You can run Nitro on a different port like 5000 instead of 3928 by running it manually in terminal -```zsh -./cortex-cpp 1 127.0.0.1 5000 ([thread_num] [host] [port] [uploads_folder_path]) -``` -- thread_num : the number of thread that cortex-cpp webserver needs to have -- host : host value normally 127.0.0.1 or 0.0.0.0 -- port : the port that cortex-cpp got deployed onto -- uploads_folder_path: custom path for file uploads in Drogon. - -cortex-cpp server is compatible with the OpenAI format, so you can expect the same output as the OpenAI ChatGPT API. - -## Compile from source -To compile cortex-cpp please visit [Compile from source](docs/docs/new/build-source.md) - ## Download @@ -142,71 +121,51 @@ To compile cortex-cpp please visit [Compile from source](docs/docs/new/build-sou - - - -
Stable (Recommended) - + CPU - + CUDA - + Intel - + M1/M2 - + CPU - + CUDA
Experimental (Nighlty Build) - - GitHub action artifactory - -
-Download the latest version of Nitro at https://nitro.jan.ai/ or visit the **[GitHub Releases](https://github.com/janhq/cortex/releases)** to download any previous release. - -## Nightly Build +> Download the latest or older versions of Cortex-cpp at the **[GitHub Releases](https://github.com/janhq/cortex/releases)**. -Nightly build is a process where the software is built automatically every night. This helps in detecting and fixing bugs early in the development cycle. The process for this project is defined in [`.github/workflows/build.yml`](.github/workflows/build.yml) - -You can join our Discord server [here](https://discord.gg/FTk2MvZwJH) and go to channel [github-nitro](https://discordapp.com/channels/1107178041848909847/1151022176019939328) to monitor the build process. - -The nightly build is triggered at 2:00 AM UTC every day. - -The nightly build can be downloaded from the url notified in the Discord channel. Please access the url from the browser and download the build artifacts from there. ## Manual Build +Manual build is a process in which the developers build the software manually. This is usually done when a new feature is implemented, or a bug is fixed. The process for this project is defined in [`.github/workflows/build.yml`](.github/workflows/build.yml) -Manual build is a process where the software is built manually by the developers. This is usually done when a new feature is implemented or a bug is fixed. The process for this project is defined in [`.github/workflows/build.yml`](.github/workflows/build.yml) - -It is similar to the nightly build process, except that it is triggered manually by the developers. - -### Contact +## Contact Support - For support, please file a GitHub ticket. - For questions, join our Discord [here](https://discord.gg/FTk2MvZwJH). @@ -214,4 +173,4 @@ It is similar to the nightly build process, except that it is triggered manually ## Star History -[![Star History Chart](https://api.star-history.com/svg?repos=janhq/nitro&type=Date)](https://star-history.com/#janhq/nitro&Date) +[![Star History Chart](https://api.star-history.com/svg?repos=janhq/cortex-cpp&type=Date)](https://star-history.com/#janhq/cortex-cpp&Date) diff --git a/cortex-js/CONTRIBUTING.md b/cortex-js/CONTRIBUTING.md new file mode 100644 index 000000000..53909c391 --- /dev/null +++ b/cortex-js/CONTRIBUTING.md @@ -0,0 +1,42 @@ +### Repo Structure +``` +# Entity Definitions +domain/ # This is the core directory where the domains are defined. + abstracts/ # Abstract base classes for common attributes and methods. + models/ # Domain interface definitions, e.g. model, assistant. + repositories/ # Extensions abstract and interface + +# Business Rules +usecases/ # Application logic + assistants/ # CRUD logic (invokes dtos, entities). + chat/ # Logic for chat functionalities. + models/ # Logic for model operations. + +# Adapters & Implementations +infrastructure/ # Implementations for Cortex interactions + commanders/ # CLI handlers + models/ + questions/ # CLI installation UX + shortcuts/ # CLI chained syntax + types/ + usecases/ # Invokes UseCases + + controllers/ # Nest controllers and HTTP routes + assistants/ # Invokes UseCases + chat/ # Invokes UseCases + models/ # Invokes UseCases + + database/ # Database providers (mysql, sqlite) + + # Framework specific object definitions + dtos/ # DTO definitions (data transfer & validation) + entities/ # TypeORM entity definitions (db schema) + + # Providers + providers/cortex # Cortex [server] provider (a core extension) + repositories/extensions # Extension provider (core & external extensions) + +extensions/ # External extensions +command.module.ts # CLI Commands List +main.ts # Entrypoint +``` \ No newline at end of file diff --git a/cortex-js/README.md b/cortex-js/README.md index 4ec7ed411..806c8c873 100644 --- a/cortex-js/README.md +++ b/cortex-js/README.md @@ -1,4 +1,31 @@ -# Installation +# Cortex - CLI +

+ cortex-cpplogo +

+ +

+ Documentation - API Reference + - Changelog - Bug reports - Discord +

+ +> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs! + +## About +Cortex is an openAI-compatible local AI server that developers can use to build LLM apps. It is packaged with a Docker-inspired command-line interface and a Typescript client library. It can be used as a standalone server, or imported as a library. + +Cortex currently supports two inference engines: + +- Llama.cpp +- TensorRT-LLM + +> Read more about Cortex at https://jan.ai/cortex + +## Quicklinks +Cortex +- [Website](https://jan.ai/) +- [GitHub](https://github.com/janhq/cortex) +- [User Guides](https://jan.ai/cortex) +- [API reference](https://jan.ai/api-reference) ## Prerequisites @@ -10,10 +37,9 @@ Before installation, ensure that you have installed the following: - **NPM**: Needed to manage packages. - **CPU Instruction Sets**: Available for download from the [Cortex GitHub Releases](https://github.com/janhq/cortex/releases) page. - +>💡 The **CPU instruction sets** are not required for the initial installation of Cortex. This dependency will be automatically installed during the Cortex initialization if they are not already on your system. + ### **Hardware** @@ -35,88 +61,42 @@ Ensure that your system meets the following requirements to run Cortex: - **Disk**: At least 10GB for app and model download. -## Cortex Installation - -To install Cortex, follow the steps below: - -### Step 1: Install Cortex - -Run the following command to install Cortex globally on your machine: - -```bash -# Install using NPM globally +## Quickstart +To install Cortex CLI, follow the steps below: +1. Install the Cortex NPM package globally: +``` bash npm i -g @janhq/cortex ``` -### Step 2: Verify the Installation - -After installation, you can verify that Cortex is installed correctly by getting help information. - -```bash -# Get the help information -cortex -h -``` - -### Step 3: Initialize Cortex - -Once verified, you need to initialize the Cortex engine. - -1. Initialize the Cortex engine: - -``` +2. Initialize a compatible engine: +``` bash cortex init ``` -1. Select between `CPU` and `GPU` modes. - -```bash -? Select run mode (Use arrow keys) -> CPU - GPU -``` +3. Download a GGUF model from Hugging Face: +``` bash +# Pull a model most compatible with your hardware +cortex pull llama3 -2. Select between GPU types. +# Pull a specific variant with `repo_name:branch` +cortex pull llama3:7b -```bash -? Select GPU types (Use arrow keys) -> Nvidia - Others (Vulkan) +# Pull a model with the HuggingFace `model_id` +cortex pull microsoft/Phi-3-mini-4k-instruct-gguf ``` - -3. Select CPU instructions (will be deprecated soon). - -```bash -? Select CPU instructions (Use arrow keys) -> AVX2 - AVX - AVX-512 +4. Load the model: +``` bash +cortex models start llama3:7b ``` -1. Cortex will download the required CPU instruction sets if you choose `CPU` mode. If you choose `GPU` mode, Cortex will download the necessary dependencies to use your GPU. -2. Once downloaded, Cortex is ready to use! - -### Step 4: Pull a model - -From HuggingFace - -```bash -cortex pull janhq/phi-3-medium-128k-instruct-GGUF +5. Start chatting with the model: +``` bash +cortex chat tell me a joke ``` -From Jan Hub (TBD) - -```bash -cortex pull llama3 -``` - -### Step 5: Chat - -```bash -cortex run janhq/phi-3-medium-128k-instruct-GGUF -``` ## Run as an API server - +To run Cortex as an API server: ```bash cortex serve ``` @@ -133,18 +113,42 @@ To install Cortex from the source, follow the steps below: npx nest build ``` -1. Make the `command.js` executable: +4. Make the `command.js` executable: ```bash chmod +x '[path-to]/cortex/cortex-js/dist/src/command.js' ``` -1. Link the package globally: +5. Link the package globally: ```bash npm link ``` +## Cortex CLI Command +The following CLI commands are currently available: +> ⚠️ **Cortex is currently in Development**: More commands will be added soon! + +```bash + + serve Providing API endpoint for Cortex backend + chat Send a chat request to a model + init|setup Init settings and download cortex's dependencies + ps Show running models and their status + kill Kill running cortex processes + pull|download Download a model. Working with HuggingFace model id. + run [options] EXPERIMENTAL: Shortcut to start a model and chat + models Subcommands for managing models + models list List all available models. + models pull Download a specified model. + models remove Delete a specified model. + models get Retrieve the configuration of a specified model. + models start Start a specified model. + models stop Stop a specified model. + models update Update the configuration of a specified model. + engines Execute a specified command related to engines. + engines list List all available engines. +``` ## Uninstall Cortex Run the following command to uninstall Cortex globally on your machine: @@ -153,3 +157,7 @@ Run the following command to uninstall Cortex globally on your machine: # Uninstall globally using NPM npm uninstall -g @janhq/cortex ``` +## Contact Support +- For support, please file a GitHub ticket. +- For questions, join our Discord [here](https://discord.gg/FTk2MvZwJH). +- For long-form inquiries, please email [hello@jan.ai](mailto:hello@jan.ai). \ No newline at end of file