Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add using jina deploy local llm in deploy_local_llm.mdx #1872

Merged
merged 4 commits into from
Aug 9, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions docs/guides/deploy_local_llm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,40 @@ RAGFlow seamlessly integrates with Ollama and Xinference, without the need for f
This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
:::

# Deploy a local model using jina

[Jina](https://github.com/jina-ai/jina) lets you build AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production.

To deploy a local model, e.g., **gpt2**, using Jina:

### 1. Check firewall settings

Ensure that your host machine's firewall allows inbound connections on port 12345.

```bash
sudo ufw allow 12345/tcp
```

### 2.install jina package

```bash
pip install jina
```

### 3. deployment local model

Step 1: Navigate to the rag/svr directory.

```bash
cd rag/svr
```

Step 2: Use Python to run the jina_server.py script and pass in the model name or the local path of the model (the script only supports loading models downloaded from Huggingface)

```bash
python jina_server.py --model_name gpt2
```

## Deploy a local model using Ollama

[Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.
Expand Down