infiniflow · KevinHuSh · Aug 9, 2024 · Aug 8, 2024 · Aug 8, 2024 · Aug 9, 2024
diff --git a/docs/guides/deploy_local_llm.mdx b/docs/guides/deploy_local_llm.mdx
@@ -15,6 +15,40 @@ RAGFlow seamlessly integrates with Ollama and Xinference, without the need for f
 This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
 :::
 
+# Deploy a local model using jina 
+
+[Jina](https://github.com/jina-ai/jina) lets you build AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production.
+
+To deploy a local model, e.g., **gpt2**, using Jina:
+
+### 1. Check firewall settings
+
+Ensure that your host machine's firewall allows inbound connections on port 12345.
+
+```bash
+sudo ufw allow 12345/tcp
+```
+
+### 2.install jina package
+
+```bash
+pip install jina
+```
+
+### 3. deployment local model
+
+Step 1: Navigate to the rag/svr directory.
+
+```bash
+cd rag/svr
+```
+
+Step 2: Use Python to run the jina_server.py script and pass in the model name or the local path of the model (the script only supports loading models downloaded from Huggingface)
+
+```bash
+python jina_server.py  --model_name gpt2
+```
+
 ## Deploy a local model using Ollama
 
 [Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.