Simple. elegant LLM Chat Inference
-
Open up your terminal (mac/linux) / powershell terminal (window) and navigate to the project directory
-
First of all to be safe, let's create a virtual environment!
Install on Windows
python -m venv .venv
Install on Linux / MacOs
python3 -m venv .venv
-
Now we need to activate it to be able to use Le Potato without libraries error!
Install on Windows
.venv\Scripts\activate.ps1
Install on Linux / MacOs
source .venv/bin/activate
-
Now we are inside our virtual environment...You will have to install all python libraries to be able to use the web app by using:
Install on Windows
pip install .
Install on Linux / MacOs
pip3 install .
-
While it is installing, you should check and modify the configuration file
There you will find some samplers and other settings to modify to host model...
database/configuration.yaml
-
Once you are done, you can launch the web server if the libraries are done installing!
Run server on Windows
python main.py
Run server on Linux / MacOs
python3 main.py
-
Once it is all set you can open the website url, by default it is
http://127.0.0.1:1234/
It will look like this:
Here is a demo of a normal chat
- You can change the background if you dont like it by going to:
src\frontend\static\images
There you will drop your background you wanted and just rename it toji.jpg to make it easier.
How to use the RAG from the UI?
-
You need to add
-rag
followed with your rag query that will be used for the retrieval. (Will not be showed to the model ) -
Then you add a new line and write down your query for the model.
Example:
-rag Who is the author?
Do you know the author ?
LLM will only see the second line and the rag query will add retrieval results to your query!