requires running llamacpp on port 8080: https://github.com/ggerganov/llama.cpp/tree/master/examples/server