Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cpu-remote-inference Docker image #5225

Merged
merged 17 commits into from
Jul 7, 2023
Merged

Add cpu-remote-inference Docker image #5225

merged 17 commits into from
Jul 7, 2023

Conversation

vblagoje
Copy link
Member

@vblagoje vblagoje commented Jun 29, 2023

What?

Introduces a new Docker image, haystack:cpu-remote-inference-<version>, to our existing images: haystack:gpu-<version> and haystack:cpu-<version>. This image is a slimmed-down version of cpu/gpu images, specifically designed for PromptNode inferencing using remote-hosted models such as HuggingFace Inference, OpenAI, Cohere, Anthropic etc.

Why?

The motivation behind the introduction of this new Docker image is to provide a more efficient and lightweight option for users who primarily perform remote Large Language Model (LLM) inference. The existing Docker images for Haystack are comprehensive and can be larger than necessary for users primarily using remote LLM inferencing. By introducing this new Docker image, we offer a more streamlined and efficient option for Haystack users.

How can it be used?

The new haystack:cpu-remote-inference-<version> image can be used the same way as the existing gpu and cpu Docker images in Haystack. Users can pull this image and run their applications, benefiting from the reduced size and specific tuning for remote LLM inferencing.

Follow these steps to test the new image:

  1. Checkout cpu_remote_inference branch on your Linux machine; go to the docker directory of the project
  2. Bake the base image, in your terminal window, type:
HAYSTACK_VERSION=cpu_remote_inference docker buildx bake --no-cache base-cpu-remote-inference --set "*.platform=linux/amd64"
  1. Bake the image, in your terminal window, type:
docker buildx bake cpu-remote-inference --set "*.platform=linux/amd64"
  1. Run the image using the following command. Replace <SERPERDEV_API_KEY> and <OPENAI_API_KEY> with your respective API keys:
docker run -d -e PIPELINE_YAML_PATH=/opt/venv/lib/python3.10/site-packages/rest_api/pipeline/pipelines_web_lfqa.haystack-pipeline.yaml -e RETRIEVER_PARAMS_API_KEY=<SERPERDEV_API_KEY> -e PROMPTNODE_PARAMS_API_KEY=<OPENAI_API_KEY> -p 8080:8000 deepset/haystack:cpu-remote-inference-local
  1. Hit the server, type:
curl -s -X POST -H "Content-Type: application/json" -d "{\"query\": \"Where in Europe, should I live?\"}" http://localhost:8080/query | jq ".results"

How did you test it?

The new Docker image was tested by baking and running both the amd64 and arm64 versions.

Notes for the reviewer

Please build images yourself and run the steps outlined in How can it be used? section

Thank you for reviewing, your feedback is greatly appreciated.

@vblagoje vblagoje requested a review from a team as a code owner June 29, 2023 07:59
@vblagoje vblagoje requested review from bogdankostic and removed request for a team June 29, 2023 07:59
@coveralls
Copy link
Collaborator

coveralls commented Jul 2, 2023

Pull Request Test Coverage Report for Build 5484129697

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 23 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.01%) to 44.67%

Files with Coverage Reduction New Missed Lines %
nodes/retriever/web.py 23 71.85%
Totals Coverage Status
Change from base Build 5476747209: 0.01%
Covered Lines: 10284
Relevant Lines: 23022

💛 - Coveralls

@vblagoje vblagoje force-pushed the cpu_remote_inference branch from d8f0c60 to 08680de Compare July 2, 2023 17:18
@vblagoje vblagoje force-pushed the cpu_remote_inference branch 2 times, most recently from a7c33cd to 67574b9 Compare July 3, 2023 09:00
docker/docker-bake.hcl Outdated Show resolved Hide resolved
docker/docker-bake.hcl Outdated Show resolved Hide resolved
@vblagoje
Copy link
Member Author

vblagoje commented Jul 5, 2023

@silvanocerza any idea what GITHUB_REF is used for in docker/docker-bake.hcl? It seems unused...

@silvanocerza
Copy link
Contributor

@silvanocerza any idea what GITHUB_REF is used for in docker/docker-bake.hcl? It seems unused...

Must be a leftover from old changes.

@vblagoje vblagoje force-pushed the cpu_remote_inference branch 2 times, most recently from e0b760a to bdafe95 Compare July 6, 2023 15:25
@vblagoje vblagoje force-pushed the cpu_remote_inference branch from 28461ea to e270064 Compare July 6, 2023 15:54
@vblagoje vblagoje force-pushed the cpu_remote_inference branch from e270064 to 5464a19 Compare July 6, 2023 16:29
@vblagoje
Copy link
Member Author

vblagoje commented Jul 6, 2023

Finally green @silvanocerza @bogdankostic , all the images get built, and tests seem to pass! Let's review everything once again tomorrow.

@vblagoje vblagoje merged commit 395854d into main Jul 7, 2023
@vblagoje vblagoje deleted the cpu_remote_inference branch July 7, 2023 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shrink CPU Docker image
3 participants