Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the microservice of vLLM #78

Merged
merged 9 commits into from
May 30, 2024
Merged

Conversation

tianyil1
Copy link
Contributor

@tianyil1 tianyil1 commented May 22, 2024

Description

This PR mainly updates the microservice wrapper for the vllm to align with the tgi, rename the ray backend service, and adds the vllm and ray introduction in the readme.

Issues

n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • New feature (non-breaking change which adds new functionality)
    • wrapped the vllm with the langchain microservice for aligning with tgi microservice.
    • updated the llms readme with the vllm and ray introduction.
    • fixed the vllm readme and docker script with huggingface API token useage.
  • Breaking change (fix or feature that would break existing design and interface
    • rename the ray serve from "rayllm" to "ray_serve" to avoid the confusion of the ray implement based on the "ray serve" instead of "ray llm"

Dependencies

No newly introduced 3rd party dependency.

Tests

This PR is tested in the Gaudi2 server with:
2 sockers Intel(R) Xeon(R) Platinum 8368 CPU @ 2.40GHz
8 Gaudi nodes, HL-SMI Version: hl-1.14.0-fw-48.0.1.0 Driver Version: 1.14.0-9e8ecf8

which is tested well in the above env:

  • vLLM backend serving
    image

  • Langchain vLLM serving
    image

@tianyil1
Copy link
Contributor Author

tianyil1 commented May 22, 2024

Please help review, and I expect your invaluable feedback. Thanks. @Jian-Zhang @xuechendi

@tianyil1
Copy link
Contributor Author

The PR has fixed the above comments and is ready to merge. Please help review. Thanks. @lvliang-intel @hshen14 @Jian-Zhang

@tianyil1 tianyil1 force-pushed the vllm branch 3 times, most recently from 055ae40 to 23d3b63 Compare May 24, 2024 05:50
@hshen14 hshen14 requested review from lvliang-intel and ftian1 May 24, 2024 06:33
Copy link
Collaborator

@ftian1 ftian1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the PR title, only LLM microservice based on vLLM was added. but in the code, you in fact included Ray Serve version, right? and it's better to move 'text-generation/ray_serve/docker' to upper level like what we did in vLLM and other examples.

another question is why there is a requirements.txt in 'text-generation/ray_serve/docker' folder? moving it to dockerfile is feasible?

@tianyil1
Copy link
Contributor Author

from the PR title, only LLM microservice based on vLLM was added. but in the code, you in fact included Ray Serve version, right? and it's better to move 'text-generation/ray_serve/docker' to upper level like what we did in vLLM and other examples.

another question is why there is a requirements.txt in 'text-generation/ray_serve/docker' folder? moving it to dockerfile is feasible?

@ftian1 Thanks for your comments. I exactly renamed the ray serve name instead of changing its version, did not implement the ray serve microservice in this PR, and will implement it in the next PR. The code in "docker" folder of the ray_serve was self-implemented docker image and just aimed to launch the ray serve llm engine unlike that the vLLM/TGI can build the docker from official github.

The requirements.txt of ray serve was the necessary file to build the ray serve engine image instead of the microservice of the ray langchain implementation, so I put it under the docker folder as before.

@tianyil1 tianyil1 force-pushed the vllm branch 3 times, most recently from 21b51da to 122560a Compare May 29, 2024 08:29
@tianyil1
Copy link
Contributor Author

This PR is ready to merge. Would you please help check this PR? Thanks. @ftian1 @lvliang-intel

Copy link
Collaborator

@ftian1 ftian1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@tianyil1 tianyil1 force-pushed the vllm branch 2 times, most recently from af2a15b to 1782801 Compare May 30, 2024 01:37
@lvliang-intel lvliang-intel merged commit f0b0690 into opea-project:main May 30, 2024
6 checks passed
@poussa poussa mentioned this pull request May 30, 2024
3 tasks
ganesanintel pushed a commit to ganesanintel/GenAIComps that referenced this pull request Jun 3, 2024
* refine the vllm microservice

Signed-off-by: tianyil1 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rename the rayllm to ray_serve

Signed-off-by: tianyil1 <[email protected]>

* refactor the ray service code structure

Signed-off-by: tianyil1 <[email protected]>

* refine the vllm and readme

Signed-off-by: tianyil1 <[email protected]>

* update the readme with correct ray service name

Signed-off-by: tianyil1 <[email protected]>

* update the readme

Signed-off-by: tianyil1 <[email protected]>

* refine the readme

Signed-off-by: tianyil1 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: tianyil1 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: V, Ganesan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants