Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compatible with openai/tgi/vllm request format #275

Closed
wants to merge 7 commits into from

Conversation

lkk12014402
Copy link
Collaborator

@lkk12014402 lkk12014402 commented Jul 4, 2024

Description

  1. compatible with openai/tgi/vllm completion request format/parameters
  2. priority default values (openai > tgi > vllm). With same default value, the opea service latency is almost same with native server tgi/vllm
  3. replace from langchain_community.llms import VLLMOpenAI with from openai import OpenAI, which can reduce initialization time (0.15s -> 0.01s) and only init once
  4. simplify post-process for openai format response (especially remove encode)

Copy link

codecov bot commented Jul 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines Coverage Δ
comps/cores/proto/docarray.py 100.00% <100.00%> (ø)

@lkk12014402
Copy link
Collaborator Author

lkk12014402 commented Jul 4, 2024

Opea/OpenAI/VLLM/TGI completion request format comparision

OpenAI API Reference - OpenAI API vLLM vllm/entrypoints/openai/protocol.py TGI GenerateParameters OPEA (now) OPEA (this pr)
model: Union[str model: str     model: Optional[str] = None # for openai, not used by tgi
prompt: Union[str prompt: Union[List[int], List[List[int]], str, List[str]]     query: str # alias 'prompt'
best_of: Optional[int] Defaults to 1 best_of: Optional[int] = None default = "null"   best_of: Optional[int] = 1
echo: Optional[bool] Defaults to false echo: Optional[bool] = False     echo: Optional[bool] = False
frequency_penalty: Optional[float] Defaults to 0 frequency_penalty: Optional[float] = 0.0 default = "null" frequency_penalty: Optional[float] = 0.0 frequency_penalty: Optional[float] = 0.0
logit_bias: Optional[Dict[str, int]] Defaults to null logit_bias: Optional[Dict[str, float]] = None     logit_bias: Optional[Dict[str, float]] = None
logprobs: Optional[int] Defaults to null logprobs: Optional[int] = None     logprobs: Optional[int] = None
max_tokens: Optional[int] Defaults to 16 max_tokens: Optional[int] = 16 default = "100" max_tokens: Optional[int] = 1024 max_new_tokens: Optional[int] = 16  # alias 'max_tokens'
n: Optional[int] Defaults to 1 n: int = 1     n: Optional[int] = 1
presence_penalty: Optional[float] Defaults to 0 presence_penalty: Optional[float] = 0.0     presence_penalty: Optional[float] = 0.0
seed: Optional[int] Defaults to null seed: Optional[int] = Field(None, default = "null"   seed: Optional[int] = None
stop: Union[Optional[str] Defaults to null stop: Optional[Union[str, List[str]]] = Field(default_factory=list) stop: Vec, default = []   stop: Union[Optional[str], List[str], None] = None
stream: Optional[Literal[False]] Defaults to false stream: Optional[bool] = False   stream: Optional[bool] = False streaming: Optional[bool] = False  # alias 'stream'
stream_options: Optional[ Defaults to null stream_options: Optional[StreamOptions] = None     stream_options: Optional[StreamOptions] = None
suffix: Optional[str] Defaults to null suffix: Optional[str] = None     suffix: Optional[str] = None
temperature: Optional[float] Defaults to 1 temperature: Optional[float] = 1.0 default = "null", temperature: Optional[float] = 0.01 temperature: Optional[float] = 1.0
top_p: Optional[float] Defaults to 1 top_p: Optional[float] = 1.0 default = "null", top_p: Optional[float] = 0.95 top_p: Optional[float] = 1.0
user: str Defaults to null user: Optional[str] = None     user: Optional[str] = None
  more parameters for vllm…..      
  default = "null" top_k: Optional[int] = 10 top_k: Optional[int] = None
  default = "null" typical_p: float = 0.95 typical_p: Optional[float] = None
  default = "null" repetition_penalty: Optional[float] = 1.03 repetition_penalty: Optional[float] = None
    do_sample: bool, default = "false"    

@lkk12014402
Copy link
Collaborator Author

implement it in another pr

@lkk12014402 lkk12014402 closed this Sep 5, 2024
lkk12014402 pushed a commit that referenced this pull request Sep 19, 2024
* add comment

Signed-off-by: Sun, Xuehao <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Sun, Xuehao <[email protected]>

* remove test

Signed-off-by: Sun, Xuehao <[email protected]>

* Update message

Signed-off-by: Sun, Xuehao <[email protected]>

* update message

Signed-off-by: Sun, Xuehao <[email protected]>

* Add dependency review

Signed-off-by: Sun, Xuehao <[email protected]>

---------

Signed-off-by: Sun, Xuehao <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants