-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Openai triton server #104
Openai triton server #104
Conversation
…ference-server/tutorials into nnshah1-meetup-04-2024
…ference-server/tutorials into nnshah1-meetup-04-2024
…on (#98) * Fix the parameter to tensor conversion in TRTLLM FastAPI implementation * Fix format
…ference-server/tutorials into nnshah1-meetup-04-2024
if "text_output" in response.outputs: | ||
try: | ||
return response.outputs["text_output"].to_string_array()[0] | ||
except: |
Check notice
Code scanning / CodeQL
Except block handles 'BaseException' Note
import argparse | ||
import time | ||
import uuid | ||
from typing import Optional, Union |
Check notice
Code scanning / CodeQL
Unused import Note
Model, | ||
ObjectType, | ||
) | ||
from transformers import AutoTokenizer, PreTrainedTokenizer, PreTrainedTokenizerFast |
Check notice
Code scanning / CodeQL
Unused import Note
|
||
|
||
class FunctionParameters(BaseModel): | ||
pass |
Check warning
Code scanning / CodeQL
Unnecessary pass Warning
# if "BaichuanTokenizer" in str(e): | ||
# # This is for the error "'BaichuanTokenizer' object has no | ||
# # attribute 'sp_model'". | ||
# tokenizer = BaichuanTokenizer.from_pretrained( | ||
# tokenizer_name, | ||
# *args, | ||
# trust_remote_code=trust_remote_code, | ||
# tokenizer_revision=tokenizer_revision, | ||
# **kwargs) | ||
# else: |
Check notice
Code scanning / CodeQL
Commented-out code Note
...Inference_Server_Python_API/examples/fastapi/fastapi-codegen/transformers_utils/tokenizer.py
Dismissed
Show dismissed
Hide dismissed
hello i have question about triton server running using python code i have 4 gpus for serve, so i have to set world size 4 but i can not found any option or tutorials, if you can help I'd be so grateful. |
I've bumped into same issue, hav you find solution? |
Please see this new location for details on an OpenAI-compatible Frontend for Triton: https://github.com/triton-inference-server/server/tree/main/python/openai |
No description provided.