-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Created streaming feature for AioHttp_OpenAI #7998
Created streaming feature for AioHttp_OpenAI #7998
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good in general - can you add testing and make sure you're testing passes
add here test_aiohttp_openai.py
Add the following test and pass the scenario - the function streaming_format_tests
is defined in test_streaming.py
def test_completion_model_stream(model):
try:
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": "how does a court case get to the Supreme Court?",
},
]
response = completion(
model=model, messages=messages, stream=True, max_tokens=50
)
complete_response = ""
# Add any assertions here to check the response
for idx, chunk in enumerate(response):
chunk, finished = streaming_format_tests(idx, chunk)
if finished:
break
complete_response += chunk
if complete_response.strip() == "":
raise Exception("Empty response received")
print(f"completion_response: {complete_response}")
except Exception as e:
pytest.fail(f"Error occurred: {e}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def __iter__(self): | ||
return self | ||
|
||
def __next__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not needed, you can just inherit this from BaseModelResponseIterator
example iterator in cloudflare/chat/transformation.py
class CloudflareChatResponseIterator(BaseModelResponseIterator):
def chunk_parser(self, chunk: dict) -> GenericStreamingChunk:
try:
text = ""
tool_use: Optional[ChatCompletionToolCallChunk] = None
is_finished = False
finish_reason = ""
usage: Optional[ChatCompletionUsageBlock] = None
provider_specific_fields = None
index = int(chunk.get("index", 0))
if "response" in chunk:
text = chunk["response"]
returned_chunk = GenericStreamingChunk(
text=text,
tool_use=tool_use,
is_finished=is_finished,
finish_reason=finish_reason,
usage=usage,
index=index,
provider_specific_fields=provider_specific_fields,
)
return returned_chunk
except json.JSONDecodeError:
raise ValueError(f"Failed to decode JSON from chunk: {chunk}")
@Kashyab19 can you add testing for this PR Here's an example - litellm/tests/local_testing/test_streaming.py Line 677 in a7b3c66
|
- Handled streaming via async calls in the handler - Processed the chunks according the OpenAI Response Format - Wrote Test Case for Streaming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
c52ab1f
into
BerriAI:litellm_contributor_prs_jan_27
Title
Adding Streaming Feature for AIOHTTP_OPENAI
Relevant issues
Type
🆕 New Feature
Changes
[REQUIRED] Testing - Attach a screenshot of any new tests passing locally
If UI changes, send a screenshot/GIF of working UI fixes