Skip to content

Commit

Permalink
Azure OpenAI: Revise streaming behavior for better usability (complet…
Browse files Browse the repository at this point in the history
…ions + chat completions) (Azure#39347)

* DRAFT: unvetted proposal for flattened streaming

* add ported functions test

* remaining tests ported

* completions for consistency

* comments, tests, and other cleanup

* one orphaned test comment cleanup

* xml comment fix

* test assets, making it real

* test assets pt. 2 (re-record functions)

* revised pattern using new StreamingResponse<T>

* use delegate resolver for stronger response/enum connection

* add a snippet for streaming functions

* also add a snippet for streaming with multiple choices

* speculative CHANGELOG update

* basic, standalone test coverage for StreamingResponse<T>

* feedback: keep StreamingResponse<T> in Azure.AI.OpenAI

* address missing 'using' on JsonDocument

* tie up broken link from changelog

* Post-merge: export-api and update-snippets
  • Loading branch information
trrwilson authored and DevArjun23 committed Nov 14, 2023
1 parent f08bdc7 commit 4f17bea
Show file tree
Hide file tree
Showing 22 changed files with 1,195 additions and 954 deletions.
21 changes: 20 additions & 1 deletion sdk/openai/Azure.AI.OpenAI/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,24 @@

This update includes a number of version-to-version breaking changes to the API.

#### Streaming for completions and chat completions

Streaming Completions and Streaming Chat Completions have been significantly updated to use simpler, shallower usage
patterns and data representations. The goal of these changes is to make streaming much easier to consume in common
cases while still retaining full functionality in more complex ones (e.g. with multiple choices requested).
- A new `StreamingResponse<T>` type is introduced that implicitly exposes an `IAsyncEnumerable<T>` derived from
the underlying response.
- `OpenAI.GetCompletionsStreaming()` now returns a `StreamingResponse<Completions>` that may be directly
enumerated over. `StreamingCompletions`, `StreamingChoice`, and the corresponding methods are removed.
- Because Chat Completions use a distinct structure for their streaming response messages, a new
`StreamingChatCompletionsUpdate` type is introduced that encapsulates this update data.
- Correspondingly, `OpenAI.GetChatCompletionsStreaming()` now returns a
`StreamingResponse<StreamingChatCompletionsUpdate>` that may be enumerated over directly.
`StreamingChatCompletions`, `StreamingChatChoice`, and related methods are removed.
- For more information, please see
[the related pull request description](https://github.com/Azure/azure-sdk-for-net/pull/39347) as well as the
updated snippets in the project README.

#### `deploymentOrModelName` moved to `*Options.DeploymentName`

`deploymentOrModelName` and related method parameters on `OpenAIClient` have been moved to `DeploymentName`
Expand Down Expand Up @@ -64,7 +82,8 @@ And *added* as replacements are:

#### Embeddings

- Changed the representation of embeddings from `IReadOnlyList<float>` to `ReadOnlyMemory<float>`.
To align representations of embeddings across Azure AI, the `Embeddings` type has been updated to use
`ReadOnlyMemory<float>` instead of `IReadOnlyList<float>`.

### Bugs Fixed

Expand Down
86 changes: 79 additions & 7 deletions sdk/openai/Azure.AI.OpenAI/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,17 +209,48 @@ var chatCompletionsOptions = new ChatCompletionsOptions()
}
};

Response<StreamingChatCompletions> response
= await client.GetChatCompletionsStreamingAsync(chatCompletionsOptions);
using StreamingChatCompletions streamingChatCompletions = response.Value;
await foreach (StreamingChatCompletionsUpdate chatUpdate in client.GetChatCompletionsStreaming(chatCompletionsOptions))
{
if (chatUpdate.Role.HasValue)
{
Console.Write($"{chatUpdate.Role.Value.ToString().ToUpperInvariant()}: ");
}
if (!string.IsNullOrEmpty(chatUpdate.ContentUpdate))
{
Console.Write(chatUpdate.ContentUpdate);
}
}
```

await foreach (StreamingChatChoice choice in streamingChatCompletions.GetChoicesStreaming())
When explicitly requesting more than one `Choice` while streaming, use the `ChoiceIndex` property on
`StreamingChatCompletionsUpdate` to determine which `Choice` each update corresponds to.

```C# Snippet:StreamChatMessagesWithMultipleChoices
// A ChoiceCount > 1 will feature multiple, parallel, independent text generations arriving on the
// same response. This may be useful when choosing between multiple candidates for a single request.
var chatCompletionsOptions = new ChatCompletionsOptions()
{
await foreach (ChatMessage message in choice.GetMessageStreaming())
Messages = { new ChatMessage(ChatRole.User, "Write a limerick about bananas.") },
ChoiceCount = 4
};

await foreach (StreamingChatCompletionsUpdate chatUpdate
in client.GetChatCompletionsStreaming(chatCompletionsOptions))
{
// Choice-specific information like Role and ContentUpdate will also provide a ChoiceIndex that allows
// StreamingChatCompletionsUpdate data for independent choices to be appropriately separated.
if (chatUpdate.ChoiceIndex.HasValue)
{
Console.Write(message.Content);
int choiceIndex = chatUpdate.ChoiceIndex.Value;
if (chatUpdate.Role.HasValue)
{
textBoxes[choiceIndex].Text += $"{chatUpdate.Role.Value.ToString().ToUpperInvariant()}: ";
}
if (!string.IsNullOrEmpty(chatUpdate.ContentUpdate))
{
textBoxes[choiceIndex].Text += chatUpdate.ContentUpdate;
}
}
Console.WriteLine();
}
```

Expand Down Expand Up @@ -338,6 +369,47 @@ if (responseChoice.FinishReason == CompletionsFinishReason.FunctionCall)
}
```

When using streaming, capture streaming response components as they arrive and accumulate streaming function arguments
in the same manner used for streaming content. Then, in the place of using the `ChatMessage` from the non-streaming
response, instead add a new `ChatMessage` instance for history, created from the streamed information.

```C# Snippet::ChatFunctions::StreamingFunctions
string functionName = null;
StringBuilder contentBuilder = new();
StringBuilder functionArgumentsBuilder = new();
ChatRole streamedRole = default;
CompletionsFinishReason finishReason = default;

await foreach (StreamingChatCompletionsUpdate update
in client.GetChatCompletionsStreaming(chatCompletionsOptions))
{
contentBuilder.Append(update.ContentUpdate);
functionName ??= update.FunctionName;
functionArgumentsBuilder.Append(update.FunctionArgumentsUpdate);
streamedRole = update.Role ?? default;
finishReason = update.FinishReason ?? default;
}

if (finishReason == CompletionsFinishReason.FunctionCall)
{
string lastContent = contentBuilder.ToString();
string unvalidatedArguments = functionArgumentsBuilder.ToString();
ChatMessage chatMessageForHistory = new(streamedRole, lastContent)
{
FunctionCall = new(functionName, unvalidatedArguments),
};
conversationMessages.Add(chatMessageForHistory);

// Handle from here just like the non-streaming case
}
```

Please note: while streamed function information (name, arguments) may be evaluated as it arrives, it should not be
considered complete or confirmed until the `FinishReason` of `FunctionCall` is received. It may be appropriate to make
best-effort attempts at "warm-up" or other speculative preparation based on a function name or particular key/value
appearing in the accumulated, partial JSON arguments, but no strong assumptions about validity, ordering, or other
details should be evaluated until the arguments are fully available and confirmed via `FinishReason`.

### Use your own data with Azure OpenAI

The use your own data feature is unique to Azure OpenAI and won't work with a client configured to use the non-Azure service.
Expand Down
60 changes: 23 additions & 37 deletions sdk/openai/Azure.AI.OpenAI/api/Azure.AI.OpenAI.netstandard2.0.cs
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,8 @@ public partial class AzureChatExtensionsMessageContext
{
public AzureChatExtensionsMessageContext() { }
public System.Collections.Generic.IList<Azure.AI.OpenAI.ChatMessage> Messages { get { throw null; } }
public Azure.AI.OpenAI.ContentFilterResults RequestContentFilterResults { get { throw null; } }
public Azure.AI.OpenAI.ContentFilterResults ResponseContentFilterResults { get { throw null; } }
}
public partial class AzureChatExtensionsOptions
{
Expand Down Expand Up @@ -206,10 +208,7 @@ public static partial class AzureOpenAIModelFactory
public static Azure.AI.OpenAI.ImageGenerations ImageGenerations(System.DateTimeOffset created = default(System.DateTimeOffset), System.Collections.Generic.IEnumerable<Azure.AI.OpenAI.ImageLocation> data = null) { throw null; }
public static Azure.AI.OpenAI.ImageLocation ImageLocation(System.Uri url = null) { throw null; }
public static Azure.AI.OpenAI.PromptFilterResult PromptFilterResult(int promptIndex = 0, Azure.AI.OpenAI.ContentFilterResults contentFilterResults = null) { throw null; }
public static Azure.AI.OpenAI.StreamingChatChoice StreamingChatChoice(Azure.AI.OpenAI.ChatChoice originalBaseChoice = null) { throw null; }
public static Azure.AI.OpenAI.StreamingChatCompletions StreamingChatCompletions(Azure.AI.OpenAI.ChatCompletions baseChatCompletions = null, System.Collections.Generic.List<Azure.AI.OpenAI.StreamingChatChoice> streamingChatChoices = null) { throw null; }
public static Azure.AI.OpenAI.StreamingChoice StreamingChoice(Azure.AI.OpenAI.Choice originalBaseChoice = null) { throw null; }
public static Azure.AI.OpenAI.StreamingCompletions StreamingCompletions(Azure.AI.OpenAI.Completions baseCompletions = null, System.Collections.Generic.List<Azure.AI.OpenAI.StreamingChoice> streamingChoices = null) { throw null; }
public static Azure.AI.OpenAI.StreamingChatCompletionsUpdate StreamingChatCompletionsUpdate(string id, System.DateTimeOffset created, int? choiceIndex = default(int?), Azure.AI.OpenAI.ChatRole? role = default(Azure.AI.OpenAI.ChatRole?), string authorName = null, string contentUpdate = null, Azure.AI.OpenAI.CompletionsFinishReason? finishReason = default(Azure.AI.OpenAI.CompletionsFinishReason?), string functionName = null, string functionArgumentsUpdate = null, Azure.AI.OpenAI.AzureChatExtensionsMessageContext azureExtensionsContext = null) { throw null; }
}
public partial class ChatChoice
{
Expand Down Expand Up @@ -482,12 +481,12 @@ public OpenAIClient(System.Uri endpoint, Azure.Core.TokenCredential tokenCredent
public virtual System.Threading.Tasks.Task<Azure.Response<Azure.AI.OpenAI.AudioTranslation>> GetAudioTranslationAsync(Azure.AI.OpenAI.AudioTranslationOptions audioTranslationOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual Azure.Response<Azure.AI.OpenAI.ChatCompletions> GetChatCompletions(Azure.AI.OpenAI.ChatCompletionsOptions chatCompletionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual System.Threading.Tasks.Task<Azure.Response<Azure.AI.OpenAI.ChatCompletions>> GetChatCompletionsAsync(Azure.AI.OpenAI.ChatCompletionsOptions chatCompletionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual Azure.Response<Azure.AI.OpenAI.StreamingChatCompletions> GetChatCompletionsStreaming(Azure.AI.OpenAI.ChatCompletionsOptions chatCompletionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual System.Threading.Tasks.Task<Azure.Response<Azure.AI.OpenAI.StreamingChatCompletions>> GetChatCompletionsStreamingAsync(Azure.AI.OpenAI.ChatCompletionsOptions chatCompletionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual Azure.AI.OpenAI.StreamingResponse<Azure.AI.OpenAI.StreamingChatCompletionsUpdate> GetChatCompletionsStreaming(Azure.AI.OpenAI.ChatCompletionsOptions chatCompletionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual System.Threading.Tasks.Task<Azure.AI.OpenAI.StreamingResponse<Azure.AI.OpenAI.StreamingChatCompletionsUpdate>> GetChatCompletionsStreamingAsync(Azure.AI.OpenAI.ChatCompletionsOptions chatCompletionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual Azure.Response<Azure.AI.OpenAI.Completions> GetCompletions(Azure.AI.OpenAI.CompletionsOptions completionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual System.Threading.Tasks.Task<Azure.Response<Azure.AI.OpenAI.Completions>> GetCompletionsAsync(Azure.AI.OpenAI.CompletionsOptions completionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual Azure.Response<Azure.AI.OpenAI.StreamingCompletions> GetCompletionsStreaming(Azure.AI.OpenAI.CompletionsOptions completionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual System.Threading.Tasks.Task<Azure.Response<Azure.AI.OpenAI.StreamingCompletions>> GetCompletionsStreamingAsync(Azure.AI.OpenAI.CompletionsOptions completionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual Azure.AI.OpenAI.StreamingResponse<Azure.AI.OpenAI.Completions> GetCompletionsStreaming(Azure.AI.OpenAI.CompletionsOptions completionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual System.Threading.Tasks.Task<Azure.AI.OpenAI.StreamingResponse<Azure.AI.OpenAI.Completions>> GetCompletionsStreamingAsync(Azure.AI.OpenAI.CompletionsOptions completionsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual Azure.Response<Azure.AI.OpenAI.Embeddings> GetEmbeddings(Azure.AI.OpenAI.EmbeddingsOptions embeddingsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual System.Threading.Tasks.Task<Azure.Response<Azure.AI.OpenAI.Embeddings>> GetEmbeddingsAsync(Azure.AI.OpenAI.EmbeddingsOptions embeddingsOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public virtual Azure.Response<Azure.AI.OpenAI.ImageGenerations> GetImageGenerations(Azure.AI.OpenAI.ImageGenerationOptions imageGenerationOptions, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
Expand All @@ -512,42 +511,29 @@ internal PromptFilterResult() { }
public Azure.AI.OpenAI.ContentFilterResults ContentFilterResults { get { throw null; } }
public int PromptIndex { get { throw null; } }
}
public partial class StreamingChatChoice
public partial class StreamingChatCompletionsUpdate
{
internal StreamingChatChoice() { }
public Azure.AI.OpenAI.ContentFilterResults ContentFilterResults { get { throw null; } }
public Azure.AI.OpenAI.CompletionsFinishReason? FinishReason { get { throw null; } }
public int? Index { get { throw null; } }
public System.Collections.Generic.IAsyncEnumerable<Azure.AI.OpenAI.ChatMessage> GetMessageStreaming([System.Runtime.CompilerServices.EnumeratorCancellationAttribute] System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
}
public partial class StreamingChatCompletions : System.IDisposable
{
internal StreamingChatCompletions() { }
internal StreamingChatCompletionsUpdate() { }
public string AuthorName { get { throw null; } }
public Azure.AI.OpenAI.AzureChatExtensionsMessageContext AzureExtensionsContext { get { throw null; } }
public int? ChoiceIndex { get { throw null; } }
public string ContentUpdate { get { throw null; } }
public System.DateTimeOffset Created { get { throw null; } }
public string Id { get { throw null; } }
public System.Collections.Generic.IReadOnlyList<Azure.AI.OpenAI.PromptFilterResult> PromptFilterResults { get { throw null; } }
public void Dispose() { }
protected virtual void Dispose(bool disposing) { }
public System.Collections.Generic.IAsyncEnumerable<Azure.AI.OpenAI.StreamingChatChoice> GetChoicesStreaming([System.Runtime.CompilerServices.EnumeratorCancellationAttribute] System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
}
public partial class StreamingChoice
{
internal StreamingChoice() { }
public Azure.AI.OpenAI.ContentFilterResults ContentFilterResults { get { throw null; } }
public Azure.AI.OpenAI.CompletionsFinishReason? FinishReason { get { throw null; } }
public int? Index { get { throw null; } }
public Azure.AI.OpenAI.CompletionsLogProbabilityModel LogProbabilityModel { get { throw null; } }
public System.Collections.Generic.IAsyncEnumerable<string> GetTextStreaming([System.Runtime.CompilerServices.EnumeratorCancellationAttribute] System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public string FunctionArgumentsUpdate { get { throw null; } }
public string FunctionName { get { throw null; } }
public string Id { get { throw null; } }
public Azure.AI.OpenAI.ChatRole? Role { get { throw null; } }
}
public partial class StreamingCompletions : System.IDisposable
public partial class StreamingResponse<T> : System.Collections.Generic.IAsyncEnumerable<T>, System.IDisposable
{
internal StreamingCompletions() { }
public System.DateTimeOffset Created { get { throw null; } }
public string Id { get { throw null; } }
public System.Collections.Generic.IReadOnlyList<Azure.AI.OpenAI.PromptFilterResult> PromptFilterResults { get { throw null; } }
internal StreamingResponse() { }
public static Azure.AI.OpenAI.StreamingResponse<T> CreateFromResponse(Azure.Response response, System.Func<Azure.Response, System.Collections.Generic.IAsyncEnumerable<T>> asyncEnumerableProcessor) { throw null; }
public void Dispose() { }
protected virtual void Dispose(bool disposing) { }
public System.Collections.Generic.IAsyncEnumerable<Azure.AI.OpenAI.StreamingChoice> GetChoicesStreaming([System.Runtime.CompilerServices.EnumeratorCancellationAttribute] System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
public System.Collections.Generic.IAsyncEnumerable<T> EnumerateValues() { throw null; }
public Azure.Response GetRawResponse() { throw null; }
System.Collections.Generic.IAsyncEnumerator<T> System.Collections.Generic.IAsyncEnumerable<T>.GetAsyncEnumerator(System.Threading.CancellationToken cancellationToken) { throw null; }
}
}
namespace Microsoft.Extensions.Azure
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

// <auto-generated/>

#nullable disable

using System.Collections.Generic;
using Azure.Core;

namespace Azure.AI.OpenAI
{
/// <summary>
/// A representation of the additional context information available when Azure OpenAI chat extensions are involved
/// in the generation of a corresponding chat completions response. This context information is only populated when
/// using an Azure OpenAI request configured to use a matching extension.
/// </summary>
public partial class AzureChatExtensionsMessageContext
{
public ContentFilterResults RequestContentFilterResults { get; internal set; }
public ContentFilterResults ResponseContentFilterResults { get; internal set; }

internal AzureChatExtensionsMessageContext(
IList<ChatMessage> messages,
ContentFilterResults requestContentFilterResults,
ContentFilterResults responseContentFilterResults)
: this(messages)
{
RequestContentFilterResults = requestContentFilterResults;
ResponseContentFilterResults = responseContentFilterResults;
}
}
}
Loading

0 comments on commit 4f17bea

Please sign in to comment.