NOTE: This feature is in the rollout phase and is available only to specific tenants. Our team is actively working on enabling this feature fully on Teams and across all languages in the SDK. Rest assured; we are diligently working to enable this feature for everyone. Updates will be posted on the Discussions page.
Navigation
- 00.OVERVIEW
- Action Planner
- Actions
- AI System
- Application class
- Augmentations
- Data Sources
- Function Calls
- Moderator
- Planner
- Powered by AI
- Prompts
- Streaming
- Turns
- User Authentication
AI-powered bots tend to have slower response times which can disengage users. There are two factors that contribute to a slow response. The first is the multiple preprocessing steps such as RAG or function calls which take time and are often required before the LLM can produce a response. The second is the time the LLM takes to generate a full response.
A common solution is to stream the bot’s response to users while the LLM generates its full response. Through streaming, your bot can offer an experience that feels engaging, responsive, and on-par with leading AI products.
There are two parts to streaming:
-
Informative Updates: Provide users with insights into what your bot is doing before it has started generating its response.
-
Response Streaming: Provide users with chunks of the response as they are generated by the LLM. This feels like the bot is actively typing out its message.
The StreamingResponse
class is the helper class for streaming responses to the client. The class is used to send a series of updates to the client in a single response. If you are using your own custom model, you can directly instantiate and manage this class to stream responses.
The expected sequence of calls is:
queueInformativeUpdate()
queueTextChunk()
, ...,endStream()
.
Once endStream()
is called, the stream is considered ended and no further updates can be sent.
- Streaming is only available in 1:1 chats.
- Only rich text can be streamed.
- Only one informative message can be set. This is reused for each message.
- Examples include:
- “Scanning through documents”
- “Summarizing content”
- “Finding relevant work items”
- Examples include:
- The informative message is rendered only at the beginning of each message returned from the LLM.
- Attachments can only be sent in the final streamed chunk.
- Streaming is not available in conjunction with AI SDK's function calls yet.
You can configure streaming with your bot by following these steps:
- Use the
DefaultAugmentation
class - Set
stream: true
in theOpenAIModel
declaration
- Set the informative message in the
ActionPlanner
declaration via theStartStreamingMessage
config. - Set attachments in the final chunk via the
EndStreamHandler
in theActionPlanner
declaration.
// Create OpenAI Model
builder.Services.AddSingleton<OpenAIModel > (sp => new(
new OpenAIModelOptions(config.OpenAI.ApiKey, "gpt-4o")
{
LogRequests = true,
Stream = true, // Set stream toggle
},
sp.GetService<ILoggerFactory>()
));
ResponseReceivedHandler endStreamHandler = new((object sender, ResponseReceivedEventArgs args) =>
{
StreamingResponse? streamer = args.Streamer;
if (streamer == null)
{
return;
}
AdaptiveCard adaptiveCard = new("1.6")
{
Body = [new AdaptiveTextBlock(streamer.Message) { Wrap = true }]
};
var adaptiveCardAttachment = new Attachment()
{
ContentType = "application/vnd.microsoft.card.adaptive",
Content = adaptiveCard,
};
streamer.Attachments = [adaptiveCardAttachment]; // Set attachments
});
// Create ActionPlanner
ActionPlanner<TurnState> planner = new(
options: new(
model: sp.GetService<OpenAIModel>()!,
prompts: prompts,
defaultPrompt: async (context, state, planner) =>
{
PromptTemplate template = prompts.GetPrompt("Chat");
return await Task.FromResult(template);
}
)
{
LogRepairs = true,
StartStreamingMessage = "Loading stream results...", // Set informative message
EndStreamHandler = endStreamHandler // Set final chunk handler
},
loggerFactory: loggerFactory
);
const model = new OpenAIModel({
// ...Setup OpenAI or AzureOpenAI
stream: true, // Set stream toggle
});
const endStreamHandler: PromptCompletionModelResponseReceivedEvent = (ctx, memory, response, streamer) => {
// ... Setup attachments
streamer.setAttachments([...cards]); // Set attachments
};
const planner = new ActionPlanner({
model,
prompts,
defaultPrompt: 'default',
startStreamingMessage: 'Loading stream results...', // Set informative message
endStreamHandler: endStreamHandler // Set final chunk handler
});