Skip to content

Commit

Permalink
feat(component,openai): add supports for tools and predicted output (#…
Browse files Browse the repository at this point in the history
…953)

Because

- tool calling is beneficial for designing the agent framework.
- predicted output can optimize response speed.

This commit

- adds support for tools and predicted output.
- removes the property count check in compogen, as we are using oneOf in
tasks, which cannot guarantee at least one property.
  • Loading branch information
donch1989 authored Jan 15, 2025
1 parent 4d932da commit fc808a7
Show file tree
Hide file tree
Showing 15 changed files with 435 additions and 55 deletions.
83 changes: 82 additions & 1 deletion pkg/component/ai/openai/v0/README.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ OpenAI's text generation models (often called generative pre-trained transformer
| Input | Field ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_TEXT_GENERATION` |
| Model (required) | `model` | string | ID of the model to use. <br/><details><summary><strong>Enum values</strong></summary><ul><li>`o1-preview`</li><li>`o1-mini`</li><li>`gpt-4o-mini`</li><li>`gpt-4o`</li><li>`gpt-4o-2024-05-13`</li><li>`gpt-4o-2024-08-06`</li><li>`gpt-4-turbo`</li><li>`gpt-4-turbo-2024-04-09`</li><li>`gpt-4-0125-preview`</li><li>`gpt-4-turbo-preview`</li><li>`gpt-4-1106-preview`</li><li>`gpt-4-vision-preview`</li><li>`gpt-4`</li><li>`gpt-4-0314`</li><li>`gpt-4-0613`</li><li>`gpt-4-32k`</li><li>`gpt-4-32k-0314`</li><li>`gpt-4-32k-0613`</li><li>`gpt-3.5-turbo`</li><li>`gpt-3.5-turbo-16k`</li><li>`gpt-3.5-turbo-0301`</li><li>`gpt-3.5-turbo-0613`</li><li>`gpt-3.5-turbo-1106`</li><li>`gpt-3.5-turbo-0125`</li><li>`gpt-3.5-turbo-16k-0613`</li></ul></details> |
| Model (required) | `model` | string | ID of the model to use. <br/><details><summary><strong>Enum values</strong></summary><ul><li>`o1`</li><li>`o1-preview`</li><li>`o1-mini`</li><li>`gpt-4o-mini`</li><li>`gpt-4o`</li><li>`gpt-4o-2024-05-13`</li><li>`gpt-4o-2024-08-06`</li><li>`gpt-4-turbo`</li><li>`gpt-4-turbo-2024-04-09`</li><li>`gpt-4-0125-preview`</li><li>`gpt-4-turbo-preview`</li><li>`gpt-4-1106-preview`</li><li>`gpt-4-vision-preview`</li><li>`gpt-4`</li><li>`gpt-4-0314`</li><li>`gpt-4-0613`</li><li>`gpt-4-32k`</li><li>`gpt-4-32k-0314`</li><li>`gpt-4-32k-0613`</li><li>`gpt-3.5-turbo`</li><li>`gpt-3.5-turbo-16k`</li><li>`gpt-3.5-turbo-0301`</li><li>`gpt-3.5-turbo-0613`</li><li>`gpt-3.5-turbo-1106`</li><li>`gpt-3.5-turbo-0125`</li><li>`gpt-3.5-turbo-16k-0613`</li></ul></details> |
| Prompt (required) | `prompt` | string | The prompt text. |
| System Message | `system-message` | string | The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. By default, the model’s behavior is using a generic message as "You are a helpful assistant.". |
| Image | `images` | array[string] | The images. |
Expand All @@ -74,6 +74,9 @@ OpenAI's text generation models (often called generative pre-trained transformer
| Top P | `top-p` | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both. . |
| Presence Penalty | `presence-penalty` | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. |
| Frequency Penalty | `frequency-penalty` | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. |
| [Prediction](#text-generation-prediction) | `prediction` | object | Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content. |
| [Tools](#text-generation-tools) | `tools` | array[object] | A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported. |
| Tool Choice | `tool-choice` | any | Controls which (if any) tool is called by the model. 'none' means the model will not call any tool and instead generates a message. 'auto' means the model can pick between generating a message or calling one or more tools. 'required' means the model must call one or more tools. |
</div>


Expand Down Expand Up @@ -113,6 +116,39 @@ The image URL
| :--- | :--- | :--- | :--- |
| URL | `url` | string | Either a URL of the image or the base64 encoded image data. |
</div>
<h4 id="text-generation-prediction">Prediction</h4>

Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content.

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Content | `content` | string | The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly. |
</div>
<h4 id="text-generation-tools">Tools</h4>

A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| [Function](#text-generation-function) | `function` | object | The function to call. |
</div>
<h4 id="text-generation-function">Function</h4>

The function to call.

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Description | `description` | string | A description of what the function does, used by the model to choose when and how to call the function. |
| Name | `name` | string | The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. |
| Parameters | `parameters` | object | The parameters the functions accepts, described as a JSON Schema object. Omitting parameters defines a function with an empty parameter list. |
| Strict | `strict` | boolean | Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters field. |
</div>
</details>

<details>
Expand Down Expand Up @@ -156,22 +192,67 @@ The image URL
| Output | Field ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Texts | `texts` | array[string] | Texts. |
| [Tool Calls](#text-generation-tool-calls) (optional) | `tool-calls` | array[object] | The tool calls generated by the model, such as function calls. |
| [Usage](#text-generation-usage) (optional) | `usage` | object | Usage statistics related to the query. |
</div>

<details>
<summary> Output Objects in Text Generation</summary>

<h4 id="text-generation-tool-calls">Tool Calls</h4>

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| [Function](#text-generation-function) | `function` | object | The function that the model called. |
| Type | `type` | string | The type of the tool. Currently, only function is supported. |
</div>

<h4 id="text-generation-function">Function</h4>

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Arguments | `arguments` | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. |
| Name | `name` | string | The name of the function to call. |
</div>

<h4 id="text-generation-usage">Usage</h4>

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| [Completion token details](#text-generation-completion-token-details) | `completion-token-details` | object | Breakdown of tokens used in a completion. |
| Completion tokens | `completion-tokens` | integer | Total number of tokens used (completion). |
| [Prompt token details](#text-generation-prompt-token-details) | `prompt-token-details` | object | Breakdown of tokens used in the prompt. |
| Prompt tokens | `prompt-tokens` | integer | Total number of tokens used (prompt). |
| Total tokens | `total-tokens` | integer | Total number of tokens used (prompt + completion). |
</div>

<h4 id="text-generation-prompt-token-details">Prompt Token Details</h4>

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Audio tokens | `audio-tokens` | integer | Audio input tokens present in the prompt. |
| Cached tokens | `cached-tokens` | integer | Cached tokens present in the prompt. |
</div>

<h4 id="text-generation-completion-token-details">Completion Token Details</h4>

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Accepted prediction tokens | `accepted-prediction-tokens` | integer | When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion. |
| Audio tokens | `audio-tokens` | integer | Audio input tokens generated by the model. |
| Reasoning tokens | `reasoning-tokens` | integer | Tokens generated by the model for reasoning. |
| Rejected prediction tokens | `rejected-prediction-tokens` | integer | When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits. |
</div>
</details>


Expand Down
164 changes: 164 additions & 0 deletions pkg/component/ai/openai/v0/config/tasks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@ TASK_TEXT_GENERATION:
model:
description: ID of the model to use.
enum:
- o1
- o1-preview
- o1-mini
- gpt-4o-mini
Expand Down Expand Up @@ -221,6 +222,7 @@ TASK_TEXT_GENERATION:
uiOrder: 0
instillCredentialMap:
values:
- o1
- o1-preview
- o1-mini
- gpt-4o
Expand Down Expand Up @@ -353,6 +355,94 @@ TASK_TEXT_GENERATION:
shortDescription: An alternative to sampling with temperature, called nucleus sampling
uiOrder: 9
title: Top P
prediction:
description: Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead
of time. This is most common when you are regenerating a file with only minor changes to most of the content.
type: object
uiOrder: 12
title: Prediction
properties:
content:
description: The content that should be matched when generating a model response. If generated tokens would match this content, the entire model
response can be returned much more quickly.
type: string
uiOrder: 0
title: Content
tools:
description: A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the
model may generate JSON inputs for. A max of 128 functions are supported.
type: array
uiOrder: 13
title: Tools
items:
type: object
required:
- function
properties:
function:
uiOrder: 0
title: Function
type: object
description: The function to call.
required:
- name
properties:
description:
type: string
uiOrder: 0
title: Description
description: A description of what the function does, used by the model to choose when and how to call the function.
name:
type: string
uiOrder: 1
title: Name
description: The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of
64.
parameters:
type: object
uiOrder: 2
title: Parameters
description: The parameters the functions accepts, described as a JSON Schema object. Omitting parameters defines a function with an empty
parameter list.
strict:
type: boolean
default: false
uiOrder: 3
title: Strict
description: Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact
schema defined in the parameters field.
tool-choice:
description: Controls which (if any) tool is called by the model. 'none' means the model will not call any tool and instead generates a message.
'auto' means the model can pick between generating a message or calling one or more tools. 'required' means the model must call one or more tools.
uiOrder: 14
title: Tool Choice
oneOf:
- type: string
enum: [none, auto, required]
uiOrder: 0
title: Tool Choice
description: none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a
message or calling one or more tools. required means the model must call one or more tools.
- type: object
uiOrder: 0
title: Tool Choice
description: Specifies a tool the model should use. Use to force the model to call a specific function.
required:
- function
properties:
function:
uiOrder: 0
title: Function
description: The function to call.
type: object
required:
- name
properties:
name:
type: string
uiOrder: 0
title: Name
description: The name of the function to call.
required:
- model
- prompt
Expand All @@ -369,6 +459,37 @@ TASK_TEXT_GENERATION:
description: Texts.
title: Texts
type: array
tool-calls:
description: The tool calls generated by the model, such as function calls.
uiOrder: 1
items:
type: object
properties:
type:
type: string
uiOrder: 0
title: Type
description: The type of the tool. Currently, only function is supported.
function:
type: object
uiOrder: 1
title: Function
description: The function that the model called.
properties:
name:
type: string
uiOrder: 0
title: Name
description: The name of the function to call.
arguments:
type: string
uiOrder: 1
title: Arguments
description: The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate
valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your
function.
title: Tool Calls
type: array
usage:
description: Usage statistics related to the query.
uiOrder: 1
Expand All @@ -388,6 +509,49 @@ TASK_TEXT_GENERATION:
description: Total number of tokens used (prompt).
uiOrder: 2
type: integer
prompt-token-details:
title: Prompt token details
description: Breakdown of tokens used in the prompt.
uiOrder: 3
type: object
properties:
audio-tokens:
title: Audio tokens
description: Audio input tokens present in the prompt.
uiOrder: 0
type: integer
cached-tokens:
title: Cached tokens
description: Cached tokens present in the prompt.
uiOrder: 1
type: integer
completion-token-details:
title: Completion token details
description: Breakdown of tokens used in a completion.
uiOrder: 4
type: object
properties:
reasoning-tokens:
title: Reasoning tokens
description: Tokens generated by the model for reasoning.
uiOrder: 0
type: integer
audio-tokens:
title: Audio tokens
description: Audio input tokens generated by the model.
uiOrder: 1
type: integer
accepted-prediction-tokens:
title: Accepted prediction tokens
description: When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
uiOrder: 2
type: integer
rejected-prediction-tokens:
title: Rejected prediction tokens
description: When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning
tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
uiOrder: 3
type: integer
required:
- total-tokens
title: Usage
Expand Down
Loading

0 comments on commit fc808a7

Please sign in to comment.