Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom pre and post processor functions #5300

Merged
merged 10 commits into from
Nov 8, 2023
57 changes: 47 additions & 10 deletions _ml-commons-plugin/extensibility/blueprints.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ parent: Connecting to remote models

All connectors consist of a JSON blueprint created by machine learning (ML) developers. The blueprint allows administrators and data scientists to make connections between OpenSearch and an AI service or model-serving technology.

The following example shows a blueprint that connects to Amazon SageMaker:
The following example shows a blueprint of an Amazon SageMaker connector:

```json
POST /_plugins/_ml/connectors/_create
Expand All @@ -20,12 +20,12 @@ POST /_plugins/_ml/connectors/_create
"version": "<YOUR CONNECTOR VERSION>",
"protocol": "aws_sigv4",
"credential": {
"access_key": "<ADD YOUR AWS ACCESS KEY HERE>",
"secret_key": "<ADD YOUR AWS SECRET KEY HERE>",
"session_token": "<ADD YOUR AWS SECURITY TOKEN HERE>"
"access_key": "<YOUR AWS ACCESS KEY>",
"secret_key": "<YOUR AWS SECRET KEY>",
"session_token": "<YOUR AWS SECURITY TOKEN>"
},
"parameters": {
"region": "<ADD YOUR AWS REGION HERE>",
"region": "<YOUR AWS REGION>",
"service_name": "sagemaker"
},
"actions": [
Expand All @@ -35,8 +35,8 @@ POST /_plugins/_ml/connectors/_create
"headers": {
"content-type": "application/json"
},
"url": "<ADD YOUR Sagemaker MODEL ENDPOINT URL>",
"request_body": "<ADD YOUR REQUEST BODY. Example: ${parameters.inputs}>"
"url": "<YOUR SAGEMAKER MODEL ENDPOINT URL>",
"request_body": "<YOUR REQUEST BODY. Example: ${parameters.inputs}>"
}
]
}
Expand Down Expand Up @@ -105,9 +105,9 @@ POST /_plugins/_ml/connectors/_create
"version": 1,
"protocol": "aws_sigv4",
"credential": {
"access_key": "<REPLACE WITH SAGEMAKER ACCESS KEY>",
"secret_key": "<REPLACE WITH SAGEMAKER SECRET KEY>",
"session_token": "<REPLACE WITH AWS SECURITY TOKEN>"
"access_key": "<YOUR SAGEMAKER ACCESS KEY>",
"secret_key": "<YOUR SAGEMAKER SECRET KEY>",
"session_token": "<YOUR AWS SECURITY TOKEN>"
},
"parameters": {
"region": "ap-northeast-1",
Expand Down Expand Up @@ -173,6 +173,43 @@ The remote text embedding model output must be a two-dimensional float array, ea
]
```

## Custom pre- and post-processing functions

You can write your own pre- and post-processing functions specifically for your model format. For example, the following Amazon Bedrock connector definition contains custom pre- and post-processing functions for the Amazon Bedrock Titan embedding model:

```json
POST /_plugins/_ml/connectors/_create
{
"name": "Amazon Bedrock Connector: embedding",
"description": "The connector to the Bedrock Titan embedding model",
"version": 1,
"protocol": "aws_sigv4",
"parameters": {
"region": "<YOUR AWS REGION>",
"service_name": "bedrock"
},
"credential": {
"access_key": "<YOUR AWS ACCESS KEY>",
"secret_key": "<YOUR AWS SECRET KEY>",
"session_token": "<YOUR AWS SECURITY TOKEN>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://bedrock-runtime.us-east-1.amazonaws.com/model/amazon.titan-embed-text-v1/invoke",
"headers": {
"content-type": "application/json",
"x-amz-content-sha256": "required"
},
"request_body": "{ \"inputText\": \"${parameters.inputText}\" }",
"pre_process_function": "\n StringBuilder builder = new StringBuilder();\n builder.append(\"\\\"\");\n String first = params.text_docs[0];\n builder.append(first);\n builder.append(\"\\\"\");\n def parameters = \"{\" +\"\\\"inputText\\\":\" + builder + \"}\";\n return \"{\" +\"\\\"parameters\\\":\" + parameters + \"}\";",
"post_process_function": "\n def name = \"sentence_embedding\";\n def dataType = \"FLOAT32\";\n if (params.embedding == null || params.embedding.length == 0) {\n return params.message;\n }\n def shape = [params.embedding.length];\n def json = \"{\" +\n \"\\\"name\\\":\\\"\" + name + \"\\\",\" +\n \"\\\"data_type\\\":\\\"\" + dataType + \"\\\",\" +\n \"\\\"shape\\\":\" + shape + \",\" +\n \"\\\"data\\\":\" + params.embedding +\n \"}\";\n return json;\n "
}
]
}
```
{% include copy-curl.html %}

## Next step

Expand Down
51 changes: 45 additions & 6 deletions _ml-commons-plugin/extensibility/connectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,17 +192,17 @@ The `parameters` section requires the following options when using `aws_sigv4` a

## Cohere connector

The following example request creates a standalone Cohere connection:
The following example request creates a standalone Cohere connector:

```json
POST /_plugins/_ml/connectors/_create
{
"name": "YOUR CONNECTOR NAME",
"description": "YOUR CONNECTOR DESCRIPTION",
"version": "YOUR CONNECTOR VERSION",
"name": "<YOUR CONNECTOR NAME>",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"description": "<YOUR CONNECTOR DESCRIPTION>",
"version": "<YOUR CONNECTOR VERSION>",
"protocol": "http",
"credential": {
"cohere_key": "ADD YOUR Cohere API KEY HERE"
"cohere_key": "<YOUR Cohere API KEY HERE>"
},
"parameters": {
"model": "embed-english-v2.0",
Expand All @@ -216,13 +216,52 @@ POST /_plugins/_ml/connectors/_create
"headers": {
"Authorization": "Bearer ${credential.cohere_key}"
},
"request_body": "{ \"texts\": ${parameters.texts}, \"truncate\": \"${parameters.truncate}\", \"model\": \"${parameters.model}\" }"
"request_body": "{ \"texts\": ${parameters.texts}, \"truncate\": \"${parameters.truncate}\", \"model\": \"${parameters.model}\" }",
"pre_process_function": "connector.pre_process.cohere.embedding",
"post_process_function": "connector.post_process.cohere.embedding"
}
]
}
```
{% include copy-curl.html %}

## Amazon Bedrock connector

The following example request creates a standalone Amazon Bedrock connector:

```json
POST /_plugins/_ml/connectors/_create
{
"name": "Amazon Bedrock Connector: embedding",
"description": "The connector to the Bedrock Titan embedding model",
"version": 1,
"protocol": "aws_sigv4",
"parameters": {
"region": "<YOUR AWS REGION>",
"service_name": "bedrock"
},
"credential": {
"access_key": "<YOUR AWS ACCESS KEY>",
"secret_key": "<YOUR AWS SECRET KEY>",
"session_token": "<YOUR AWS SECURITY TOKEN>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://bedrock-runtime.us-east-1.amazonaws.com/model/amazon.titan-embed-text-v1/invoke",
"headers": {
"content-type": "application/json",
"x-amz-content-sha256": "required"
},
"request_body": "{ \"inputText\": \"${parameters.inputText}\" }",
"pre_process_function": "\n StringBuilder builder = new StringBuilder();\n builder.append(\"\\\"\");\n String first = params.text_docs[0];\n builder.append(first);\n builder.append(\"\\\"\");\n def parameters = \"{\" +\"\\\"inputText\\\":\" + builder + \"}\";\n return \"{\" +\"\\\"parameters\\\":\" + parameters + \"}\";",
"post_process_function": "\n def name = \"sentence_embedding\";\n def dataType = \"FLOAT32\";\n if (params.embedding == null || params.embedding.length == 0) {\n return params.message;\n }\n def shape = [params.embedding.length];\n def json = \"{\" +\n \"\\\"name\\\":\\\"\" + name + \"\\\",\" +\n \"\\\"data_type\\\":\\\"\" + dataType + \"\\\",\" +\n \"\\\"shape\\\":\" + shape + \",\" +\n \"\\\"data\\\":\" + params.embedding +\n \"}\";\n return json;\n "
}
]
}
```
{% include copy-curl.html %}

## Next steps

Expand Down
3 changes: 2 additions & 1 deletion _ml-commons-plugin/extensibility/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ PUT /_cluster/settings
"plugins.ml_commons.trusted_connector_endpoints_regex": [
"^https://runtime\\.sagemaker\\..*[a-z0-9-]\\.amazonaws\\.com/.*$",
"^https://api\\.openai\\.com/.*$",
"^https://api\\.cohere\\.ai/.*$"
"^https://api\\.cohere\\.ai/.*$",
"^https://bedrock-runtime\\..*[a-z0-9-]\\.amazonaws\\.com/.*$"
]
}
}
Expand Down
Loading