Skip to content

Commit

Permalink
Add custom pre and post processor functions (#5300)
Browse files Browse the repository at this point in the history
* Add custom pre and post processor functions

Signed-off-by: Fanit Kolchina <[email protected]>

* Uniform placeholder format

Signed-off-by: Fanit Kolchina <[email protected]>

* Add custom pre and post processor blueprint

Signed-off-by: Fanit Kolchina <[email protected]>

* Add a pretrained model

Signed-off-by: Fanit Kolchina <[email protected]>

* Update _ml-commons-plugin/extensibility/blueprints.md

Co-authored-by: Yaliang Wu <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>

* Implemented tech review comments

Signed-off-by: Fanit Kolchina <[email protected]>

* Add two more models

Signed-off-by: Fanit Kolchina <[email protected]>

* Specify json code highlighter

Signed-off-by: Fanit Kolchina <[email protected]>

---------

Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Yaliang Wu <[email protected]>
  • Loading branch information
kolchfa-aws and ylwu-amzn authored Nov 8, 2023
1 parent dadb6b1 commit 3dbcc34
Show file tree
Hide file tree
Showing 4 changed files with 113 additions and 35 deletions.
57 changes: 47 additions & 10 deletions _ml-commons-plugin/extensibility/blueprints.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ parent: Connecting to remote models

All connectors consist of a JSON blueprint created by machine learning (ML) developers. The blueprint allows administrators and data scientists to make connections between OpenSearch and an AI service or model-serving technology.

The following example shows a blueprint that connects to Amazon SageMaker:
The following example shows a blueprint of an Amazon SageMaker connector:

```json
POST /_plugins/_ml/connectors/_create
Expand All @@ -20,12 +20,12 @@ POST /_plugins/_ml/connectors/_create
"version": "<YOUR CONNECTOR VERSION>",
"protocol": "aws_sigv4",
"credential": {
"access_key": "<ADD YOUR AWS ACCESS KEY HERE>",
"secret_key": "<ADD YOUR AWS SECRET KEY HERE>",
"session_token": "<ADD YOUR AWS SECURITY TOKEN HERE>"
"access_key": "<YOUR AWS ACCESS KEY>",
"secret_key": "<YOUR AWS SECRET KEY>",
"session_token": "<YOUR AWS SECURITY TOKEN>"
},
"parameters": {
"region": "<ADD YOUR AWS REGION HERE>",
"region": "<YOUR AWS REGION>",
"service_name": "sagemaker"
},
"actions": [
Expand All @@ -35,8 +35,8 @@ POST /_plugins/_ml/connectors/_create
"headers": {
"content-type": "application/json"
},
"url": "<ADD YOUR Sagemaker MODEL ENDPOINT URL>",
"request_body": "<ADD YOUR REQUEST BODY. Example: ${parameters.inputs}>"
"url": "<YOUR SAGEMAKER MODEL ENDPOINT URL>",
"request_body": "<YOUR REQUEST BODY. Example: ${parameters.inputs}>"
}
]
}
Expand Down Expand Up @@ -105,9 +105,9 @@ POST /_plugins/_ml/connectors/_create
"version": 1,
"protocol": "aws_sigv4",
"credential": {
"access_key": "<REPLACE WITH SAGEMAKER ACCESS KEY>",
"secret_key": "<REPLACE WITH SAGEMAKER SECRET KEY>",
"session_token": "<REPLACE WITH AWS SECURITY TOKEN>"
"access_key": "<YOUR SAGEMAKER ACCESS KEY>",
"secret_key": "<YOUR SAGEMAKER SECRET KEY>",
"session_token": "<YOUR AWS SECURITY TOKEN>"
},
"parameters": {
"region": "ap-northeast-1",
Expand Down Expand Up @@ -173,6 +173,43 @@ The remote text embedding model output must be a two-dimensional float array, ea
]
```

## Custom pre- and post-processing functions

You can write your own pre- and post-processing functions specifically for your model format. For example, the following Amazon Bedrock connector definition contains custom pre- and post-processing functions for the Amazon Bedrock Titan embedding model:

```json
POST /_plugins/_ml/connectors/_create
{
"name": "Amazon Bedrock Connector: embedding",
"description": "The connector to the Bedrock Titan embedding model",
"version": 1,
"protocol": "aws_sigv4",
"parameters": {
"region": "<YOUR AWS REGION>",
"service_name": "bedrock"
},
"credential": {
"access_key": "<YOUR AWS ACCESS KEY>",
"secret_key": "<YOUR AWS SECRET KEY>",
"session_token": "<YOUR AWS SECURITY TOKEN>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://bedrock-runtime.us-east-1.amazonaws.com/model/amazon.titan-embed-text-v1/invoke",
"headers": {
"content-type": "application/json",
"x-amz-content-sha256": "required"
},
"request_body": "{ \"inputText\": \"${parameters.inputText}\" }",
"pre_process_function": "\n StringBuilder builder = new StringBuilder();\n builder.append(\"\\\"\");\n String first = params.text_docs[0];\n builder.append(first);\n builder.append(\"\\\"\");\n def parameters = \"{\" +\"\\\"inputText\\\":\" + builder + \"}\";\n return \"{\" +\"\\\"parameters\\\":\" + parameters + \"}\";",
"post_process_function": "\n def name = \"sentence_embedding\";\n def dataType = \"FLOAT32\";\n if (params.embedding == null || params.embedding.length == 0) {\n return params.message;\n }\n def shape = [params.embedding.length];\n def json = \"{\" +\n \"\\\"name\\\":\\\"\" + name + \"\\\",\" +\n \"\\\"data_type\\\":\\\"\" + dataType + \"\\\",\" +\n \"\\\"shape\\\":\" + shape + \",\" +\n \"\\\"data\\\":\" + params.embedding +\n \"}\";\n return json;\n "
}
]
}
```
{% include copy-curl.html %}

## Next step

Expand Down
51 changes: 45 additions & 6 deletions _ml-commons-plugin/extensibility/connectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,17 +192,17 @@ The `parameters` section requires the following options when using `aws_sigv4` a

## Cohere connector

The following example request creates a standalone Cohere connection:
The following example request creates a standalone Cohere connector:

```json
POST /_plugins/_ml/connectors/_create
{
"name": "YOUR CONNECTOR NAME",
"description": "YOUR CONNECTOR DESCRIPTION",
"version": "YOUR CONNECTOR VERSION",
"name": "<YOUR CONNECTOR NAME>",
"description": "<YOUR CONNECTOR DESCRIPTION>",
"version": "<YOUR CONNECTOR VERSION>",
"protocol": "http",
"credential": {
"cohere_key": "ADD YOUR Cohere API KEY HERE"
"cohere_key": "<YOUR Cohere API KEY HERE>"
},
"parameters": {
"model": "embed-english-v2.0",
Expand All @@ -216,13 +216,52 @@ POST /_plugins/_ml/connectors/_create
"headers": {
"Authorization": "Bearer ${credential.cohere_key}"
},
"request_body": "{ \"texts\": ${parameters.texts}, \"truncate\": \"${parameters.truncate}\", \"model\": \"${parameters.model}\" }"
"request_body": "{ \"texts\": ${parameters.texts}, \"truncate\": \"${parameters.truncate}\", \"model\": \"${parameters.model}\" }",
"pre_process_function": "connector.pre_process.cohere.embedding",
"post_process_function": "connector.post_process.cohere.embedding"
}
]
}
```
{% include copy-curl.html %}

## Amazon Bedrock connector

The following example request creates a standalone Amazon Bedrock connector:

```json
POST /_plugins/_ml/connectors/_create
{
"name": "Amazon Bedrock Connector: embedding",
"description": "The connector to the Bedrock Titan embedding model",
"version": 1,
"protocol": "aws_sigv4",
"parameters": {
"region": "<YOUR AWS REGION>",
"service_name": "bedrock"
},
"credential": {
"access_key": "<YOUR AWS ACCESS KEY>",
"secret_key": "<YOUR AWS SECRET KEY>",
"session_token": "<YOUR AWS SECURITY TOKEN>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://bedrock-runtime.us-east-1.amazonaws.com/model/amazon.titan-embed-text-v1/invoke",
"headers": {
"content-type": "application/json",
"x-amz-content-sha256": "required"
},
"request_body": "{ \"inputText\": \"${parameters.inputText}\" }",
"pre_process_function": "\n StringBuilder builder = new StringBuilder();\n builder.append(\"\\\"\");\n String first = params.text_docs[0];\n builder.append(first);\n builder.append(\"\\\"\");\n def parameters = \"{\" +\"\\\"inputText\\\":\" + builder + \"}\";\n return \"{\" +\"\\\"parameters\\\":\" + parameters + \"}\";",
"post_process_function": "\n def name = \"sentence_embedding\";\n def dataType = \"FLOAT32\";\n if (params.embedding == null || params.embedding.length == 0) {\n return params.message;\n }\n def shape = [params.embedding.length];\n def json = \"{\" +\n \"\\\"name\\\":\\\"\" + name + \"\\\",\" +\n \"\\\"data_type\\\":\\\"\" + dataType + \"\\\",\" +\n \"\\\"shape\\\":\" + shape + \",\" +\n \"\\\"data\\\":\" + params.embedding +\n \"}\";\n return json;\n "
}
]
}
```
{% include copy-curl.html %}

## Next steps

Expand Down
3 changes: 2 additions & 1 deletion _ml-commons-plugin/extensibility/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ PUT /_cluster/settings
"plugins.ml_commons.trusted_connector_endpoints_regex": [
"^https://runtime\\.sagemaker\\..*[a-z0-9-]\\.amazonaws\\.com/.*$",
"^https://api\\.openai\\.com/.*$",
"^https://api\\.cohere\\.ai/.*$"
"^https://api\\.cohere\\.ai/.*$",
"^https://bedrock-runtime\\..*[a-z0-9-]\\.amazonaws\\.com/.*$"
]
}
}
Expand Down
Loading

0 comments on commit 3dbcc34

Please sign in to comment.