Skip to content

Commit

Permalink
add offline batch inference connector blueprints (opensearch-project#…
Browse files Browse the repository at this point in the history
…2768) (opensearch-project#2771)

Signed-off-by: Xun Zhang <[email protected]>
(cherry picked from commit 62b33fd)

Co-authored-by: Xun Zhang <[email protected]>
  • Loading branch information
opensearch-trigger-bot[bot] and Zhangxunmt authored Jul 27, 2024
1 parent 446a592 commit e371f4b
Show file tree
Hide file tree
Showing 2 changed files with 303 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
### OpenAI connector blueprint example for batch inference:

Read more details on https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/

Integrate the OpenAI Batch API using the connector below with a new action type "batch_predict".
For more details of the OpenAI Batch API, please refer to https://platform.openai.com/docs/guides/batch/overview.

#### 1. Create your Model connector and Model group

##### 1a. Register Model group
```json
POST /_plugins/_ml/model_groups/_register
{
"name": "openAI_model_group",
"description": "Your openAI model group"
}
```
This request response will return the `model_group_id`, note it down.
Sample response:
```json
{
"model_group_id": "IMobmY8B8aiZvtEZeO_i",
"status": "CREATED"
}
```

##### 1b. Create Connector
```json
POST /_plugins/_ml/connectors/_create
{
"name": "OpenAI Embedding model",
"description": "OpenAI embedding model for testing offline batch",
"version": "1",
"protocol": "http",
"parameters": {
"model": "text-embedding-ada-002",
"input_file_id": "file-YbowBByiyVJN89oSZo2Enu9W",
"endpoint": "/v1/embeddings"
},
"credential": {
"openAI_key": "<your openAI key>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://api.openai.com/v1/embeddings",
"headers": {
"Authorization": "Bearer ${credential.openAI_key}"
},
"request_body": "{ \"input\": ${parameters.input}, \"model\": \"${parameters.model}\" }",
"pre_process_function": "connector.pre_process.openai.embedding",
"post_process_function": "connector.post_process.openai.embedding"
},
{
"action_type": "batch_predict",
"method": "POST",
"url": "https://api.openai.com/v1/batches",
"headers": {
"Authorization": "Bearer ${credential.openAI_key}"
},
"request_body": "{ \"input_file_id\": \"${parameters.input_file_id}\", \"endpoint\": \"${parameters.endpoint}\", \"completion_window\": \"24h\" }"
}
]
}
```
To create the file_id in the connector, please prepare your batch file and upload it to the OpenAI service through the file API. Please refer to this [Public doc](https://platform.openai.com/docs/api-reference/files)

#### Sample response
```json
{
"connector_id": "XU5UiokBpXT9icfOM0vt"
}
```

### 2. Register model to the model group and link the created connector:

```json
POST /_plugins/_ml/models/_register?deploy=true
{
"name": "OpenAI model for realtime embedding and offline batch inference",
"function_name": "remote",
"model_group_id": "IMobmY8B8aiZvtEZeO_i",
"description": "OpenAI text embedding model",
"connector_id": "XU5UiokBpXT9icfOM0vt"
}
```
Sample response:
```json
{
"task_id": "rMormY8B8aiZvtEZIO_j",
"status": "CREATED",
"model_id": "lyjxwZABNrAVdFa9zrcZ"
}
```
### 3. Test offline batch inference using the connector

```json
POST /_plugins/_ml/models/lyjxwZABNrAVdFa9zrcZ/_batch_predict
{
"parameters": {
"model": "text-embedding-ada-002"
}
}
```
Sample response:
```json
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"id": "batch_khFSJIzT0eev9PuxVDsIGxv6",
"object": "batch",
"endpoint": "/v1/embeddings",
"errors": null,
"input_file_id": "file-YbowBByiyVJN89oSZo2Enu9W",
"completion_window": "24h",
"status": "validating",
"output_file_id": null,
"error_file_id": null,
"created_at": 1722037257,
"in_progress_at": null,
"expires_at": 1722123657,
"finalizing_at": null,
"completed_at": null,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"request_counts": {
"total": 0,
"completed": 0,
"failed": 0
},
"metadata": null
}
}
],
"status_code": 200
}
]
}
```
For the definition of each field in the result, please refer to https://platform.openai.com/docs/guides/batch.
Once the batch is complete, you can download the output by making a request directly against the OpenAI Files API via the "id" field in the output.
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
### Sagemaker connector blueprint example for batch inference:

Read more details on https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/

Integrate the SageMaker Batch Transform API using the connector below with a new action type "batch_predict".
For more details to use batch transform to run inference with Amazon SageMaker, please refer to https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html.

#### 1. Create your Model connector and Model group

##### 1a. Register Model group
```json
POST /_plugins/_ml/model_groups/_register
{
"name": "sagemaker_model_group",
"description": "Your sagemaker model group"
}
```
This request response will return the `model_group_id`, note it down.
Sample response:
```json
{
"model_group_id": "IMobmY8B8aiZvtEZeO_i",
"status": "CREATED"
}
```

##### 1b. Create Connector
```json
POST /_plugins/_ml/connectors/_create
{
"name": "DJL Sagemaker Connector: all-MiniLM-L6-v2",
"version": "1",
"description": "The connector to sagemaker embedding model all-MiniLM-L6-v2",
"protocol": "aws_sigv4",
"credential": {
"access_key": "<your access_key>",
"secret_key": "<your secret_key>",
"session_token": "<your session_token>"
},
"parameters": {
"region": "us-east-1",
"service_name": "sagemaker",
"DataProcessing": {
"InputFilter": "$.content",
"JoinSource": "Input",
"OutputFilter": "$"
},
"ModelName": "DJL-Text-Embedding-Model-imageforjsonlines",
"TransformInput": {
"ContentType": "application/json",
"DataSource": {
"S3DataSource": {
"S3DataType": "S3Prefix",
"S3Uri": "s3://offlinebatch/sagemaker_djl_batch_input.json"
}
},
"SplitType": "Line"
},
"TransformJobName": "SM-offline-batch-transform-07-12-13-30",
"TransformOutput": {
"AssembleWith": "Line",
"Accept": "application/json",
"S3OutputPath": "s3://offlinebatch/output"
},
"TransformResources": {
"InstanceCount": 1,
"InstanceType": "ml.c5.xlarge"
},
"BatchStrategy": "SingleRecord"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"headers": {
"content-type": "application/json"
},
"url": "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/OpenSearch-sagemaker-060124023703/invocations",
"request_body": "${parameters.input}",
"pre_process_function": "connector.pre_process.default.embedding",
"post_process_function": "connector.post_process.default.embedding"
},
{
"action_type": "batch_predict",
"method": "POST",
"headers": {
"content-type": "application/json"
},
"url": "https://api.sagemaker.us-east-1.amazonaws.com/CreateTransformJob",
"request_body": "{ \"BatchStrategy\": \"${parameters.BatchStrategy}\", \"ModelName\": \"${parameters.ModelName}\", \"DataProcessing\" : ${parameters.DataProcessing}, \"TransformInput\": ${parameters.TransformInput}, \"TransformJobName\" : \"${parameters.TransformJobName}\", \"TransformOutput\" : ${parameters.TransformOutput}, \"TransformResources\" : ${parameters.TransformResources}}"
}
]
}
```
SageMaker supports data processing through a subset of the defined JSONPath operators, and supports Associating Inferences results with Input Records.
Please refer to this [AWS doc](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html)

#### Sample response
```json
{
"connector_id": "XU5UiokBpXT9icfOM0vt"
}
```

### 2. Register model to the model group and link the created connector:

```json
POST /_plugins/_ml/models/_register?deploy=true
{
"name": "SageMaker model for realtime embedding and offline batch inference",
"function_name": "remote",
"model_group_id": "IMobmY8B8aiZvtEZeO_i",
"description": "SageMaker hosted DJL model",
"connector_id": "XU5UiokBpXT9icfOM0vt"
}
```
Sample response:
```json
{
"task_id": "rMormY8B8aiZvtEZIO_j",
"status": "CREATED",
"model_id": "lyjxwZABNrAVdFa9zrcZ"
}
```
### 3. Test offline batch inference using the connector

```json
POST /_plugins/_ml/models/dBK3t5ABrxVhHgFYhg7Q/_batch_predict
{
"parameters": {
"TransformJobName": "SM-offline-batch-transform-07-15-11-30"
}
}
```
Sample response:
```json
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"job_arn": "arn:aws:sagemaker:us-east-1:802041417063:transform-job/SM-offline-batch-transform"
}
}
],
"status_code": 200
}
]
}
```
The "job_arn" is returned immediately from this request, and you can use this job_arn to check the job status
in the SageMaker service. Once the job is done, you can check your batch inference results in the S3 that is
specified in the "S3OutputPath" field in your connector.

0 comments on commit e371f4b

Please sign in to comment.