forked from opensearch-project/ml-commons
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add offline batch inference connector blueprints (opensearch-project#…
…2768) (opensearch-project#2771) Signed-off-by: Xun Zhang <[email protected]> (cherry picked from commit 62b33fd) Co-authored-by: Xun Zhang <[email protected]>
- Loading branch information
1 parent
446a592
commit e371f4b
Showing
2 changed files
with
303 additions
and
0 deletions.
There are no files selected for viewing
148 changes: 148 additions & 0 deletions
148
docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,148 @@ | ||
### OpenAI connector blueprint example for batch inference: | ||
|
||
Read more details on https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/ | ||
|
||
Integrate the OpenAI Batch API using the connector below with a new action type "batch_predict". | ||
For more details of the OpenAI Batch API, please refer to https://platform.openai.com/docs/guides/batch/overview. | ||
|
||
#### 1. Create your Model connector and Model group | ||
|
||
##### 1a. Register Model group | ||
```json | ||
POST /_plugins/_ml/model_groups/_register | ||
{ | ||
"name": "openAI_model_group", | ||
"description": "Your openAI model group" | ||
} | ||
``` | ||
This request response will return the `model_group_id`, note it down. | ||
Sample response: | ||
```json | ||
{ | ||
"model_group_id": "IMobmY8B8aiZvtEZeO_i", | ||
"status": "CREATED" | ||
} | ||
``` | ||
|
||
##### 1b. Create Connector | ||
```json | ||
POST /_plugins/_ml/connectors/_create | ||
{ | ||
"name": "OpenAI Embedding model", | ||
"description": "OpenAI embedding model for testing offline batch", | ||
"version": "1", | ||
"protocol": "http", | ||
"parameters": { | ||
"model": "text-embedding-ada-002", | ||
"input_file_id": "file-YbowBByiyVJN89oSZo2Enu9W", | ||
"endpoint": "/v1/embeddings" | ||
}, | ||
"credential": { | ||
"openAI_key": "<your openAI key>" | ||
}, | ||
"actions": [ | ||
{ | ||
"action_type": "predict", | ||
"method": "POST", | ||
"url": "https://api.openai.com/v1/embeddings", | ||
"headers": { | ||
"Authorization": "Bearer ${credential.openAI_key}" | ||
}, | ||
"request_body": "{ \"input\": ${parameters.input}, \"model\": \"${parameters.model}\" }", | ||
"pre_process_function": "connector.pre_process.openai.embedding", | ||
"post_process_function": "connector.post_process.openai.embedding" | ||
}, | ||
{ | ||
"action_type": "batch_predict", | ||
"method": "POST", | ||
"url": "https://api.openai.com/v1/batches", | ||
"headers": { | ||
"Authorization": "Bearer ${credential.openAI_key}" | ||
}, | ||
"request_body": "{ \"input_file_id\": \"${parameters.input_file_id}\", \"endpoint\": \"${parameters.endpoint}\", \"completion_window\": \"24h\" }" | ||
} | ||
] | ||
} | ||
``` | ||
To create the file_id in the connector, please prepare your batch file and upload it to the OpenAI service through the file API. Please refer to this [Public doc](https://platform.openai.com/docs/api-reference/files) | ||
|
||
#### Sample response | ||
```json | ||
{ | ||
"connector_id": "XU5UiokBpXT9icfOM0vt" | ||
} | ||
``` | ||
|
||
### 2. Register model to the model group and link the created connector: | ||
|
||
```json | ||
POST /_plugins/_ml/models/_register?deploy=true | ||
{ | ||
"name": "OpenAI model for realtime embedding and offline batch inference", | ||
"function_name": "remote", | ||
"model_group_id": "IMobmY8B8aiZvtEZeO_i", | ||
"description": "OpenAI text embedding model", | ||
"connector_id": "XU5UiokBpXT9icfOM0vt" | ||
} | ||
``` | ||
Sample response: | ||
```json | ||
{ | ||
"task_id": "rMormY8B8aiZvtEZIO_j", | ||
"status": "CREATED", | ||
"model_id": "lyjxwZABNrAVdFa9zrcZ" | ||
} | ||
``` | ||
### 3. Test offline batch inference using the connector | ||
|
||
```json | ||
POST /_plugins/_ml/models/lyjxwZABNrAVdFa9zrcZ/_batch_predict | ||
{ | ||
"parameters": { | ||
"model": "text-embedding-ada-002" | ||
} | ||
} | ||
``` | ||
Sample response: | ||
```json | ||
{ | ||
"inference_results": [ | ||
{ | ||
"output": [ | ||
{ | ||
"name": "response", | ||
"dataAsMap": { | ||
"id": "batch_khFSJIzT0eev9PuxVDsIGxv6", | ||
"object": "batch", | ||
"endpoint": "/v1/embeddings", | ||
"errors": null, | ||
"input_file_id": "file-YbowBByiyVJN89oSZo2Enu9W", | ||
"completion_window": "24h", | ||
"status": "validating", | ||
"output_file_id": null, | ||
"error_file_id": null, | ||
"created_at": 1722037257, | ||
"in_progress_at": null, | ||
"expires_at": 1722123657, | ||
"finalizing_at": null, | ||
"completed_at": null, | ||
"failed_at": null, | ||
"expired_at": null, | ||
"cancelling_at": null, | ||
"cancelled_at": null, | ||
"request_counts": { | ||
"total": 0, | ||
"completed": 0, | ||
"failed": 0 | ||
}, | ||
"metadata": null | ||
} | ||
} | ||
], | ||
"status_code": 200 | ||
} | ||
] | ||
} | ||
``` | ||
For the definition of each field in the result, please refer to https://platform.openai.com/docs/guides/batch. | ||
Once the batch is complete, you can download the output by making a request directly against the OpenAI Files API via the "id" field in the output. |
155 changes: 155 additions & 0 deletions
155
docs/remote_inference_blueprints/batch_inference_sagemaker_connector_blueprint.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
### Sagemaker connector blueprint example for batch inference: | ||
|
||
Read more details on https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/ | ||
|
||
Integrate the SageMaker Batch Transform API using the connector below with a new action type "batch_predict". | ||
For more details to use batch transform to run inference with Amazon SageMaker, please refer to https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html. | ||
|
||
#### 1. Create your Model connector and Model group | ||
|
||
##### 1a. Register Model group | ||
```json | ||
POST /_plugins/_ml/model_groups/_register | ||
{ | ||
"name": "sagemaker_model_group", | ||
"description": "Your sagemaker model group" | ||
} | ||
``` | ||
This request response will return the `model_group_id`, note it down. | ||
Sample response: | ||
```json | ||
{ | ||
"model_group_id": "IMobmY8B8aiZvtEZeO_i", | ||
"status": "CREATED" | ||
} | ||
``` | ||
|
||
##### 1b. Create Connector | ||
```json | ||
POST /_plugins/_ml/connectors/_create | ||
{ | ||
"name": "DJL Sagemaker Connector: all-MiniLM-L6-v2", | ||
"version": "1", | ||
"description": "The connector to sagemaker embedding model all-MiniLM-L6-v2", | ||
"protocol": "aws_sigv4", | ||
"credential": { | ||
"access_key": "<your access_key>", | ||
"secret_key": "<your secret_key>", | ||
"session_token": "<your session_token>" | ||
}, | ||
"parameters": { | ||
"region": "us-east-1", | ||
"service_name": "sagemaker", | ||
"DataProcessing": { | ||
"InputFilter": "$.content", | ||
"JoinSource": "Input", | ||
"OutputFilter": "$" | ||
}, | ||
"ModelName": "DJL-Text-Embedding-Model-imageforjsonlines", | ||
"TransformInput": { | ||
"ContentType": "application/json", | ||
"DataSource": { | ||
"S3DataSource": { | ||
"S3DataType": "S3Prefix", | ||
"S3Uri": "s3://offlinebatch/sagemaker_djl_batch_input.json" | ||
} | ||
}, | ||
"SplitType": "Line" | ||
}, | ||
"TransformJobName": "SM-offline-batch-transform-07-12-13-30", | ||
"TransformOutput": { | ||
"AssembleWith": "Line", | ||
"Accept": "application/json", | ||
"S3OutputPath": "s3://offlinebatch/output" | ||
}, | ||
"TransformResources": { | ||
"InstanceCount": 1, | ||
"InstanceType": "ml.c5.xlarge" | ||
}, | ||
"BatchStrategy": "SingleRecord" | ||
}, | ||
"actions": [ | ||
{ | ||
"action_type": "predict", | ||
"method": "POST", | ||
"headers": { | ||
"content-type": "application/json" | ||
}, | ||
"url": "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/OpenSearch-sagemaker-060124023703/invocations", | ||
"request_body": "${parameters.input}", | ||
"pre_process_function": "connector.pre_process.default.embedding", | ||
"post_process_function": "connector.post_process.default.embedding" | ||
}, | ||
{ | ||
"action_type": "batch_predict", | ||
"method": "POST", | ||
"headers": { | ||
"content-type": "application/json" | ||
}, | ||
"url": "https://api.sagemaker.us-east-1.amazonaws.com/CreateTransformJob", | ||
"request_body": "{ \"BatchStrategy\": \"${parameters.BatchStrategy}\", \"ModelName\": \"${parameters.ModelName}\", \"DataProcessing\" : ${parameters.DataProcessing}, \"TransformInput\": ${parameters.TransformInput}, \"TransformJobName\" : \"${parameters.TransformJobName}\", \"TransformOutput\" : ${parameters.TransformOutput}, \"TransformResources\" : ${parameters.TransformResources}}" | ||
} | ||
] | ||
} | ||
``` | ||
SageMaker supports data processing through a subset of the defined JSONPath operators, and supports Associating Inferences results with Input Records. | ||
Please refer to this [AWS doc](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html) | ||
|
||
#### Sample response | ||
```json | ||
{ | ||
"connector_id": "XU5UiokBpXT9icfOM0vt" | ||
} | ||
``` | ||
|
||
### 2. Register model to the model group and link the created connector: | ||
|
||
```json | ||
POST /_plugins/_ml/models/_register?deploy=true | ||
{ | ||
"name": "SageMaker model for realtime embedding and offline batch inference", | ||
"function_name": "remote", | ||
"model_group_id": "IMobmY8B8aiZvtEZeO_i", | ||
"description": "SageMaker hosted DJL model", | ||
"connector_id": "XU5UiokBpXT9icfOM0vt" | ||
} | ||
``` | ||
Sample response: | ||
```json | ||
{ | ||
"task_id": "rMormY8B8aiZvtEZIO_j", | ||
"status": "CREATED", | ||
"model_id": "lyjxwZABNrAVdFa9zrcZ" | ||
} | ||
``` | ||
### 3. Test offline batch inference using the connector | ||
|
||
```json | ||
POST /_plugins/_ml/models/dBK3t5ABrxVhHgFYhg7Q/_batch_predict | ||
{ | ||
"parameters": { | ||
"TransformJobName": "SM-offline-batch-transform-07-15-11-30" | ||
} | ||
} | ||
``` | ||
Sample response: | ||
```json | ||
{ | ||
"inference_results": [ | ||
{ | ||
"output": [ | ||
{ | ||
"name": "response", | ||
"dataAsMap": { | ||
"job_arn": "arn:aws:sagemaker:us-east-1:802041417063:transform-job/SM-offline-batch-transform" | ||
} | ||
} | ||
], | ||
"status_code": 200 | ||
} | ||
] | ||
} | ||
``` | ||
The "job_arn" is returned immediately from this request, and you can use this job_arn to check the job status | ||
in the SageMaker service. Once the job is done, you can check your batch inference results in the S3 that is | ||
specified in the "S3OutputPath" field in your connector. |