Add Connectors and ML updates for 2.9 #4554

Naarcha-AWS · 2023-07-13T16:59:35Z

Fixes #3063

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Naarcha-AWS <[email protected]>

_ml-commons-plugin/connectors.md

Signed-off-by: Naarcha-AWS <[email protected]>

kolchfa-aws

LGTM except for some comments.

_ml-commons-plugin/cluster-settings.md

_ml-commons-plugin/connectors.md

_ml-commons-plugin/ml-dashboard.md

_ml-commons-plugin/connectors.md

STYLE_GUIDE.md

Signed-off-by: Naarcha-AWS <[email protected]>

natebower

@Naarcha-AWS A few more changes. Let me know if I reviewed what you needed or if there's other content that needs my review.

_ml-commons-plugin/connectors.md

Signed-off-by: Naarcha-AWS <[email protected]>

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]>

Signed-off-by: Naarcha-AWS <[email protected]>

ylwu-amzn · 2023-07-22T18:38:36Z

_ml-commons-plugin/connectors.md

+
+### Adding trusted endpoints
+
+To configure connectors in OpenSearch, add the trusted endpoints to your cluster settings using the `plugins.ml_commons.trusted_connector_endpoints_regex` setting, which supports Java regex expressions, as shown in the following example:


Add this to settings page

ylwu-amzn · 2023-07-22T18:39:29Z

_ml-commons-plugin/connectors.md

+            "^https://runtime\\.sagemaker\\..*\\.amazonaws\\.com/.*$",
+            "^https://api\\.openai\\.com/.*$",
+            "^https://api\\.cohere\\.ai/.*$",
+            "^https://bedrock\\..*\\.amazonaws.com/.*$"


remove this?

ylwu-amzn · 2023-07-22T18:44:11Z

_ml-commons-plugin/connectors.md

+
+### Enabling ML nodes
+
+Most connectors require the use of dedicated ML nodes. To make sure you have ML nodes enabled, update the following cluster settings:


Most connectors require the use of dedicated ML nodes -> By default, connectors require the use of dedicated ML nodes. Actually this is the default setting: "plugins.ml_commons.only_run_on_ml_node": true

For remote connector, it consumes much less resource. So should be ok if user prefer to run on data node. If they don't have dedicate ML node, and prefer to run on data node, they can set

PUT /_cluster/settings { "persistent": { "plugins.ml_commons.only_run_on_ml_node": false } }

ylwu-amzn · 2023-07-22T18:52:44Z

_ml-commons-plugin/connectors.md

+| `description` | String | A description of the connector. |
+| `version` | Integer | The version of the connector. |
+| `protocol` | String | The protocol for the connection. For AWS services such as Amazon SageMaker and Amazon Bedrock, use `aws_sigv4`. For all other services, use `http`. |
+| `parameter` | JSON array | The default connector parameters, including `endpoint` and `model`. 


parameter -> parameters

Type is : Map<String, ?>

including endpointandmodel. -> for example endpointandmodel.

From security team , we should call out all parameters in this block will be overridable in predict request. User can provide parameter with same in predict request to override the default parameter value defined in connector.

ylwu-amzn · 2023-07-22T18:54:15Z

_ml-commons-plugin/connectors.md

+| `version` | Integer | The version of the connector. |
+| `protocol` | String | The protocol for the connection. For AWS services such as Amazon SageMaker and Amazon Bedrock, use `aws_sigv4`. For all other services, use `http`. |
+| `parameter` | JSON array | The default connector parameters, including `endpoint` and `model`. 
+| `credential` | String | Defines any credential variables required to connect to your chosen endpoint. ML Commons uses **AES/GCM/NoPadding** symmetric encryption with a key length of 32 bytes. When a connection cluster first starts, the key persists in OpenSearch. Therefore, you do not need to manually encrypt the key.


Type: Map<string, string>

ML Commons uses **AES/GCM/NoPadding** symmetric encryption with a key length of 32 bytes. When a connection cluster first starts, the key persists in OpenSearch. Therefore, you do not need to manually encrypt the key.
->
ML Commons uses **AES/GCM/NoPadding** symmetric encryption to encrypt your credential. When a connection cluster first starts, ml-commons will create a random 32 bytes key and persist in OpenSearch system index. Therefore, you do not need to manually set the encryption key.

ylwu-amzn · 2023-07-22T18:59:40Z

_ml-commons-plugin/connectors.md

+| `protocol` | String | The protocol for the connection. For AWS services such as Amazon SageMaker and Amazon Bedrock, use `aws_sigv4`. For all other services, use `http`. |
+| `parameter` | JSON array | The default connector parameters, including `endpoint` and `model`. 
+| `credential` | String | Defines any credential variables required to connect to your chosen endpoint. ML Commons uses **AES/GCM/NoPadding** symmetric encryption with a key length of 32 bytes. When a connection cluster first starts, the key persists in OpenSearch. Therefore, you do not need to manually encrypt the key.
+| `action` | JSON array | Tells the connector what actions to run after a connection to ML Commons has been established.


Tells the connector what actions to run after a connection to ML Commons has been established.
->
Define what actions can run within this connector.

ylwu-amzn · 2023-07-22T19:00:57Z

_ml-commons-plugin/connectors.md

+`action_type` | String | Required. Sets the ML Commons API operation to use upon connection. As of OpenSearch 2.9, only `predict` is supported. 
+`method` | String | Required. Defines the HTTP method for the API call. Supports `POST` and `GET`.
+`url` | String | Required. Sets the connection endpoint at which the action takes place. This must match the regex expression for the connection used when [adding trusted endpoints](#adding-trusted-endpoints).
+`headers` | String | Sets the headers used inside the request or response body. Default is `application/json`.


Type: Map<String, String>

Default is application/json
->
Default "Content-Type" is "application/json"

ylwu-amzn · 2023-07-22T19:04:12Z

_ml-commons-plugin/connectors.md

+
+### Standalone connector
+
+The connector creation API, `/_plugins/_ml/connectors/_create`, creates connections to third-party ML tools. Using the `endpoint` parameter, you can connect ML Commons to any supported ML tool using its specific API endpoint. For example, to connect to a ChatGPT completion model, you can connect using the `api.openai.com`, as shown in the following example:


For example, to connect to a ChatGPT completion model,

->

For example, to connect to a ChatGPT chat model,

ylwu-amzn · 2023-07-22T19:04:45Z

_ml-commons-plugin/connectors.md

+```
+{% include copy-curl.html %}
+
+If successful, the connector API responds with a `connector_id` and `status` for the connection:


Remote this

and status

ylwu-amzn · 2023-07-22T19:06:13Z

_ml-commons-plugin/connectors.md

+- You can use `model_group_id` to register a model version to an existing model group.
+- If you do not use `model_group_id`, ML Commons creates a model with a new model group.
+
+The following example registers a model named `openAI-GPT-3.5 completions`:


The following example registers a model named openAI-GPT-3.5 completions:

->

The following example registers a model named openAI-gpt-3.5-turbo:

ylwu-amzn · 2023-07-22T19:07:57Z

_ml-commons-plugin/connectors.md

+{
+    "name": "openAI-gpt-3.5-turbo",
+    "function_name": "remote",
+    "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",


Suggest add the example for creating this model group

POST /_plugins/_ml/model_groups/_register { "name": "remote_model_group", "description": "This is an example description" } Sample response { "model_group_id": "wlcnb4kBJ1eYAeTMHlV6", "status": "CREATED" }

ylwu-amzn · 2023-07-22T19:10:00Z

_ml-commons-plugin/connectors.md

+POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_predict
+{
+  "parameters": {
+    "model": "gpt-3.5-turbo",


We can remove this line, as model already defined in connector

ylwu-amzn · 2023-07-22T19:17:29Z

_ml-commons-plugin/connectors.md

+
+```json
+POST /_plugins/_ml/models/_register
+{


This is not chat model example. Let's keep consistent with the standalone connector.

ylwu-amzn · 2023-07-22T19:21:09Z

_ml-commons-plugin/connectors.md

+}
+```
+
+After creating the connector, you can retrieve the `task_id`, deploy the model, and use the Predict API, similar to a standalone connector.


After creating the connector, you can retrieve the task_id, deploy the model, and use the Predict API, similar to a standalone connector.

->

After creating the connector, you can use the connector id to register model, deploy and predict.

ylwu-amzn · 2023-07-22T19:26:04Z

_ml-commons-plugin/connectors.md

+The `paramaters` section requires the following options when using `aws-sigv4` authentication:
+
+- `region`: The AWS Region in which the AWS instance is located.
+- `service_name`: The name of the AWS service for the connector.


As all items defined in parameters will be overridable and visible (user can see parameters with get connector API). If user don't want this, they can move these two reserved parameters for aws_sigv4 to credential.

ylwu-amzn · 2023-07-22T19:26:25Z

_ml-commons-plugin/connectors.md

+- `secret_key`: Required. Provides the secret key for the AWS instance.
+- `session_token`: Optional. Provides a temporary set of credentials for the AWS instance.
+
+The `paramaters` section requires the following options when using `aws-sigv4` authentication:


aws-sigv4
->
aws_sigv4

Please also check all other places

ylwu-amzn · 2023-07-24T19:12:34Z

_ml-commons-plugin/connectors.md

@@ -0,0 +1,461 @@
+---


We should add some security warning for remote connector. As user need to configure credential in connector, for example AWS credential, openAI API key, user should always use security enabled cluster. Otherwise their credentials will not be protected which is risky.

dylan-tong-aws · 2023-07-24T22:20:02Z

@natebower any mention of "SageMaker" should be "Amazon SageMaker". I saw some instances of "AWS SageMaker" that needs to be corrected. The same for ChatGPT. I believe we our standard is to use the full name, "OpenAI ChatGPT".

natebower · 2023-07-25T12:13:50Z

@natebower any mention of "SageMaker" should be "Amazon SageMaker". I saw some instances of "AWS SageMaker" that needs to be corrected. The same for ChatGPT. I believe we our standard is to use the full name, "OpenAI ChatGPT".

@dylan-tong-aws Re: "Amazon SageMaker", that's correct, thanks for the callout. @Naarcha-AWS Can you please global find and replace? Re: ChatGPT, we don't actually have a standard for this, but given that OpenAI simply uses "ChatGPT", I'd prefer that we follow suit. Thanks!

* Add Connectors and ML updates for 2.9 Signed-off-by: Naarcha-AWS <[email protected]> * Fix code block Signed-off-by: Naarcha-AWS <[email protected]> * Add Connectors and ML updates for 2.9 Signed-off-by: Naarcha-AWS <[email protected]> * Fix code block Signed-off-by: Naarcha-AWS <[email protected]> * Add connector settings and examples Signed-off-by: Naarcha-AWS <[email protected]> * Add GA warning Signed-off-by: Naarcha-AWS <[email protected]> * Add final experimental warning Signed-off-by: Naarcha-AWS <[email protected]> * Address tech review. Fix typos Signed-off-by: Naarcha-AWS <[email protected]> * Fix bad link. Add next steps section Signed-off-by: Naarcha-AWS <[email protected]> * Fix typo Signed-off-by: Naarcha-AWS <[email protected]> * Update cluster-settings.md Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Update _ml-commons-plugin/connectors.md Signed-off-by: Naarcha-AWS <[email protected]> * Change cluster values for boolean. Fix typo. Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Fix cluser settings Signed-off-by: Naarcha-AWS <[email protected]> * Add missing config options. More technical feedback. Signed-off-by: Naarcha-AWS <[email protected]> * Adjust cluster setting description. Signed-off-by: Naarcha-AWS <[email protected]> * Add updated ChatGPT example Signed-off-by: Naarcha-AWS <[email protected]> * Add info and example for internal connector. Signed-off-by: Naarcha-AWS <[email protected]> * One last adjustment. Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Fix dead link Signed-off-by: Naarcha-AWS <[email protected]> * Fix one last comment. Signed-off-by: Naarcha-AWS <[email protected]> * change ordered list to numbered. Signed-off-by: Naarcha-AWS <[email protected]> --------- Signed-off-by: Naarcha-AWS <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> Co-authored-by: kolchfa-aws <[email protected]> Co-authored-by: Nathan Bower <[email protected]>

Naarcha-AWS added 2 commits July 13, 2023 11:55

Add Connectors and ML updates for 2.9

0296df9

Signed-off-by: Naarcha-AWS <[email protected]>

Fix code block

e6d2dcd

Signed-off-by: Naarcha-AWS <[email protected]>

Naarcha-AWS added the 3 - Tech review PR: Tech review in progress label Jul 13, 2023

Naarcha-AWS requested review from cwillum, hdhalter, kolchfa-aws, vagimeli, ananzh, seanneumann, AMoo-Miki and natebower as code owners July 13, 2023 16:59

Naarcha-AWS self-assigned this Jul 13, 2023

Naarcha-AWS added 2 commits July 13, 2023 12:18

Add Connectors and ML updates for 2.9

bf3ced8

Signed-off-by: Naarcha-AWS <[email protected]>

Fix code block

492ecd7

Signed-off-by: Naarcha-AWS <[email protected]>

hdhalter added v2.9.0 release-notes PR: Include this PR in the automated release notes labels Jul 13, 2023

Naarcha-AWS added 3 commits July 17, 2023 13:08

Add connector settings and examples

f219674

Signed-off-by: Naarcha-AWS <[email protected]>

Add GA warning

a5c70e3

Signed-off-by: Naarcha-AWS <[email protected]>

Add final experimental warning

4c39f11

Signed-off-by: Naarcha-AWS <[email protected]>

Zhangxunmt reviewed Jul 17, 2023

View reviewed changes

_ml-commons-plugin/connectors.md Outdated Show resolved Hide resolved

Zhangxunmt reviewed Jul 17, 2023

View reviewed changes

_ml-commons-plugin/connectors.md Outdated Show resolved Hide resolved

Address tech review. Fix typos

38b2cdd

Signed-off-by: Naarcha-AWS <[email protected]>

Naarcha-AWS requested a review from Zhangxunmt July 18, 2023 16:37

Naarcha-AWS added 2 commits July 18, 2023 09:56

Fix bad link. Add next steps section

f3343df

Signed-off-by: Naarcha-AWS <[email protected]>

Fix typo

52466cc

Signed-off-by: Naarcha-AWS <[email protected]>

Naarcha-AWS added 4 - Doc review PR: Doc review in progress and removed 3 - Tech review PR: Tech review in progress labels Jul 18, 2023

kolchfa-aws approved these changes Jul 18, 2023

View reviewed changes

Zhangxunmt reviewed Jul 18, 2023

View reviewed changes

_ml-commons-plugin/connectors.md Outdated Show resolved Hide resolved

_ml-commons-plugin/connectors.md Outdated Show resolved Hide resolved

STYLE_GUIDE.md Outdated Show resolved Hide resolved

Update cluster-settings.md

42c4a6c

Signed-off-by: Naarcha-AWS <[email protected]>

natebower reviewed Jul 19, 2023

View reviewed changes

_ml-commons-plugin/connectors.md Outdated Show resolved Hide resolved

_ml-commons-plugin/connectors.md Outdated Show resolved Hide resolved

_ml-commons-plugin/connectors.md Outdated Show resolved Hide resolved

_ml-commons-plugin/connectors.md Outdated Show resolved Hide resolved

Naarcha-AWS and others added 5 commits July 19, 2023 15:48

One last adjustment.

d60c4c0

Signed-off-by: Naarcha-AWS <[email protected]>

Apply suggestions from code review

e26a66c

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]>

Fix dead link

e5f3f21

Signed-off-by: Naarcha-AWS <[email protected]>

Fix one last comment.

afa2fca

Signed-off-by: Naarcha-AWS <[email protected]>

change ordered list to numbered.

8eb225c

Signed-off-by: Naarcha-AWS <[email protected]>

Naarcha-AWS merged commit 95d117f into main Jul 19, 2023

Naarcha-AWS deleted the ml-connectors branch July 19, 2023 23:35

ylwu-amzn reviewed Jul 22, 2023

View reviewed changes

ylwu-amzn reviewed Jul 24, 2023

View reviewed changes

ylwu-amzn mentioned this pull request Jul 25, 2023

Add ML connector edits #4636

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Connectors and ML updates for 2.9 #4554

Add Connectors and ML updates for 2.9 #4554

Naarcha-AWS commented Jul 13, 2023

kolchfa-aws left a comment

natebower left a comment

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 24, 2023

dylan-tong-aws commented Jul 24, 2023

natebower commented Jul 25, 2023


		### Adding trusted endpoints

		To configure connectors in OpenSearch, add the trusted endpoints to your cluster settings using the `plugins.ml_commons.trusted_connector_endpoints_regex` setting, which supports Java regex expressions, as shown in the following example:


		### Enabling ML nodes

		Most connectors require the use of dedicated ML nodes. To make sure you have ML nodes enabled, update the following cluster settings:


		### Standalone connector

		The connector creation API, `/_plugins/_ml/connectors/_create`, creates connections to third-party ML tools. Using the `endpoint` parameter, you can connect ML Commons to any supported ML tool using its specific API endpoint. For example, to connect to a ChatGPT completion model, you can connect using the `api.openai.com`, as shown in the following example:

Add Connectors and ML updates for 2.9 #4554

Add Connectors and ML updates for 2.9 #4554

Conversation

Naarcha-AWS commented Jul 13, 2023

Checklist

kolchfa-aws left a comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ylwu-amzn Jul 22, 2023 • edited Loading

Choose a reason for hiding this comment

ylwu-amzn Jul 22, 2023 • edited Loading

Choose a reason for hiding this comment

ylwu-amzn Jul 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ylwu-amzn Jul 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ylwu-amzn Jul 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ylwu-amzn Jul 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dylan-tong-aws commented Jul 24, 2023

natebower commented Jul 25, 2023

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023 •

edited

Loading

ylwu-amzn Jul 22, 2023 •

edited

Loading