Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update connector API #1227

Merged
merged 3 commits into from
Sep 12, 2023
Merged

update connector API #1227

merged 3 commits into from
Sep 12, 2023

Conversation

Zhangxunmt
Copy link
Collaborator

@Zhangxunmt Zhangxunmt commented Aug 21, 2023

Description

Update connector API to partially update a connector in the ml-connector index. Verified it works with updating existing fields, adding new fields and deleting fields.

Example:
POST /_plugins/_ml/connectors/_update/4s0vGooBTRq0xHburjWW
{
"credential": {
"openAI_key": "new key"
}
}

GET /_plugins/_ml/connectors/4s0vGooBTRq0xHburjWW
{
...
"credential": {
"openAI_key": "new key"
},
...
}

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ylwu-amzn
Copy link
Collaborator

Github CI failed

@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env August 22, 2023 22:17 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env August 22, 2023 22:17 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env August 22, 2023 22:17 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env August 22, 2023 22:17 — with GitHub Actions Inactive
@codecov
Copy link

codecov bot commented Aug 22, 2023

Codecov Report

Merging #1227 (e0c6276) into 2.x (8cdac91) will increase coverage by 0.17%.
The diff coverage is 98.71%.

@@             Coverage Diff              @@
##                2.x    #1227      +/-   ##
============================================
+ Coverage     79.23%   79.41%   +0.17%     
- Complexity     2246     2270      +24     
============================================
  Files           188      190       +2     
  Lines          9089     9167      +78     
  Branches        904      906       +2     
============================================
+ Hits           7202     7280      +78     
- Misses         1477     1478       +1     
+ Partials        410      409       -1     
Flag Coverage Δ
ml-commons 79.41% <98.71%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...tion/connector/UpdateConnectorTransportAction.java 98.30% <98.30%> (ø)
...rg/opensearch/ml/plugin/MachineLearningPlugin.java 96.21% <100.00%> (+0.02%) ⬆️
...pensearch/ml/rest/RestMLUpdateConnectorAction.java 100.00% <100.00%> (ø)

... and 4 files with indirect coverage changes

@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 18:28 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 18:49 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 18:49 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 19:10 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 19:10 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 20:00 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 20:00 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 20:00 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 20:23 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 20:23 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 20:23 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt temporarily deployed to ml-commons-cicd-env September 12, 2023 20:23 — with GitHub Actions Inactive
@b4sjoo b4sjoo self-requested a review September 12, 2023 20:24
@Zhangxunmt Zhangxunmt merged commit 0855241 into opensearch-project:2.x Sep 12, 2023
SearchRequest searchRequest = new SearchRequest(ML_MODEL_INDEX);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery(MLModel.CONNECTOR_ID_FIELD, connectorId));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check model status ?

Copy link
Collaborator Author

@Zhangxunmt Zhangxunmt Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already included in this search. The line 100 makes sure that the search results matches the model_ids that are in the cache. The size of the model is a concern. As the cache size goes large, the performance may become an issue. But 1) this update API is not a frequent API like invoke, so should be fine for now, and 2) the auto deployment with TTL should resolve this concern in the future after we simplify the APIs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion, I didn't clearly understand why line 100 is relevant here?

Why can't we just search in the model index: where CONNECTOR_ID_FIELD = connectorId and model status = deployed or partially deployed?


public static MLUpdateConnectorRequest parse(XContentParser parser, String connectorId) throws IOException {
Map<String, Object> dataAsMap = null;
dataAsMap = parser.map();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add TODO in that case.

SearchRequest searchRequest = new SearchRequest(ML_MODEL_INDEX);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery(MLModel.CONNECTOR_ID_FIELD, connectorId));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion, I didn't clearly understand why line 100 is relevant here?

Why can't we just search in the model index: where CONNECTOR_ID_FIELD = connectorId and model status = deployed or partially deployed?

if (searchHits.length == 0) {
client.update(updateRequest, getUpdateResponseListener(connectorId, listener, context));
} else {
log.error(searchHits.length + " models are still using this connector, please undeploy the models first!");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like best user experience will be, if we show the deployed model ids also to the customer. Otherwise with this message user again needs to query to find out the deployed models.

Customer might not have any idea which models are deployed now with this connector idea. If we show the model ids, that could be helpful for customer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 100 will use all the deployed and deploying models to search ML-Model index that match "connector_id=". The model id shouldn't be included in the message otherwise app sec reviewer will raise concerns about the customer info leakage in the logs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how a random model id can concern appsec which is not quite a customer info.
Anyway, I was mostly referring to the new MLValidationException( part where we can send these model ids.

The goal is to make this easy for customers. That shouldn't concern appsec at least.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the deployed and deploying models

does this get a inclusive list of all the deployed and deploying models in all the nodes?

@opensearch-trigger-bot
Copy link
Contributor

The backport to main failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-main main
# Navigate to the new working tree
cd .worktrees/backport-main
# Create a new branch
git switch --create backport/backport-1227-to-main
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 0855241ed55523f88fd11f188a331a013b1c7128
# Push it to GitHub
git push --set-upstream origin backport/backport-1227-to-main
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-main

Then, create a pull request where the base branch is main and the compare/head branch is backport/backport-1227-to-main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants