Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow input null for text docs input #1401

Merged
merged 1 commit into from
Sep 27, 2023

Conversation

ylwu-amzn
Copy link
Collaborator

Description

We need to support null value for some use case, for example multi-modal model needs text and image input. User who use TextDocsInputDataSet can use such format [text, image], the first item is text and second is image, but they could be null, for example user may only input image, the input will be [null, image]

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 18:39 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 18:39 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 18:39 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 18:39 — with GitHub Actions Inactive
@@ -114,7 +114,11 @@ public TextDocsMLInput(XContentParser parser, FunctionName functionName) throws
case TEXT_DOCS_FIELD:
ensureExpectedToken(XContentParser.Token.START_ARRAY, parser.currentToken(), parser);
while (parser.nextToken() != XContentParser.Token.END_ARRAY) {
docs.add(parser.text());
if (parser.currentToken() == null || parser.currentToken() == XContentParser.Token.VALUE_NULL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the input is [null, null]?

Copy link
Collaborator Author

@ylwu-amzn ylwu-amzn Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This depends on the model. If model can't accept, it will throw exception.

@codecov
Copy link

codecov bot commented Sep 27, 2023

Codecov Report

Merging #1401 (bc86c39) into main (358354c) will decrease coverage by 1.26%.
Report is 6 commits behind head on main.
The diff coverage is 57.98%.

@@             Coverage Diff              @@
##               main    #1401      +/-   ##
============================================
- Coverage     78.82%   77.56%   -1.26%     
- Complexity     2145     2178      +33     
============================================
  Files           168      173       +5     
  Lines          8755     8977     +222     
  Branches        878      889      +11     
============================================
+ Hits           6901     6963      +62     
- Misses         1455     1611     +156     
- Partials        399      403       +4     
Flag Coverage Δ
ml-commons 77.56% <57.98%> (-1.26%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...rg/opensearch/ml/client/MachineLearningClient.java 100.00% <100.00%> (ø)
...earch/ml/engine/algorithms/TextEmbeddingModel.java 100.00% <100.00%> (ø)
...thms/sparse_encoding/SparseEncodingTranslator.java 100.00% <100.00%> (ø)
...rse_encoding/TextEmbeddingSparseEncodingModel.java 100.00% <100.00%> (ø)
...ing/HuggingfaceTextEmbeddingServingTranslator.java 100.00% <ø> (ø)
...NNXSentenceTransformerTextEmbeddingTranslator.java 67.04% <ø> (ø)
...ng/SentenceTransformerTextEmbeddingTranslator.java 100.00% <100.00%> (+2.38%) ⬆️
...rithms/text_embedding/TextEmbeddingDenseModel.java 91.30% <100.00%> (ø)
...va/org/opensearch/ml/engine/utils/ScriptUtils.java 90.00% <100.00%> (+13.52%) ⬆️
...tion/prediction/TransportPredictionTaskAction.java 78.26% <100.00%> (+0.48%) ⬆️
... and 10 more

... and 2 files with indirect coverage changes

@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 20:14 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 20:14 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 20:14 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 20:14 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 20:14 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env September 27, 2023 20:14 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn merged commit f1dd56b into opensearch-project:main Sep 27, 2023
9 of 11 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-1401-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 f1dd56bb46554c6c0469c5c0222cf618ef1031b3
# Push it to GitHub
git push --set-upstream origin backport/backport-1401-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-1401-to-2.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants