Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-12831: Add PutOpenSearchVector and QueryOpenSearchVector processors #8441

Closed
wants to merge 5 commits into from

Conversation

mark-bathori
Copy link
Contributor

Summary

NIFI-12831

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on these new components @mark-bathori. This highlights the need for moving Python Processors to a separate repository, but that doesn't need to prevent this from going forward.

On a cursory review, I noted one security concern related to certificate verification. We should not support disabling certificate verification as it provides a fundamental security check for TLS communication.

Comment on lines 71 to 78
VERIFY_CERTIFICATES = PropertyDescriptor(
name="Verify Certificates",
description="The password to use for authenticating to OpenSearch server",
allowable_values=["true", "false"],
default_value="false",
required=False,
validators=[StandardValidators.NON_EMPTY_VALIDATOR]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In keeping with practices in other Processors, we should not support disabling certificate verification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment @exceptionfactory, I'll remove this property.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

ENGINE_VALUES = dict([NMSLIB, FAISS, LUCENE])

# Space types
L2 = ("L2 (Euclidean distance)", "l2")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The space types (L2, L1, LINF, COSINESIMIL) seem to be the same for PutOpenSearchVector.py and QueryOpenSearchVector.py, those can be extracted to OpenSearchVectorUtils.py.

)
VECTOR_FIELD = PropertyDescriptor(
name="Vector Field Name",
description="The name of Document field where the embeddings are stored. This field need to be a 'knn_vector' typed field.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use "document" here as well (as it is done in other descriptions)

@lordgamez lordgamez self-requested a review March 11, 2024 12:14
Copy link
Contributor

@lordgamez lordgamez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@asfgit asfgit closed this in b608e5a May 2, 2024
shubhluck pushed a commit to shubhluck/nifi that referenced this pull request Jun 1, 2024
shubhluck pushed a commit to shubhluck/nifi that referenced this pull request Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants