Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Lucene 9.5 codec and make it new default #700

Merged

Conversation

martin-gaievski
Copy link
Member

Signed-off-by: Martin Gaievski [email protected]

Description

Adding new kNN codec that wraps Lucene new codec that is part of Lucene 9.5 and latest core. Existing 9.4 codec doesn't work as we use it for both reads and writes, but writes only allowed for latest codec version. This already blocking PRs for main in our CI.

Backport is not required as core 2.x is still on Lucene 9.4

Check List

  • New functionality includes testing.
    • All tests pass
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@martin-gaievski martin-gaievski added the Maintenance Add support for new versions of OpenSearch/Dashboards from upstream label Jan 4, 2023
@martin-gaievski martin-gaievski marked this pull request as ready for review January 4, 2023 00:35
@martin-gaievski martin-gaievski requested a review from a team January 4, 2023 00:35
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we mention in https://github.com/opensearch-project/k-NN/blob/main/DEVELOPER_GUIDE.md#codec-versioning that it should be in sync with OpenSearch's codec?

@martin-gaievski
Copy link
Member Author

Can we mention in https://github.com/opensearch-project/k-NN/blob/main/DEVELOPER_GUIDE.md#codec-versioning that it should be in sync with OpenSearch's codec?

Yes, let me update the doc, in addition to version sync we moved lot of logic from factory to codec version class.

@jmazanec15 jmazanec15 added the v2.5.0 'Issues and PRs related to version v2.5.0' label Jan 4, 2023
Signed-off-by: Martin Gaievski <[email protected]>
@martin-gaievski martin-gaievski removed the v2.5.0 'Issues and PRs related to version v2.5.0' label Jan 4, 2023
testMultiFieldsKnnIndex(KNN950Codec.builder().delegate(V_9_5_0.getDefaultCodecDelegate()).build());
}

public void testBuildFromModelTemplate() throws InterruptedException, ExecutionException, IOException {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see if you want to use @SneakyThrows to remove these exceptions from method signature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, nice advise, let me replace list of exceptions with a sneakythrows

@@ -74,9 +77,24 @@ public enum KNNCodecVersion {
.knnVectorsFormat(new KNN940PerFieldKnnVectorsFormat(Optional.ofNullable(mapperService)))
.build(),
KNN940Codec::new
),

V_9_5_0(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Question]: The other enums which are present in the like V_9_4_0, are they getting used somewhere? if not lets remove them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are referenced from codec classes, like here. So if we keep older codecs we also need that other enum members.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarification - in enum we do have only codec versions starting from 9.1 (9.1.0, 9.2.0, 9.4.0). There are no references for older codecs, so it's not a problem if we decide to drop them in future to be consistent with what core OS supports.

new KNN80DocValuesFormat(delegate.docValuesFormat()),
new KNN80CompoundFormat(delegate.compoundFormat())
),
(userCodec, mapperService) -> KNN940Codec.builder()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be 95? KNN940Codec

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, my bad, correcting it now

@martin-gaievski martin-gaievski merged commit 4f9a8b2 into opensearch-project:main Jan 4, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-700-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 4f9a8b296241c316bbc2d7dbfe67f20a59893302
# Push it to GitHub
git push --set-upstream origin backport/backport-700-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-700-to-2.x.

naveentatikonda pushed a commit to naveentatikonda/k-NN that referenced this pull request Feb 2, 2023
* Add Lucene 9.5 codec and make it new default

Signed-off-by: Martin Gaievski <[email protected]>
@naveentatikonda naveentatikonda mentioned this pull request Feb 2, 2023
2 tasks
naveentatikonda added a commit that referenced this pull request Feb 3, 2023
* Update lucene94 package

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add Lucene 9.5 codec and make it new default (#700)

* Add Lucene 9.5 codec and make it new default

Signed-off-by: Martin Gaievski <[email protected]>

* Update tests for backwards codecs (#710)

Updates tests for backwards codecs to prevent backwards codecs from
writing with a read only codec.

Signed-off-by: John Mazanec <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>
Co-authored-by: John Mazanec <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Maintenance Add support for new versions of OpenSearch/Dashboards from upstream
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants