Expanded backwards compatibility coverage proposal #360

jmazanec15 · 2022-04-11T17:57:53Z

Overview

We have had several bugs recently related to issues with backwards compatibility (#353 , #308 , #255). So, I think we need to improve our bwc coverage.

We already have the framework setup and some tests working.

As part of expanding our coverage, we should split tests up between

full restart tests
rolling upgrade tests
mixed cluster tests

This is what OpenSearch does and will allow us to better organize our coverage.

Below are the areas I believe we should add coverage around.

Full Restart Tests

Full restart tests are concerned with index compatibility between the old version of the cluster and new version.

Mapping Checks

There are 3 kinds of mappings we support for knn_vector:

legacy mapping
method mapping
model mapping

For details, refer here.

For each type, we need to ensure that the new cluster can load the mapping with:

default parameters
user provided parameters

Indexing Checks

Ensure that for an index containing knn_vector field created in the old version:

New documents can be added to this index from the new cluster
Segments from the old index and new index can be successfully merged together.

Search Checks

Ensure that an index created on old cluster can be searched from new cluster for:

Painless scripting
Score script
ANN search

Warmup

Ensure that an index created on the old cluster can be warmed up successfully with the new cluster

Training

Ensure that a model that was created in the old cluster can be used to create an index in the new cluster and documents can be added to the index.

Rolling Upgrade Tests

Rolling upgrades will ensure that the cluster can operate normally as each node is upgraded to the next version. Much of the upgrade functionality is covered by Full Restart Tests already, so this set of tests will cover a smaller subset of cases

Index Lifecycle

An index with a knn_vector field should be created on the old version of the cluster at the beginning of the test. Then, at each stage in the test, documents should be added to the index and the index should be searchable.

Mixed Cluster

Mixed cluster ensures that the cluster functions as expected when there are nodes from the old and new cluster.

Stats

Ensure stats API returns all stats when cluster is mixed

Training

Ensure a model can be trained when the cluster is mixed.

Warmup

Ensure an index with a knn_vector field can be warmed up successfully

Index Life cycle

Ensure indexes with a knn_vector field can be created with mixed nodes in the cluster. Ensure documents can be added to the index. Ensure the index can be searched.

Feedback

Please comment any cases I may have missed or questions/comments you may have.

The text was updated successfully, but these errors were encountered:

naveentatikonda · 2022-04-11T19:00:58Z

We can add some delete index operations in Rolling Upgrade Tests like:

Creating index in Old cluster
Deleting that index in Mixed cluster
Validating it in Upgraded Cluster

martin-gaievski · 2022-04-11T20:07:50Z

Is this split between full restart tests, rolling upgrade tests and mixed cluster tests going to be logical only or we will reflect it in the code structure/environment setup etc.? Currently everything is in one class, that scales pretty bad.

Do you have a vision of adding test process? I think now we add them after the bug has been discovered, is there something we can do here more proactively? Maybe scan issues and add missing tests as part of the release process - it can be one gate.

jmazanec15 · 2022-04-11T20:26:18Z

We can add some delete index operations in Rolling Upgrade Tests like:
1. Creating index in Old cluster

2. Deleting that index in Mixed cluster

3. Validating it in Upgraded Cluster

I think we can add a test for this, yes.

jmazanec15 · 2022-04-11T20:33:06Z

Is this split between full restart tests, rolling upgrade tests and mixed cluster tests going to be logical only or we will reflect it in the code structure/environment setup etc.? Currently everything is in one class, that scales pretty bad.

OpenSearch does it so that each test type is its own separate gradle subproject. I like this approach and think we should follow it. Additionally, I think we should create abstract test cases for each type and then separate the tests into their own classes to organize the tests.

Do you have a vision of adding test process? I think now we add them after the bug has been discovered, is there something we can do here more proactively? Maybe scan issues and add missing tests as part of the release process - it can be one gate.

Good question. I think this issue contains a lot of proactive components. I think we can keep thinking of cases as we go. In addition, I think if a PR makes a change that is not covered by BWC tests, we should require it to be added. Scanning issues makes sense - but I think this is more of a one time thing as opposed to every release.

vamshin · 2022-04-12T17:08:40Z

Mapping Checks
There are 3 kinds of mappings we support for knn_vector:
legacy mapping
method mapping
model mapping

We should I believe think about deprecating legacy mapping so we could minimize permutations on different mappings for testing.

For each of the category like full restart, rolling upgrade, mixed, we should have one test that completes the full life cycle of knn index after upgrades as mentioned below

Able to add new document to existing index coming from older version
Able to edit existing document in the upgraded index
Able to delete document in the upgraded index
Able to forcemerge after indexing X documents in upgraded index
Able to run knn search on upgraded index
Repeat above steps after creating a new index on upgraded cluster.

I would move above operations to some base class and make them run for any new category of upgrade.

We should also have some tests that create index with default values in older version and able to upgrade to new version and perform all CRUD operations similar to "Indexing Checks" section.

martin-gaievski · 2022-05-24T16:55:26Z

@jmazanec15 what the target plugin version for this issue? I set 2.1.0 as of now (need to close 2.0-ga release checklist), feel free to change it to other version if needed.

jmazanec15 · 2022-05-24T18:40:06Z

@martin-gaievski I think this is done and can be closed.

This was referenced Apr 14, 2022

Refactor BWC framework into sub-projects #362

Merged

Add Abstraction for BWC tests with a basic test #367

Merged

naveentatikonda mentioned this issue Apr 27, 2022

Add restart upgrade BWC Tests #377

Merged

2 tasks

This was referenced May 5, 2022

Add Rolling-Upgrade BWC Tests #386

Merged

Add support for multiple BWC Versions #389

Merged

martin-gaievski added the v2.1.0 label May 24, 2022

jmazanec15 closed this as completed May 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expanded backwards compatibility coverage proposal #360

Expanded backwards compatibility coverage proposal #360

jmazanec15 commented Apr 11, 2022

naveentatikonda commented Apr 11, 2022 •

edited

Loading

martin-gaievski commented Apr 11, 2022

jmazanec15 commented Apr 11, 2022

jmazanec15 commented Apr 11, 2022

vamshin commented Apr 12, 2022

martin-gaievski commented May 24, 2022

jmazanec15 commented May 24, 2022

Expanded backwards compatibility coverage proposal #360

Expanded backwards compatibility coverage proposal #360

Comments

jmazanec15 commented Apr 11, 2022

Overview

Full Restart Tests

Mapping Checks

Indexing Checks

Search Checks

Warmup

Training

Rolling Upgrade Tests

Index Lifecycle

Mixed Cluster

Stats

Training

Warmup

Index Life cycle

Feedback

naveentatikonda commented Apr 11, 2022 • edited Loading

martin-gaievski commented Apr 11, 2022

jmazanec15 commented Apr 11, 2022

jmazanec15 commented Apr 11, 2022

vamshin commented Apr 12, 2022

martin-gaievski commented May 24, 2022

jmazanec15 commented May 24, 2022

naveentatikonda commented Apr 11, 2022 •

edited

Loading