Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update archival indices logic to support ES 7 indices #116565

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
95ba9c4
WIP
cbuescher Nov 6, 2024
d2817fa
Trying to add new qa test project for 7x archival indices
cbuescher Nov 21, 2024
1bd696d
iter
cbuescher Nov 22, 2024
85b61cd
Add basic search test
cbuescher Nov 22, 2024
93b95fb
Rework OldMappingIT
cbuescher Nov 25, 2024
6b8177c
iter
cbuescher Nov 25, 2024
b27ab02
Merge branch 'main' into add-bwcLucene87Codec
cbuescher Nov 25, 2024
33f3f36
Use system property for version
cbuescher Nov 25, 2024
dce64b2
Add BWCLucene86Codec
cbuescher Nov 25, 2024
8aae226
Merge branch 'main' into add-bwcLucene87Codec
cbuescher Nov 25, 2024
2f3b6d0
Fix codec name
cbuescher Nov 25, 2024
feaee28
Change version to 7.9.0
cbuescher Nov 26, 2024
becea6b
Merge branch 'main' into add-bwcLucene87Codec
cbuescher Nov 26, 2024
41d69c7
Add modified version of OldRepositoryAccessIT
cbuescher Nov 26, 2024
8b3acb8
Adding back test for source_only repo
cbuescher Nov 26, 2024
55ed6d0
Add looping over versions
cbuescher Nov 26, 2024
1351e2e
Change looping over version
cbuescher Nov 27, 2024
4d35161
Fix cluster version checks
cbuescher Nov 27, 2024
d4be0ac
No snapshot cache for old cluster
cbuescher Nov 27, 2024
9919159
Add DocValueOnlyFieldsIT yaml rest test
cbuescher Nov 27, 2024
2406751
Fix wiring of yaml specs to test tasks
breskeby Nov 27, 2024
85d01c2
Make DocValueOnlyFieldsIT work for V7x
cbuescher Nov 27, 2024
f17527f
Using 7.9.0 instead of 7.16
cbuescher Nov 27, 2024
6cabe33
Merge branch 'main' into add-bwcLucene87Codec
cbuescher Nov 27, 2024
da3db30
Cleanups
cbuescher Nov 27, 2024
4a46d80
Merge branch 'main' into add-bwcLucene87Codec
cbuescher Nov 28, 2024
01ba741
Add _field_names disabling to new tests
cbuescher Nov 28, 2024
3d257dd
Merge branch 'main' into add-bwcLucene87Codec
cbuescher Nov 29, 2024
c7693b3
Add restart cluster test
cbuescher Nov 29, 2024
3bf231d
Merge branch 'main' into add-bwcLucene87Codec
cbuescher Dec 2, 2024
050d195
Pulling in test changes from 117649
cbuescher Dec 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

package org.elasticsearch.xpack.lucene.bwc.codecs;

import org.apache.lucene.backward_codecs.lucene87.Lucene87Codec;
import org.apache.lucene.codecs.Codec;
import org.apache.lucene.codecs.FieldInfosFormat;
import org.apache.lucene.codecs.FieldsConsumer;
Expand All @@ -26,6 +27,7 @@
import org.apache.lucene.index.Terms;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.IOContext;
import org.elasticsearch.xpack.lucene.bwc.codecs.lucene87.BWCLucene87Codec;

import java.io.IOException;
import java.util.ArrayList;
Expand Down Expand Up @@ -118,7 +120,9 @@ private static FieldInfos filterFields(FieldInfos fieldInfos) {
}

public static SegmentInfo wrap(SegmentInfo segmentInfo) {
final Codec codec = segmentInfo.getCodec();
// special handling for Lucene87Codec (which is currently bundled with Lucene)
// Use BWCLucene87Codec instead as that one extends BWCCodec (similar to all other older codecs)
final Codec codec = segmentInfo.getCodec() instanceof Lucene87Codec ? new BWCLucene87Codec() : segmentInfo.getCodec();
final SegmentInfo segmentInfo1 = new SegmentInfo(
segmentInfo.dir,
// Use Version.LATEST instead of original version, otherwise SegmentCommitInfo will bark when processing (N-1 limitation)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

package org.elasticsearch.xpack.lucene.bwc.codecs.lucene87;

import org.apache.lucene.backward_codecs.lucene50.Lucene50CompoundFormat;
import org.apache.lucene.backward_codecs.lucene50.Lucene50LiveDocsFormat;
import org.apache.lucene.backward_codecs.lucene50.Lucene50TermVectorsFormat;
import org.apache.lucene.backward_codecs.lucene60.Lucene60FieldInfosFormat;
import org.apache.lucene.backward_codecs.lucene80.Lucene80DocValuesFormat;
import org.apache.lucene.backward_codecs.lucene80.Lucene80NormsFormat;
import org.apache.lucene.backward_codecs.lucene84.Lucene84PostingsFormat;
import org.apache.lucene.backward_codecs.lucene86.Lucene86PointsFormat;
import org.apache.lucene.backward_codecs.lucene86.Lucene86SegmentInfoFormat;
import org.apache.lucene.backward_codecs.lucene87.Lucene87StoredFieldsFormat;
import org.apache.lucene.codecs.CompoundFormat;
import org.apache.lucene.codecs.DocValuesFormat;
import org.apache.lucene.codecs.FieldInfosFormat;
import org.apache.lucene.codecs.KnnVectorsFormat;
import org.apache.lucene.codecs.LiveDocsFormat;
import org.apache.lucene.codecs.NormsFormat;
import org.apache.lucene.codecs.PointsFormat;
import org.apache.lucene.codecs.PostingsFormat;
import org.apache.lucene.codecs.SegmentInfoFormat;
import org.apache.lucene.codecs.StoredFieldsFormat;
import org.apache.lucene.codecs.TermVectorsFormat;
import org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat;
import org.apache.lucene.codecs.perfield.PerFieldPostingsFormat;
import org.elasticsearch.xpack.lucene.bwc.codecs.BWCCodec;

public class BWCLucene87Codec extends BWCCodec {

private final TermVectorsFormat vectorsFormat = new Lucene50TermVectorsFormat();
private final FieldInfosFormat fieldInfosFormat = wrap(new Lucene60FieldInfosFormat());
private final SegmentInfoFormat segmentInfosFormat = wrap(new Lucene86SegmentInfoFormat());
private final LiveDocsFormat liveDocsFormat = new Lucene50LiveDocsFormat();
private final CompoundFormat compoundFormat = new Lucene50CompoundFormat();
private final PointsFormat pointsFormat = new Lucene86PointsFormat();
private final PostingsFormat defaultFormat;

private final PostingsFormat postingsFormat = new PerFieldPostingsFormat() {
@Override
public PostingsFormat getPostingsFormatForField(String field) {
return BWCLucene87Codec.this.getPostingsFormatForField(field);
}
};

private final DocValuesFormat docValuesFormat = new PerFieldDocValuesFormat() {
@Override
public DocValuesFormat getDocValuesFormatForField(String field) {
return BWCLucene87Codec.this.getDocValuesFormatForField(field);
}
};

private final StoredFieldsFormat storedFieldsFormat;

/** Instantiates a new codec. */
public BWCLucene87Codec() {
super("BWCLucene87Codec");
this.storedFieldsFormat = new Lucene87StoredFieldsFormat(Lucene87StoredFieldsFormat.Mode.BEST_COMPRESSION);
this.defaultFormat = new Lucene84PostingsFormat();
this.defaultDVFormat = new Lucene80DocValuesFormat(Lucene80DocValuesFormat.Mode.BEST_COMPRESSION);
}

@Override
public StoredFieldsFormat storedFieldsFormat() {
return storedFieldsFormat;
}

@Override
public TermVectorsFormat termVectorsFormat() {
return vectorsFormat;
}

@Override
public PostingsFormat postingsFormat() {
return postingsFormat;
}

@Override
public final FieldInfosFormat fieldInfosFormat() {
return fieldInfosFormat;
}

@Override
public SegmentInfoFormat segmentInfoFormat() {
return segmentInfosFormat;
}

@Override
public final LiveDocsFormat liveDocsFormat() {
return liveDocsFormat;
}

@Override
public CompoundFormat compoundFormat() {
return compoundFormat;
}

@Override
public PointsFormat pointsFormat() {
return pointsFormat;
}

@Override
public final KnnVectorsFormat knnVectorsFormat() {
return KnnVectorsFormat.EMPTY;
}

/**
* Returns the postings format that should be used for writing new segments of <code>field</code>.
*
* <p>The default implementation always returns "Lucene84".
*
* <p><b>WARNING:</b> if you subclass, you are responsible for index backwards compatibility:
* future version of Lucene are only guaranteed to be able to read the default implementation.
*/
public PostingsFormat getPostingsFormatForField(String field) {
return defaultFormat;
}

/**
* Returns the docvalues format that should be used for writing new segments of <code>field</code>
* .
*
* <p>The default implementation always returns "Lucene80".
*
* <p><b>WARNING:</b> if you subclass, you are responsible for index backwards compatibility:
* future version of Lucene are only guaranteed to be able to read the default implementation.
*/
public DocValuesFormat getDocValuesFormatForField(String field) {
return defaultDVFormat;
}

@Override
public final DocValuesFormat docValuesFormat() {
return docValuesFormat;
}

private final DocValuesFormat defaultDVFormat;

private final NormsFormat normsFormat = new Lucene80NormsFormat();

@Override
public NormsFormat normsFormat() {
return normsFormat;
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
# 2.0.
#

org.elasticsearch.xpack.lucene.bwc.codecs.lucene87.BWCLucene87Codec
org.elasticsearch.xpack.lucene.bwc.codecs.lucene70.BWCLucene70Codec
org.elasticsearch.xpack.lucene.bwc.codecs.lucene70.Lucene70Codec
org.elasticsearch.xpack.lucene.bwc.codecs.lucene62.Lucene62Codec
Expand Down
9 changes: 9 additions & 0 deletions x-pack/qa/repository-old-versions-7x/build.gradle
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
import org.elasticsearch.gradle.internal.test.RestIntegTestTask
import org.elasticsearch.gradle.Version

apply plugin: 'elasticsearch.internal-java-rest-test'

tasks.named("javaRestTest").configure {
usesDefaultDistribution()
usesBwcDistribution(Version.fromString("7.17.25"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@breskeby that looks simple enough, but going forward I would need several "old" clusters (i.e. 7.0.0, maybe even some other minor 7x versions where the Lucene codecs changed) for the snapshot data setup and then they would need to run against a clean version of the "current" cluster version for the tests.

I'm wondering if I should use something like

buildParams.bwcVersions.withWireCompatible { bwcVersion, baseName ->
where we seem to use the bwcVersion probably coming from the elasticsearch.bwc-test plugin or if we can supply a "fixed" list of versions we want to run as "old" versions here. Would I need to register individual tasks for each "old" version?

Also, does this definition here already "start" the clusters or is that happening later with ElasticsearchCluster.local()[...].build() in the test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go ahead with a fixed list as we did for other "old elasticsearch" tests.
I think the versions you wanna use are listed in that task. That probably needs some tweaking of the RestTestBasePlugin to be supported.

This definition basically only downloads or builds the distribution you wanna use in your tests. they are only started in ElasticsearchCluster.local()[...].build(). Given that. I think we should only have one test task here and do the parameterisation of the versions we test in the test code. e.g. having a parameterised junit test per version under test. the lifecycle of the "current" cluster for that would also live in the test code then.

One problem we might see (haven't completely checked) is that those older cluster only support a limited range of java versions? we probably need to ensure those clusters are wired to the correct java version when starting up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that. I think we should only have one test task here and do the parameterisation of the versions we test in the test code. e.g. having a parameterised junit test per version under test.

I'm not sure I understand. Do you propose to start all "old version" clusters consecutively from only one test suite? That would make it hard to e.g. from gradle only run the tests for e.g 7.0.0 vs. CURRENT, or am I missunderstanding something? Or do you mean we pass in the "old version" we are running for the data setup to the test e.g. via a system property and loop over it somewhere in the build.gradle definition? I would like to avoid having to run all old version tests at once every time, at the same time having the flexibility to use more than one test suite with basically the same fixed setup (ideally the old data setup is just done once).

}
Loading