Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure in WatcherRestartIT.testWatcherRestart #69918

Closed
danhermann opened this issue Mar 3, 2021 · 14 comments
Closed

Failure in WatcherRestartIT.testWatcherRestart #69918

danhermann opened this issue Mar 3, 2021 · 14 comments
Assignees
Labels
:Data Management/Watcher Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI

Comments

@danhermann
Copy link
Contributor

danhermann commented Mar 3, 2021

Build scan: https://gradle-enterprise.elastic.co/s/pgbuawfavusl2

Repro line: ./gradlew ':x-pack:qa:rolling-upgrade:v7.9.1#oneThirdUpgradedTest' -Dtests.class="org.elasticsearch.upgrades.WatcherRestartIT" -Dtests.method="testWatcherRestart" -Dtests.seed=3DE603A8143C17BA -Dtests.security.manager=true -Dtests.bwc=true -Dtests.locale=fr-CA -Dtests.timezone=Africa/Tripoli -Druntime.java=8

Reproduces locally?: No

Applicable branches: 7.x, 7.11, 7.12

Failure history: Failing pretty frequently as of the morning of 3/3/21

Failure excerpt:


java.lang.AssertionError: |  
-- | --
  | Expected: not a string containing "\"watcher_state\":\"stopped\"" |  
  | but: was "{\"_nodes\":{\"total\":3,\"successful\":3,\"failed\":0},\"cluster_name\":\"v7.9.1\",\"manually_stopped\":false,\"stats\":[{\"node_id\":\"qEEjrtXgSVqMj7QTR-ec6w\",\"watcher_state\":\"started\",\"watch_count\":0,\"execution_thread_pool\":{\"queue_size\":0,\"max_size\":0}},{\"node_id\":\"zxGWpG4yRSO_e2B8k2Xyqw\",\"watcher_state\":\"stopped\",\"watch_count\":0,\"execution_thread_pool\":{\"queue_size\":0,\"max_size\":0}},{\"node_id\":\"orv9jd8cTkOqJNipcPYfNg\",\"watcher_state\":\"started\",\"watch_count\":0,\"execution_thread_pool\":{\"queue_size\":0,\"max_size\":1}}]}"


at __randomizedtesting.SeedInfo.seed([3DE603A8143C17BA:A8BE0D275F20CEE3]:0) |  
-- | --
  |   | at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) |  
  |   | at org.junit.Assert.assertThat(Assert.java:956) |  
  |   | at org.junit.Assert.assertThat(Assert.java:923) |  
  |   | at org.elasticsearch.upgrades.WatcherRestartIT.lambda$ensureWatcherStarted$2(WatcherRestartIT.java:153) |  
  |   | at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1014) |  
  |   | at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:987) |  
  |   | at org.elasticsearch.upgrades.WatcherRestartIT.ensureWatcherStarted(WatcherRestartIT.java:147) |  
  |   | at org.elasticsearch.upgrades.WatcherRestartIT.testWatcherRestart(WatcherRestartIT.java:40)



@danhermann danhermann added >test-failure Triaged test failures from CI :Data Management/Watcher labels Mar 3, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Mar 3, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@mark-vieira
Copy link
Contributor

If I recall I think the suspicion here was the this failing Watcher test was causing the other upgrade tests to fail as well. It seems we still are having failings in 7.12 BWC tests against 7.9 clusters even with the watcher test muted.

https://gradle-enterprise.elastic.co/s/ucgpf5dtdipna

@danhermann
Copy link
Contributor Author

danhermann commented Mar 4, 2021

If I recall I think the suspicion here was the this failing Watcher test was causing the other upgrade tests to fail as well. It seems we still are having failings in 7.12 BWC tests against 7.9 clusters even with the watcher test muted.

https://gradle-enterprise.elastic.co/s/ucgpf5dtdipna

Yes, those test failures now appear independent of the watcher test failure.

@mark-vieira
Copy link
Contributor

There's a number of tests failing with this though, so it seems it might still be something generally watcher-esque.

java.lang.AssertionError: Failure in test setup: Failed to initialize at least 3 watcher nodes

This seems to be busted in both 7.x and 7.12 branches and specific to the BWC tests against the 7.9.x series.

@martijnvg
Copy link
Member

This is watcher related. I will mute this assertions in waitForWatcher() method for now.

A fix was pushes yesterday around when watcher templates are updated.
I somehow wasn't able to reproduce it locally. Not sure why...
But now I can and I found that the fix that was pushed needs correcting.
In the meantime we should mute these watcher assertions.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Mar 4, 2021
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Mar 4, 2021
* The WatcherIndexTemplateRegistry moved from legacy templates to composable index templates in version 7.10.0 and not 7.9.0
* The WatcherIndexTemplateRegistry#validate(...) method should only care whether a template for watcher history indices exist and
not whether that template is a composable index template or a legacy template. This shouldn't matter whether to determine if watcher can be started on a node. The content of the templates didn't change in a breaking manner (since version 6.8.0).

Closes elastic#69918
martijnvg added a commit that referenced this issue Mar 5, 2021
* The WatcherIndexTemplateRegistry moved from legacy templates to composable index templates in version 7.10.0 and not 7.9.0
* The WatcherIndexTemplateRegistry#validate(...) method should only care whether a template for watcher history indices exist and
not whether that template is a composable index template or a legacy template. This shouldn't matter whether to determine if watcher can be started on a node. The content of the templates didn't change in a breaking manner (since version 6.8.0).

Should resolve #69918
@martijnvg
Copy link
Member

After merging #69998, it looks like the WatcherRestartIT and the after test method UpgradeClusterClientYamlTestSuiteIT#waitForWatcher() didn't fail. However the latest bwc ci run for 7.x branch did encounter other test suite timeouts and a watcher related assertion error (the assertion need to be relaxed because it isn't wrapped in an assertBusy (not possible in yaml tests) and it is possible that watcher on a specific node may have stopped due to shard relocation).

@martijnvg
Copy link
Member

Observations from: https://gradle-enterprise.elastic.co/s/ciwpar45g6ev6/

  • The same change we made to WatcherIndexTemplateRegistry should also be made to DeprecationIndexingTemplateRegistry:
[2021-03-05T08:57:59,016][ERROR][o.e.x.c.t.IndexTemplateRegistry] [v7.0.1-0] error adding index template [.deprecation-indexing-settings] from [/org/elasticsearch/xpack/deprecation/deprecation-indexing-settings.json] for [deprecation] |  
-- | --
  | »  org.elasticsearch.transport.RemoteTransportException: [v7.0.1-1][127.0.0.1:32868][cluster:admin/component_template/put] |  
  | »  Caused by: org.elasticsearch.transport.ActionNotFoundTransportException: No handler for action [cluster:admin/component_template/put] |  
  | »  	at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1023) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT]
  • Serialization errors:

[2021-03-05T08:58:03,548][WARN ][o.e.c.c.PublicationTransportHandler] [v7.0.1-2] unexpected error while deserializing an incoming cluster state |  
-- | --
  | »  java.lang.IllegalArgumentException: Unknown NamedWriteable [org.elasticsearch.cluster.NamedDiff][index_template] |  
  | »  	at org.elasticsearch.common.io.stream.NamedWriteableRegistry.getReader(NamedWriteableRegistry.java:112) ~[elasticsearch-7.0.1.jar:7.0.1] |  
  | »  	at org.elasticsearch.common.io.stream.NamedWriteableAwareStreamInput.readNamedWriteable(NamedWriteableAwareStreamInput.java:45) ~[elasticsearch-7.0.1.jar:7.0.1] |  
  | »  	at org.elasticsearch.cluster.NamedDiffableValueSerializer.readDiff(NamedDiffableValueSerializer.java:56) ~[elasticsearch-7.0.1.jar:7.0.1] |  
  | »  	at org.elasticsearch.cluster.NamedDiffableValueSerializer.readDiff(NamedDiffableValueSerializer.java:30) ~[elasticsearch-7.0.1.jar:7.0.1] |  
  | »  	at org.elasticsearch.cluster.DiffableUtils$MapDiff.<init>(DiffableUtils.java:406) ~[elasticsearch-7.0.1.jar:7.0.1] |  
  | »  	at org.elasticsearch.cluster.DiffableUtils$ImmutableOpenMapDiff.<init>(DiffableUtils.java:234) ~[elasticsearch-7.0.1.jar:7.0.1] |  
  | »  	at org.elasticsearch.cluster.DiffableUtils.readImmutableOpenMapDiff(DiffableUtils.java:127) ~[elasticsearch-7.0.1.jar:7.0.1] |  
  | »  	at org.elasticsearch.cluster.metadata.MetaData$MetaDataDiff.<init>(MetaData.java:855) ~[elasticsearch-7.0.1.jar:7.0.1] |  
  | »  	at org.elasticsearch.cluster.metadata.MetaData.readDiffFrom(MetaData.java:801) ~[elasticsearch-7.0.1.jar:7.0.1] |  
  | »  	at org.elasticsearch.cluster.ClusterState$ClusterStateDiff.<init>(ClusterState.java:878) ~[elasticsearch-7.0.1.jar:7.0.1]



[2021-03-05T08:59:18,561][ERROR][o.e.b.Bootstrap          ] [v7.0.1-2] Exception |  
-- | --
  | »  org.elasticsearch.ElasticsearchException: java.io.IOException: failed to read /dev/shm/elastic+elasticsearch+7.x+bwc/BWC_VERSION/7.0.1/nodes/centos-7&&immutable/x-pack/qa/rolling-upgrade/build/testclusters/v7.0.1-2/data/nodes/0/_state/global-123.st |  
  | »  	at org.elasticsearch.ExceptionsHelper.maybeThrowRuntimeAndSuppress(ExceptionsHelper.java:158) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.gateway.MetadataStateFormat.loadGeneration(MetadataStateFormat.java:403) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:71) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.gateway.GatewayMetaState.start(GatewayMetaState.java:136) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.node.Node.start(Node.java:836) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:324) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:419) [elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) [elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) [elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:75) [elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:116) [elasticsearch-cli-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.cli.Command.main(Command.java:79) [elasticsearch-cli-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115) [elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:81) [elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  Caused by: java.io.IOException: failed to read /dev/shm/elastic+elasticsearch+7.x+bwc/BWC_VERSION/7.0.1/nodes/centos-7&&immutable/x-pack/qa/rolling-upgrade/build/testclusters/v7.0.1-2/data/nodes/0/_state/global-123.st |  
  | »  	at org.elasticsearch.gateway.MetadataStateFormat.loadGeneration(MetadataStateFormat.java:397) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	... 12 more |  
  | »  Caused by: org.elasticsearch.common.xcontent.XContentParseException: [-1:27411] [index_lifecycle] failed to parse field [policies] |  
  | »  	at org.elasticsearch.common.xcontent.ObjectParser.parseValue(ObjectParser.java:520) ~[elasticsearch-x-content-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.common.xcontent.ObjectParser.parseSub(ObjectParser.java:530) ~[elasticsearch-x-content-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.common.xcontent.ObjectParser.parse(ObjectParser.java:313) ~[elasticsearch-x-content-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.common.xcontent.ConstructingObjectParser.parse(ConstructingObjectParser.java:160) ~[elasticsearch-x-content-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.xpack.ilm.IndexLifecycle.lambda$getNamedXContent$4(IndexLifecycle.java:236) ~[?:?] |  
  | »  	at org.elasticsearch.common.xcontent.NamedXContentRegistry$Entry.lambda$new$0(NamedXContentRegistry.java:52) ~[elasticsearch-x-content-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.common.xcontent.NamedXContentRegistry.parseNamedObject(NamedXContentRegistry.java:129) ~[elasticsearch-x-content-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.common.xcontent.support.AbstractXContentParser.namedObject(AbstractXContentParser.java:398) ~[elasticsearch-x-content-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.cluster.metadata.Metadata$Builder.fromXContent(Metadata.java:1736) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.cluster.metadata.Metadata$1.fromXContent(Metadata.java:1781) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.cluster.metadata.Metadata$1.fromXContent(Metadata.java:1772) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.gateway.MetadataStateFormat.read(MetadataStateFormat.java:291) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	at org.elasticsearch.gateway.MetadataStateFormat.loadGeneration(MetadataStateFormat.java:393) ~[elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT] |  
  | »  	... 12 more


@javanna
Copy link
Member

javanna commented Mar 5, 2021

here is the latest failure around not being able to start the watcher nodes: https://gradle-enterprise.elastic.co/s/h6mmehbdy6yao/console-log?task=:x-pack:qa:rolling-upgrade:v7.9.2%23oneThirdUpgradedTest

I will open a separate issue for other more or less related bwc test failures.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Mar 8, 2021
When running in a mixed cluster with node with version pre 7.8 and post 7.8 then
attempting the install composable index templates from any node can cause many
error logs indicating that put composable and component index template apis don't exist.
These APIs are always redirect to the elected master and as long as this node
is on a pre 7.8 version then attempting to install these templates will always fail.

Relates to elastic#69918
@droberts195
Copy link
Contributor

The java.lang.AssertionError: Failure in test setup: Failed to initialize at least 3 watcher nodes error is still occurring in BWC tests on the 7.12 branch, although it's been silenced on the newer branches. For example https://gradle-enterprise.elastic.co/s/djcc266w6xwvs is a BWC test of 7.12 against 7.9.

So please can the fixes or mutes be backported to the 7.12 branch too?

@martijnvg
Copy link
Member

@droberts195 I'm on this test failure issue. I will backport #69998 to the 7.12 and 7.11 branches.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Mar 8, 2021
* The WatcherIndexTemplateRegistry moved from legacy templates to composable index templates in version 7.10.0 and not 7.9.0
* The WatcherIndexTemplateRegistry#validate(...) method should only care whether a template for watcher history indices exist and
not whether that template is a composable index template or a legacy template. This shouldn't matter whether to determine if watcher can be started on a node. The content of the templates didn't change in a breaking manner (since version 6.8.0).

Should resolve elastic#69918
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Mar 8, 2021
* The WatcherIndexTemplateRegistry moved from legacy templates to composable index templates in version 7.10.0 and not 7.9.0
* The WatcherIndexTemplateRegistry#validate(...) method should only care whether a template for watcher history indices exist and
not whether that template is a composable index template or a legacy template. This shouldn't matter whether to determine if watcher can be started on a node. The content of the templates didn't change in a breaking manner (since version 6.8.0).

Should resolve elastic#69918
@martijnvg
Copy link
Member

The real next issue that have to fix is that when upgrading from some versions (7.0.1, 7.2.0, 7.4.0, 7.5.1), starting the upgraded node (on both 7.13, 7.12 or 7.11.3 versions) fails because when loading the cluster state, there is somehow a rollover action in an ilm policy with no conditions set. We check for this in the constructor of the Rollover class. How this is possible is a mystery. This validation has existed since the beginning of ilm, so it shouldn't be possible to create an ilm policy with a rollover action that has no condition set.

martijnvg added a commit that referenced this issue Mar 8, 2021
Backporting #69998 to 7.12 branch.

* The WatcherIndexTemplateRegistry moved from legacy templates to composable index templates in version 7.10.0 and not 7.9.0
* The WatcherIndexTemplateRegistry#validate(...) method should only care whether a template for watcher history indices exist and
not whether that template is a composable index template or a legacy template. This shouldn't matter whether to determine if watcher can be started on a node. The content of the templates didn't change in a breaking manner (since version 6.8.0).

Should resolve #69918
martijnvg added a commit that referenced this issue Mar 8, 2021
Backporting #69998 to 7.11 branch.

* The WatcherIndexTemplateRegistry moved from legacy templates to composable index templates in version 7.10.0 and not 7.9.0
* The WatcherIndexTemplateRegistry#validate(...) method should only care whether a template for watcher history indices exist and
not whether that template is a composable index template or a legacy template. This shouldn't matter whether to determine if watcher can be started on a node. The content of the templates didn't change in a breaking manner (since version 6.8.0).

Should resolve #69918
@martijnvg
Copy link
Member

We think that we figured out why the rollover validation error occurs during startup, #69995 added a the new maxPrimaryShardSize condition to builtin templates and in a mixed cluster if the policy is updated and there are still nodes in the cluster that don't support this maxPrimaryShardSize condition then these node get a rollover instance with no conditions (node to node serialization omits this condition from being serialized to older nodes) then when booting the node after it has been upgraded then it is unable to read the cluster state it has on disk, because it contains an ilm policy with a rollover action with no conditions set.

To address this issue, instead of serializing no conditions, the maxSize will be serialized, so that at least the rollover action is valid. When the node joins the cluster, from the new cluster state it will be able to parse the rollover action with the maxPrimaryShardSize condition.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Mar 8, 2021
…upgrade.

If node doesn't support maxPrimaryShardSize then serialize maxPrimaryShardSize as maxSize.
This should fix a problematic situation if an older node doesn't support maxPrimaryShardSize
and this is the only condition specified then the older node ends up with a instance without
any conditions. This could lead to upgrade failures, new nodes not able to start because
local cluster state can't be read.

Relates to elastic#69918
martijnvg added a commit that referenced this issue Mar 8, 2021
…de. (#70057)

When running in a mixed cluster with node with version pre 7.8 and post 7.8 then
attempting the install composable index templates from any node can cause many
error logs indicating that put composable and component index template apis don't exist.
These APIs are always redirect to the elected master and as long as this node
is on a pre 7.8 version then attempting to install these templates will always fail.

Relates to #69918
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Mar 8, 2021
…de. (elastic#70057)

When running in a mixed cluster with node with version pre 7.8 and post 7.8 then
attempting the install composable index templates from any node can cause many
error logs indicating that put composable and component index template apis don't exist.
These APIs are always redirect to the elected master and as long as this node
is on a pre 7.8 version then attempting to install these templates will always fail.

Relates to elastic#69918
martijnvg added a commit that referenced this issue Mar 8, 2021
…de. (#70080)

Backport of #70057 to 7.x branch

When running in a mixed cluster with node with version pre 7.8 and post 7.8 then
attempting the install composable index templates from any node can cause many
error logs indicating that put composable and component index template apis don't exist.
These APIs are always redirect to the elected master and as long as this node
is on a pre 7.8 version then attempting to install these templates will always fail.

Relates to #69918
@martijnvg
Copy link
Member

martijnvg commented Mar 8, 2021

The 7.12 bwc ci job finally completed successfully. The 7.11 bwc ci job would have to, but unfortunately ran in two network related errors while downloading artifacts.

The 7.x bwc should also complete successfully when #70076 is merged and backported to the 7.x branch.

martijnvg added a commit that referenced this issue Mar 9, 2021
…upgrade (#70076)

If node doesn't support maxPrimaryShardSize then serialize maxPrimaryShardSize as maxSize.
This should fix a problematic situation if an older node doesn't support maxPrimaryShardSize
and this is the only condition specified then the older node ends up with a instance without
any conditions. This could lead to upgrade failures, new nodes not able to start because
local cluster state can't be read.

Relates to #69918
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Mar 9, 2021
…upgrade (elastic#70076)

If node doesn't support maxPrimaryShardSize then serialize maxPrimaryShardSize as maxSize.
This should fix a problematic situation if an older node doesn't support maxPrimaryShardSize
and this is the only condition specified then the older node ends up with a instance without
any conditions. This could lead to upgrade failures, new nodes not able to start because
local cluster state can't be read.

Relates to elastic#69918
martijnvg added a commit that referenced this issue Mar 9, 2021
…upgrade (#70076) (#70128)

If node doesn't support maxPrimaryShardSize then serialize maxPrimaryShardSize as maxSize.
This should fix a problematic situation if an older node doesn't support maxPrimaryShardSize
and this is the only condition specified then the older node ends up with a instance without
any conditions. This could lead to upgrade failures, new nodes not able to start because
local cluster state can't be read.

Relates to #69918
@martijnvg
Copy link
Member

Also 7.11 bwc ci job is now happy. Looks like all watcher and ilm fatal upgrade errors have been solved. I will close this issue. Please open a new issue if any of the failures mentioned occur again.

martijnvg added a commit that referenced this issue Mar 11, 2021
…de. (#70057)

When running in a mixed cluster with node with version pre 7.8 and post 7.8 then
attempting the install composable index templates from any node can cause many
error logs indicating that put composable and component index template apis don't exist.
These APIs are always redirect to the elected master and as long as this node
is on a pre 7.8 version then attempting to install these templates will always fail.

Relates to #69918
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Watcher Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

6 participants