Watcher: Mysterious rolling upgrade failure #33185

nik9000 · 2018-08-27T18:29:29Z

This rolling upgrade build failed fairly mysteriously. This is what the failure looks like:

15:24:46   1> [2018-08-26T15:24:37,459][INFO ][o.e.u.UpgradeClusterClientYamlTestSuiteIT] [test] Stash dump on test failure [{
15:24:46   1>   "stash" : {
15:24:46   1>     "record_id" : "my_watch_0be15ae1-d0c1-4516-b2fc-7291a51c00a6-2018-08-26T15:24:37.281Z",
15:24:46   1>     "body" : {
15:24:46   1>       "_id" : "my_watch_0be15ae1-d0c1-4516-b2fc-7291a51c00a6-2018-08-26T15:24:37.281Z",
15:24:46   1>       "watch_record" : {
15:24:46   1>         "watch_id" : "my_watch",
15:24:46   1>         "node" : "j6O-Ii5MRcuZismjV2mJPA",
15:24:46   1>         "state" : "failed",
15:24:46   1>         "user" : "test_user",
15:24:46   1>         "status" : {
15:24:46   1>           "state" : {
15:24:46   1>             "active" : true,
15:24:46   1>             "timestamp" : "2018-08-26T15:24:36.992Z"
15:24:46   1>           },
15:24:46   1>           "actions" : {
15:24:46   1>             "logging" : {
15:24:46   1>               "ack" : {
15:24:46   1>                 "timestamp" : "2018-08-26T15:24:36.992Z",
15:24:46   1>                 "state" : "awaits_successful_execution"
15:24:46   1>               }
15:24:46   1>             }
15:24:46   1>           },
15:24:46   1>           "execution_state" : "failed",
15:24:46   1>           "version" : 1
15:24:46   1>         },
15:24:46   1>         "trigger_event" : {
15:24:46   1>           "type" : "manual",
15:24:46   1>           "triggered_time" : "2018-08-26T15:24:37.279Z",
15:24:46   1>           "manual" : {
15:24:46   1>             "schedule" : {
15:24:46   1>               "scheduled_time" : "2018-08-26T15:24:37.279Z"
15:24:46   1>             }
15:24:46   1>           }
15:24:46   1>         },
15:24:46   1>         "input" : {
15:24:46   1>           "simple" : { }
15:24:46   1>         },
15:24:46   1>         "condition" : {
15:24:46   1>           "always" : { }
15:24:46   1>         },
15:24:46   1>         "result" : {
15:24:46   1>           "execution_time" : "2018-08-26T15:24:37.281Z",
15:24:46   1>           "execution_duration" : 1535297077283,
15:24:46   1>           "actions" : [ ]
15:24:46   1>         },
15:24:46   1>         "exception" : {
15:24:46   1>           "type" : "illegal_state_exception",
15:24:46   1>           "reason" : "could not register execution [my_watch]. current executions are sealed and forbid registrations of additional executions."
15:24:46   1>         }
15:24:46   1>       }
15:24:46   1>     }
15:24:46   1>   }
15:24:46   1> }]
15:24:46   1> [2018-08-26T15:24:37,502][INFO ][o.e.u.UpgradeClusterClientYamlTestSuiteIT] [test] [p0=old_cluster/60_watcher/CRUD watch APIs] after test
15:24:46 FAILURE 0.96s | UpgradeClusterClientYamlTestSuiteIT.test {p0=old_cluster/60_watcher/CRUD watch APIs} <<< FAILURES!
15:24:46    > Throwable #1: java.lang.AssertionError: Failure at [old_cluster/60_watcher:43]: watch_record.state didn't match expected value:
15:24:46    >             watch_record.state: expected [executed] but was [failed]

This is in the "old" cluster so the cluster state is empty, but we still get an error as though the watch was running.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-08-27T18:29:36Z

Pinging @elastic/es-core-infra

polyfractal · 2018-08-30T15:14:20Z

Pasting from an old issue (in old xpack repo) with a similar error. I think it was a concurrency issue for the other one but was never resolved.

I tried a few approaches to fix this but wasn't satisfied with any of them. Jotting down notes for now:

The issue is that ExecutionService#clearExecutions() is called on stop or pause, which seals currentExecutions then overwrites it with a new instantiation. But it's possible for a manual execution of a watch to slip in between the sealing and the new object, which throws the exception seen here.

There are a few ways this could be fixed, unsure the best route:

Instead of throwing an exception, CurrentExceptions#put() could return Optional to signal if it was able to add to the map, and if not, the calling code could just log a message and abort

Allow manually executed Watches to be added to the currently executing list despite being sealed

Modify CurrentExceptions so that it can be reset internally, rather than externally overwritten after being drained. This would change up the sealing dynamics and allow waiting for the drain to finish, etc.

In that issue, @DaveCTurner followed up with:

Would it work to do something like the following?

diff --git a/plugin/watcher/src/main/java/org/elasticsearch/xpack/watcher/execution/ExecutionService.java b/plugin/watcher/src/main/java/org/elasticsearch/xpack/watcher/execution/ExecutionService.java
index 407ab9a7e5..c3f73740e8 100644
--- a/plugin/watcher/src/main/java/org/elasticsearch/xpack/watcher/execution/ExecutionService.java
+++ b/plugin/watcher/src/main/java/org/elasticsearch/xpack/watcher/execution/ExecutionService.java
@@ -554,8 +554,9 @@ public class ExecutionService extends AbstractComponent {
      * This is needed, because when this method is called, watcher keeps running, so sealing executions would be a bad idea
     */
    public void clearExecutions() {
-        currentExecutions.sealAndAwaitEmpty(maxStopTimeout);
+        final CurrentExecutions previousExecutions = currentExecutions;
        currentExecutions = new CurrentExecutions();
+        previousExecutions.sealAndAwaitEmpty(maxStopTimeout);
    }

    // the watch execution task takes another runnable as parameter

NB I'm not sure this actually works as-is without further synchronisation.

Not sure if it's the same issue (manual executions racing), but if there are concurrency issues with this code it might be manifesting in the above test too.

alpar-t · 2018-09-19T13:53:56Z

Another instance: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/27/console

jkakavas · 2019-02-22T08:58:31Z

And another one today: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/2197/consoleText in master intake

This fails once in a while: https://build-stats.elastic.co/app/kibana#/discover?_g=(refreshInterval:(pause:!t,value:0),time:(from:now-6M,mode:quick,to:now))&_a=(columns:!(_source),index:b646ed00-7efc-11e8-bf69-63c8ef516157,interval:auto,query:(language:lucene,query:'%22old_cluster%2F60_watcher%2FCRUD%20watch%20APIs%22'),sort:!(process.time-start,desc))

This fails on old_cluster but mixed_cluster and upgraded_cluster depend on watches set in old_cluster so that can't be muted on its own Relates: elastic#33185

This fails on old_cluster but mixed_cluster and upgraded_cluster depend on watches set in old_cluster so that can't be muted on its own Relates: #33185

This fails on old_cluster but mixed_cluster and upgraded_cluster depend on watches set in old_cluster so that can't be muted on its own Relates: elastic#33185

davidkyle · 2019-04-08T13:38:29Z

Another instance in 7.0 https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.0+artifactory/133/console

This fails on old_cluster but mixed_cluster and upgraded_cluster depend on watches set in old_cluster so that can't be muted on its own Relates: #33185

davidkyle · 2019-04-08T16:14:11Z

Another failure in 7.0 https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.0+artifactory/136/testReport/junit/org.elasticsearch.upgrades/UpgradeClusterClientYamlTestSuiteIT/test__p0_old_cluster_60_watcher_CRUD_watch_APIs_/ so I backported the muting to 7.0 1948702

There are only 2 watcher tests in the yml suite both are muted leaving only WatcherRestartIT for
Watcher upgrade test coverage and that test appears pretty minimal.

jakelandis · 2019-05-29T13:38:16Z

Un-muted this test on PR #42377 to obtain additional logs.

If (when?) this test fails again please obtain the following information before muting the test:

Copy of the relevant failure
Copy of the reproduce line
The Jenkins build link
The Gradle scan link
The relevant cluster logs from "Google Cloud Storage Upload Report" (link found in Jenkins build)

jkakavas · 2019-05-30T05:58:36Z

This failed in a PR build:

Jenkins build: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request-bwc/6473/
Gradle scan: https://gradle.com/s/7smiqhxdoghp2
Repro:

./gradlew :x-pack:qa:rolling-upgrade:v7.3.0#oldClusterTestRunner --tests "org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT.test {p0=old_cluster/60_watcher/CRUD watch APIs}" -Dtests.seed=8DFA7D80E04774A7 -Dtests.security.manager=true -Dtests.locale=ta-MY -Dtests.timezone=Asia/Krasnoyarsk -Dcompiler.java=12 -Druntime.java=11 -Dtests.rest.suite=old_cluster

logs:
https://storage.cloud.google.com/elasticsearch-ci-artifacts/jobs/elastic+elasticsearch+pull-request-bwc/6473/hprof-files.tar.gz
https://storage.cloud.google.com/elasticsearch-ci-artifacts/jobs/elastic+elasticsearch+pull-request-bwc/6473/syserr-files.tar.gz
https://storage.cloud.google.com/elasticsearch-ci-artifacts/jobs/elastic+elasticsearch+pull-request-bwc/6473/log-files.tar.gz

DaveCTurner · 2019-06-18T16:37:19Z

In today's master I found a bunch of tests that are being skipped and which link to this issue:

$ find . -type f -name '*.yml' | xargs grep -e 33185
./x-pack/qa/rolling-upgrade/build/resources/test/rest-api-spec/test/old_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build/resources/test/rest-api-spec/test/old_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build/resources/test/rest-api-spec/test/mixed_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build/resources/test/rest-api-spec/test/mixed_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build/resources/test/rest-api-spec/test/upgraded_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build/resources/test/rest-api-spec/test/upgraded_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build-idea/classes/test/rest-api-spec/test/old_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build-idea/classes/test/rest-api-spec/test/old_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build-idea/classes/test/rest-api-spec/test/mixed_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build-idea/classes/test/rest-api-spec/test/mixed_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build-idea/classes/test/rest-api-spec/test/upgraded_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/build-idea/classes/test/rest-api-spec/test/upgraded_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/src/test/resources/rest-api-spec/test/old_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/src/test/resources/rest-api-spec/test/old_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/src/test/resources/rest-api-spec/test/mixed_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/src/test/resources/rest-api-spec/test/mixed_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/src/test/resources/rest-api-spec/test/upgraded_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185
./x-pack/qa/rolling-upgrade/src/test/resources/rest-api-spec/test/upgraded_cluster/60_watcher.yml:      reason: https://github.com/elastic/elasticsearch/issues/33185

It looks like GitHub interpreted the phrase This initial PR does not attempt to fix #33185. wrongly 🤦‍♀ . Re-opening.

This test is believed to be fixed by elastic#43939 closes elastic#33185

set watcher logger to debug level. These tests haven't run in such a long time, we first need to get a better picture how/if these tests fail today. See elastic#33185

set watcher logger to debug level. These tests haven't run in such a long time, we first need to get a better picture how/if these tests fail today. See #33185

set watcher logger to debug level. These tests haven't run in such a long time, we first need to get a better picture how/if these tests fail today. Backport of elastic#51478 See elastic#33185

martijnvg · 2020-01-30T13:00:49Z

After merging in the pr that enables the watcher rolling upgrade tests, these tests haven't yet failed. I'm going to also enable these tests in the 7 dot x branch.

set watcher logger to debug level. These tests haven't run in such a long time, we first need to get a better picture how/if these tests fail today. Backport of #51478 See #33185

martijnvg · 2020-02-03T13:25:34Z

Today the first real failure occurred:

org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT > test {p0=old_cluster/60_watcher/CRUD watch APIs} FAILED |  
-- | --
  | java.lang.AssertionError: Failure at [old_cluster/60_watcher:41]: watch_record.state didn't match expected value: |  
  | watch_record.state: expected String [executed] but was String [failed] |  
  | at __randomizedtesting.SeedInfo.seed([885817E4E7A82A5D:C283E495447A5]:0) |  
  | at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.executeSection(ESClientYamlSuiteTestCase.java:405) |  
  | at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:382) |  
  | at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Last response:

[2020-02-03T03:01:13,678][INFO ][o.e.u.UpgradeClusterClientYamlTestSuiteIT] [test] Stash dump on test failure [{ |  
-- | --
  | 1>   "stash" : { |  
  | 1>     "record_id" : "my_watch_26c48e21-ab7e-4953-836a-f90d7ec7baea-2020-02-03T01:01:13.478Z", |  
  | 1>     "body" : { |  
  | 1>       "_id" : "my_watch_26c48e21-ab7e-4953-836a-f90d7ec7baea-2020-02-03T01:01:13.478Z", |  
  | 1>       "watch_record" : { |  
  | 1>         "watch_id" : "my_watch", |  
  | 1>         "node" : "VhrcuRKYSkmoP9P9ilKjTw", |  
  | 1>         "state" : "failed", |  
  | 1>         "user" : "test_user", |  
  | 1>         "status" : { |  
  | 1>           "state" : { |  
  | 1>             "active" : true, |  
  | 1>             "timestamp" : "2020-02-03T01:01:13.171Z" |  
  | 1>           }, |  
  | 1>           "actions" : { |  
  | 1>             "logging" : { |  
  | 1>               "ack" : { |  
  | 1>                 "timestamp" : "2020-02-03T01:01:13.171Z", |  
  | 1>                 "state" : "awaits_successful_execution" |  
  | 1>               } |  
  | 1>             } |  
  | 1>           }, |  
  | 1>           "execution_state" : "failed", |  
  | 1>           "version" : 1 |  
  | 1>         }, |  
  | 1>         "trigger_event" : { |  
  | 1>           "type" : "manual", |  
  | 1>           "triggered_time" : "2020-02-03T01:01:13.475Z", |  
  | 1>           "manual" : { |  
  | 1>             "schedule" : { |  
  | 1>               "scheduled_time" : "2020-02-03T01:01:13.475Z" |  
  | 1>             } |  
  | 1>           } |  
  | 1>         }, |  
  | 1>         "input" : { |  
  | 1>           "simple" : { } |  
  | 1>         }, |  
  | 1>         "condition" : { |  
  | 1>           "always" : { } |  
  | 1>         }, |  
  | 1>         "result" : { |  
  | 1>           "execution_time" : "2020-02-03T01:01:13.478Z", |  
  | 1>           "execution_duration" : 1580691673483, |  
  | 1>           "actions" : [ ] |  
  | 1>         }, |  
  | 1>         "exception" : { |  
  | 1>           "type" : "illegal_state_exception", |  
  | 1>           "reason" : "could not register execution [my_watch]. current executions are sealed and forbid registrations of additional executions." |  
  | 1>         } |  
  | 1>       } |  
  | 1>     } |  
  | 1>   } |  
  | 1> }]

Relevant build logs on node executing watch:

[2020-02-03T01:00:58,389][INFO ][o.e.x.w.WatcherService   ] [v6.8.7-1] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
[2020-02-03T01:01:13,312][INFO ][o.e.x.w.WatcherService   ] [v6.8.7-1] reloading watcher, reason [new local watcher shard allocation ids], cancelled [0] queued tasks
[2020-02-03T01:01:13,342][DEBUG][o.e.x.w.WatcherService   ] [v6.8.7-1] watch service has been reloaded, reason [new local watcher shard allocation ids]
[2020-02-03T01:01:13,371][DEBUG][o.e.x.w.WatcherIndexingListener] [v6.8.7-1] adding watch [my_watch] to trigger service
[2020-02-03T01:01:13,482][INFO ][o.e.x.w.WatcherService   ] [v6.8.7-1] reloading watcher, reason [new local watcher shard allocation ids], cancelled [0] queued tasks
[2020-02-03T01:01:13,487][DEBUG][o.e.x.w.e.ExecutionService] [v6.8.7-1] failed to execute watch [my_watch]
java.lang.IllegalStateException: could not register execution [my_watch]. current executions are sealed and forbid registrations of additional executions.
        at org.elasticsearch.xpack.core.watcher.support.Exceptions.illegalState(Exceptions.java:26) ~[x-pack-core-6.8.7-SNAPSHOT.jar:6.8.7-SNAPSHOT]
        at org.elasticsearch.xpack.watcher.execution.CurrentExecutions.put(CurrentExecutions.java:41) ~[x-pack-watcher-6.8.7-SNAPSHOT.jar:6.8.7-SNAPSHOT]
        at org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:282) [x-pack-watcher-6.8.7-SNAPSHOT.jar:6.8.7-SNAPSHOT]
        at org.elasticsearch.xpack.watcher.transport.actions.execute.TransportExecuteWatchAction$1.doRun(TransportExecuteWatchAction.java:164) [x-pack-watcher-6.8.7-SNAPSHOT.jar:6.8.7-SNAPSHOT]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.8.7-SNAPSHOT.jar:6.8.7-SNAPSHOT]
        at org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:617) [x-pack-watcher-6.8.7-SNAPSHOT.jar:6.8.7-SNAPSHOT]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-6.8.7-SNAPSHOT.jar:6.8.7-SNAPSHOT]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
[2020-02-03T01:01:13,506][DEBUG][o.e.x.w.WatcherService   ] [v6.8.7-1] Loaded [0] watches for execution
[2020-02-03T01:01:13,507][DEBUG][o.e.x.w.WatcherService   ] [v6.8.7-1] watch service has been reloaded, reason [new local watcher shard allocation ids]
[2020-02-03T01:01:13,660][ERROR][o.e.x.w.Watcher          ] [v6.8.7-1] triggered watches could not be deleted [my_watch_26c48e21-ab7e-4953-836a-f90d7ec7baea-2020-02-03T01:01:13.478Z], failure [[.triggered_watches] IndexNotFoundException[no such index]]
[2020-02-03T01:01:13,661][DEBUG][o.e.x.w.e.ExecutionService] [v6.8.7-1] finished [my_watch]/[my_watch_26c48e21-ab7e-4953-836a-f90d7ec7baea-2020-02-03T01:01:13.478Z]

Build url: https://gradle-enterprise.elastic.co/s/hbiysvknha5wi/

martijnvg · 2020-02-03T13:27:13Z

I think that after the watch gets created when running against old cluster then we should wait until .watches index is at least yellow and watcher is started. This way we avoid executing a watch before watcher is ready to execute.

In the rolling upgrade tests, watcher is manually executed, in rare scenarios this happens before watcher is started, resulting in the manual execution to fail. Relates to elastic#33185

martijnvg · 2020-02-10T11:40:15Z

The latest failure happened a couple of times in the last week. I've opened #52139 to address it.

In the rolling upgrade tests, watcher is manually executed, in rare scenarios this happens before watcher is started, resulting in the manual execution to fail. Relates to #33185

…ic#52139) In the rolling upgrade tests, watcher is manually executed, in rare scenarios this happens before watcher is started, resulting in the manual execution to fail. Relates to elastic#33185

Backport: #52139 In the rolling upgrade tests, watcher is manually executed, in rare scenarios this happens before watcher is started, resulting in the manual execution to fail. Relates to #33185

Relates to elastic#33185

Relates to #33185

Relates to elastic#33185

Relates to #33185

Relates to elastic#33185

martijnvg · 2020-02-21T06:31:00Z

I'm closing this issue, these commits (^), seem to have stabilised this test.

nik9000 added the :Data Management/Watcher label Aug 27, 2018

nik9000 assigned hub-cap Aug 27, 2018

nik9000 added the >test-failure Triaged test failures from CI label Aug 27, 2018

jkakavas mentioned this issue Feb 22, 2019

Mute rolling upgrade watcher CRUD tests #39293

Merged

jkakavas added a commit that referenced this issue Feb 22, 2019

Mute rolling upgrade watcher CRUD tests (#39293)

d3bced2

This fails on old_cluster but mixed_cluster and upgraded_cluster depend on watches set in old_cluster so that can't be muted on its own Relates: #33185

jkakavas added a commit that referenced this issue Feb 22, 2019

Mute rolling upgrade watcher CRUD tests (#39293)

401226f

This fails on old_cluster but mixed_cluster and upgraded_cluster depend on watches set in old_cluster so that can't be muted on its own Relates: #33185

davidkyle pushed a commit that referenced this issue Apr 8, 2019

Mute rolling upgrade watcher CRUD tests (#39293)

1948702

This fails on old_cluster but mixed_cluster and upgraded_cluster depend on watches set in old_cluster so that can't be muted on its own Relates: #33185

This was referenced May 22, 2019

un-mute Watcher rolling upgrade tests and bump up logging #42377

Merged

Meta: Fix Watcher Test Failures #42409

Closed

jakelandis closed this as completed in #42377 May 29, 2019

jkakavas mentioned this issue May 30, 2019

Fix refresh remote JWKS logic #42662

Merged

DaveCTurner reopened this Jun 18, 2019

jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Oct 18, 2019

Re-enable Watcher full rolling restart tests

bd56b7f

This test is believed to be fixed by elastic#43939 closes elastic#33185

martijnvg assigned martijnvg and unassigned hub-cap Jan 27, 2020

martijnvg mentioned this issue Jan 27, 2020

Unmute rolling upgrade watcher tests #51478

Merged

martijnvg added a commit that referenced this issue Jan 29, 2020

Unmute rolling upgrade watcher tests and (#51478)

69ec669

set watcher logger to debug level. These tests haven't run in such a long time, we first need to get a better picture how/if these tests fail today. See #33185

martijnvg mentioned this issue Jan 30, 2020

Backport: unmute rolling upgrade watcher tests and #51664

Merged

martijnvg mentioned this issue Feb 10, 2020

Wait for watcher to be started prior to rolling upgrade tests. #52139

Merged

martijnvg mentioned this issue Feb 11, 2020

backport: wait for watcher to be started prior to rolling upgrade tests. #52186

Merged

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Feb 17, 2020

Improve watcher rolling upgrade tests

267492c

Relates to elastic#33185

martijnvg mentioned this issue Feb 17, 2020

Improve watcher rolling upgrade tests #52404

Merged

martijnvg added a commit that referenced this issue Feb 17, 2020

Improve watcher rolling upgrade tests (#52404)

81e47e9

Relates to #33185

martijnvg added a commit that referenced this issue Feb 17, 2020

Improve watcher rolling upgrade tests (#52404)

82ede04

Relates to #33185

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Feb 18, 2020

Adjusted assertion for watcher rolling upgrade test.

b0ca34f

Relates to elastic#33185

martijnvg mentioned this issue Feb 18, 2020

Adjust assertion for watcher rolling upgrade test. #52463

Merged

martijnvg added a commit that referenced this issue Feb 18, 2020

Adjusted assertion for watcher rolling upgrade test. (#52463)

606bc80

Relates to #33185

martijnvg added a commit that referenced this issue Feb 18, 2020

Adjusted assertion for watcher rolling upgrade test. (#52463)

306d7a0

Relates to #33185

sbourke pushed a commit to sbourke/elasticsearch that referenced this issue Feb 19, 2020

Adjusted assertion for watcher rolling upgrade test. (elastic#52463)

116d8cc

Relates to elastic#33185

martijnvg closed this as completed Feb 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Watcher: Mysterious rolling upgrade failure #33185

Watcher: Mysterious rolling upgrade failure #33185

nik9000 commented Aug 27, 2018

elasticmachine commented Aug 27, 2018

polyfractal commented Aug 30, 2018 •

edited

Loading

alpar-t commented Sep 19, 2018

jkakavas commented Feb 22, 2019

davidkyle commented Apr 8, 2019

davidkyle commented Apr 8, 2019

jakelandis commented May 29, 2019

jkakavas commented May 30, 2019

DaveCTurner commented Jun 18, 2019

martijnvg commented Jan 30, 2020

martijnvg commented Feb 3, 2020

martijnvg commented Feb 3, 2020

martijnvg commented Feb 10, 2020

martijnvg commented Feb 21, 2020

Watcher: Mysterious rolling upgrade failure #33185

Watcher: Mysterious rolling upgrade failure #33185

Comments

nik9000 commented Aug 27, 2018

elasticmachine commented Aug 27, 2018

polyfractal commented Aug 30, 2018 • edited Loading

alpar-t commented Sep 19, 2018

jkakavas commented Feb 22, 2019

davidkyle commented Apr 8, 2019

davidkyle commented Apr 8, 2019

jakelandis commented May 29, 2019

jkakavas commented May 30, 2019

DaveCTurner commented Jun 18, 2019

martijnvg commented Jan 30, 2020

martijnvg commented Feb 3, 2020

martijnvg commented Feb 3, 2020

martijnvg commented Feb 10, 2020

martijnvg commented Feb 21, 2020

polyfractal commented Aug 30, 2018 •

edited

Loading