Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing ES Forward Compatibility: Default CI Group #6 / Uptime app with real-world data uptime ml anomaly can create job successfully #179166

Closed
Tracked by #190068
mistic opened this issue Mar 21, 2024 · 14 comments
Assignees
Labels
failed-test A test failure on a tracked branch, potentially flaky-test Team:obs-ux-management Observability Management User Experience Team

Comments

@mistic
Copy link
Member

mistic commented Mar 21, 2024

Chrome X-Pack UI Functional Tests
x-pack/test/functional/apps/uptime/ml_anomaly.ts

Uptime app with real-world data uptime ml anomaly can create job successfully

This failure is preventing the ES 8.14 forward compatibility pipeline to proceed.

Error: retry.tryForTime timeout: Error: expected testSubject(uptimeMLJobSuccessfullyCreated) to exist
    at TestSubjects.existOrFail (/opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/test/functional/services/common/test_subjects.ts:45:13)
    at /opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/x-pack/test/functional/services/uptime/ml_anomaly.ts:43:9
    at runAttempt (/opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/test/common/services/retry/retry_for_success.ts:29:15)
    at retryForSuccess (/opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/test/common/services/retry/retry_for_success.ts:68:21)
    at RetryService.tryForTime (/opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/test/common/services/retry/retry.ts:22:12)
    at Context.<anonymous> (test/functional/apps/uptime/ml_anomaly.ts:44:7)
    at Object.apply (/opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/node_modules/@kbn/test/target_node/functional_test_runner/lib/mocha/wrap_function.js:78:16)
    at onFailure (/opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/test/common/services/retry/retry_for_success.ts:17:9)
    at retryForSuccess (/opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/test/common/services/retry/retry_for_success.ts:59:13)
    at RetryService.tryForTime (/opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/test/common/services/retry/retry.ts:22:12)
    at Context.<anonymous> (test/functional/apps/uptime/ml_anomaly.ts:44:7)
    at Object.apply (/opt/local-ssd/buildkite/builds/kb-n2-4-14cf8847403d6f06/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/kibana/node_modules/@kbn/test/target_node/functional_test_runner/lib/mocha/wrap_function.js:78:16)
@mistic mistic added blocker failed-test A test failure on a tracked branch, potentially flaky-test Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability skipped-test failed-es-promotion v7.17.19 labels Mar 21, 2024
@mistic
Copy link
Member Author

mistic commented Mar 21, 2024

Skipped.

7.17: dd53060

@smith smith added the Team:obs-ux-management Observability Management User Experience Team label Jul 18, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

@smith smith removed the Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability label Jul 18, 2024
@rayafratkina
Copy link
Contributor

@jasonrhodes any updates on this?

@jasonrhodes
Copy link
Member

@rayafratkina we are looking into it. If the test has been skipped, can someone help us understand how it's still blocking something?

We may end up removing this test due to changes in Uptime/Heartbeat, but we'll get back to you ASAP.

@rayafratkina
Copy link
Contributor

Caught up with @jasonrhodes offline, but for the record: we skip failed integration tests (failed-ed-promotion label) in order to avoid blocking everyone, but we expect the responsible team to investigate ASAP to determine if the failure does, in fact, indicate something in Kibana is impacted by an ES change.

The failed-es-promotion issue should be assigned and the owner should report the outcome of the investigation and any planned actions in the issue.

If it is determined there is no way to address the failure so we can re-enable the test in a timely manner, please consider removing the test until a replacement can be added back and track this work on your team's backlog and not as a skipped test.

@dominiqueclarke dominiqueclarke self-assigned this Sep 18, 2024
@dominiqueclarke
Copy link
Contributor

I've investigated this failure, increasing the timeout on the test and running the test in flaky test runner for 200 iterations, for which it has passed.

I'm not sure though if i need to be testing in a different way. Should I be testing against a specific version of ES @rayafratkina ?

@jasonrhodes
Copy link
Member

@mistic @rayafratkina do you have an guidance on how we make sure we've tested this correctly? Thanks!

@jasonrhodes
Copy link
Member

@dominiqueclarke do you feel like we have what we need to confidently close this, or do we need more input from @mistic / @rayafratkina / @elastic/kibana-core re: testing this with the right ES?

@afharo
Copy link
Member

afharo commented Sep 25, 2024

I think @elastic/kibana-operations can help better with the steps. I think there are some env vars to be set to ensure that we run 8.x ES when Kibana is still in 7.17.

It looks like the promotion job was removed, so I'm not sure if this is still relevant?

@jbudz
Copy link
Member

jbudz commented Sep 25, 2024

It's resolved, it will automatically be picked up after the test is unskipped. I don't see any issues with closing this out.

https://buildkite.com/elastic/kibana-es-forward-compatibility-testing/builds/191#01922791-3769-47bc-ad4c-dc9fd04a019a/586-1168

@jasonrhodes
Copy link
Member

Thanks! Forgive me but I don't really understand the process for this kind of issue.

I think there are some env vars to be set to ensure that we run 8.x ES when Kibana is still in 7.17.

Is this something we need to know how to do to test something like this? Is this for local testing, or for CI?

It looks like the promotion job was removed, so I'm not sure if this is still relevant?

What does this mean? What's the "promotion job" and what does it mean that it's been removed?

It's resolved, it will automatically be picked up after the test is unskipped.

What's been resolved? And how does this test get unskipped?

I don't see any issues with closing this out.

I'm happy to close this once we are sure we understand the above, thanks.

@jbudz
Copy link
Member

jbudz commented Sep 25, 2024

Is this something we need to know how to do to test something like this? Is this for local testing, or for CI?

Yes for local testing. There will be a comment in the build (example) with instructions. This pipeline is testing Kibana 7.17 against supported versions of Elasticsearch 8, the environment variable will override the Elasticsearch snapshot version to emulate what CI is doing.

What does this mean? What's the "promotion job" and what does it mean that it's been removed?

We run all Kibana tests against new Elasticsearch builds of the same version before they're 'promoted' to Kibana development. We do this to prevent Elasticsearch bugs from blocking Kibana development and CI.

This isn't a promotion job, it's a daily run of Kibana 7.17 against the most recent promoted version of Elasticsearch 8. I'll take it back to the team to make sure we're aligned and remove the promotion label.

Regarding removal, afharo is referring to logs missing for https://buildkite.com/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility/builds/42. There was a refactor and the pipelines were removed in the process. It's still relevant, but the pipeline is at https://buildkite.com/elastic/kibana-es-forward-compatibility-testing now.

What's been resolved? And how does this test get unskipped?

Generally, for releases, we need the the blocker label triaged. For closing the issue we need the test unskipped or removed.
The test was unskipped in #193310.

Thanks for the thorough questions! We're reviewing this in the near future and the feedback is helpful.

@dominiqueclarke
Copy link
Contributor

Closed by #193310.

@jasonrhodes
Copy link
Member

Thanks for the explanations, @jbudz !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
failed-test A test failure on a tracked branch, potentially flaky-test Team:obs-ux-management Observability Management User Experience Team
Projects
None yet
Development

No branches or pull requests

10 participants