Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] Improve task manager functional tests in preperation for mget task claimer being the default (#196399) #197062

Merged
merged 1 commit into from
Oct 21, 2024

Conversation

mikecote
Copy link
Contributor

Backport

This will backport the following commits from main to 8.x:

Questions ?

Please refer to the Backport tool documentation

…aimer being the default (elastic#196399)

Resolves elastic#184942
Resolves elastic#192023
Resolves elastic#195573

In this PR, I'm improving the flakiness found in our functional tests in
preperation for mget being the default task claimer that all these tests
run with (elastic#194625). Because the
mget task claimer works differently and also polls more frequently, we
end-up in situations where tasks run faster than they were with
update_by_query, creating more race conditions that are now fixed in
this PR.

Issues were surfaced via elastic#190148
where I set `mget` as the default task claiming strategy.

Flaky test runs (some of these failed on other tests that are flaky):
-
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7151
-
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7169
-
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7172
-
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7175
-
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7176
-
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7185
(for
elastic@0fcf1ae)

(cherry picked from commit 3b8cf12)

# Conflicts:
#	x-pack/test/alerting_api_integration/spaces_only/tests/alerting/group4/alerts_as_data/alerts_as_data_flapping.ts
@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

@mikecote mikecote merged commit 9266571 into elastic:8.x Oct 21, 2024
32 checks passed
OneOfTaskTypes('task.taskType', claimPartitions.unlimitedTypes),
OneOfTaskTypes(
'task.taskType',
claimPartitions.unlimitedTypes.concat(Array.from(removedTypes))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to cause a starvation problem? I don't think so, as it looks like we are removing them (marking them to unrecognized) at the ed of the claim, but not clear to me why this wasn't in here before. Perhaps there was some confusion since we also have an issue to handle these via a separate stand-alone task?

This code does seem right, so I'm guessing some test was expecting these to be marked as unrecognized, and perhaps that wasn't happening before this PR ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing some test was expecting these to be marked as unrecognized, and perhaps that wasn't happening before this PR ...

Yeah there was a test in the update_by_query suite that ensured removed task types get marked as unrecognized:

it('should successfully schedule registered tasks, not claim unregistered tasks and mark removed task types as unrecognized', async () => {

I was able to make the test pass by adding this bit to the mget filter but I agree we should move this to its own task next via #192686

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants