-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Redshift fault tolerant execution #16860
Conversation
1e87d8d
to
b7f9275
Compare
/test-with-secrets sha=b7f9275330dca8a702d7434a1389822440d64b71 |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/4672620920 |
...ant-tests/src/test/java/io/trino/faulttolerant/redshift/BaseRedshiftFailureRecoveryTest.java
Outdated
Show resolved
Hide resolved
b7f9275
to
c0adbdd
Compare
The only thing I find that should be watched for is that the tests take currently take 20 minutes. We should watch every now and then that the duration of the tests doesn't become a problem. https://github.com/trinodb/trino/actions/runs/4683487128/jobs/8300162603?pr=16968 |
.github/workflows/ci.yml
Outdated
@@ -707,6 +709,30 @@ jobs: | |||
-Dtest.redshift.aws.region="${AWS_REGION}" \ | |||
-Dtest.redshift.aws.access-key="${AWS_ACCESS_KEY_ID}" \ | |||
-Dtest.redshift.aws.secret-key="${AWS_SECRET_ACCESS_KEY}" | |||
- name: Fault Tolerant Redshift Tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: The non FTE tests for Redshift are called "Cloud Redsfhift tests" - maybe call these "Cloud Fault Tolerant Redshift Tests"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need separate step definition for fault-tolerant tests.
How does Fault Tolerant Redshift Tests
differ from Cloud Redshift Tests
above.
Can you just modify Cloud Redshift Tests
to pass proper -pl :...
based on the test matrix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to use different redshift clusters for each set of tests.
The redshift tests are prone to timing out as @findinpath mentioned and adding more tests against the same cluster will make that more likely to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
each single CI job uses a single cluster. So having multiple tests run in single profile only increases the wall time of the tests. It doesn't affect flakiness at all.
FTE tests are peculiar though because they use a semaphore to bypass the global concurrency limits. If they cause issues the concurrency can be limited (see the ctor of BaseFailureRecoveryTest iirc). Multiple clusters per PR adds non-negligible cost increase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding more tests against the same cluster will make that more likely to happen.
By Can you just modify Cloud Redshift Tests to pass proper -pl :... based on the test matrix?
I did not mean to share the CI job to run fault tolerant and not fault tolerant tests.
But to instantiate two versions of "Cloud Redshift Tests"; one for normal tests; one for FTE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, yes that is possible.
Will set the step name dynamically based on the profile to disambiguate between the fte tests/cloud tests.
...ests/src/test/java/io/trino/faulttolerant/redshift/TestRedshiftQueryFailureRecoveryTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % @losipiuk 's comment to move to redshift module
c0adbdd
to
14ea9c5
Compare
Updated the mirror PR #16968. |
The CI status is fake green due to invalid
|
doh, typo in quoting the string. fixed. |
14ea9c5
to
c0a5ced
Compare
c0a5ced
to
9a6780a
Compare
Updated the mirror PR #16968. |
Awesome. Thanks. |
Description
Adds support for fault tolerant execution to Redshift connector
Additional context and related issues
These tests can be prone to timing out based on the region that the GHA runner is spun up in.
May reduce the FTE test suite size just for Redshift if this is a problem. Will need to run the tests multiple times in order to verify.
Release notes
(x) Release notes are required, with the following suggested text: