-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] RemoteClusterSecurityReloadCredentialsRestIT testFirstTimeSetupWithElasticsearchSettings failing #116883
Comments
Pinging @elastic/es-security (Team:Security) |
Marking this (and other tests) a
@n1v0lg Assigning this one to you as you were the original author of |
Interestingly, these are all failing in FIPS mode, and likewise the reload call is used in other suites that were also failing in FIPS mode (e.g., we recently disabled FIPS mode for snapshot ITs via #116811) |
Hm actually that message is benign -- the QC is not yet configured with a remote cluster credential and attempts to connect to the FC's port as though it were regular transport, not remote server, and therefore does not trust the cert (symptom 1 in our docs), eventually the keystore is configured with the cross cluster API key credential and those failure messages stop. I get those same exact error messages on runs that are successful. I think the suite simply doesn't have enough time to wrap up, reaches time-out, and fails. |
Yikes, yeah just getting the FC and QC up and running takes ~4m, so about 2m per cluster, then each test takes between 1.5m and 2m -- with 4 tests total we time-out while completing the last one. Main hog is the keystore setup -- almost >30s to create and provide all the entries:
There is room for improvement here... |
I need to check the timestamps for a non-FIPS run, but I wonder if this is so heinously slow because we are using slower default algos in FIPS mode for e.g., keystore encryption |
Nice catch! Yeah, that sounds way too slow. I'm wondering if we could avoid running the same commands on every node, but rather construct the keystore once and copy it to all nodes (potentially using |
@slobodanadamovic that'd be a nice improvement for larger clusters for sure -- the slow test run here though was for single-node clusters so wouldn't benefit much. Delightfully, I've actually made this whole thing even slower in 8.14, by increasing the KDF iteration count from 10_000 to 210_000 in #107107 🤦 |
In the interim: #117157 |
Rather than muting the suite and losing signal, bump the suite timeout to account for very slow keystore operations. We should follow this up with performance improvements around keystore setup in tests. Closes: elastic#116883 (cherry picked from commit 312f831)
Rather than muting the suite and losing signal, bump the suite timeout to account for very slow keystore operations. We should follow this up with performance improvements around keystore setup in tests. Closes: elastic#116883
Rather than muting the suite and losing signal, bump the suite timeout to account for very slow keystore operations. We should follow this up with performance improvements around keystore setup in tests. Closes: elastic#116883 (cherry picked from commit 312f831)
Rather than muting the suite and losing signal, bump the suite timeout to account for very slow keystore operations. We should follow this up with performance improvements around keystore setup in tests. Closes: elastic#116883
Rather than muting the suite and losing signal, bump the suite timeout to account for very slow keystore operations. We should follow this up with performance improvements around keystore setup in tests. Closes: elastic#116883
Rather than muting the suite and losing signal, bump the suite timeout to account for very slow keystore operations. We should follow this up with performance improvements around keystore setup in tests. Closes: elastic#116883 (cherry picked from commit 312f831)
Rather than muting the suite and losing signal, bump the suite timeout to account for very slow keystore operations. We should follow this up with performance improvements around keystore setup in tests. Closes: elastic#116883
Rather than muting the suite and losing signal, bump the suite timeout to account for very slow keystore operations. We should follow this up with performance improvements around keystore setup in tests. Closes: elastic#116883
Build Scans:
Reproduction Line:
Applicable branches:
8.x
Reproduces locally?:
N/A
Failure History:
See dashboard
Failure Message:
Issue Reasons:
Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.
The text was updated successfully, but these errors were encountered: