Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serverless][Testing] Improve serverless force logout and enable some related tests #184202

Conversation

gergoabraham
Copy link
Contributor

@gergoabraham gergoabraham commented May 24, 2024

Summary

Looks like there are some tests being flaky because redirect after forceLogout is not performed correctly on https://cloud.elastic.co/logout?redirectTo=%2Fprojects%2F.
So far there has been tries to increase timeout to improve flakiness but there are still random fails.

This PR tries to improve on this by simply adding a retry for the force logout process, therefore the logout page will be reloaded, in order to retrigger the redirect process.

There are some other cases when there was an error message in the browser: Failed to decode downloaded font: https://cloud.elastic.co/449450163f45882ebfc2.woff. This also looks flakiness, so I'm hoping that the retry/reload mechanism helps with those cases, too.

For future flakiness

In case we still experience flakiness, maybe we can simply don't expect the redirection to finish - i don't think we care about it: when we are on this redirect page, we are already logged out.

The redirect is an external dependency from the perspective of these tests - if cloud.elastic.co cannot redirect, it shouldn't really matter.

Seemingly related tests

Skipped tests that are enabled in this PR

fixes #180401
fixes #184319

Failed but not yet skipped tests

closes #174975
closes #175159
closes #167818
closes #176413
closes #175612
closes #175785
closes #175606
closes #175599
closes #175083
closes #174439
closes #174346
closes #174566
closes #173192
closes #174111
closes #174116
closes #173950
closes #173074
closes #170899
closes #170890
closes #167597
closes #169818
closes #167585
closes #167792
closes #168751

Other

This test is related, and could be enabled, but since it's been skipped for months, the business logic has changed, so the test needs to be updated: #168037

Note

If a test fails, the issue contains its error message. If it fails again in the future with a different error, the same issue will be reopened, although it's not the same error. And, I searched for the error message amongst these issues, therefore:

  • I found test fail issues containing the same error message originally, but recently being failed on a different reason: these are of course not included in this list as the recent error reason is something different
  • and probably I did not find test fails that originally failed with a different reason, but now with this, which is unfortunate

@gergoabraham gergoabraham added release_note:skip Skip the PR/issue when compiling release notes backport:skip This commit does not require backporting Team:Defend Workflows “EDR Workflows” sub-team of Security Solution labels May 24, 2024
@gergoabraham gergoabraham self-assigned this May 24, 2024
@gergoabraham
Copy link
Contributor Author

/ci

@gergoabraham gergoabraham marked this pull request as ready for review May 24, 2024 13:10
@gergoabraham gergoabraham requested review from a team as code owners May 24, 2024 13:10
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-defend-workflows (Team:Defend Workflows)

@gergoabraham
Copy link
Contributor Author

@elasticmachine merge upstream

@gergoabraham
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@tonyghiani tonyghiani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gergoabraham
Copy link
Contributor Author

@elasticmachine merge upstream

await alert.accept();
}

// Timeout has been doubled here in attempt to quiet the flakiness
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doubled is a relative word. I'd suggest to update the comment about timeout "timeout is set at 40 seconds for reducing flakiness". Just so that it gets updated next time we update the timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// After logging out, the user can be redirected to various locations depending on the context. By default, we
// expect the user to be redirected to the login page. However, if the login page is not available for some reason,
// we should simply wait until the user is redirected *elsewhere*.
// Timeout has been doubled here in attempt to quiet the flakiness
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise here about doubled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @gergoabraham

@gergoabraham gergoabraham merged commit 22c2a67 into elastic:main May 31, 2024
17 checks passed
@gergoabraham gergoabraham deleted the improve-serverless-force-logout-and-enable-tests branch May 31, 2024 11:21
@@ -16,8 +16,7 @@ import {
export default function (providerContext: FtrProviderContext) {
const { loadTestFile, getService, getPageObjects } = providerContext;

// FLAKY: https://github.com/elastic/kibana/issues/180401
describe.skip('endpoint', function () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gergoabraham We saw similar flakiness before and from appex-qa perspective it is related to basic authentication in UI tests (line 42 calls svlCommonPage.login()) for serverless env when serverless env is setup with mock-idp to do SAML authentication. I tried retries before myself and flaky-test-runner proved it won't help as redirects issue still persist.

The right solution is to migrate the FTR tests to role-based SAML authentication, it is mostly 1 line change and we didn't see similar flakiness:
https://github.com/elastic/kibana/blob/main/x-pack/test_serverless/functional/page_objects/svl_common_page.ts#L73-L130
More details with examples are here

You probably saw recent email about new preferred login method for a local serverless Kibana, with that we consider deprecating svlCommonPage.login() as all e2e UI tests should be using SAML auth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @dmlemeshko, a big thank you for letting me know! 🙌

addressing the change here:

gergoabraham added a commit that referenced this pull request Aug 5, 2024
…#189248)

## Summary

This PR re-enables our `endpoint` FTRs, and tries to address the
serverless login/forceLogout flakiness both in `endpoint` and
`integration` FTRs, by using a new `svlCommonPage.loginWithRole()` login
functionality, suggested in this comment:
#184202 (comment)

flaky runner:
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6678

closes: #186089
closes: #186086

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Defend Workflows “EDR Workflows” sub-team of Security Solution v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failing test: Serverless Security Functional Tests - Common Group 1.x-pack/test_serverless/functional/test_suites/common/home_page/home_page·ts - Serverless Common UI - Home Page home page "before all" hook for "has project header" Failing test: Serverless Observability Functional Tests.x-pack/test_serverless/functional/test_suites/observability/observability_log_explorer/header_menu·ts - serverless observability UI Observability Log Explorer Header menu "after all" hook for "should inject the app header menu on the top navbar" Failing test: Serverless Observability Functional Tests - Common Group 1.x-pack/test_serverless/functional/test_suites/common/platform_security/navigation/avatar_menu·ts - Serverless Common UI - Platform Security Avatar menu "before all" hook for "is displayed" Failing test: Serverless Search Screenshot Creation.x-pack/test_serverless/functional/test_suites/search/screenshot_creation/response_ops_docs/stack_connectors/connectors·ts - Screenshots - serverless search UI response ops docs stack connectors connectors "before all" hook for "connectors list screenshot" Failing test: Serverless Security Functional Tests.x-pack/test_serverless/functional/test_suites/security/ftr/landing_page·ts - serverless security UI landing page "after all" hook for "has serverless side nav"
7 participants