Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIEM] Executes security application cypress tests on every PR #73939

Closed

Conversation

MadameSheema
Copy link
Member

Summary

We have faced some situations where our tests were failing because other teams introduced bugs in our application making our tests fails and blocking us or blocking other teams modifying/working on our source code.

Lately our failing tests have been for that reason, in order to avoid this, we want to execute or tests on every PR.

@MadameSheema MadameSheema added Team:SIEM v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.10.0 labels Jul 31, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/siem (Team:SIEM)

}
},

'xpack-securitySolutionCypress': kibanaPipeline.functionalTestProcess('xpack-securitySolutionCypress', './test/scripts/jenkins_security_solution_cypress.sh'),
// 'xpack-visualRegression': kibanaPipeline.functionalTestProcess('xpack-visualRegression', './test/scripts/jenkins_xpack_visual_regression.sh'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the commented out block below (I know it's not part of this PR but just saying) ... or put a comment above it about why it is commented out and important to keep around as a reminder.

Generally we just delete commented out things or remove them as we can use git history to find older information.

Copy link
Contributor

@FrankHassanabad FrankHassanabad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@cnasikas cnasikas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I think as we depend on other teams we should run our tests on every PR.

@tylersmalley
Copy link
Contributor

I believe one of the reasons for only running these tests on security changes was due to the stability of the tests. Lately, it seems most of the failures are caused by other changes which would have been addressed if they were ran on those pull requests.

@spalger @brianseeders, do either of you have issues running these on all PR's?

Copy link
Contributor

@spalger spalger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I have with this is that the cypress tests don't have a standard debugging experience so everyone on the team needs to be educated on how to debug these tests if we start running them on every PR. This was part of the cost of doing something unique that we described.

I think we need to move these tests off of cypress and start using the unified functional testing approach so that people can debug these tests when they fail and so Operations can properly track them.

@tylersmalley
Copy link
Contributor

tylersmalley commented Jul 31, 2020

If we do decide to continue with Cypress for these tests:

@spalger
Copy link
Contributor

spalger commented Jul 31, 2020

Documentation needs added for it in https://github.com/elastic/kibana/blob/master/docs/developer/contributing/development-functional-tests.asciidoc

I disagree that writing some documentation and fixing the reporting is sufficient here. I think adding a wildly different test environment that everyone could potentially have to debug and fix tests in is a too high a cost for the supposed benefits here.

@spalger
Copy link
Contributor

spalger commented Jul 31, 2020

If SIEM is going to test parts of their application that integrate with the rest of Kibana and therefore require that the tests are run on every PR we should write them with the shared testing tools.

@MadameSheema you don't need to rewrite all of the tests using the FTR today, just move over the tests that integrate with other parts of Kibana so we can start to get coverage of the SIEM app running on every PR.

@brianseeders
Copy link
Contributor

Out of curiosity, do you have an example of a change outside of security solution that broke your tests? We would love to get to a point where we don't need to run all test suites for all PRs, so it would be nice to see a concrete example of something that's keeping us from doing so, so we can think about what to do with it in the future.

If developers are able to understand and take action on failures, I'm okay with enabling the tests for PRs. If every failure is going to lead to confusion and questions, we should make the experience a little better before turning it on. I don't know how the experience currently is, I'm just saying it should be at least on par with the experience of handling any other test failures

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

✅ unchanged

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@dhurley14
Copy link
Contributor

@brianseeders



Out of curiosity, do you have an example of a change outside of security solution that broke your tests? We would love to get to a point where we don't need to run all test suites for all PRs, so it would be nice to see a concrete example of something that's keeping us from doing so, so we can think about what to do with it in the future.

I think I have found three of the more recent examples where our tests failed due to changes made outside of the security solution plugin. I'm sure there are more examples, but I feel these are enough to provide evidence in support of some amount / all of security solution cypress tests to be run outside of changes made directly in the security solution plugin:

  1. Removal of kibana alerting’s encrypted saved objects migrations broke our tests because es_archiver data was missing updates that needed to be made in conjunction with this pr. As you can see, our tests did not run because no file from the security solution was modified in this PR but it did cause our tests to fail [SIEM] Fixes cypress build by removing alerting version within the saved object data.json.gz #73550. 
Slack convo: https://elastic.slack.com/archives/C013B57RRGE/p1595962931263500

  2. Alerting RBAC pr -> made very small changes to about 3 different files
 Adds Role Based Access-Control to the Alerting & Action plugins based on Kibana Feature Controls #67157
    If the alerting team had not made those changes to files in the security_solution, our cypress tests would not have run and those breaking changes would have put our team in deep trouble for 7.9.

  3. (Not a cypress test, but similar concept) Change in Elasticsearch caching of api keys (Improve role cache efficiency for API key roles elasticsearch#58156) broke the ability to grant api keys without a “name” which is utilized by kibana alerting (which is what our detection rules run on top of.) Our tests caught this breaking change but were skipped in favor of getting the above breaking PR in 
https://elastic.slack.com/archives/CC4SW3AUQ/p1594677817418400

@tylersmalley
Copy link
Contributor

I disagree that writing some documentation and fixing the reporting is sufficient here.

Not saying it's sufficient, just another blocker to this effort.

@mshustov
Copy link
Contributor

mshustov commented Aug 2, 2020

Once, I've experienced a hard time running APM Cypress functional tests locally, so gave up eventually and pinged the owner to help to debug the problem. Not sure the approach with custom tooling is scalable for the Kibana team size.

I think I have found three of the more recent examples where our tests failed due to changes made outside of the security solution plugin. I'm sure there are more examples, but I feel these are enough to provide evidence in support of some amount / all of security solution cypress tests to be run outside of changes made directly in the security solution plugin

The Security solutions plugin has dependencies on other plugins + the Kibana platform.

"requiredPlugins": [
"actions",
"alerts",
"data",
"dataEnhanced",
"embeddable",
"features",
"home",
"taskManager",
"inspector",
"licensing",
"maps",
"triggers_actions_ui",
"uiActions"
],
"optionalPlugins": [
"encryptedSavedObjects",
"ingestManager",
"ml",
"newsfeed",
"security",
"spaces",
"usageCollection",
"lists"
],

Why not run these tests on any change in a dependency tree? Thanks to the Kibana plugin system, we have all the deps declared. We only need to write a script to traverse the deps tree for a plugin. Or we can even extend the platform API with this functionality.

@MadameSheema
Copy link
Member Author

@spalger we don't want to be a blocker or make things harder for other teams, this is why I'm talking with my the team to see how to proceed with this, unfortunately right now we don't have the available bandwith to start with the migration.

For now, it is possible to run the tests for all the PRs but allow other teams to merge the code although something from our side fails? Just block it for security. If you consider that we can generate too much noise, at least with for all the dependencies? That would be really helpful for us to start to investigate issues asap as soon as something fails and we will not block other people. What do you think?

@spalger
Copy link
Contributor

spalger commented Aug 13, 2020

@spalger we don't want to be a blocker or make things harder for other teams, this is why I'm talking with my the team to see how to proceed with this, unfortunately right now we don't have the available bandwith to start with the migration.

How about we setup an hourly job that runs the security suite until the security team can find the time to move their tests to the common structure and the FTR?

For now, it is possible to run the tests for all the PRs but allow other teams to merge the code although something from our side fails? Just block it for security.

No, sorry, I don't think there isn't any way to only require a check for the security team that wouldn't cause a bunch of false negative reporting and decrease the usefulness/trustworthiness of CI or require a ton of changes to the reporting mechanisms we have in place.

@MadameSheema
Copy link
Member Author

How about we setup an hourly job that runs the security suite until the security team can find the time to move their tests to the common structure and the FTR?

Sounds good :)

Do you have any example we can use as a starting point?

@spalger
Copy link
Contributor

spalger commented Aug 14, 2020

I'm working on creating the job now

@MadameSheema
Copy link
Member Author

Thanks! :)

@MadameSheema MadameSheema deleted the security-tests-execution branch July 14, 2021 10:41
@MindyRS MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Sep 23, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:skip Skip the PR/issue when compiling release notes Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:SIEM v7.10.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.