Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test-failed]: X-Pack Endpoint Functional Tests1.x-pack/test/security_solution_endpoint/apps/endpoint/policy_details·ts - endpoint When on the Endpoint Policy Details Page with a valid policy id "before all" hook for "should display policy view" #80978

Closed
liza-mae opened this issue Oct 19, 2020 · 15 comments
Assignees
Labels
failed-test A test failure on a tracked branch, potentially flaky-test Feature:Policy Security Solution Policy feature Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. test-cloud

Comments

@liza-mae
Copy link
Contributor

liza-mae commented Oct 19, 2020

Version: 7.10.0
Class: X-Pack Endpoint Functional Tests1.x-pack/test/security_solution_endpoint/apps/endpoint/policy_details·ts
Stack Trace:

Error: Unable to create Agent Policy via Ingest!
   at logSupertestApiErrorAndThrow (test/security_solution_endpoint/services/endpoint_policy.ts:60:11)
   at Object.createPolicy (test/security_solution_endpoint/services/endpoint_policy.ts:148:16)

Other test failures:

  • endpoint When on the Endpoint Policy Details Page and the save button is clicked "before each" hook for "should display success toast on successful save"
  • endpoint When on the Endpoint Policy Details Page when on Ingest Policy Edit Package Policy page "before each" hook for "should show callout"

Test Report: https://internal-ci.elastic.co/view/Stack%20Tests/job/elastic+estf-cloud-kibana-tests/868/testReport/

Configuration: Two Kibana instances, consistently fails in this configuration.

@liza-mae liza-mae added failed-test A test failure on a tracked branch, potentially flaky-test test-cloud labels Oct 19, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-test-triage (failed-test)

@liza-mae liza-mae added Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Endpoint Management labels Oct 19, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/endpoint-management (Team:Endpoint Management)

@paul-tavares
Copy link
Contributor

@liza-mae quick question:

re:

two Kibana instances, consistently fails in this configuration.

Would this cause two instances of the test suite to be run against Kibana concurrently?

@liza-mae
Copy link
Contributor Author

HI @paul-tavares, the two Kibana instances are behind a load balancer, some info: https://www.elastic.co/guide/en/kibana/current/production.html#load-balancing-kibana.

I am running on ESS, it is possible these failures are not related to this configuration but I wanted to point it out as a difference and to help in reproducing/debugging the problem.

@jfsiii
Copy link
Contributor

jfsiii commented Oct 19, 2020

@paul-tavares mentioned in Slack that there's a 409 conflict error. There were two recent PRs #80506 & #79201 to enforce unique policy names

Fleet avoids this in some tests using an after or afterEach to to clean up the created policies & prevent the conflict. e.g.

after(async function () {
await supertest
.post(`/api/fleet/agent_policies/delete`)
.set('kbn-xsrf', 'xxxx')
.send({ agentPolicyId });
});

after(async () => {
const deletedPromises = createdPolicyIds.map((agentPolicyId) =>
supertest
.post(`/api/fleet/agent_policies/delete`)
.set('kbn-xsrf', 'xxxx')
.send({ agentPolicyId })
.expect(200)
);
await Promise.all(deletedPromises);

Another option would be to ensure the name was unique (add Date.now(), etc) within the test suite

@paul-tavares
Copy link
Contributor

@jfsiii all of our tests do clean up using afterEach:

beforeEach(async () => {
// Create a policy and navigate to Ingest app
policyInfo = await policyTestResources.createPolicy();
await pageObjects.ingestManagerCreatePackagePolicy.navigateToAgentPolicyEditPackagePolicy(
policyInfo.agentPolicy.id,
policyInfo.packagePolicy.id
);
});
afterEach(async () => {
if (policyInfo) {
await policyInfo.cleanup();
}
});

Also - this seems to be happening only in this branch (7.10) and maybe only under the given multi-kibana node setup (unclear if that is actually a contributing factor).

Can you think of any other issue in Ingest that could cause it?

@paul-tavares
Copy link
Contributor

@liza-mae how do I access the screen captures taken by the failed test?

There are several other tests that failed, some of which make no sense to me and I just want to look at what was being displayed at the time of failure.

@liza-mae
Copy link
Contributor Author

Hi @paul-tavares, yes you can find the screenshots this way:

  1. Click the test report link in the summary of this issue:
    https://internal-ci.elastic.co/view/Stack%20Tests/job/elastic+estf-cloud-kibana-tests/868/testReport/
  2. Click the group for the test failure, in this case xpackExtGrp2
  3. On the left hand side, click Google Cloud Storage Upload Report
  4. Search for the test title: endpoint When on the Endpoint Policy Details Page with a valid policy id
  5. The search will highlight a .png and .html file for the test name, click the .png Download link one.

I have also attached it for you :) The other failures listed in this issue have the same error displayed.

endpoint When on the Endpoint Policy Details Page with a valid policy id before all hook

@paul-tavares
Copy link
Contributor

Thank you @liza-mae . One other think I'm not able to identify is the commit hash for the build that was used for this env. Some of failed tests (unrelated to this one specifically) seem to indicate running older test files against newer code.

@jfsiii
Copy link
Contributor

jfsiii commented Oct 19, 2020

@paul-tavares looking at the the Jenkins logs I see the 409

[00:00:55]               │ERROR {
[00:00:55]               │        "statusCode": 409,
[00:00:55]               │        "error": "Conflict",
[00:00:55]               │        "message": "Agent Policy '11121c50-0f97-11eb-9ba5-8336808f8cae' already exists with name 'East Coast'"
[00:00:55]               │      }
[00:00:55]               │ERROR Error: expected 200 "OK", got 409 "Conflict"

and the link back to the createPolicy from the stack trace

async createPolicy(): Promise<PolicyTestResourceInfo> {
// create Agent Policy
let agentPolicy: CreateAgentPolicyResponse['item'];
try {
const newAgentPolicyData: CreateAgentPolicyRequest['body'] = {
name: 'East Coast',
description: 'East Coast call center',
namespace: 'default',
};
const { body: createResponse }: { body: CreateAgentPolicyResponse } = await supertest
.post(INGEST_API_AGENT_POLICIES)
.set('kbn-xsrf', 'xxx')
.send(newAgentPolicyData)
.expect(200);

I found two other files which seem to call createPolicy, policy_details & policy_list.ts

If those are tests are sharing the same ES and running at the same time they will have a potential name collision until they cleanup and free up that policy name.

@liza-mae
Copy link
Contributor Author

liza-mae commented Oct 19, 2020

@paul-tavares

One other think I'm not able to identify is the commit hash for the build that was used for this env.

This was run against 7.10.0 BC2 (kibana f1c0bdd)

Some of failed tests (unrelated to this one specifically) seem to indicate running older test files against newer code.

It is the other way around, the test files are always the latest and the code under test might be older, that is because when we fix flaky tests I need to pull the latest tests.

@paul-tavares
Copy link
Contributor

@liza-mae
Thanks again.

re: test files not being in sync with the code
That explains some of the test failures for security_solution_endpoint/ where the tests were adjusted in conjunction with the code, so those tests will now fail (as expected?). When will this be sync'd up?
For reference - this PR was merged last Friday which once the code is brought up to that level, should take care of the failures around page titles not syncing up.

Re: failures against Fleet apis
I'm still not able to explain why this is happening in this branch/env. It's not an issue in master, but I will continue to look at it. Can you again confirm that the test suite is not being run more than once concurrently against the same Kibana?

@jfsiii thanks - yes, they are used heavily on our tests, but they all should cleanup during afterEach. If the test suite is in fact running more than once concurrently, then I can see the problem there - but why only in this env?

@liza-mae
Copy link
Contributor Author

@paul-tavares

That explains some of the test failures for security_solution_endpoint/ where the tests were adjusted in conjunction with the code, so those tests will now fail (as expected?). When will this be sync'd up?

We will sync back up once our test failures have gone down to the single digits or preferably zero, I don't expect that to happen soon.

I'm still not able to explain why this is happening in this branch/env. It's not an issue in master, but I will continue to look at it. Can you again confirm that the test suite is not being run more than once concurrently against the same Kibana?

I don't run the test suite concurrently on the same Kibana.

@kevinlog
Copy link
Contributor

as far as I can tell, these tests failed with another set of unrelated tests, so I'm going to chalk it up to a build issue. I'm going to close this and will re-open if the problem persists

@kevinlog
Copy link
Contributor

Closing this particular one again, I haven't seen it fail on the last couple 7.10 builds. I think it was due to tests being slightly out of sync.

Test not failing.
https://internal-ci.elastic.co/view/Stack%20Tests/job/elastic+estf-cloud-kibana-tests/897/testReport/
https://internal-ci.elastic.co/view/Stack%20Tests/job/elastic+estf-cloud-kibana-tests/921/testReport/

Will re-open if it occurs again. Not currently flaky in CI either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
failed-test A test failure on a tracked branch, potentially flaky-test Feature:Policy Security Solution Policy feature Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. test-cloud
Projects
None yet
Development

No branches or pull requests

7 participants