Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing test: Chrome X-Pack UI Functional Tests.x-pack/test/functional/apps/discover/async_scripted_fields·js - discover async search with scripted fields query return results with valid scripted field #78553

Closed
kibanamachine opened this issue Sep 26, 2020 · 41 comments · Fixed by #143346
Labels
failed-test A test failure on a tracked branch, potentially flaky-test impact:critical This issue should be addressed immediately due to a critical level of impact on the product. loe:small Small Level of Effort Project:AsyncSearch Background search, partial results, async search services. SharedUX/fix-it-week Bugs that have been groomed and queued up for the team's next fix it week

Comments

@kibanamachine
Copy link
Contributor

kibanamachine commented Sep 26, 2020

A test failed on a tracked branch

Error: retry.tryForTime timeout: Error: expected '14,005' to equal '13,301'
    at Assertion.assert (/dev/shm/workspace/parallel/2/kibana/packages/kbn-expect/expect.js:100:11)
    at Assertion.be.Assertion.equal (/dev/shm/workspace/parallel/2/kibana/packages/kbn-expect/expect.js:227:8)
    at Assertion.be (/dev/shm/workspace/parallel/2/kibana/packages/kbn-expect/expect.js:69:22)
    at /dev/shm/workspace/parallel/2/kibana/x-pack/test/functional/apps/discover/async_scripted_fields.js:122:61
    at process._tickCallback (internal/process/next_tick.js:68:7)
    at onFailure (/dev/shm/workspace/parallel/2/kibana/test/common/services/retry/retry_for_success.ts:28:9)
    at retryForSuccess (/dev/shm/workspace/parallel/2/kibana/test/common/services/retry/retry_for_success.ts:68:13)

First failure: Jenkins Build

@kibanamachine kibanamachine added the failed-test A test failure on a tracked branch, potentially flaky-test label Sep 26, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-test-triage (failed-test)

@azasypkin azasypkin added Team:Visualizations Visualization editors, elastic-charts and infrastructure triage_needed labels Sep 28, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app (Team:KibanaApp)

@wylieconlon wylieconlon self-assigned this Sep 28, 2020
@wylieconlon
Copy link
Contributor

This fails every day or so, and I think it could be related to the timeout-related changes from async search. Running a flaky test suite to see if it's easy to reproduce.

@wylieconlon
Copy link
Contributor

Ran 42 runs on master and they all succeeded. I will test 7.x separately, and see if I can reproduce in cloud at all.

@kertal
Copy link
Member

kertal commented Sep 30, 2020

FYI @LeeDr there seems to be a kind a flakiness, from time to time showing a false hit count

@kibanamachine
Copy link
Contributor Author

New failure: Jenkins Build

@wylieconlon wylieconlon removed their assignment Oct 1, 2020
@LeeDr LeeDr self-assigned this Oct 1, 2020
@LeeDr
Copy link

LeeDr commented Oct 1, 2020

I added this test on June 29th 2020. And Rashmi revised to setRoles to the minimal roles required on July 16th.
The first failure was on August 4th. I'm finding 14 failures since then with 2 of those on 7.x builds and the other 12 on PRs. So that's out of about 100,000 builds.
Here's the most recent (October 1st 2020, 10:24:35.506) failure console output (slightly edited to remove spurious info). In this case, it never loaded any results and shows an error dialog;

[00:08:28]             └-> query return results with valid scripted field
[00:08:28]               └-> "before each" hook: global before each
[00:08:28]               │ debg TestSubjects.click(indexPattern-switch-link)
[00:08:28]               │ debg Find.clickByCssSelector('[data-test-subj="indexPattern-switch-link"]') with timeout=10000
[00:08:28]               │ debg Find.findByCssSelector('[data-test-subj="indexPattern-switch-link"]') with timeout=10000
[00:08:28]               │ debg Find.setValue('[data-test-subj="indexPattern-switcher"] input', 'logstash-*')
[00:08:28]               │ debg Find.findByCssSelector('[data-test-subj="indexPattern-switcher"] input') with timeout=10000
[00:08:28]               │ debg Find.clickByCssSelector('[data-test-subj="indexPattern-switcher"] [title="logstash-*"]') with timeout=10000
[00:08:28]               │ debg Find.findByCssSelector('[data-test-subj="indexPattern-switcher"] [title="logstash-*"]') with timeout=10000
[00:08:28]               │ debg isGlobalLoadingIndicatorVisible
[00:08:28]               │ debg TestSubjects.exists(globalLoadingIndicator)
[00:08:28]               │ debg Find.existsByDisplayedByCssSelector('[data-test-subj="globalLoadingIndicator"]') with timeout=1500
[00:08:28]               │ debg TestSubjects.exists(globalLoadingIndicator-hidden)
[00:08:28]               │ debg Find.existsByCssSelector('[data-test-subj="globalLoadingIndicator-hidden"]') with timeout=100000
[00:08:31]               │ debg QueryBar.setQuery(php* OR *jpg OR *css*)
[00:08:31]               │ debg TestSubjects.click(queryInput)
[00:08:31]               │ debg Find.clickByCssSelector('[data-test-subj="queryInput"]') with timeout=10000
[00:08:31]               │ debg Find.findByCssSelector('[data-test-subj="queryInput"]') with timeout=10000
[00:08:32]               │ debg TestSubjects.getAttribute(queryInput, value)
[00:08:32]               │ debg TestSubjects.find(queryInput)
[00:08:32]               │ debg Find.findByCssSelector('[data-test-subj="queryInput"]') with timeout=10000
[00:08:32]               │ debg TestSubjects.click(querySubmitButton)
[00:08:32]               │ debg Find.clickByCssSelector('[data-test-subj="querySubmitButton"]') with timeout=10000
[00:08:32]               │ debg Find.findByCssSelector('[data-test-subj="querySubmitButton"]') with timeout=10000
[00:08:32]               │ debg isGlobalLoadingIndicatorVisible
[00:08:32]               │ debg TestSubjects.exists(globalLoadingIndicator)
[00:08:32]               │ debg Find.existsByDisplayedByCssSelector('[data-test-subj="globalLoadingIndicator"]') with timeout=1500
[00:08:34]               │ debg TestSubjects.exists(globalLoadingIndicator-hidden)
[00:08:34]               │ debg Find.existsByCssSelector('[data-test-subj="globalLoadingIndicator-hidden"]') with timeout=100000
[00:08:35]               │ info [o.e.x.c.a.AsyncResultsService] [kibana-ci-immutable-centos-tests-xxl-1601560039535949685] failed to update expiration time for async-search [FlpWWlc2THlNUUtHYk1WelB1UzdnOVEcODF0SkRwQjdUMzZ6Q0x2YjBwREdJUToxMDg1Mg==]
[00:08:35]               │      org.elasticsearch.transport.RemoteTransportException: [kibana-ci-immutable-centos-tests-xxl-1601560039535949685][127.0.0.1:6133][indices:data/write/update[s]]
[00:08:35]               │      Caused by: org.elasticsearch.index.engine.VersionConflictEngineException: [ZVZW6LyMQKGbMVzPuS7g9Q]: version conflict, required seqNo [15], primary term [1]. current document has seqNo [16] and primary term [1]
[00:08:35]               │ proc [kibana]   log   [14:27:38.084] [error][data][elasticsearch] [resource_not_found_exception]: FlpWWlc2THlNUUtHYk1WelB1UzdnOVEcODF0SkRwQjdUMzZ6Q0x2YjBwREdJUToxMDg1Mg==
[00:08:35]               │ERROR browser[SEVERE] http://localhost:6131/internal/search/ese/FlpWWlc2THlNUUtHYk1WelB1UzdnOVEcODF0SkRwQjdUMzZ6Q0x2YjBwREdJUToxMDg1Mg== - Failed to load resource: the server responded with a status of 404 (Not Found)
[00:08:35]               │ debg TestSubjects.getVisibleText(discoverQueryHits)
[00:08:35]               │ debg TestSubjects.find(discoverQueryHits)
[00:08:35]               │ debg Find.findByCssSelector('[data-test-subj="discoverQueryHits"]') with timeout=10000
[00:08:45]               │ debg --- retry.tryForTime error: Waiting for element to be located By(css selector, [data-test-subj="discoverQueryHits"])
[00:08:45]               │      Wait timed out after 10021ms
[00:08:45]               │ debg isGlobalLoadingIndicatorVisible
[00:08:45]               │ debg TestSubjects.exists(globalLoadingIndicator)
[00:08:45]               │ debg Find.existsByDisplayedByCssSelector('[data-test-subj="globalLoadingIndicator"]') with timeout=1500
[00:08:47]               │ debg --- retry.tryForTime error: [data-test-subj="globalLoadingIndicator"] is not displayed
[00:08:47]               │ debg TestSubjects.exists(globalLoadingIndicator-hidden)
[00:08:47]               │ debg Find.existsByCssSelector('[data-test-subj="globalLoadingIndicator-hidden"]') with timeout=100000
[00:08:47]               │ debg TestSubjects.getVisibleText(discoverQueryHits)
[00:08:47]               │ debg TestSubjects.find(discoverQueryHits)
[00:08:47]               │ debg Find.findByCssSelector('[data-test-subj="discoverQueryHits"]') with timeout=10000

[00:08:58]               │ debg --- retry.tryForTime error: Waiting for element to be located By(css selector, [data-test-subj="discoverQueryHits"])
[00:08:58]               │      Wait timed out after 10037ms
[00:08:58]               │ debg isGlobalLoadingIndicatorVisible
[00:08:58]               │ debg TestSubjects.exists(globalLoadingIndicator)
[00:08:58]               │ debg Find.existsByDisplayedByCssSelector('[data-test-subj="globalLoadingIndicator"]') with timeout=1500
[00:09:00]               │ debg --- retry.tryForTime error: [data-test-subj="globalLoadingIndicator"] is not displayed
[00:09:00]               │ debg TestSubjects.exists(globalLoadingIndicator-hidden)
[00:09:00]               │ debg Find.existsByCssSelector('[data-test-subj="globalLoadingIndicator-hidden"]') with timeout=100000
[00:09:00]               │ debg TestSubjects.getVisibleText(discoverQueryHits)
[00:09:00]               │ debg TestSubjects.find(discoverQueryHits)
[00:09:00]               │ debg Find.findByCssSelector('[data-test-subj="discoverQueryHits"]') with timeout=10000
[00:09:10]               │ debg --- retry.tryForTime error: Waiting for element to be located By(css selector, [data-test-subj="discoverQueryHits"])
[00:09:10]               │      Wait timed out after 10023ms
[00:09:11]               │ info Taking screenshot "/dev/shm/workspace/parallel/3/kibana/x-pack/test/functional/screenshots/failure/discover async search with scripted fields query return results with valid scripted field.png"
[00:09:11]               │ info Current URL is: http://localhost:6131/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:%27Sep%2018,%202015%20@%2019:37:13.000%27,to:%27Sep%2023,%202015%20@%2002:30:09.000%27))&_a=(columns:!(_source),filters:!(),index:eddf62c0-85bb-11ea-a555-0b6139dddc19,interval:auto,query:(language:kuery,query:%27php*%20OR%20*jpg%20OR%20*css*%27),sort:!())
[00:09:11]               │ info Saving page source to: /dev/shm/workspace/parallel/3/kibana/x-pack/test/functional/failure_debug/html/discover async search with scripted fields query return results with valid scripted field.html
[00:09:11]               └- ✖ fail: discover async search with scripted fields query return results with valid scripted field
[00:09:11]               │      Error: retry.tryForTime timeout: TimeoutError: Waiting for element to be located By(css selector, [data-test-subj="discoverQueryHits"])
[00:09:11]               │ Wait timed out after 10023ms
[00:09:11]               │     at /dev/shm/workspace/kibana/node_modules/selenium-webdriver/lib/webdriver.js:842:17
[00:09:11]               │     at process._tickCallback (internal/process/next_tick.js:68:7)
[00:09:11]               │       at onFailure (/dev/shm/workspace/parallel/3/kibana/test/common/services/retry/retry_for_success.ts:28:9)
[00:09:11]               │       at retryForSuccess (/dev/shm/workspace/parallel/3/kibana/test/common/services/retry/retry_for_success.ts:68:13)
[00:09:11]               │ 
[00:09:11]               │ 

And the screenshot;
discover async search with scripted fields query return results with valid scripted field

@LeeDr
Copy link

LeeDr commented Oct 1, 2020

Failure on September 30th 2020, 09:49:14.546 looked the same.
. same on September 29th 2020, 11:20:22.365

On September 28th 2020, 15:13:11.238 the failure screenshot looked different. It looks like it loaded the page with 14,005 hits (which is the expected pre-query hit count), and then did the query, hit an error, and didn't clear the previous results.
discover async search with scripted fields query return results with valid scripted field (1)

I checked 3 more before this and they had the same resource_not_found error dialog.

I think I need to make the test click on the See the full error button to capture what's there.

@LeeDr
Copy link

LeeDr commented Oct 1, 2020

Found this in the Jenkins failed test log;

[00:08:47]               │ info [o.e.x.c.a.AsyncResultsService] [kibana-ci-immutable-debian-tests-xxl-1601110029429513606] failed to update expiration time for async-search [FmNISDNMTlRMVFd5RUptMF9fVjRBZnccc1FKMHRac0VSTC1USFlQSFVrZ0lqZzoxMDg0Ng==]
[00:08:47]               │      org.elasticsearch.transport.RemoteTransportException: [kibana-ci-immutable-debian-tests-xxl-1601110029429513606][127.0.0.1:6123][indices:data/write/update[s]]
[00:08:47]               │      Caused by: org.elasticsearch.index.engine.VersionConflictEngineException: [cHH3LNTLTWyEJm0__V4Afw]: version conflict, required seqNo [21], primary term [1]. current document has seqNo [22] and primary term [1]
[00:08:47]               │ proc [kibana]   log   [09:26:37.381] [error][data][elasticsearch] [resource_not_found_exception]: FmNISDNMTlRMVFd5RUptMF9fVjRBZnccc1FKMHRac0VSTC1USFlQSFVrZ0lqZzoxMDg0Ng==
[00:08:47]               │ERROR browser[SEVERE] http://localhost:6121/internal/search/ese/FmNISDNMTlRMVFd5RUptMF9fVjRBZnccc1FKMHRac0VSTC1USFlQSFVrZ0lqZzoxMDg0Ng== - Failed to load resource: the server responded with a status of 404 (Not Found)
[00:08:47]               │ debg TestSubjects.getVisibleText(discoverQueryHits)
[00:08:47]               │ debg TestSubjects.find(discoverQueryHits)
[00:08:47]               │ debg Find.findByCssSelector('[data-test-subj="discoverQueryHits"]') with timeout=10000
[00:08:47]               │ debg --- retry.tryForTime error: expected '14,005' to equal '13,301'

@LeeDr LeeDr assigned lizozom and unassigned LeeDr Oct 1, 2020
@LeeDr
Copy link

LeeDr commented Oct 1, 2020

@lizozom with this additional info from the failed test, is this something you could look into? It's not failing all that often so it may be pretty hard to reproduce.

It initially started failing around Aug 4th with the resource_no_found_exception dialog and log messages failed to update expiration time for async-search, version conflict.

But seems to have changed around September 29th to just show Not Found, with the same underlying log messages in the test debug log.

@wylieconlon wylieconlon added Project:AsyncSearch Background search, partial results, async search services. Team:AppArch labels Oct 1, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-arch (Team:AppArch)

@wylieconlon wylieconlon removed Feature:Discover Discover Application Team:Visualizations Visualization editors, elastic-charts and infrastructure triage_needed labels Oct 1, 2020
@wylieconlon wylieconlon added the bug Fixes for quality problems that affect the customer experience label Oct 1, 2020
@lukasolson lukasolson self-assigned this Oct 1, 2020
@tylersmalley
Copy link
Contributor

tylersmalley commented Dec 15, 2020

Skipped, blocking master/7.x

master: 003396e
7.x: 1a8a6db

@kibanamachine
Copy link
Contributor Author

New failure: Jenkins Build

1 similar comment
@kibanamachine
Copy link
Contributor Author

New failure: Jenkins Build

gmmorris added a commit to ymao1/kibana that referenced this issue Dec 15, 2020
* master: (66 commits)
  [Alerting] fixes broken Alerting Example plugin (elastic#85774)
  [APM] Service overview instances table (elastic#85770)
  [Security Solution] Unskip timeline creation Cypress test (elastic#85871)
  properly recognize enterprise licenses (elastic#85849)
  [SecuritySolution][Detections] Adds SavedObject persistence to Signals Migrations (elastic#85690)
  [TSVB] Fix functional tests flakiness and unskip them (elastic#85388)
  [Fleet] Change permissions for Fleet enroll role (elastic#85802)
  Gauge visualization can no longer be clicked to filter on values since Kibana 7.10.0 (elastic#84768)
  [Security Solution][Detections] Add alert source to detection rule action context (elastic#85488)
  [Discover] Don't display hide/show button for histogram when there's no time filter (elastic#85424)
  skip flaky suite (elastic#78553)
  License checks for alerts plugin (elastic#85649)
  skip flaky suite (elastic#84992)
  skip 'query return results valid for scripted field' elastic#78553
  Allow action types to perform their own mustache variable escaping in parameter templates (elastic#83919)
  [ML] More machine learning links in doc_links_service.ts (elastic#85365)
  Removed Alerting & Event Log deprecated fields that should not be using (elastic#85652)
  Closes elastic#79995 by adding new tab in transaction details to show related trace logs. (elastic#85859)
  Fix outdated jest snapshot
  [Maps] Surface on prem EMS (elastic#85729)
  ...
@Dosant Dosant added SharedUX/fix-it-week Bugs that have been groomed and queued up for the team's next fix it week and removed bug Fixes for quality problems that affect the customer experience labels Feb 24, 2021
@lukasolson
Copy link
Member

Resolved by #94653.

@kibanamachine
Copy link
Contributor Author

New failure: CI Build - main

@exalate-issue-sync exalate-issue-sync bot added the impact:critical This issue should be addressed immediately due to a critical level of impact on the product. label Jul 21, 2022
@tsullivan
Copy link
Member

Not sure if this test is still truly flaky.

The cause of the last failure is Error: retry.try timeout: TimeoutError: Waiting for element to be located By(css selector, [data-test-subj="indexPattern-switcher"] [title="logstash-*"])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
failed-test A test failure on a tracked branch, potentially flaky-test impact:critical This issue should be addressed immediately due to a critical level of impact on the product. loe:small Small Level of Effort Project:AsyncSearch Background search, partial results, async search services. SharedUX/fix-it-week Bugs that have been groomed and queued up for the team's next fix it week
Projects
None yet
Development

Successfully merging a pull request may close this issue.