Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES search rate very high when TBS enabled #6639

Closed
Tracked by #6894
bryce-b opened this issue Nov 16, 2021 · 2 comments · Fixed by #7211
Closed
Tracked by #6894

ES search rate very high when TBS enabled #6639

bryce-b opened this issue Nov 16, 2021 · 2 comments · Fixed by #7211
Assignees
Labels
Milestone

Comments

@bryce-b
Copy link
Contributor

bryce-b commented Nov 16, 2021

APM Server version (apm-server version): 7.16.0-SNAPSHOT

Description of the problem including expected versus actual behavior:
When Tail-based sampling is enabled, ES search rate doubles
Screen Shot 2021-11-16 at 9 59 07 AM
In the above screencap you can see when TBS is enabled at 21:30

Steps to reproduce:

Please include a minimal but complete recreation of the problem,
including server configuration, agent(s) used, etc. The easier you make it
for us to reproduce it, the more likely that somebody will take the time to
look at it.

  1. start cloud deployment
  2. apply APM integrations to deployment
  3. generate data w/ apm_integration_testing
  4. apply configuration to APM Server:
apm-server:
  data_streams:
    enabled: true
  sampling:
    keep_unsampled: false
    tail:
      enabled: true
	 policies:
	        - sample_rate: 0.1
@bryce-b bryce-b added the bug label Nov 16, 2021
@simitt simitt mentioned this issue Dec 17, 2021
21 tasks
@simitt simitt added this to the 8.1 milestone Dec 17, 2021
@simitt simitt changed the title Tail Based Sampling: ES search rate very high when TBS enabled ES search rate very high when TBS enabled Dec 17, 2021
@axw axw self-assigned this Jan 24, 2022
@axw
Copy link
Member

axw commented Feb 7, 2022

With some additional debug logging in place, I can see that the TBS subscriber is repeatedly returning the same trace IDs. Still digging...

@axw
Copy link
Member

axw commented Feb 7, 2022

OK, found the issue. The subscriber is not updating the minimum constraint on _seq_no when performing multiple searches, when there are more than 1000 docs, causing this loop to never terminate:

for maxObservedSeqno < maxSeqno {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants