Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improvement] Optimize send fragment logic to reduce send fragment timeout error #9720

Merged
merged 8 commits into from
Jun 3, 2022

Conversation

morningman
Copy link
Contributor

@morningman morningman commented May 21, 2022

Proposed changes

Issue Number: close #8942

Problem Summary:

This CL mainly changes:

  1. Reducing the rpc timeout problem caused by rpc waiting for the worker thread of brpc.

    1. Merge multiple fragment instances on the same BE to send requests to reduce the number of send fragment rpcs
    2. If fragments size >= 3, use 2 phase RPC: one is to send all fragments, two is to start these fragments. So that there
      will be at most 2 RPC for each query on one BE.
  2. Set the timeout of send fragment rpc to the query timeout to ensure the consistency of users' expectation of query timeout period.

  3. Do not close the connection anymore when rpc timeout occurs.

  4. Change some log level from info to debug to simplify the fe.log content.

NOTICE:

  1. Change the definition of execPlanFragment rpc, must first upgrade BE.
  2. Remove FE config remote_fragment_exec_timeout_ms

Checklist(Required)

  1. Does it affect the original behavior: (Yes)
  2. Has unit tests been added: (No Need)
  3. Has document been added or modified: (Yes)
  4. Does it need to update dependencies: (No)
  5. Are there any changes that cannot be rolled back: (No)

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions github-actions bot added the kind/docs Categorizes issue or PR as related to documentation. label May 21, 2022
@xinyiZzz
Copy link
Contributor

LGTM

@morningman morningman added this to the v1.1 milestone May 27, 2022
@morningman morningman added the dev/1.0.1-deprecated should be merged into dev-1.0.1 branch label May 27, 2022
@morningman morningman marked this pull request as draft May 28, 2022 03:04
@morningman morningman force-pushed the send_timeout branch 3 times, most recently from fb9a06e to 75b42c2 Compare May 29, 2022 05:53
@github-actions github-actions bot added area/load Issues or PRs related to all kinds of load area/routine load labels May 29, 2022
@morningman
Copy link
Contributor Author

I have made following tests:

  1. Complex query with 47 fragments. Before: 47 RPCs. After: 2 RPCs
  2. High concurrency test for query with 3 fragments: there is no impact on QPS.
  3. High concurrency test for query with 2 fragments: these is no impact on QPS.
  4. The new BE is compatible with request from old FE.

@morningman morningman marked this pull request as ready for review May 29, 2022 06:05
@morningman
Copy link
Contributor Author

@xinyiZzz @yangzhg PTAL

if (_need_wait_execution_trigger) {
// if _need_wait_execution_trigger is true, which means this instance
// is prepared but need to wait for the signal to do the rest execution.
_fragments_ctx->wait_for_start();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a timeout here, avoid occupy the running thread too much time during exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need, there is a "timeout checker" on BE side to check and notify this.

if (_need_wait_execution_trigger) {
// if _need_wait_execution_trigger is true, which means this instance
// is prepared but need to wait for the signal to do the rest execution.
_fragments_ctx->wait_for_start();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a timeout here, avoid occupy the running thread too much time during exceptions.

@yangzhg
Copy link
Member

yangzhg commented May 31, 2022

I think it is better to divide into two rpc interfaces. exec_plan_fragment processing only fragments size < 3. When 2 phase RPC is required, use two new rpc interfaces exec_plan_fragment_prepare and exec_plan_fragment_start. This may be more clear

@yangzhg
Copy link
Member

yangzhg commented May 31, 2022

I have made following tests:

  1. Complex query with 47 fragments. Before: 47 RPCs. After: 2 RPCs
  2. High concurrency test for query with 3 fragments: there is no impact on QPS.
  3. High concurrency test for query with 2 fragments: these is no impact on QPS.
  4. The new BE is compatible with request from old FE.

In addition to the observed reduction in the number of RPCs, are there any mitigations for the timeout issue?

@morningman
Copy link
Contributor Author

I think it is better to divide into two rpc interfaces. exec_plan_fragment processing only fragments size < 3. When 2 phase RPC is required, use two new rpc interfaces exec_plan_fragment_prepare and exec_plan_fragment_start. This may be more clear

Let me try

@morningman
Copy link
Contributor Author

In addition to the observed reduction in the number of RPCs, are there any mitigations for the timeout issue?

I tested a sql with 3 fragments, 1FE, 1BE, jmeter, thread num 100.
before: send fragment timeout after running a few seconds.
after: no error.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented Jun 3, 2022

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Jun 3, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Jun 3, 2022

PR approved by anyone and no changes requested.

@morningman morningman merged commit c996334 into apache:master Jun 3, 2022
@morningman morningman added dev/merged-1.0.1-deprecated PR has been merged into dev-1.0.1 and removed dev/1.0.1-deprecated should be merged into dev-1.0.1 branch labels Jun 3, 2022
morningman added a commit that referenced this pull request Jun 3, 2022
…meout error (#9720)

This CL mainly changes:
1. Reducing the rpc timeout problem caused by rpc waiting for the worker thread of brpc.
    1. Merge multiple fragment instances on the same BE to send requests to reduce the number of send fragment rpcs
    2. If fragments size >= 3, use 2 phase RPC: one is to send all fragments, two is to start these fragments. So that there
         will be at most 2 RPC for each query on one BE.

3. Set the timeout of send fragment rpc to the query timeout to ensure the consistency of users' expectation of query timeout period.

4. Do not close the connection anymore when rpc timeout occurs.
5. Change some log level from info to debug to simplify the fe.log content.

NOTICE:
1. Change the definition of execPlanFragment rpc, must first upgrade BE.
3. Remove FE config `remote_fragment_exec_timeout_ms`
morningman added a commit to morningman/doris that referenced this pull request Jun 7, 2022
morningman added a commit to morningman/doris that referenced this pull request Jun 8, 2022
yiguolei pushed a commit that referenced this pull request Jun 8, 2022
morningman added a commit that referenced this pull request Jun 8, 2022
yiguolei pushed a commit that referenced this pull request Sep 9, 2022
…12495)


1. For query with 1656 union, the plan thrift size will be reduced from 400MB+ to 2MB.
This optimization is introduced from #4904, but lost after #9720

2. Disable ExprSubstitutionMap.verify when debug is disable.
So that the plan time of query with 1656 union will be reduced from 20s to 2s
yiguolei pushed a commit that referenced this pull request Sep 9, 2022
…12495)

1. For query with 1656 union, the plan thrift size will be reduced from 400MB+ to 2MB.
This optimization is introduced from #4904, but lost after #9720

2. Disable ExprSubstitutionMap.verify when debug is disable.
So that the plan time of query with 1656 union will be reduced from 20s to 2s
Henry2SS pushed a commit to Henry2SS/incubator-doris that referenced this pull request Sep 14, 2022
…pache#12495)

1. For query with 1656 union, the plan thrift size will be reduced from 400MB+ to 2MB.
This optimization is introduced from apache#4904, but lost after apache#9720

2. Disable ExprSubstitutionMap.verify when debug is disable.
So that the plan time of query with 1656 union will be reduced from 20s to 2s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. area/load Issues or PRs related to all kinds of load area/routine load dev/merged-1.0.1-deprecated PR has been merged into dev-1.0.1 kind/docs Categorizes issue or PR as related to documentation. kind/improvement reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] too many "send fragment timeout. backend id" error
5 participants