-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline: support multi-level feedback queue #7393
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/run-all-tests |
/assign |
dbms/src/Flash/Pipeline/Schedule/TaskQueues/MultiLevelFeedbackQueue.h
Outdated
Show resolved
Hide resolved
/rebuild |
Could you briefly describe what factors determine the priority of a task? |
|
/run-all-tests |
/run-all-tests |
task_queue.push_back(std::move(task)); | ||
} | ||
|
||
double UnitQueue::accuTimeAfterDivisor() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe rename it to normalizedTime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, renamed.
// The executing task should yield if it takes more than `YIELD_MAX_TIME_SPENT_NS`. | ||
if (status != Impl::TargetStatus || execute_time_ns >= YIELD_MAX_TIME_SPENT_NS) | ||
if (status != Impl::TargetStatus || total_time_spent >= YIELD_MAX_TIME_SPENT_NS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YIELD_MAX_TIME_SPENT_NS
is the same for different level queue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can be the same, because the minimum time slice of the queue is 200ms, which is greater than the 100ms here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
// The executing task should yield if it takes more than `YIELD_MAX_TIME_SPENT_NS`. | ||
if (status != Impl::TargetStatus || execute_time_ns >= YIELD_MAX_TIME_SPENT_NS) | ||
if (status != Impl::TargetStatus || total_time_spent >= YIELD_MAX_TIME_SPENT_NS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
||
static QueueType newTaskQueue() | ||
{ | ||
return std::make_unique<FIFOTaskQueue>(); | ||
return std::make_unique<CPUMultiLevelFeedbackQueue>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why cpu queue use MultiLevelFeedbackQueue
and io queue use FIFOTaskQueue
? And I think maybe we should add a configure variable to decide which queue to used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think IO-related operations should use a different type of queue, such as performing spill before restore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I think maybe we should add a configure variable to decide which queue to used?
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
UnitType io_pending_time = 0; \ | ||
UnitType await_time = 0; | ||
|
||
class LocalTaskProfileInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does Local
mean? Do we have RemoteTaskProfileInfo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that in the future, all TaskProfileInfo will be counted together to calculate the amount of resources used by the query, but now I can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to TaskProfileInfo
.
class LocalTaskProfileInfo | ||
{ | ||
public: | ||
PROFILE_MEMBER(UInt64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the PROFILE_MEMBER is used at only one place, is macro necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, removed.
|
||
class LocalTaskProfileInfo | ||
{ | ||
public: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put these variable in private sector and get them with related interfaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other LGTM
#include <Flash/Pipeline/Schedule/TaskQueues/MultiLevelFeedbackQueue.h> | ||
#include <assert.h> | ||
#include <common/likely.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include <Flash/Pipeline/Schedule/TaskQueues/MultiLevelFeedbackQueue.h> | |
#include <assert.h> | |
#include <common/likely.h> | |
#include <Flash/Pipeline/Schedule/TaskQueues/MultiLevelFeedbackQueue.h> | |
#include <common/likely.h> | |
#include <assert.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But clang-format likes this :)
#include <Common/Logger.h> | ||
#include <Common/MemoryTracker.h> | ||
#include <Flash/Pipeline/Schedule/Tasks/TaskProfileInfo.h> | ||
#include <memory.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include <Common/Logger.h> | |
#include <Common/MemoryTracker.h> | |
#include <Flash/Pipeline/Schedule/Tasks/TaskProfileInfo.h> | |
#include <memory.h> | |
#include <Common/Logger.h> | |
#include <Common/MemoryTracker.h> | |
#include <Flash/Pipeline/Schedule/Tasks/TaskProfileInfo.h> | |
#include <memory.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But clang-format likes this :)
…Queue.cpp Co-authored-by: xzhangxian1008 <[email protected]>
Co-authored-by: xzhangxian1008 <[email protected]>
/merge |
@SeaRise: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: f593ab2
|
/run-unit-test |
@SeaRise: Your PR was out of date, I have automatically updated it for you. At the same time I will also trigger all tests for you: /run-all-tests
If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
What problem does this PR solve?
Issue Number: ref #6518
Problem Summary:
What is changed and how it works?
TaskProfileInfo
to recordcpu_execute_time
,cpu_pending_time
,io_execute_time
,io_pending_time
andawait_time
ofTask
.TaskProfileInfo
.Check List
Tests
gtest_filter=*Event*
gtest_filter=*TaskScheduler*
gtest_filter=*Executor*
gtest_filter=*ComputeServerRunner*
gtest_filter=*TestMLFQTaskQueue*
Side effects
Documentation
Release note