-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature-wip] (memory tracker) (step2) Hook TCMalloc new/delete automatically counts to MemTracker #8476
[feature-wip] (memory tracker) (step2) Hook TCMalloc new/delete automatically counts to MemTracker #8476
Conversation
91e918c
to
d56bac7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch is very uncomfortable for reviewer due to many files it changed. I looked at about 100 files and added a few comment.
You'd better split changes into small parts. e.g. 1. implement new memory tracker by hooking new, delete but not enable; 2. enable new memory tracker by removing old usage; 3. handle null pointer for old fasion allocate 4. add memory tracker for untracked memory usage;
This way, we can review the 2nd patch quickly and review others carefully. The majority can be put into to the 2nd part, right?
be/src/runtime/thread_context.h
Outdated
return "COMPACTION"; | ||
default: | ||
return "UNKNOWN"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use array instead of switch case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found switch case to be faster than using map. Would it be more readable to use map? or other benefits.
Reference: https://stackoverflow.com/questions/6860525/c-what-is-faster-lookup-in-hashmap-or-switch-statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I means array not map. e.g. const char * xx_desc[TYPE_NUM]. But maybe compiler translates such switch case to code like array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand, it seems that compilers does compile switch constructs into array of code pointers, as said in the link above.
I changed to array as you said, which seems more concise. thx~ @dataroaring
Also, I did a simple test in the benchmark and it seems that switch constructs are faster, I didn't analyze the assembly code carefully, performance is not the bottleneck in this case.
8252b7a
to
1a387d5
Compare
This is indeed a worse way to submit a PR, but for this current PR, I would suggest leaving it as is for now to avoid some more uncontrollable factors when splitting the PR. |
Thx for your suggestion, I will pay attention to control the size of pr later, |
@@ -416,7 +416,7 @@ VOlapTablePartitionParam::VOlapTablePartitionParam(std::shared_ptr<OlapTableSche | |||
: _schema(schema), | |||
_t_param(t_param), | |||
_slots(_schema->tuple_desc()->slots()), | |||
_mem_tracker(MemTracker::create_tracker(-1, "OlapTablePartitionParam")) { | |||
_mem_tracker(MemTracker::create_virtual_tracker(-1, "OlapTablePartitionParam")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When should I create a virtual memtracker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this tracker needs to manually consume/release
The difference between virutal tracker and non-virutal tracker:
-
non-virutal tracker
In order to ensure the absolute accuracy of non-virutal mem tracker tree statistics, there are only two ways to count: one is to modify the tls mem tracker through attach or switch, and count in the tcmalloc new/delete hook; the other is to transfer memory ownership between non-virutal trackers. -
virutal tracker
Manual consume/release as before, the reasons for designing the virutal tracker: First, to transfer memory ownership between two trackers, it will release first and then consume, which is slower than calling consume/release directly on the virutal tracker; second, through parameters After blocking the virutal tracker, it will prevent the mem tracker tree from becoming more messy, and it is safer to add or delete the virutal tracker.
The non-virutal tracker is similar to the INFO log level, and the virutal tracker is similar to the DEBUG log level.
be/src/exec/tablet_sink.cpp
Outdated
@@ -48,6 +49,7 @@ NodeChannel::NodeChannel(OlapTableSink* parent, IndexChannel* index_channel, int | |||
if (_parent->_transfer_data_by_brpc_attachment) { | |||
_tuple_data_buffer_ptr = &_tuple_data_buffer; | |||
} | |||
_node_channel_tracker = MemTracker::create_tracker(-1, "NodeChannel"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add BE id to the tracker's name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BE id of a BE mem tracker is the same = _ =, so I added the thread id.
@@ -154,7 +154,7 @@ Status AggFnEvaluator::prepare(RuntimeState* state, const RowDescriptor& desc, M | |||
_intermediate_slot_desc = intermediate_slot_desc; | |||
|
|||
_string_buffer_len = 0; | |||
_mem_tracker = mem_tracker; | |||
_mem_tracker = MemTracker::create_virtual_tracker(-1, "AggFnEvaluator", mem_tracker); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memtracker
param of AggFnEvaluator::prepare
can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memtracker
param is used in Expr::prepare on the next line, which I modified.
In addition, it is also the parent of the virtual tracker _mem_tracker
.
@@ -99,6 +102,9 @@ class SnapshotManager { | |||
// snapshot | |||
Mutex _snapshot_mutex; | |||
uint64_t _snapshot_base_id; | |||
|
|||
// TODO(zxy) used after | |||
std::shared_ptr<MemTracker> _mem_tracker = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the next pr, this tracker will be switched to the tls mem tracker in the public func of SnapshotManager
.
In this pr, I created all the trackers that will be used in the future, and built a complete mem tracker tree. (Perhaps it would be better to do this with a pr alone... Do you think it needs to be deleted first? = =||| )
RowBatch::RowBatch(const RowDescriptor& row_desc, int capacity, MemTracker* mem_tracker) | ||
: _mem_tracker(mem_tracker), | ||
RowBatch::RowBatch(const RowDescriptor& row_desc, int capacity) | ||
: _mem_tracker(thread_local_ctx.get()->_thread_mem_tracker_mgr->mem_tracker()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that this rowbatch is transferred from one thread to another.
If so, the _mem_tracker also need to be changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that modifying the mem tracker of rowbatch
is not directly related to the thread used.
My understanding of this question is: whether there is a rowbatch
, consume tls A mem tracker when new, release tls B mem tracker when delete, and both trackers are not equal to 0 when they are finally destructed.
The actual code now avoids the above problem. Through RowBatch::acquire_state
and RowBatch::transfer_resource_ownership
, complete the mem_tracker update and memory ownership transfer of buffers in two row_batch
, avoiding the new and delete of buffers in a rowbatch
on different trackers.
For example: In OlapScanNode::get_next
, the rowbatch
created by Scanner will be transferred to the external parameter row_batch
through RowBatch::acquire_state
, and the mem tracker of the buffer will be modified. Ownership is transferred in two row_batches via update_mem_tracker
.
But I'm not sure if the buffer mem_tracker is updated in all similar places, which requires manual maintenance to ensure that the new and delete of a buffer are in the same tracker.
- I will test in further.
Similarly, mem pool
also has a similar situation, mem pool
also provides MemPool::acquire_data
and MemPool::exchange_data
to complete the transfer of chunks. However, I used to add the tls mem tracker when allocate in each chunk
, and found that the chunk
tracker is different from the tls mem tracker when the mem pool
is destructed.
1a387d5
to
d10fc82
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
151a29a
to
21c861e
Compare
21c861e
to
90d03c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
This pr seems will make build ASAN fail. |
@yangzhg PTAL |
I reproduced the problem in ASAN and it has been fixed in the following pr |
…emory usage (#8605) In pr #8476, all memory usage of a process is recorded in the process mem tracker, and all memory usage of a query is recorded in the query mem tracker, and it is still necessary to manually call `transfer to` to track the cached memory size. We hope to separate out more detailed memory usage based on Hook TCMalloc new/delete + TLS mem tracker. In this pr, the more detailed mem tracker is switched to TLS, which automatically and accurately counts more detailed memory usage than before.
Proposed changes
Issue Number: close #7196 (step 2/3)
Problem Summary:
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G
Implement a new way of memory statistics based on TCMalloc New/Delete Hook, MemTracker and TLS, and it is expected that all memory new/delete/malloc/free of the BE process can be counted.
Checklist(Required)
Further comments
specific purpose:
1. Accurate query memory limit.
2. Clearer tree structure of mem tracker.
3. Complete and detailed BE memory statistics.
Note, It is independent of the recording of tcmalloc hook in the thread local tracker, so the same block of memory is recorded independently in these two trackers.
Note, almost all the previous manual statistics positions are currently reserved, and some places may not be necessary, and will be revised in the future (TODO)
The difference between virutal tracker and non-virutal tracker:
non-virutal tracker
In order to ensure the absolute accuracy of non-virutal mem tracker tree statistics, there are only two ways to count: one is to modify the tls mem tracker through attach or switch, and count in the tcmalloc new/delete hook; the other is to transfer memory ownership between non-virutal trackers.
virutal tracker
Manual consume/release as before, the reasons for designing the virutal tracker: First, to transfer memory ownership between two trackers, it will release first and then consume, which is slower than calling consume/release directly on the virutal tracker; second, through parameters After blocking the virutal tracker, it will prevent the mem tracker tree from becoming more messy, and it is safer to add or delete the virutal tracker.
The non-virutal tracker is similar to the INFO log level, and the virutal tracker is similar to the DEBUG log level.
Existing performance problems and solutions:
The mem tracker shared between threads has low performance when it is frequently consumed/released in the new/delete hook;
The memory consumption of the current thread is cached in tls, and after the cumulative consumption reaches 2M, the consume / release mem tracker will be called once to avoid frequent calls.
At this stage, std::shared_ptr is used to save the mem tracker in tls. When a thread frequently switches the mem tracker, the use count of std::shared_ptr is frequently changed and the performance is low;
During an attach query, tls caches all mem trackers that have been switched and uncommitted memory consumption, and does not need to reset ptr when switching to the same mem tracker next time. In the future, the mem tracker in tls should be changed to raw pointers to solve this problem from the source (TODO)
In the next pr
more detailed memory statistics such as exec node, exprs, hash table, etc. will be realized through the mem tracker switch during the thread attach query.
Accuracy verification
If the following three values are the same, the overall statistics are accurate:
Query memory can be successfully limited
Modify session variable
set exec_mem_limit = 2147
, submit a query, it will return full OOM details:Memory exceed limit. fragment=xxx details=xxx., on backend=xxx. Memory left in process limit=xxx. current tracker <label =xxx, used=xxx, limit=xxx, failed alloc size=xxx>.
View more detailed memory statistics
Modify session variable
set mem_tracker_level = 2
, you can see INSTANCE level statistics, ref mem_tracker_level.Pformance Testing
The performance is reduced by about 1%~2% after opening the Hook TCMalloc new/delete.
Test Set: ssb LINEORDER 600w
Result:
Test SQL: