Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
arena_memory_resource optimization: disable tracking allocated blocks…
… by default (#732) This is done similarly to #702. Previously `arena_memory_resource` maintained a set of allocated blocks, but this was only used for reporting/debugging purposes. Maintaining this set requires a `set::find` at every deallocation, which can get expensive when there are many allocated blocks. This PR moves the tracking behind a default-undefined preprocessor flag. This results in some speedup in the random allocations benchmark for `arena_memory_resource`. Tracking can be enabled by defining `RMM_POOL_TRACK_ALLOCATIONS`. This should also fix the Spark small shuffle buffer issue: NVIDIA/spark-rapids#1711 Before: ```console ------------------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------------------ BM_RandomAllocations/arena_mr/1000/1 1.36 ms 1.36 ms 457 BM_RandomAllocations/arena_mr/1000/4 1.21 ms 1.21 ms 517 BM_RandomAllocations/arena_mr/1000/64 1.22 ms 1.22 ms 496 BM_RandomAllocations/arena_mr/1000/256 1.08 ms 1.07 ms 535 BM_RandomAllocations/arena_mr/1000/1024 0.949 ms 0.948 ms 583 BM_RandomAllocations/arena_mr/1000/4096 0.853 ms 0.848 ms 680 BM_RandomAllocations/arena_mr/10000/1 98.7 ms 98.3 ms 8 BM_RandomAllocations/arena_mr/10000/4 65.4 ms 65.4 ms 9 BM_RandomAllocations/arena_mr/10000/64 16.6 ms 16.5 ms 38 BM_RandomAllocations/arena_mr/10000/256 11.2 ms 11.2 ms 48 BM_RandomAllocations/arena_mr/10000/1024 9.45 ms 9.44 ms 62 BM_RandomAllocations/arena_mr/10000/4096 9.24 ms 9.20 ms 59 BM_RandomAllocations/arena_mr/100000/1 7536 ms 7536 ms 1 BM_RandomAllocations/arena_mr/100000/4 3002 ms 3002 ms 1 BM_RandomAllocations/arena_mr/100000/64 170 ms 170 ms 3 BM_RandomAllocations/arena_mr/100000/256 107 ms 107 ms 7 BM_RandomAllocations/arena_mr/100000/1024 96.0 ms 95.7 ms 6 BM_RandomAllocations/arena_mr/100000/4096 86.7 ms 86.7 ms 6 ``` After: ```console ------------------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------------------ BM_RandomAllocations/arena_mr/1000/1 1.20 ms 1.20 ms 519 BM_RandomAllocations/arena_mr/1000/4 1.08 ms 1.08 ms 588 BM_RandomAllocations/arena_mr/1000/64 1.11 ms 1.11 ms 552 BM_RandomAllocations/arena_mr/1000/256 0.957 ms 0.957 ms 611 BM_RandomAllocations/arena_mr/1000/1024 0.857 ms 0.857 ms 687 BM_RandomAllocations/arena_mr/1000/4096 0.795 ms 0.793 ms 724 BM_RandomAllocations/arena_mr/10000/1 73.0 ms 73.0 ms 10 BM_RandomAllocations/arena_mr/10000/4 45.7 ms 45.7 ms 14 BM_RandomAllocations/arena_mr/10000/64 14.4 ms 14.4 ms 40 BM_RandomAllocations/arena_mr/10000/256 9.87 ms 9.82 ms 60 BM_RandomAllocations/arena_mr/10000/1024 8.72 ms 8.72 ms 69 BM_RandomAllocations/arena_mr/10000/4096 7.32 ms 7.30 ms 85 BM_RandomAllocations/arena_mr/100000/1 6384 ms 6384 ms 1 BM_RandomAllocations/arena_mr/100000/4 2480 ms 2480 ms 1 BM_RandomAllocations/arena_mr/100000/64 147 ms 147 ms 5 BM_RandomAllocations/arena_mr/100000/256 103 ms 103 ms 7 BM_RandomAllocations/arena_mr/100000/1024 78.1 ms 78.1 ms 9 BM_RandomAllocations/arena_mr/100000/4096 72.3 ms 72.3 ms 9 ``` @abellina Authors: - Rong Ou (@rongou) Approvers: - Mark Harris (@harrism) - Conor Hoekstra (@codereport) URL: #732
- Loading branch information