Skip to content
This repository has been archived by the owner on Nov 28, 2020. It is now read-only.

Better measurement of Acmeair memory footprint #132

Closed
kunalspathak opened this issue Aug 14, 2017 · 7 comments
Closed

Better measurement of Acmeair memory footprint #132

kunalspathak opened this issue Aug 14, 2017 · 7 comments

Comments

@kunalspathak
Copy link
Member

Have a way to measure memory foot print of Acmeair benchmark for entire run duration rather than just measuring before and after footprint. The idea is to collect rss every second and after the benchmark run is complete, get the normalized number out of it.

@kunalspathak kunalspathak changed the title Acmeair memory footprint Better measurement of Acmeair memory footprint Aug 14, 2017
@mlippautz
Copy link

Was pointed here by @bmeurer to give some hints on how V8 and chromium handle this scenario.

V8 exposes several API that can be used to inspect its memory usage. Basically, anything around Isolate::GetHeapStatistics [1] could be useful. The main numbers you probably want to have are [2]

  • HeapStatistics::total_physical_size() for ~RSS
  • HeapStatistics::used_heap_size() for actual payload

For overall memory statistics V8 does not emit any trace events but rather relies on the embedder, e.g. node, to sample the usage. Chromium implements this through through its MemoryDumpProvider interface, more specifically the V8 part is implemented in V8IsolateMemoryDumpProvider [3].

Chromium distinguishes between light and heavy dumps:

  • light: Sampled every few seconds and contain the numbers pointed out above.
  • heavy: Sampled rarely, e.g. actual start up and tear down of a benchmark, and contain detail statistics about on-heap objects [4].

[1] https://cs.chromium.org/chromium/src/v8/include/v8.h?type=cs&q=GetHeapStatistics&l=7111
[2] https://cs.chromium.org/chromium/src/v8/include/v8.h?type=cs&l=6405
[3] https://cs.chromium.org/chromium/src/gin/v8_isolate_memory_dump_provider.h?l=20
[4] https://cs.chromium.org/chromium/src/v8/src/api.cc?type=cs&q=GetHeapObjectStatisticsAtLastGC&l=8707

@kunalspathak
Copy link
Member Author

Thanks @mlippautz and @bmeurer for the information. We can definitely use these APIs via node to measure memory behavior. However I think if we use these APIs, it adds cost to the benchmark code itself to get these numbers and I don't want benchmark CPU/memory time to be affected by it. Here is what I had in mind:

  • Start a node benchmark (server benchmark makes more sense which runs for few minutes rather than Octane style benchmarks) and get the pid of the process.
  • Have a separate node process, call it memory-recorder that given a pid of a benchmark process, takes memory numbers. It doesn't have to use Heap* APIs, but could be as simple as running the fp.sh script every xxx seconds.
  • Once the benchmark process is done, memory-recorder will return the aggregate data of memory snapshots it has taken. In other words, if we have to plot the graph with memory numbers on Y-axis and time when snapshot was taken on X-axis, the aggregate number will represent the area under the graph. Lower the number, better the memory performance of node.

Let me know any questions/suggestions.

@bmeurer
Copy link
Member

bmeurer commented Aug 29, 2017

As said in the meeting, I don't think this kind of data is very accurate. The Heap has a better idea of actual memory and provides more fine-grained information. I also think the additional cost is acceptable, since you always pay the cost in the same way.

@mlippautz
Copy link

The Heap* APIs just read out a few pre-computed counters. Reading those every few seconds (or even every second) will not have any noticeable impact on performance and memory.

@kunalspathak
Copy link
Member Author

I also think the additional cost is acceptable, since you always pay the cost in the same way.

That is a fair point but do we all agree on below approach?

Once the benchmark process is done, memory-recorder will return the aggregate data of memory snapshots it has taken. In other words, if we have to plot the graph with memory numbers on Y-axis and time when snapshot was taken on X-axis, the aggregate number will represent the area under the graph. Lower the number, better the memory performance of node.

@bmeurer
Copy link
Member

bmeurer commented Aug 30, 2017

SGTM

@mhdawson
Copy link
Member

Given that there has not been discussion for a year, we should probably close. Please re-open if you think we need to start discussion again.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants