-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v8: optimize getHeapStatistics
#469
Conversation
Is it really faster? And is v8.getHeapStatistics() enough of a bottleneck to warrant the extra complexity? I imagine most applications call it at most once every few seconds. |
Not sure, I'm really bad at microbenchmarking. Was just following @trevnorris's TODO |
exports.getHeapStatistics = v8binding.getHeapStatistics; | ||
|
||
exports.getHeapStatistics = function() { | ||
var data = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done incorrectly. The data
object should be allocated memory once and then reused for every request. There are two ways to do this. The fastest would be to create a Persistent<Object>
on Environment
and write the values to that. Another way is to simply get the data off the Object
by passing it in. I'll comment with more implementation details.
This has a memory leak since it overrides the memory allocated to the object without actually free'ing it.
$ /usr/bin/time ./iojs /tmp/get-heap-stats.js
2377.3 ns/op
2.45user 0.03system 0:02.48elapsed 100%CPU (0avgtext+0avgdata 242016maxresident)k
Using this script:
var v8 = require('v8');
var ITER = 1e6;
var t = process.hrtime();
for (var i = 0; i < ITER; i++) {
v8.getHeapStatistics()
}
t = process.hrtime(t);
console.log(((t[0] * 1e9 + t[1]) / ITER).toFixed(1) + ' ns/op');
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't memory freed on gc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vkurchatkin I'm sorry. I misread the initial implementation and interpreted the maxresident
to suggest there was a memory leak. I thought I saw var data = {};
outside of the getHeapStatistics()
function. Which would have meant every call to getHeapStatistics()
would have attached memory to the same object, leaving the previously allocated memory hanging around and unable to be collected. My mistake. :)
@vkurchatkin Here's the implementation I was thinking of: diff --git a/lib/v8.js b/lib/v8.js
index d4e4c34..2c7a2fa 100644
--- a/lib/v8.js
+++ b/lib/v8.js
@@ -13,7 +13,20 @@
// OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
'use strict';
-
var v8binding = process.binding('v8');
+var smalloc = require('smalloc');
+var data = {};
+
+smalloc.alloc(5, data, smalloc.Types.Uint32);
+
exports.getHeapStatistics = v8binding.getHeapStatistics;
-exports.setFlagsFromString = v8binding.setFlagsFromString;
+exports.getHeapStatistics = function getHeapStatistics() {
+ v8binding.getHeapStatistics(data);
+ return {
+ 'total_heap_size': data[0],
+ 'total_heap_size_executable': data[1],
+ 'total_physical_size': data[2],
+ 'used_heap_size': data[3],
+ 'heap_size_limit': data[4]
+ };
+};
diff --git a/src/node_v8.cc b/src/node_v8.cc
index 47e802e..d9694f8 100644
--- a/src/node_v8.cc
+++ b/src/node_v8.cc
@@ -22,23 +22,23 @@ using v8::Value;
void GetHeapStatistics(const FunctionCallbackInfo<Value>& args) {
- Environment* env = Environment::GetCurrent(args);
+ ASSERT(args[0]->IsObject());
+
Isolate* isolate = args.GetIsolate();
HeapStatistics s;
+
isolate->GetHeapStatistics(&s);
- Local<Object> info = Object::New(isolate);
- // TODO(trevnorris): Setting many object properties in C++ is a significant
- // performance hit. Redo this to pass the results to JS and create/set the
- // properties there.
-#define V(name) \
- info->Set(env->name ## _string(), Uint32::NewFromUnsigned(isolate, s.name()))
- V(total_heap_size);
- V(total_heap_size_executable);
- V(total_physical_size);
- V(used_heap_size);
- V(heap_size_limit);
+ Local<Object> obj = args[0].As<Object>();
+ char* data = static_cast<char*>(obj->GetIndexedPropertiesExternalArrayData());
+
+#define V(i, name) \
+ data[i] = static_cast<uint32_t>(s.name());
+ V(0, total_heap_size);
+ V(1, total_heap_size_executable);
+ V(2, total_physical_size);
+ V(3, used_heap_size);
+ V(4, heap_size_limit);
#undef V
- args.GetReturnValue().Set(info);
}
@bnoordhuis Using the same bench script posted prior, execution time goes from ~1.4us/op to ~150ns/op. IMO that's a worth while gain. |
721ffa8
to
8c8e23d
Compare
@trevnorris amended commit to reuse buffer, please take a look |
+1 |
This LGTM. |
added a commit that seems to make things slightly faster |
e651804
to
514d858
Compare
@vkurchatkin Thanks for taking the time to implement 514d858. Interestingly it doesn't offer any performance or memory footprint benefit. Seems V8 has highly optimized the ability to retrieve externally allocated array data from an |
514d858
to
c95f6a4
Compare
exports.getHeapStatistics = v8binding.getHeapStatistics; | ||
var smalloc = require('smalloc'); | ||
|
||
var data = smalloc.alloc(5, smalloc.Types.Uint32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trevnorris Doesn't this always leak a small amount of memory on exit? One more thing to add to my valgrind suppression file, I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. It depends on the GC to cleanup the memory, and since the object sticks around indefinitely then it'll show up. Though we could amend the second commit to free the memory on exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it a problem? buffer pool behaves the same way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @bnoordhuis' question was more rhetorical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it was. Never mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnoordhuis This isn't to say I don't like the idea of having a common way to cleanup memory on exit that's been allocated. It is annoying to constantly have to sort through so many extra lines in valgrind. Do you have those documented? Could use that as a starting point and possibly create a class or something that is used for allocations that need to be cleaned up on exit.
@trevnorris ping |
V(heap_size_limit); | ||
Local<Object> obj = args[0].As<Object>(); | ||
uint32_t* data = | ||
static_cast<uint32_t*>(obj->GetIndexedPropertiesExternalArrayData()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably have a CHECK_NE(data, nullptr)
and GetIndexedPropertiesExternalArrayDataType() and GetIndexedPropertiesExternalArrayDataLength() checks. The last two should maybe be ASSERT_EQ checks because they are cheap but not wholly free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had thought about this. On the fence on whether doing the length check is necessary, since the allocation and usage is in a single close location. What about, instead, creating a constant in C++ that is used in JS for the size of the allocation. Instead of relying on keeping the numbers in both locations the same?
c95f6a4
to
88805d4
Compare
@trevnorris @bnoordhuis addressed your concerns, PTAL |
@vkurchatkin Great job. Thanks for those changes. @bnoordhuis Have any additional feedback? I think it is worth investigating creating a standardized way of allocating memory like this on an object so it's automatically cleaned up when the application exits. |
|
||
CHECK_NE(data, nullptr); | ||
ASSERT(obj->GetIndexedPropertiesExternalArrayDataType() == kHeapStatisticsBufferType); | ||
ASSERT(obj->GetIndexedPropertiesExternalArrayDataLength() == kHeapStatisticsBufferLength); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: long lines, please wrap at 80 columns.
I realize we don't have ASSERT_EQ yet. Can you update this when #529 lands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#529 landed. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has landed.
88805d4
to
f1f75ff
Compare
V(1, total_heap_size_executable, kTotalHeapSizeExecutableIndex); \ | ||
V(2, total_physical_size, kTotalPhysicalSizeIndex); \ | ||
V(3, used_heap_size, kUsedHeapSizeIndex); \ | ||
V(4, heap_size_limit, kHeapSizeLimitIndex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to go on about this but please don't use semicolons in x-macros, they tend to make code and error messages very confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok! at least now I know how this trick is called
f1f75ff
to
b7ce826
Compare
updated one more time |
@vkurchatkin Don't hate me but I have two more nits:
|
Since setting object properties in C++ can be slow, pass data to JS using preallocated smalloc buffer and create object in JS instead.
b7ce826
to
a16ccd2
Compare
@bnoordhuis that's ok. squashed and amended the commit |
Since setting object properties in C++ can be slow, pass data to JS using preallocated smalloc buffer and create object in JS instead. PR-URL: #469 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Trevor Norris <[email protected]>
Thanks Vladimir, landed in 5435cf2! |
Wow, that felt like a long journey. Congrats @vkurchatkin |
Thanks much @vkurchatkin. Great piece of work. |
Set statistics data to external uint32 array instead of creating object.
R=@trevnorris or @bnoordhuis ?