-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node_mksnapshot.cc: code cache generated based on temporary RO heap layout #47636
Comments
Thanks for reporting this, I did fbe1478 because of the same issue but I didn't realize that the RO heap could also change after serialization. I think what I was seeing from https://bugs.chromium.org/p/v8/issues/detail?id=12718 was that the code cache should be serialized from the same heap where the snapshot was taken to match the RO heap layout, but the order didn't matter as much. Option 1 sounds good enough for me for now, thanks (3 would be better in the long term, or maybe some cleverer way to encode the RO heap positions can be invented to avoid requiring the layout to be fixed). Is it safe to still use the serialized isolate for compiling code? If so I think maybe simply moving Line 1213 in 5c86f22
CreateBlob() would be enough? (I tried it in this branch, which seems to work fine with the V8 on main branch, but I'm not sure if it works with your patch)
|
Re option 1: no, unfortunately it's not enough to move code-cache serialization til after CreateBlob. It has to use precisely the same initial heap shape as 'production' isolates, i.e. an Isolate deserialized from the final snapshot. The simplest solution is to do code-cache serialization in a separate binary. Any opinions on how we should proceed on the V8 side until this is fixed? I would like to land my change (crrev.com/c/4440686) soon, but am currently blocked by the node failure at https://logs.chromium.org/logs/node-ci/buildbucket/cr-buildbucket/8783294649412924721/+/u/build/compile/stdout. Suggestion: disable |
To clarify, is it safe to deserialize that snapshot in the same process later and use the deserialized isolate to generate code cache? Or does it have to be deserialized from a new process? Does turning off shared RO heap make a difference here? I am concerned that more differences in build configs e.g. disabling code cache in V8's build would result in more breakages that go unnoticed until it gets fixed properly, it depends on how long it's going to take to fix it though. |
Or, from what I can tell from https://chromium-review.googlesource.com/c/v8/v8/+/4440686, it seems possible to make the timing of the sealing of the RO space configurable in V8 internally? If V8 keeps the old timing of sealing the RO heap for the embedder API, and uses the new extended timing for the internal default snapshot, we could have some time to tackle it without turning off code cache in Node.js while the V8 default snapshot gets the benefit. We still need to figure out how to address the potential mismatch of the code cache, but has always been the case anyway, and having the V8 default snapshot built somewhat differently from the snapshots built using the embedder API seems reasonable to me. Setting aside the code cache mismatch issue, I can see that maybe for the default V8 snapshot, the more moved from the isolate snapshot to the RO snapshot, the better, but that's not necessarily the case for snapshots built from the embedder API. |
With some custom logic (which would fully tear down RO space prior to deserialization) that'd work, but not out of the box. Currently we only fully support this in a new process.
In V8, I've moved the Also note RO space currently isn't unsealed when building custom snapshots. The steps that that led to the crash I've been seeing were:
|
I think for Node.js that might be an easier way out if we can still do it in the same process, or otherwise we need to update node.gyp to allow another circular dependency in the current circular dependency (libnode -> mkcodecache -> libnode + mksnapshot ->libnode), which isn't necessarily impossible, but would be more complicated to implement. |
We fixed this on the V8 side by preserving RO space layout through serialization-deserialization. |
See github.com/nodejs/node/issues/47636. If enabled, node creates the code cache on a transitory read-only heap state that may be invalidated when the heap snapshot is taken. This has worked so far, but will break in the near future. Disable this option until we have a solution in node. Note also the slight confusion between node_use_node_code_cache and node_use_code_cache: gn/node/BUILD.gn uses `node_use_code_cache` to set `NODE_USE_NODE_CODE_CACHE=1`. Bug: v8:13949 Change-Id: Idbe1b18a36ff6eb6508dc6265fdb69418de04881 Reviewed-on: https://chromium-review.googlesource.com/c/v8/node-ci/+/4474327 Commit-Queue: Patrick Thier <[email protected]> Reviewed-by: Patrick Thier <[email protected]> Auto-Submit: Jakob Linke <[email protected]>
This reverts commit 742268f. Reason for revert: V8 now preserves RO space layout through serialization crrev.com/c/4543306 Original change's description: > Temporarily disable node_use_node_code_cache > > See github.com/nodejs/node/issues/47636. > > If enabled, node creates the code cache on a transitory read-only heap > state that may be invalidated when the heap snapshot is taken. This > has worked so far, but will break in the near future. Disable this > option until we have a solution in node. > > Note also the slight confusion between node_use_node_code_cache > and node_use_code_cache: gn/node/BUILD.gn uses `node_use_code_cache` > to set `NODE_USE_NODE_CODE_CACHE=1`. > > Bug: v8:13949 > Change-Id: Idbe1b18a36ff6eb6508dc6265fdb69418de04881 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/node-ci/+/4474327 > Commit-Queue: Patrick Thier <[email protected]> > Reviewed-by: Patrick Thier <[email protected]> > Auto-Submit: Jakob Linke <[email protected]> Fixed: v8:13949 Change-Id: I454809c6393f41ff80fc593cb2972a061d614d1c Reviewed-on: https://chromium-review.googlesource.com/c/v8/node-ci/+/4577062 Auto-Submit: Jakob Linke <[email protected]> Reviewed-by: Patrick Thier <[email protected]> Commit-Queue: Patrick Thier <[email protected]>
V8 now requires the code cache to be compiled with a finalized read-only space, so we need to serialize the snapshot to get a finalized read-only space first, then deserialize it to compile the code cache. PR-URL: #49099 Refs: #47636 Refs: https://bugs.chromium.org/p/v8/issues/detail?id=13789 Reviewed-By: Yagiz Nizipli <[email protected]>
V8 now requires the code cache to be compiled with a finalized read-only space, so we need to serialize the snapshot to get a finalized read-only space first, then deserialize it to compile the code cache. PR-URL: #49099 Refs: #47636 Refs: https://bugs.chromium.org/p/v8/issues/detail?id=13789 Reviewed-By: Yagiz Nizipli <[email protected]>
Version
No response
Platform
No response
Subsystem
No response
What steps will reproduce the bug?
No response
How often does it reproduce? Is there a required condition?
No response
What is the expected behavior? Why is that the expected behavior?
No response
What do you see instead?
fbe1478 moved code cache generation into node_mksnapshot, but I suspect this process is (and has always been) unsupported by V8. It has just worked somewhat by accident til now.
A hidden requirement of V8's code serializer is that it runs inside an Isolate that has been deserialized from the final snapshot, i.e. the same snapshot that is used in the node binary itself. Why? Because SerializeReadOnlyObjectReference encodes the RO object ref as {chunk index, offset}, but RO space layout can and does change between RO space serialization- and deserialization-time (and thus the serialized encoded RO ref may be invalid).
I'm hitting this locally (based on a V8 change I'm currently working on), note how the offset of the "load" string changes:
in node_mksnapshot:
0x0a78926543d9 <String[4]: #load> offset 82904
in node
0x32bf83c14789 <String[4]: #load> offset 83848
.This results in the code-deserializer creating a pointer pointing at a random location inside RO space.
We could fix this in several ways,
node_use_node_code_cache
@joyeecheung wdyt?
Additional information
No response
The text was updated successfully, but these errors were encountered: