-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slowdown of XS worker #6661
Comments
My only hypothesis on this so far is some kind of memory fragmentation, but if the memory footprint (VmSize/RSS) is growing too, then maybe the list of active slots has a bunch of useless/dummy entries in it, or something. |
Just so I don't forget, CRIU (e.g. https://criu.org/Simple_loop) is a tool that performs userspace snapshotting/restore of Linux processes, and might conceivably be useful in building some faster-to-restart reproduction tools for this (to snapshot an |
We believe some of the slowdown is somewhat visible on spreadsheets behind the investigation of #6786. |
A re-run of all vat transcripts under both a no-restart and a forced restart after every snapshot (executing only a single worker concurrently in both cases), show that mostly vat17 and vat18 are impacted by a roughly 200% slowdown. vat6 is experiencing a much smaller, but still noticeable slowdown (25%), while vat7 which has a similar amount of total execution time experienced no slowdown at all. vat1 experienced a small 6% slowdown. Other vats had no meaningful execution time, and were all consistent.
(Execution time in seconds of 1000 deliveries between snapshots) |
cc @raphdev |
Describe the bug
An xsnap worker loading from a recent snapshot is faster than one that executed without restart. The memory usage is also higher.
To Reproduce
Use the
transcript-replay.js
tool, configuring it to keep multiple workers (feature currently only available on branch mhofman/6588-diagnose)Expected behavior
This kind of slowdown / memory growth should not happen as the resource usage should entirely depend on the program being executed.
Platform Environment
Additional context
Issue first discovered by @warner while investigating #6625. Confirmed by adding stats to replay tool while investigating #6588.
Screenshots
The text was updated successfully, but these errors were encountered: