-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Startup performance #41
Comments
There isn't a ton of logs in the underlying shim that would help. @Mossaka has been working on adding open telemetry tracing to runwasi. That would be able to show more details on where we are spending time. We has some early results and I think we found it was in the setting up the container NS's that we were spending time. Cleaning up the disk operations in #40 will add some improvements. I saw an improvement in the wasmtime shim after avoiding disk reads/writes with the OCI artifacts but it was not where were spending most of our time. |
yeah this is the work @jsturtevant's referring to: containerd/runwasi#10 |
I think even only being able to set the |
A quick update on the findings of the startup performance - After adding traces to runwasi, I found that Jeager tracingThe What's next?I am following up with rust-extension folks to figure out why exactly are we waiting for such a long time to establish a connection with a socket address. I am also investigating how to properly delete the shim socket address in runwasi after the shim gets deleted, which in turn could make the startup much faster. In theory, I think |
This is excellent info, @Mossaka, thank you so much for digging into this! |
A quick update: Abel's comment at containerd/rust-extensions#263 answers why we have to wait for the socket. So the only question remains: why doesn't the runwasi shim delete it's socket address after its get closed? After digging more into the syscalls, I found that the But why does the bundle path gets deleted? I am not sure at the moment. Still trying to figure it out. It might related to youki's libcontainer (e.g. youki-dev/youki#2772). |
@Mossaka is youki-dev/youki#2780 going to affect the startup performance we're seeing here? Thanks! |
Yes, that will make the startup performance much faster |
containerd/runwasi#601 once this PR is merged, we can do a release of runwasi and then bump the version in the spin shim. It should make start up much faster |
i just bump the newest release version of the runwasi, will do some tests to see the perf gain |
Test is performed using Before#########################################
Containerd + RUNWASI with Spin SHIM
#########################################
Running experiment 1
Start timestamp in milis: 1716853258777 milliseconds
End timestamp in milis: 1716853259917 milliseconds
Time difference: 1140 milliseconds
Running experiment 2
Start timestamp in milis: 1716853260086 milliseconds
End timestamp in milis: 1716853261240 milliseconds
Time difference: 1154 milliseconds
Running experiment 3
Start timestamp in milis: 1716853261307 milliseconds
End timestamp in milis: 1716853262432 milliseconds
Time difference: 1125 milliseconds
Running experiment 4
Start timestamp in milis: 1716853262533 milliseconds
End timestamp in milis: 1716853263661 milliseconds
Time difference: 1128 milliseconds
Running experiment 5
Start timestamp in milis: 1716853264079 milliseconds
End timestamp in milis: 1716853265217 milliseconds
Time difference: 1138 milliseconds
Mean: 1137 milliseconds
Median: 1138 milliseconds
Mode: 1154 milliseconds
Standard Deviation: 10.19 milliseconds
Minimum Time: 1125 milliseconds
Maximum Time: 1154 milliseconds After#########################################
Containerd + RUNWASI with Spin SHIM
#########################################
Running experiment 1
Start timestamp in milis: 1716853849599 milliseconds
End timestamp in milis: 1716853849736 milliseconds
Time difference: 137 milliseconds
Running experiment 2
Start timestamp in milis: 1716853849817 milliseconds
End timestamp in milis: 1716853849948 milliseconds
Time difference: 131 milliseconds
Running experiment 3
Start timestamp in milis: 1716853849988 milliseconds
End timestamp in milis: 1716853850117 milliseconds
Time difference: 129 milliseconds
Running experiment 4
Start timestamp in milis: 1716853850152 milliseconds
End timestamp in milis: 1716853850273 milliseconds
Time difference: 121 milliseconds
Running experiment 5
Start timestamp in milis: 1716853850309 milliseconds
End timestamp in milis: 1716853850435 milliseconds
Time difference: 126 milliseconds
Mean: 128 milliseconds
Median: 129 milliseconds
Mode: 137 milliseconds
Standard Deviation: 5.29 milliseconds
Minimum Time: 121 milliseconds
Maximum Time: 137 milliseconds |
To close this issue, I am working on a PR to add opentelemetry capability to runwasi and its shims, which helped me to discover suprises and trace where we spend more time on. |
I'd like to understand where we spend time when starting a container; I filed #40 since I assume we spend a non-trivial amount of time when starting, but I'm not sure if there is anywhere else we could look.
@jsturtevant — is the
RUST_LOG
implementation in therust-extensions
crate helping us with tracing here?The text was updated successfully, but these errors were encountered: