-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed memory leak #2
Conversation
71bf7e7
to
3407cc0
Compare
see #3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine as an urgent customer fix, but it strikes me as pretty complex for the problem it's solving.
You're basically writing your own 2-level pool implementation because sync.Pool doesn't give you enough control over max size and lifecycle to implement this behavior. This would be more naturally expressed as a pool of pools, rather than slices and maps. And it might be faster as well.
I only briefly researched these alternatives, but you should check them out and see if they can be used:
https://github.com/jolestar/go-commons-pool
Another option would be use a linked list instead of slices, which would help you avoid copying the slices. Not sure if that's a major part of the performance diff or not, but worth considering:
runtime.SetFinalizer(module, nil) | ||
} | ||
// Grab the runtime ID and remove the module from the tracking map | ||
ptr := reflect.ValueOf(module).Pointer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is slow, it might make sense to get it a single time at creation, and for the pool to vend an object which includes both the api.Module
and its uintptr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the slow portion. It's wazero.NewRuntime(ctx)
that's slow (like half a second on my machine). That's why we can't just create a new one on every request. Second to that is InstantiateModule
. Everything else is essentially instant in comparison. The slowdown is due to having to periodically a new runtime, not really anything else.
I may not have been clear or detailed enough in the original PR message, so I'll restate the problem that I observed.
Any solution that does not involve the removal of The issue regarding performance degradation is strictly related to calls to I'll also mention that, by definition of a leak, I'm mainly referring to an out-of-memory error with runaway memory usage. This was the only solution that I tried that did not result in an eventual out-of-memory error (with it running for over 3 hours, while the other options never lasted more than 10 minutes). We can absolutely use another solution that is simpler, but it also needs to handle the memory leak (and/or runaway memory usage resulting in an out-of-memory error). |
This resolves a runaway memory leak. In a nutshell,
wazero.Runtime
is holding on to memory that is being allocated throughapi.Module
. Even when the modules are being closed properly, the runtime still accumulates the memory. In truth, I think there may be a potential bug in the glue code between Go and WASM, however I wasn't able to track it down, and the result is thatwazero.Runtime
is accumulating memory.Closing and removing the reference to
wazero.Runtime
does free the memory, however instantiation ofwazero.Runtime
is relatively slow, and it would not be feasible to create a runtime every time that we need to use a regex.Instead, I've created a custom pool structure that pools modules with a runtime, and puts a cap on the number of times a module may be fetched from a runtime. Once that cap has been reached, the pool waits for all outstanding modules to be returned for that runtime (whether through
Put
or GC collection), before closing the runtime and removing it from the pool. This puts a soft cap on the amount of memory that a runtime can generate. With this in place, we'll still see spikes since the GC determining when to collect the runtimes isn't deterministic, but it at least prevents runaway memory usage.Using a multithreaded application to spam the server with queries, I was able to run out of memory in < 3 minutes on the
main
branch. With this pool in place, the memory usage stabilized, and I ran it for 3 hours before deciding that it was good enough. So while this doesn't fix the underlying problem, it at least fixes the memory usage error. Performance-wise, there's about a 20-25% reduction in query speed, for regexes, using the multithreaded application.