-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evaluator: parsing antler CUE configs can exhaust system memory #3452
Comments
Is it possible to reproduce this slowness via the |
Yes, although it takes a few steps:
When running |
I upgraded my box to 64 GB RAM and did some testing with CUE v0.5.0 and CUE v0.11.0. This is the resident memory reported after the config is completely parsed: CUE v0.5.0: 9.6 GB I appear to be stuck on v0.5.0. Or, are there any other experimental flags I can try? |
That's it for now. We still have some performance and memory usage work to be done on evalv3, so that's still our focus for issues like this one. |
Just adding to this that I reduced the CPU and memory considerably by removing all the disjunctions I was using in my config schema. CUE v0.5, with disjunctions:
CUE v0.5, without disjunctions:
CUE v0.11, without disjunctions, without CUE_EXPERIMENT=evalv3:
CUE v0.11, without disjunctions, with CUE_EXPERIMENT=evalv3 (however this got a "field not allowed" error on a line number that doesn't make sense yet, so I'll try to sort this out later):
At least this shows that in my case, the way I'm using disjunctions (to enforce that only one field is set in a struct, and those structs are themselves used inside of recursive structs) is a pretty big portion of the resource consumption. I can make a workaround for this, but also look forward to things returning to a v0.5 level of performance, or better one day. 🤞 |
What version of CUE are you using (
cue version
)?Does this issue reproduce with the latest stable release?
Yes. It's the same or possibly worse in v0.10.0, with or without
CUE_EXPERIMENT=evalv3
.What did you do?
I created a CUE package for Antler in this sce-tests repo. This is an Antler test config with 216 tests, that uses both large lists (generated programmatically with Go templates), and CUE list comprehension, that likely results in a large CUE graph.
The biggest culprit, it seems, is the Run list for my FCT tests. This creates list of 1200 elements (using a Go template), which is used with list comprehension to generate StreamClients. When Antler goes to unify the schema with the config using the CUE API using the CUE API, the process memory reported by
top
rises very quickly, and can complete exhaust the system memory, depending on the hardware. If I comment out this list, it's still slow and uses a lot of system memory compared to what I'd hope for, but it's at least much faster.To reproduce it, one can install Antler, pull the sce-tests repo, and run
antler vet
to parse the config. My hope is that this isn't necessary for you to do, and just based on the description, you can identify the category of performance problem referred to in the Performance umbrella issue, so I have a sense of if or when this may be improved.Also, I might be able to work around this by avoiding large lists, but it's flexible for users to provide their own statistical distribution of wait times and flow lengths, and these lists can simply get long. On top of that, this project will eventually at least triple in size with more tests, so I'll have to solve this somehow, and am just looking for advice. Would this be any better in v0.11.0-alpha.1, or with any other config options?
What did you expect to see?
The config to parse reasonably quickly.
What did you see instead?
Excessive memory allocations.
A Linux laptop with 8G of RAM and 8G of swap runs out of memory entirely when parsing the config.
Another box with 16G of RAM and 8G of swap is able to parse the config without running out of memory, but just barely.
The text was updated successfully, but these errors were encountered: