Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kiss-ICP with a mem limit is permanently GC-ing because _something_ is hogging memory #5685

Closed
teh-cmc opened this issue Mar 26, 2024 · 2 comments · Fixed by #6104
Closed
Assignees
Labels
🪳 bug Something isn't working examples Issues relating to the Rerun examples 🚀 performance Optimization, memory use, etc

Comments

@teh-cmc
Copy link
Member

teh-cmc commented Mar 26, 2024

Via @02alexander:
image

which basically means that we're GC-ing like crazy in hopes of reducing memory, except the thing that's actually hogging memory is not part of the GC path.

Hence the question is: what is the something in question? 👁️

@teh-cmc teh-cmc added 🪳 bug Something isn't working 🚀 performance Optimization, memory use, etc examples Issues relating to the Rerun examples labels Mar 26, 2024
@teh-cmc
Copy link
Member Author

teh-cmc commented Mar 26, 2024

@teh-cmc teh-cmc self-assigned this Mar 26, 2024
@teh-cmc
Copy link
Member Author

teh-cmc commented Mar 27, 2024

TL;DR: Fixed by removing instance keys:


So, here's what happening.

The Kiss-ICP pipeline logs a new, bigger point-cloud as its /world/map every frame.
(I assume it does so because it either can't or doesn't want to rely on visible history for whatever reason.)

/world/map number of instances over time:
image

Because that number is different every frame, we need to generate a new instance key batch matching that number of instances, every frame.
And because that number is quite large, and only getting larger and larger, that's a lot of data.
And because autogenerated keys are cached and unaffected by GC, that's a lot of data hangs around permanently!

That's what you see on the left side of this memgraph:
image

Now, because that data hangs around indefinitely and is by itself larger than the GC threshold, the GC will now trigger every frame:
image

Because the GC triggers every frame, the whole viewer slows downs, which means ingestion slows down, which mean raw arrow data starts to accumulate in the different layers of buffers/channels/etc.
That's why the memory permanently grows at a very alarming rate, even though the GC is running like crazy.

That's what you see on the other side of this memgraph:
image


So, one solution to this would be to modify Kiss-ICP to log its own instance keys, that way they would get GC'd as needed.
But that's not possible since we don't expose instance keys to the SDK anymore.

Another solution would be to make instance-key generation smarter by sharing their backing buffers etc.
But that might or might not tricky in and of itself, and then we have those address-based performance hacks around instance keys to take care of... and more importantly we're trying to get rid of instance keys to begin with!

So the last is simply to wait for 0.16 to land, which should get rid of (autogenerated) instance keys entirely:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working examples Issues relating to the Rerun examples 🚀 performance Optimization, memory use, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant