Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to streaming out the heap snapshot data #127

Merged
merged 6 commits into from
Feb 1, 2024

Conversation

JianFangAtRai
Copy link

@JianFangAtRai JianFangAtRai commented Jan 11, 2024

  • Streaming the heap snapshot!

This should prevent the engine from OOMing while recording the snapshot!

Now we just need to sample the files, either online, before downloading, or offline after downloading :)

If we're gonna do it offline, we'll want to gzip the files before downloading them.

  • Allow custom filename; use original API

  • Support legacy heap snapshot interface. Add reassembly function.

  • Add tests

  • Apply suggestions from code review

  • Update src/gc-heap-snapshot.cpp

  • Change to always save the parts in the same directory

This way you can always recover from an OOM

  • Fix bug in reassembler: from_node and to_node were in the wrong order

  • Fix correctness mistake: The edges have to be reordered according to the node order. That's the whole reason this is tricky.

But i'm not sure now whether the SoAs approach is actually an optimization.... It seems like we should probably prefer to inline the Edges right into the vector, rather than having to do another random lookup into the edges table?

  • Debugging messed up edge array idxs

  • Disable log message

  • Write the .nodes and .edges as binary data

  • Remove unnecessary logging

  • fix merge issues

  • attempt to add back the orphan node checking logic


PR Description

What does this PR do?

Checklist

Requirements for merging:

  • I have opened an issue or PR upstream on JuliaLang/julia: <link to JuliaLang/julia>
  • I have removed the port-to-* labels that don't apply.
  • I have opened a PR on raicode to test these changes:

* Streaming the heap snapshot!

This should prevent the engine from OOMing while recording the snapshot!

Now we just need to sample the files, either online, before downloading,
or offline after downloading :)

If we're gonna do it offline, we'll want to gzip the files before
downloading them.

* Allow custom filename; use original API

* Support legacy heap snapshot interface. Add reassembly function.

* Add tests

* Apply suggestions from code review

* Update src/gc-heap-snapshot.cpp

* Change to always save the parts in the same directory

This way you can always recover from an OOM

* Fix bug in reassembler: from_node and to_node were in the wrong order

* Fix correctness mistake: The edges have to be reordered according to the
node order. That's the whole reason this is tricky.

But i'm not sure now whether the SoAs approach is actually an
optimization.... It seems like we should probably prefer to inline the
Edges right into the vector, rather than having to do another random
lookup into the edges table?

* Debugging messed up edge array idxs

* Disable log message

* Write the .nodes and .edges as binary data

* Remove unnecessary logging

* fix merge issues

* attempt to add back the orphan node checking logic

---------

Co-authored-by: Nathan Daly <[email protected]>
Co-authored-by: Nathan Daly <[email protected]>
@JianFangAtRai JianFangAtRai marked this pull request as draft January 11, 2024 04:15
@github-actions github-actions bot added port-to-v1.10 This change should apply to Julia v1.10 builds port-to-master This change should apply to all future Julia builds port-to-v1.9 This change should apply to Julia v1.9 builds labels Jan 11, 2024
@JianFangAtRai JianFangAtRai self-assigned this Jan 11, 2024
@JianFangAtRai JianFangAtRai removed port-to-v1.10 This change should apply to Julia v1.10 builds port-to-master This change should apply to all future Julia builds labels Jan 11, 2024
@JianFangAtRai
Copy link
Author

Requirements for merging:

I have opened an issue or PR upstream on JuliaLang/julia: JuliaLang#52854
I have removed the port-to-* labels that don't apply. kept the label port-to-v1.9
I have opened a PR on raicode to test these changes: https://github.com/RelationalAI/raicode/pull/17226

remove unused k_node_number_of_fields from gc-heap-snapshot.cpp

attempt to resolve the savepoint issue on serialize_node
@NHDaly NHDaly added port-to-v1.10 This change should apply to Julia v1.10 builds port-to-master This change should apply to all future Julia builds labels Jan 16, 2024
@NHDaly
Copy link
Member

NHDaly commented Jan 16, 2024

I think this actually does need to be ported to 1.10, since we'll want this in our 1.10 branch as well. Thanks for commenting on what you changed, that was helpful.
EDIT: And it doesn't need the port-to-1.9, since it's already targeting 1.9

@NHDaly NHDaly removed the port-to-v1.9 This change should apply to Julia v1.9 builds label Jan 16, 2024
@JianFangAtRai JianFangAtRai marked this pull request as ready for review January 17, 2024 18:18
@JianFangAtRai JianFangAtRai changed the title Change to streaming out the heap snapshot data (#1) Change to streaming out the heap snapshot data Jan 18, 2024
Copy link
Member

@d-netto d-netto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@d-netto d-netto merged commit ba5345f into v1.9.2+RAI Feb 1, 2024
1 check passed
@d-netto d-netto deleted the jfang-heapsnapshot-streaming-back-port branch February 1, 2024 15:17
@Drvi Drvi removed the port-to-v1.10 This change should apply to Julia v1.10 builds label Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
port-to-master This change should apply to all future Julia builds
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants