Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add a mechanism to avoid including unnecessary files in the trace #2529

Closed
wants to merge 1 commit into from

Conversation

Keno
Copy link
Member

@Keno Keno commented Apr 22, 2020

This is more a kick off point for discussion than a serious PR at this point,
but I figure it's always better to discuss with working code than in the abstract.

We're planning to start collecting large numbers of rr traces from failing CI jobs.
However, at the moment, these traces are a bit large for comfort. However, they
don't need to be that large, because a good fraction of their size comes from
binaries that we're already storing on our artifact server, and thus don't need
to store separately in the trace. This PR adds an --index-dir option to rr pack.
rr will index each such directory, cataloguing all contained files by content hash.
It will then symlink the index directory into the trace, rewrite the mmap entries
to point into this symlink and as usual delete any unreferenced files. The idea is
to send this directory over to the replay machine, rewrite the symlink to a directory
containing the same immutable binaries that were in the index directory and replay
the trace that way.

This works ok, and is probably what we'll do for the short term, but it's not a
satisfying long-term solution, since we don't record any information whatsoever
about what was in the index directory. I'm thinking a better solution would be
to have a custom external record type that we could use with a custom packer
to record exactly where on our artifact server the relevant binary can be found
(which is easy, since those binaries are already content-addressed). That said,
the effort to implement that seemed not-insubstantial, so this is the poor man's
version of that.

If the version in this PR is interesting, I'm happy to clean it up and get it in
properly, but as I said, I'd like to personally move away from it in the medium
term, so I don't need it merged.

This is more a kick off point for discussion than a serious PR at this point,
but I figure it's always better to discuss with working code than in the abstract.

We're planning to start collecting large numbers of rr traces from failing CI jobs.
However, at the moment, these traces are a bit large for comfort. However, they
don't need to be that large, because a good fraction of their size comes from
binaries that we're already storing on our artifact server, and thus don't need
to store separately in the trace. This PR adds an `--index-dir` option to `rr pack`.
rr will index each such directory, cataloguing all contained files by content hash.
It will then symlink the index directory into the trace, rewrite the mmap entries
to point into this symlink and as usual delete any unreferenced files. The idea is
to send this directory over to the replay machine, rewrite the symlink to a directory
containing the same immutable binaries that were in the index directory and replay
the trace that way.

This works ok, and is probably what we'll do for the short term, but it's not a
satisfying long-term solution, since we don't record any information whatsoever
about what was in the index directory. I'm thinking a better solution would be
to have a custom `external` record type that we could use with a custom packer
to record exactly where on our artifact server the relevant binary can be found
(which is easy, since those binaries are already content-addressed). That said,
the effort to implement that seemed not-insubstantial, so this is the poor man's
version of that.

If the version in this PR is intersting, I'm happy to clean it up and get it in
properly, but as I said, I'd like to personally move away from it in the medium
term, so I don't need it merged.
@rocallahan
Copy link
Collaborator

rocallahan commented Apr 23, 2020

It makes sense to have some kind of mechanism for this, preferably one that doesn't constrain the mechanics of the external storage.

But I wonder if this actually needs rr changes? What if:

  • You run rr pack as normal.
  • You have your own tool that goes through the rr trace directory, identifying files that you're storing externally and replacing those files with some kind of metadata identifying the external resource
  • You package the resulting trace dir and store it.

Then before replay:

  • You obtain the packaged trace dir.
  • You read the saved metadata, and follow its instructions to obtain files from external storage.
  • You replay with rr normally.

@rocallahan
Copy link
Collaborator

I.e. I don't see a reason to modify the rr trace data itself to record that some files are stored externally. That would only make sense if rr itself was going to retrieve those files before/during replay. But making rr do that just seems like unnecessary coupling.

@Keno
Copy link
Member Author

Keno commented Apr 23, 2020

Yes, I think that's about right, but I think it would be good to somehow signal to rr that this has been done such that it can error during replay if the custom decompression step hasn't been done. Perhaps the packer script should embed a human-readable error string in the trace for rr to emit if it tries to replay something that hasn't been unpacked yet.

@rocallahan
Copy link
Collaborator

Yeah that would make sense. Maybe we should just have a well-known file name that rr checks for; if that file exists, then rr replay prints its contents and exits.

@Keno
Copy link
Member Author

Keno commented Apr 23, 2020

That seems like a good solution.

@GitMensch
Copy link
Contributor

Seems like this PR was found to be "not the way to go". I suggest to close it and possibly have an issue or PR for "let rr look for the magic file and print its output", if that is still relevant.

@rocallahan
Copy link
Collaborator

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants