-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow remapping source path prefixes in debug output #38322
Comments
Thanks for writing up the issue, @jmesmon! @rust-lang/tools: Can any of you think of a reason not to add something like this as a |
Sounds good to me! |
A Because of this, we recently submitted some patches to GCC to also support the same behaviour as Actually, I didn't know about |
I implemented this and it seems to work. I tested by building (reproducible-build test itself only tests stable symbol naming, not bit-for-bit output. In the past Rust could produce different symbol names between runs(!): see #30330.) |
Unclear if it's important for rust, but in gcc/clang multiple mappings are supported. This ends up being important in C/C++ due to Also, it could be a good idea to avoid the splitting on |
How do multiple mappings work? Are they applied in command line order? Then is the order significant? I think it must be, since |
In gcc, the handling of multiple debug_prefix_maps is to search the mappings last-on-cmdline to first-on-cmdline, applying the first prefix that matches. The last-to-first strategy is common in gcc (and other command line tools) as it is intended that later options are able to override earlier ones. |
My GCC patches linked above, includes modifying the existing GCC behaviour to split on the final = instead of the initial one. I think that is better as well, I can imagine someone wanting to map a path that contains a =, but less likely to map something to such a path. (edit: previously mentioned a space character, that was for some other thing that I got confused with) |
Thanks for answers! I will implement multiple mappings, last-to-first order, and splitting on the final = now. |
I'd recommend avoiding splitting on It would be a really good idea to keep all of the escaping/special characters in callers of rustc (shells, etc) just to avoid funny limitations like this ( |
How about requiring the mapping information to be provided in a file (similar to ld's |
I'd prefer avoiding needing to use (temporary?) external files to configure this feature. I say temporary because debug src mappings aren't something like a target specification where it is a fixed, predetermined value for all platforms: these are something that depend on the build directory & depend on where the source is being mapped to (which is, in the debug source packaging case) is typically a path under And one would need to know the escaping in that file format, so it doesn't simplify things wrt allowing arbitrary paths, it just moves them somewhere else. |
So, it seems that discussion on this issue has stalled (here and over in #38348) for two reasons:
I want to move forward with this so I propose the following solutions:
The formula that produces these results is: fn debuginfo_path(p: Path, map: [(Path, Path)]) -> Path {
let p = normalize(make_absolute(p))
for (from, to) in map {
// Exit on the *first* match, order determined by commandline option order
// UPDATE option order is last to first, i.e. later CLI options overrule earlier ones
if p.starts_with(from) {
return p.replace_prefix(from, to)
}
}
// No remapping done, but still normalized and absolute now
p
} Note that whether paths are later stored as relative to their Thoughts? @jmesmon @infinity0 @sanxiyn @jsgf @rust-lang/tools |
I don't see an issue with that so long as the "commandline option order" is defined as last-to-first. Doing this bit differently from other command line utilities doesn't buy us anything (unlike having 2 seperate args for from & to). It also looks like the examples decide to fix the mapping at directory name level, but doesn't appear in the english description. I don't have anything in mind that would break that, but given that allowing matching partials would allow appending '/' to the end to get matching-full-path-elements, I'm not sure such a restriction is a good idea. |
If there's precedent for that I'm fine with going last-to-first. |
I think it's just simpler to use. You don't have to worry if you need to append a |
This is the order used by gcc & clang for all of their "more than 1 & pick 1" options (ignoring special cases): debug-maps (as discussed earlier in this thread), optimization levels, debug info levels, include directories, etc. |
Sure, it's simpler. The issue is that it's also less flexible, and it's trivial to get the match-full-paths behavior from match-anything, but going the other direction is impossible. Again, I don't have a use case that would want partial matches, but the ease of supporting both should be considered. |
I updated the description above to reflect this. |
Yes, I know that string-based matching is more powerful. My argument is that it gives you more subtle ways to get it wrong without a clear benefit. Is it |
Several points: I think splitting the option into two is very unpretty, and somewhat ambiguous - if they're just related by being adjacent, it seems like it raises a lot of questions:
In particular it means that any tools that's parsing/processing the commandline needs to know about this special case in order to avoid breaking it. Normalizing paths by eliminating Edit: I can't think of a problem with eliminating Proposal: Retain the |
I think @michaelwoerister's suggestions are cleanest; tools that want to add this value to a rustc flag probably don't want to look inside the value to search it and then select a separator. I agree that matching only full path components are best and less likely prone to error. I am a little concerned that GCC differs a bit from this, but the rustc code example is indeed very simple and it might be possible to ask GCC to adopt a similar approach - I have to send them a patch anyways, I may add this as well. Even if they don't adopt it, we (in the interests of standardising this behaviour) could define a standard that defines a "minimal" mapping behaviour but leave it open saying "the tool might perform further additional mappings". I am less sure about normalisation, because it has the potential to mess with the semantics of various fields. For example this:
what would you map There are two cases:
(edit: "other mappings" -> "further additional mappings") |
I'd imagine that lto in gcc/clang could mirror (the path mapping issues with) rust's function inlining as in the lto case the functions to be inlined (if the compiler chooses to do inlining) would be from another translation unit. |
I'm still digesting the above, but it looks good so far. Wishlist request: an option to make paths/filenames in compiler messages the remapped version, rather than the input version. |
Triage: there's been a ton of discussion on this issue, but I have no idea what the current state of it is. can anyone summarize? |
#41555 was the tracking issue and it's closed now. We are using --remap-path-prefix in Debian and it works to make ripgrep reproducible. rustc itself is still not reproducible however, and I don't know why. That is tracked in #34902. There have been quite a few regressions. |
Yes, it's available on stable so I'm closing this issue. |
Hello, We in Chromium are working on integrating Rust into our distributed build system. We really appreciate the work done here, as reproducible builds are a requirement for such a plan. However, while this allows the binary output of rustc to be reproducible, we have noticed that the (fully resolved) command-line itself ends up not being so. This destroys our ability to cache build objects between bots and/or developers, as any change in the command-line requires a recompilation (such as changing optimization flags). Clang has resolved this by providing a This problem is discussed, in the context of Clang, in this blog post: https://blog.llvm.org/2019/11/deterministic-builds-with-clang-and-lld.html We would like to propose a similar flag for rustc. For simplicity we'd propose similar naming and behaviour as Clang's, in order to reuse previous work.
In terms of ordering and presidence, it can follow the exact same rules as multiple --remap-path-prefix arguments. Would it be best to discuss this here, or to open another issue? Thanks for looking. |
@danakj I think its worth filing a new issue for this case. Does Or is it that your build system is taking the specific command-line into account and that's affecting the input side of your reproducability? |
The build output is not affected by different |
Sure, but that's a specific property of how your build system computes cache keys right? Edit: For example if you wrap rustc in a shell script and put the |
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
In gcc & clang, the flag
-fdebug-prefix-map=old=new
allows changing the prefix of source files referred to in debug information.Related to #34902, this allows one to avoid having the particular source directory that a file was built in affect the output object/executable/library contents.
It also allows source-level debugging to work in cases where the source code is installed by the debug packages to a location that differs from where it was built (debian & OpenEmbedded, at least, take advantage of this).
As an alternate, the program
debugedit
allows modifying the files after generation to adjust the paths (as far as I can tell, Fedora uses this, and potentially other rpm based distros).The text was updated successfully, but these errors were encountered: