-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable PdbGit (GitLink) on our PDB files #15256
Conversation
CC @Pilchie, @dotnet/roslyn-infrastructure |
$packagesPath = Get-PackagesPath | ||
$pdbGit = join-path $packagesPath "pdbgit\$pdbGitVersion\tools\PdbGit.exe" | ||
|
||
$signToolDataPath = join-path $sourcePath "build\config\SignToolData.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could a comment be added to explain why we're involving signing configuration here? I have guesses but I'd rather not be guessing and just know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also wondering if "SignToolData" is now a poor name if it's being used for more things too.
$binaryPath = join-Path $binariesPath $relativePath | ||
$pdbPath = [IO.Path]::ChangeExtension($binaryPath, ".pdb") | ||
if (-not (test-path $pdbPath)) { | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not fail instead? This means there's bad data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two cases where it's not an error:
- The binary is a VSIX
- The PDB is a portable PDB
The former is really the only concern for our repo right now though. Will account for VSIX and error on other cases.
} | ||
|
||
write-host "Running pdbgit on $pdbPath" | ||
& $pdbGit $pdbPath -u https://github.com/dotnet/roslyn --baseDir $sourcePath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this URL supposed to be? Is it the git repository (which I would expect should have a .git to be more canonical) or does it have some smarts here?
(I know GitHub currently supports clones without the .git, but unless you can get a contract with them saying they'll never break it...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be a recognizable name for the git repo. The tool doesn't actually use this name directly but instead uses the name to determine the repo involved. It writes a different URL to the PDB. Generally in the form of https://raw.github.com/repo/project/<commit>/...
I based this argument off of the sample on the PdbGit README file.
PdbGit.exe path-to-your.pdb -u https://github.com/username/project --baseDir C:\root-of-your-project\ --commit your-commit-ID
We need to be explicit here because in the case of mulitple remotes the tool will pick one of them. When I ran locally it was actually embedding urls to my personal fork vs. the official once hence I made it an explicit argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, makes sense. I didn't know where the README was to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly specifying the URL to avoid forks sounds good. But you should only need the --baseDir
parameter if there is no .git folder. I think you can remove that from your script here.
@jasonmalinowski responded to PR feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. Good, but yes, we need to do unit test binaries too. Otherwise I promise to send @CyrusNajmabadi in your direction next time he has to debug a unit test crash. :-)
$pdbGit = join-path $packagesPath "pdbgit\$pdbGitVersion\tools\PdbGit.exe" | ||
|
||
# Linking our PDBs to Git is really only interesting for PE / PDB that are | ||
# being shipped to customers. For local developement the original PDB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about unit tests though? Investigating dumps of crashing unit tests is common and you'd want that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasonmalinowski can you elaborate on the scenario here? Essentially when do we get a crash dump for a unit test that we need to investigate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a unit test fails in Jenkins or the signed build. That's how we figured out the flaky access violation issue just last week.
...which I now realize means we need to run this in Jenkins too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to consider the cost of this tool. PdbGit is not a fast tool, instead it's comparitevyl slow with other tools used during build. The performance seems to scale with the size of the binary involved. This means adding this to our unit test runs will add quite a bit of time to our unit tests runs.
To give a sense of how much consider that we are linking ~60 PDBs and it takes 15 minutes to do so. If we implement source linking perfectly on our source tree that will be ~200 PDBs and hence easily adding about 1 hour to our unit test time.
By perfectly I mean that we link a PDB exactly once. Given our build setup PDBs will appear multiple times in the output tree (due to build output copying). After the first link we need to copy the PDB vs. linking again as copying is faster. That's certainly doable but takes a bit more effort.
I'm currently digging into the performance a bit to see if there are any wins we could get. At this time though I don't think it's viable for unit test legs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow. 15 minutes for 60 PDBs is awful. When I wrote (some of) the PdbGit code, I remember that there was non-performant code I was writing that could be improved if necessary (the whole "optimize after you measure" philosophy). So given how bad it is, can you please send an ETL trace of the tool running (perhaps the command line invocation on just one of your larger PDBs)? I bet we can fix it significantly (perhaps an order of magnitude).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Then we might even be able to use the srcsrv support in the PDB to self-extract the source. :)
I'll mention you on an issue where I'll track the work to ask you more questions. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey, whatever the new thing is, I'm game. Just point me at the docs. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean including a command in the PDB to extract the source? Let's not do that, that's a security hole right there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new thing is Portable PDBs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that embedding to windows PDBs via the compiler was pushed out to Update 1: #13707
The compiler supports embedding source in portable PDBs only right now.
Please tag me as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@@ -40,6 +40,14 @@ try | |||
$count = 0 | |||
|
|||
foreach ($relativePath in $signToolData.sign | %{ $_.values }) { | |||
|
|||
# This is a DLL we sign but do not produce. Hence it doesn't have a PDB | |||
# to be source linked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we signing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tmat because we were signing it 😄
Basically when I converted to SignUtil I started by doing a diagnostic run on our signed builds. That spit out the list of assemblies which were being signed. I assumed that was correct and replicated the list in the sign tool data.
If that's wrong I'm happy to remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's wrong. We are not building it in the repo so we shouldn't be signing it. It's signed by the build queue of symreader-portable repo.
This PR seems related to #3. Not to close it, but it satisfies some of the requirements presented there. |
Closing as after some discussion we are switching to using GitLink. |
@jaredpar: can you share those discussions? I'm very curious what your reasoning was and what approach you will take (e.g. msbuild integration, running gitlink.exe passing in a solution file, etc.) |
No description provided.