Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ObjWriter: Do not generate relocations within .debug_info for DW_AT_type #98597

Merged
merged 2 commits into from
Feb 20, 2024

Conversation

filipnavara
Copy link
Member

Fixes #98377.

Neither gcc, nor clang, generate relocations for references within the same .debug_info section. The DWARF 5 specification, section 7.3.1, doesn't list this type of relocation as required. At least some versions of GNU binutils linker seem to apply both the relocation and their own offset in the final file, resulting in the offset being applied twice.

@ghost
Copy link

ghost commented Feb 17, 2024

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #98377.

Neither gcc, nor clang, generate relocations for references within the same .debug_info section. The DWARF 5 specification, section 7.3.1, doesn't list this type of relocation as required. At least some versions of GNU binutils linker seem to apply both the relocation and their own offset in the final file, resulting in the offset being applied twice.

Author: filipnavara
Assignees: -
Labels:

area-NativeAOT-coreclr

Milestone: -

@filipnavara filipnavara requested a review from am11 February 17, 2024 05:51
@filipnavara filipnavara added the community-contribution Indicates that the PR has been added by a community member label Feb 17, 2024
@am11
Copy link
Member

am11 commented Feb 17, 2024

Tested with your branch, still getting the warning (with smaller offsets):

nm: DWARF error: offset (198379264) greater than or equal to .debug_str size (1049227)
nm: DWARF error: offset (202103815) greater than or equal to .debug_str size (1049227)
nm: DWARF error: offset (1484298) greater than or equal to .debug_str size (1049227)
nm: DWARF error: could not find abbrev number 83

@filipnavara
Copy link
Member Author

Damn. I wonder what’s wrong with the linker. I guess I will have to reproduce the whole flow locally.

@filipnavara
Copy link
Member Author

@am11 Could you please share the nine/nine.o from this attempt? I want to run them through [llvm-]dwarfdump to see if there are any other errors.

@am11
Copy link
Member

am11 commented Feb 17, 2024

attempt1_gh-98597.tar.gz

@filipnavara
Copy link
Member Author

attempt1_gh-98597.tar.gz

I run it through readelf -rW obj/Release/net9.0/linux-musl-arm64/native/nine.o and in the .rela.debug_info sections I still see references to .debug_info:

0000000000000702  0000000d00000102 R_AARCH64_ABS32        0000000000000000 .debug_str + 726
000000000000070c  0000000c00000102 R_AARCH64_ABS32        0000000000000000 .debug_info + 710
000000000000071c  0000000d00000102 R_AARCH64_ABS32        0000000000000000 .debug_str + 759
0000000000000720  0000000d00000102 R_AARCH64_ABS32        0000000000000000 .debug_str + 764
000000000000072a  0000000c00000102 R_AARCH64_ABS32        0000000000000000 .debug_info + 72e
0000000000000735  0000000d00000102 R_AARCH64_ABS32        0000000000000000 .debug_str + 78d
0000000000000739  0000000d00000102 R_AARCH64_ABS32        0000000000000000 .debug_str + 79f
0000000000000743  0000000c00000102 R_AARCH64_ABS32        0000000000000000 .debug_info + 747

So, either I messed something up or it didn't use the new ILCompiler.

@filipnavara
Copy link
Member Author

Ah, I see, I missed one place:

public void WriteInfoAbsReference(long offset)
{
Debug.Assert(offset < uint.MaxValue);
_infoSectionWriter.EmitSymbolReference(RelocType.IMAGE_REL_BASED_HIGHLOW, ".debug_info", offset);
}

@filipnavara
Copy link
Member Author

I updated the PR. A retest would be nice. Thanks for bearing with me on this one!

@am11
Copy link
Member

am11 commented Feb 17, 2024

All good! 👍
attempt2_gh-98597.tar.gz


The flow I'm using is a bit involved (and most likely inefficient). I am testing on macOS, with #98603, the "live" hello world app built via sdk. Can do the same with some simple smoke test binary as well, haven't tried. 😅

nine.tar.gz (this is helloworld + custom nuget.config)

# host osx-arm64
$ git clone https://github.com/dotnet/runtime --single-branch --depth 1; cd runtime
$ curl -sSL https://github.com/dotnet/runtime/pull/98603.patch | git apply -
$ curl -sSL https://github.com/dotnet/runtime/pull/98597.patch | git apply -
$ curl -sSLO https://github.com/dotnet/runtime/files/14319205/nine.tar.gz

$ docker run --rm -v$(pwd):/runtime -w /runtime \
    --platform linux/arm64 -it alpine

# in container REPL
$ eng/install-native-dependencies.sh

$ tar xzf nine.tar.gz -C /

# repeatable
$ rm -rf artifacts
$ ./build.sh clr+libs+packs
$ /runtime/dotnet.sh nuget locals all --clear
$ cd /nine
$ rm -rf obj bin dist packs1
$ /runtime/dotnet.sh publish -p:PublishAot=true -o dist -p:StripSymbols=false --packages packs1

$ apk add binutils
$ nm --portability --line-numbers dist/nine > /dev/null
# this time no warnings! 🎉

@filipnavara filipnavara marked this pull request as ready for review February 17, 2024 15:39
@filipnavara
Copy link
Member Author

filipnavara commented Feb 17, 2024

Thanks for checking!

The flow I'm using is a bit involved...

Yeah, I figured that's what you are likely doing. I tried numerous different ways:

  • Installing Alpine as virtualized VM in UTM on macOS. It hard crashed the kernel when trying to clone the GIT repository.
  • Installing Alpine in WSL2 on ARM64 Windows. Failed horribly, the scripts no longer work and neither does the unofficial app in the Store.
  • Cross-compiling with dotnet for linux-musl-arm64 from linux-x64 machine. Surprisingly that works but it doesn't produce corrupted DWARF, likely because the LLVM linker is used.

@am11
Copy link
Member

am11 commented Feb 17, 2024

That is unfortunate. Even this docker flow sometimes fail in packs subset, but right after clreaing ILCompiler nupkg which we are interested in 😅

  oob -> Trimming linux-musl-arm64 out-of-band assemblies with ILLinker...
  Microsoft.DotNet.ILCompiler -> /runtime/artifacts/packages/Debug/Shipping/Microsoft.DotNet.ILCompiler.9.0.0-dev.nupkg
  Microsoft.DotNet.ILCompiler -> /runtime/artifacts/packages/Debug/Shipping/runtime.linux-musl-arm64.Microsoft.DotNet.ILCompiler.9.0.0-dev.nupkg
  Microsoft.NETCore.App.Ref -> 
/runtime/.dotnet/sdk/9.0.100-alpha.1.23615.4/Sdks/NuGet.Build.Tasks.Pack/build/NuGet.Build.Tasks.Pack.targets(221,5): error : Could not find a part of the path '/runtime/src/installer/pkg/sfx/Microsoft.NETCore.App'. [/runtime/src/installer/pkg/sfx/Microsoft.NETCore.App/Microsoft.NETCore.App.Ref.sfxproj]

I guess this intermittent failure is some hardlink issue of msbuild (or the way our sfxprojs are authored?). Luckily we can continue testing ilcompiler nupkg even with this broken flow. :)

@akoeplinger
Copy link
Member

I guess this intermittent failure is some hardlink issue of msbuild (or the way our sfxprojs are authored?). Luckily we can continue testing ilcompiler nupkg even with this broken flow. :)

It could also be an issue with the filesystem sync of the mounted volume from the mac host, you might want to try cloning the runtime repo inside of the docker linux filesystem.

@am11
Copy link
Member

am11 commented Feb 19, 2024

It could also be an issue with the filesystem sync of the mounted volume from the mac host, you might want to try cloning the runtime repo inside of the docker linux filesystem.

I noticed that when it fails, it seems to be always failing on Microsoft.NETCore.App.Ref.sfxproj, maybe there is something special about this project which is causing it? will try to take diagnostics dumps.

@am11
Copy link
Member

am11 commented Feb 19, 2024

$ ./build.sh packs -c Release -v:diag
...
11:50:00.347   7:3>/runtime/.dotnet/sdk/9.0.100-alpha.1.23615.4/Sdks/NuGet.Build.Tasks.Pack/build/NuGet.Build.Tasks.Pack.targets(221,5): error : Could not find a part of the path '/runtime/src/installer/pkg/sfx/Microsoft.NETCore.App'. [/runtime/
src/installer/pkg/sfx/Microsoft.NETCore.App/Microsoft.NETCore.App.Ref.sfxproj]
                     Assembly loaded during TaskRun (NuGet.Build.Tasks.Pack.PackTask): System.Diagnostics.StackTrace, Version=9.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a (location: /runtime/.dotnet/shared/Microsoft.NETCore.App/9.0.
0-alpha.1.23614.10/System.Diagnostics.StackTrace.dll, MVID: e4055d45-a175-49e1-afcb-8df2f6d2eee3, AssemblyLoadContext: Default) (TaskId:14)
                     System.IO.DirectoryNotFoundException: Could not find a part of the path '/runtime/src/installer/pkg/sfx/Microsoft.NETCore.App'.
                        at System.IO.Enumeration.FileSystemEnumerator`1.CreateDirectoryHandle(String path, Boolean ignoreNotFound)
                        at System.IO.Enumeration.FileSystemEnumerator`1.Init()
                        at System.IO.Enumeration.FileSystemEnumerable`1..ctor(String directory, FindTransform transform, EnumerationOptions options, Boolean isNormalized)
                        at System.IO.Enumeration.FileSystemEnumerableFactory.UserFiles(String directory, String expression, EnumerationOptions options)
                        at System.IO.Directory.InternalEnumeratePaths(String path, String searchPattern, SearchTarget searchTarget, EnumerationOptions options)
                        at System.IO.Directory.GetFiles(String path, String searchPattern, EnumerationOptions enumerationOptions)
                        at NuGet.Common.PathResolver.PerformWildcardSearch(String basePath, String searchPath, Boolean includeEmptyDirectories, String& normalizedBasePath)
                        at NuGet.Packaging.PackageBuilder.ResolveSearchPattern(String basePath, String searchPath, String targetPath, Boolean includeEmptyDirectories)
                        at NuGet.Packaging.PackageBuilder.AddFiles(String basePath, String source, String destination, String exclude)
                        at NuGet.Packaging.PackageBuilder.PopulateFiles(String basePath, IEnumerable`1 files)
                        at NuGet.Commands.MSBuildProjectFactory.CreateBuilder(String basePath, NuGetVersion version, String suffix, Boolean buildIfNeeded, PackageBuilder builder)
                        at NuGet.Commands.PackCommandRunner.BuildFromProjectFile(String path)
                        at NuGet.Build.Tasks.Pack.PackTask.Execute() (TaskId:14)
                   Done executing task "PackTask" -- FAILED. (TaskId:14)

the directory exists:

$ ls /runtime/src/installer/pkg/sfx/Microsoft.NETCore.App

Directory.Build.props                    Microsoft.NETCore.App.Host.sfxproj                             Microsoft.NETCore.App.MonoCrossAOT.sfxproj  Microsoft.NETCore.App.Runtime.sfxproj  monocrossaot.sfxproj
Directory.Build.targets                  Microsoft.NETCore.App.MonoCrossAOT.Sdk.props.in                Microsoft.NETCore.App.Ref.sfxproj           PackageOverrides.txt
Microsoft.NETCore.App.Crossgen2.sfxproj  Microsoft.NETCore.App.MonoCrossAOT.UnixFilePermissions.xml.in  Microsoft.NETCore.App.Runtime.props         ReadyToRun.target

With this code (based on https://github.com/NuGet/NuGet.Client/blob/61bd0d260ea2f849dd842e1a881e99b875b085f4/src/NuGet.Core/NuGet.Common/PathUtil/PathResolver.cs#L149):

using System;
using System.IO;

const string path = "/runtime/src/installer/pkg/sfx/Microsoft.NETCore.App";
foreach (string file in System.IO.Directory.GetFiles(path, "*", SearchOption.TopDirectoryOnly).AsParallel())
{
  Console.WriteLine(file);
}

foreach (string file in System.IO.Directory.GetFiles(path, "*", SearchOption.AllDirectories).AsParallel())
{
  Console.WriteLine(file);
}

it lists all the files (in release and debug). No idea where the mix up is happening. 🤷‍♂️

@akoeplinger
Copy link
Member

@am11 this is pretty weird, the src/installer/pkg/sfx/Microsoft.NETCore.App path should always exist and it's not a symlink. I guess I'd file a separate issue and maybe try to inject more logging into FileSystemEnumerator.Unix.cs

@am11
Copy link
Member

am11 commented Feb 20, 2024

Looks like VirtioFS instability issue docker/for-mac#6219 (comment)

  1. switching to gRPC FUSE fixed the error:
    image
  2. only VirtioFS mount has that issue, making a copy of /runtime to /runtime2 fixed the error as well.
  3. In that particular context of NuGet, this workaround'ish fix is safe, makes sense and fixes the problem with VirtioFS as well:
    --- a/src/NuGet.Core/NuGet.Common/PathUtil/PathResolver.cs
    +++ b/src/NuGet.Core/NuGet.Common/PathUtil/PathResolver.cs
    @@ -144,6 +144,8 @@ public static IEnumerable<string> PerformWildcardSearch(string basePath, string
                     searchOption = SearchOption.TopDirectoryOnly;
                 }
    
    +            if (!Directory.Exists(normalizedBasePath)) return Enumerable.Empty<SearchPathResult>();
    +
                 // Starting from the base path, enumerate over all files and match it using the wildcard expression provided by the user.
                 // Note: We use Directory.GetFiles() instead of Directory.EnumerateFiles() here to support Mono
                 IEnumerable<SearchPathResult> matchedFiles = from file in Directory.GetFiles(normalizedBasePath, "*.*", searchOption).AsParallel()

still no idea why it fails only at that particular sfxproj (after building/packing hundreds of projects and compiling loads of native code on that mount)

i'll now stop hijacking @filipnavara's thread. 😁

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@jkotas jkotas merged commit bee4602 into dotnet:main Feb 20, 2024
110 checks passed
@filipnavara filipnavara deleted the no_debug_info_reloc branch February 20, 2024 03:26
@github-actions github-actions bot locked and limited conversation to collaborators Mar 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-NativeAOT-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DWARF errors with nm on linux-musl
4 participants