-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dotnet publish
fails on macOS on ARM
#246
Comments
This is really interesting - The code looks like what I'd expect to see here (no direct invocation of a disposed object): {
long fileSize;
Span<byte> hash = stackalloc byte[SHA256.HashSizeInBytes];
byte[] uncompressedHash;
string tempTarballPath = ContentStore.GetTempFile();
using (FileStream fs = File.Create(tempTarballPath))
{
using (HashDigestGZipStream gz = new(fs, leaveOpen: true))
{
using (TarWriter writer = new(gz, TarEntryFormat.Gnu, leaveOpen: true))
{
foreach (var item in fileList)
{
// Docker treats a COPY instruction that copies to a path like `/app` by
// including `app/` as a directory, with no leading slash. Emulate that here.
string containerPath = item.containerPath.TrimStart(PathSeparators);
writer.WriteEntry(item.path, containerPath);
}
} // Dispose of the TarWriter before getting the hash so the final data get written to the tar stream
uncompressedHash = gz.GetHash();
}
fileSize = fs.Length;
fs.Position = 0;
SHA256.HashData(fs, hash);
} and we're explicitly not closing any of the streams until the outer filestream exits the scope. @rainersigwald any thoughts on this one? Wondering if tere might be some kind of ARM64 interaction on hashing on macOS? |
Yeah, that tracks quite well with what I've figured out too. @eerhardt any hope this immediately rings a bell for you? |
Hmm, I tried to repro this on my macOS Ventura M1, but it didn't repro. @aeons - does it repro for you every time? Can you pass |
I wonder if an exception is happening inside sdk-container-builds/Microsoft.NET.Build.Containers/Layer.cs Lines 44 to 58 in 4ed9915
but before
Which will set the But in our sdk-container-builds/Microsoft.NET.Build.Containers/Layer.cs Lines 138 to 143 in 4ed9915
Since the I think the fix here is to flip around the ordering of calling Dispose sdk-container-builds/Microsoft.NET.Build.Containers/Layer.cs Lines 138 to 143 in 4ed9915
Dispose the cc @bartonjs - to see check to make sure what I'm saying makes sense. |
I think a better fix would be to drop CryptoStream altogether. And SHA256 while you're at it. ( |
I encounted the same issue on a real project I tried to use container builds in, and so far it happens every time. I have a binlog. From my looking at it, it doesn't contain any more information on the exception. I can email it to you if you are interested, but I would prefer not to share it here. |
It still fails, but now it seems we get the real exception:
|
Ok, this I expect is because we're using the 'convenience' wrappers for constructing Tar Entry objects, and those wrappers do user/group lookups. We've got an item logged to move away from this and do more explicit creation of the file and directory entries (partially because of this, partially to support explicit user/group mapping for rootless container support), but we haven't done that work quite yet. See #160 for the discussion around that. |
Interesting. I wonder why that fails on macOS. |
FYI - @carlossanlop Should we re-open this issue? Or open a new one? (Or move it to |
Makes sense to reopen - @aeons' still blocked from the end result, even if you did address a blocker they found. |
Any updates on this? Also seeing errors when trying to use this with macOS on ARM. |
I am also experiencing this issue on a Mac (but this one's Intel)
|
Sorry folks, no update on this one. The root cause is this line - instead of letting the TAR APIs construct a TarEntry for us, we'd need to manually create the group/user - it seems the metadata on the existing files isn't correct somehow on macOS. Neither Rainer nor I have easy access to a macOS device, though, so it will likely be slow going. |
@carlossanlop - why doesn't TarWriter work "correctly" in this case? |
@baronfel Thanks for pointing out the crashing line! I can repro in a simple example and created an issue (which automatically got linked above) with the report. Additionally, it does look like explicitly creating |
Luckily we should only need to make Directory and RegularFile entries (at least the last time I took a look at this). I think this works on Windows (as you noted) because on Windows the group/user data is mostly elided. Thanks for writing up the issue on Runtime - you wouldn't by chance be interested in trying to fix it here would you? 🎣 |
The problem is reported in the callstack to be coming from this line specifically: https://github.com/dotnet/runtime/blob/0ca364767715fee7bc9d0273eec4f30a3c6cc8f0/src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarWriter.Unix.cs#L85 It seems we are getting an ERRNO 0. According to the docs, it is a possible value for a failure to retrieve a groupname or a groupid: https://linux.die.net/man/3/getgrgid_r I assume we're using getgrgid_r, the thread-safe version of that method. It is supported by MacOSX: https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/getgrgid_r.3.html Since the method that captures the ERRNO is returning null (L85 above), we are throwing here: https://github.com/dotnet/runtime/blob/9d44b9bd4eddd5e523a44056c722bcc93bccca0e/src/libraries/Common/src/Interop/Unix/System.Native/Interop.GetGroupName.cs#L20 So my question is for @baronfel: what groupname/groupid is expected here? It seems it does not exist. Either we get the group created before executing all this tar code, or as you suggested, manually construct the entry. |
The User and group information should come from the I wonder if we're been wrong all along in our tooling - perhaps there should almost never be an inferred user or group assignment! |
I'd really like to avoid this plan to address this issue, if we can. I think it may just be masking a bug in the TarWriter itself. Anyone else using |
Heh; I locally made the requisite changes to The output is a little hard to read; it seems like there's some interleaving of Docker output with test output... but there are two main types of errors:
Here's a gist with the full test output, if it's helpful: https://gist.github.com/jacobcarpenter/0b105dc62030a6f3af8bd88b1258d6b2 |
Yes; correct. Should this Issue title be edited to remove the "on ARM" qualifier? |
I can submit a fix for |
I think that solution would be ok - from what I can see the name is really what matters. If the group or user name is present in the entry, then |
Fixing the underlying issue so others don't run into it sounds good to me. I'm not sure if it needs to be backported to .NET 7 or not. But if you think so, I wouldn't object. |
Thanks both.
Is this necessary? The tar entry header has a standard field
Do we want this to happen? This
@baronfel I think you can answer the question since sdk-container-builds is the customer - Would this repo benefit from backporting such fix to .NET 7, or should it just stay in main for .NET 8? |
I agree that leaving it unmodified is the best course of action for the .NET API - I was just trying to relate the behavior of
We would benefit a ton from a fix here - it seems like many (all?) of our users on macOS are hitting this wall and only able to succeed using Windows or Linux CI machines. For 7.0.2xx this feature is easily lit up for WebSDK projects, and for 7.0.3xx we're targeting full, automatic support for all project types via shipping in the .NET SDK, so I'd hope that usage of the tool (and therefore errors from macOS users) would also go up. |
This fix is in the runtime and has been merged and backported, so I'll close this issue for now. Thank you all for the report, discussion, and thank you @carlossanlop for the fix! |
Hi,
I tried to publish an image from a fresh web app, but hit an
ObjectDisposedException
:My system:
The text was updated successfully, but these errors were encountered: