Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow spaces in octal attributes #74358

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -202,19 +202,18 @@ internal static TarEntryType GetCorrectTypeFlagForFormat(TarEntryFormat format,
/// <summary>Parses a byte span that represents an ASCII string containing a number in octal base.</summary>
internal static T ParseOctal<T>(ReadOnlySpan<byte> buffer) where T : struct, INumber<T>
{
buffer = TrimEndingNullsAndSpaces(buffer);

T octalFactor = T.CreateTruncating(8u);
T value = T.Zero;
foreach (byte b in buffer)

// skip leading non-octal bytes
int offset = 0;
for (; offset < buffer.Length && (buffer[offset] < (byte)'0' || buffer[offset] > (byte)'7'); ++offset);

foreach (byte b in buffer.Slice(offset))
{
uint digit = (uint)(b - '0');
if (digit >= 8)
{
ThrowInvalidNumber();
}
if (b < (byte)'0' || b > (byte)'7') break;

value = checked((value * octalFactor) + T.CreateTruncating(digit));
value = checked((value * octalFactor) + T.CreateTruncating((uint)(b - '0')));
Comment on lines -212 to +216
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which assets have octal fields with non-octal characters on either end? I'm concerned that we are basically allowing garbage and not respecting the tar spec anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea let's not anywhere intentionally be more lenient than we discover existing tools are.

Copy link
Member Author

@am11 am11 Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danmoseley, it is exactly as lenient as libarchive's octal parsing, it allows any non-octal character at the beginning and end.

GNU tar and busybox implementations allow both leading and trailing whitespaces and nulls.

.NET's implementation is the strictest one, we are only ignoring null and spaces when they appear at the end, and we do not allow them at the beginning.

I haven't find formal specification calling out for and against these behaviors, so it is up to us to decide which existing implementation to follow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But ignoring nulls and spaces is not the same as ignoring any character that is either smaller than '0' or larger than '7'.

Copy link
Member Author

@am11 am11 Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libarchive ignores any non-octal character. :)
https://github.com/libarchive/libarchive/blob/9e5279615033fa073d0bbae83b366359eab3b62f/contrib/untar.c#L45

if this is too lenient, I can confine it to only allow nulls and spaces.

}

return value;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Licensed to the .NET Foundation under one or more agreements.
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Collections.Generic;
Expand Down Expand Up @@ -125,28 +125,38 @@ public void Read_Archive_LongFileName_Over100_Under255(TarEntryFormat format, Te
public void Read_Archive_LongPath_Over255(TarEntryFormat format, TestTarFormat testFormat) =>
Read_Archive_LongPath_Over255_Internal(format, testFormat);

[Fact]
public void Read_NodeTarArchives_Successfully()
public static IEnumerable<object[]> GetTarFiles(string subset)
{
string nodeTarPath = Path.Join(Directory.GetCurrentDirectory(), "tar", "node-tar");
foreach (string file in Directory.EnumerateFiles(nodeTarPath, "*.tar", SearchOption.AllDirectories))
string path = Path.Join(Directory.GetCurrentDirectory(), "tar", subset);
foreach (string file in Directory.EnumerateFiles(path, "*.tar", SearchOption.AllDirectories))
{
using FileStream sourceStream = File.Open(file, FileMode.Open, FileAccess.Read, FileShare.Read);
using var reader = new TarReader(sourceStream);
yield return new object[] { file };
}
}

TarEntry? entry = null;
while (true)
{
Exception ex = Record.Exception(() => entry = reader.GetNextEntry());
Assert.Null(ex);
[Theory]
[MemberData(nameof(GetTarFiles), "node-tar")]
[MemberData(nameof(GetTarFiles), "tar-rs")]
public void Read_TestArchives_Successfully(string file)
{
using FileStream sourceStream = File.Open(file, FileMode.Open, FileAccess.Read, FileShare.Read);
using var reader = new TarReader(sourceStream);

if (entry is null) break;
TarEntry? entry = null;
while ((entry = reader.GetNextEntry()) != null)
{
Assert.NotNull(entry.Name);
Assert.True(Enum.IsDefined(entry.EntryType));
Assert.True(Enum.IsDefined(entry.Format));

ex = Record.Exception(() => entry.Name);
Assert.Null(ex);
if (entry.EntryType == TarEntryType.Directory)
continue;

ex = Record.Exception(() => entry.Length);
Assert.Null(ex);
var ds = entry.DataStream;
if (ds != null && ds.Length > 0)
{
using var memoryStream = new MemoryStream();
ds.CopyTo(memoryStream);
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,19 +20,6 @@ public void MalformedArchive_TooSmall()
Assert.Throws<EndOfStreamException>(() => reader.GetNextEntry());
}

[Fact]
public void MalformedArchive_HeaderSize()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were these tests removed?

Copy link
Member

@carlossanlop carlossanlop Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I am concerned that we now allow a whole header to contain invalid data. I do not think we want this level of flexiblity.

The test that is being removed would've thrown at some point due to having 512 consecutive 0x1 bytes. Let's analyze each one of the fields in the header:

From V7:

  • name: This can be assumed to be valid since we store it as string and don't validate the value until we extract.
  • mode, uid, guid, size, mtime, checksum: they all expect octal numbers. With your change, they would not throw if there are non-octal chars.
  • linkname: Same as name.

From Ustar and Gnu:

  • typeflag: We should be ok with any character here, even if it's not among the ones defined in our enum (this should throw right now, but it's a bug unrelated to this change, we should allow any ASCII character Verified, we allow values outside of the enum).
  • magic: this should throw. We only have three possible values: either it's all nulls, or it's the Ustar magic, or it's the Gnu magic.
  • version: Similar to magic.
  • uname/gname: would not throw, we store this as a string.
  • devmajor/devminor: Octal, this would pass with your change.
  • prefix: Same as name and linkname.

Then since the size had some data, but we didn't find any characters, we would set size as 0 with your change. So no data.

{
using MemoryStream malformed = new MemoryStream();
byte[] buffer = new byte[512]; // Minimum length of any header
Array.Fill<byte>(buffer, 0x1);
malformed.Write(buffer);
malformed.Seek(0, SeekOrigin.Begin);

using TarReader reader = new TarReader(malformed);
Assert.Throws<FormatException>(() => reader.GetNextEntry());
}

[Fact]
public void EmptyArchive()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,23 +39,6 @@ public async Task MalformedArchive_TooSmall_Async()
}
}

[Fact]
public async Task MalformedArchive_HeaderSize_Async()
{
await using (MemoryStream malformed = new MemoryStream())
{
byte[] buffer = new byte[512]; // Minimum length of any header
Array.Fill<byte>(buffer, 0x1);
malformed.Write(buffer);
malformed.Seek(0, SeekOrigin.Begin);

await using (TarReader reader = new TarReader(malformed))
{
await Assert.ThrowsAsync<FormatException>(async () => await reader.GetNextEntryAsync());
}
}
}

[Fact]
public async Task EmptyArchive_Async()
{
Expand Down