Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base64.Decode: fixed latent bug for non-ASCII inputs #76795

Merged
merged 4 commits into from
Oct 10, 2022

Conversation

gfoidl
Copy link
Member

@gfoidl gfoidl commented Oct 9, 2022

Repro:

string base64      = "ìz/TpH7sqEkerqMweH1uSw==";    // first char is not ASCII
byte[] base64Bytes = Encoding.UTF8.GetBytes(base64);
Span<byte> data    = stackalloc byte[128];

OperationStatus status = Base64_.DecodeFromUtf8(base64Bytes, data, out int consumed, out int written);

Here the ì isn't recognized as being not part of the base64-alphabet, so wrong data is decoded. Ultimately the check

fails as the length doesn't match (base64-len is 24, base64Bytes-len is 25).
So status reports InvalidData correctly, but consumed (24) and written (17) are wrong, as they should be 0 each.

This latent bug got introduced by #70654, and unfortunately there was a test-hole that didn't catch this.
#70654 (comment) talks about the shuffle mask -- here the highest bit of the mask-byte is set so after the shuffle the bits of that vector-element are set to 0, thus

// Check for invalid input: if any "and" values from lo and hi are not zero,
// fall back on bytewise code to do error checking and reporting:
if ((lo & hi) != Vector128<byte>.Zero)
break;
doesn't detect the non-ASCII / non-base64 char.

By masking with 0xF we clear the highest bit -- actually we mask by 0x2F as that constant is used somewhere else and for shuffling (besides the highest bit) only the lower-nibble of the mask-byte is relevant.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Oct 9, 2022
@ghost
Copy link

ghost commented Oct 9, 2022

Tagging subscribers to this area: @dotnet/area-system-memory
See info in area-owners.md if you want to be subscribed.

Issue Details

Repro:

string base64      = "ìz/TpH7sqEkerqMweH1uSw==";    // first char is not ASCII
byte[] base64Bytes = Encoding.UTF8.GetBytes(base64);
Span<byte> data    = stackalloc byte[128];

OperationStatus status = Base64_.DecodeFromUtf8(base64Bytes, data, out int consumed, out int written);

Here the ì isn't recognized as being not part of the base64-alphabet, so wrong data is decoded. Ultimately the check

fails as the length doesn't match (base64-len is 24, base64Bytes-len is 25).
So status reports InvalidData correctly, but consumed (24) and written (17) are wrong, as they should be 0 each.

This latent bug got introduced by #70654, and unfortunately there was a test-hole that didn't catch this.
#70654 (comment) talks about the shuffle mask -- here the highest bit of the mask-byte is set so after the shuffle the bits of that vector-element are set to 0, thus

// Check for invalid input: if any "and" values from lo and hi are not zero,
// fall back on bytewise code to do error checking and reporting:
if ((lo & hi) != Vector128<byte>.Zero)
break;
doesn't detect the non-ASCII / non-base64 char.

By masking with 0xF we clear the highest bit -- actually we mask by 0x2F as that constant is used somewhere else and for shuffling (besides the highest bit) only the lower-nibble of the mask-byte is relevant.

Author: gfoidl
Assignees: -
Labels:

area-System.Memory

Milestone: -

@gfoidl
Copy link
Member Author

gfoidl commented Oct 9, 2022

In CI there's lots of

{"$id":"1","innerException":null,"message":"VS403205: The organization is currently offline for migration. Once the migration has been completed the organization will come back online. Try accessing the organization at a later time. Host Id: 6fcc92e5-73a7-4f88-8d13-d9045b45fb27","typeName":"Microsoft.TeamFoundation.Framework.Server.HostShutdownException, Microsoft.TeamFoundation.Framework.Server","typeKey":"HostShutdownException","errorCode":0,"eventId":3000}
   at Microsoft.DotNet.Helix.AzureDevOps.AzureDevOpsTask.HandleFailedRequest(HttpRequestMessage req, HttpResponseMessage res) in /_/src/Microsoft.DotNet.Helix/Sdk/AzureDevOpsTask.cs:line 139
   at Microsoft.DotNet.Helix.AzureDevOps.AzureDevOpsTask.ParseResponseAsync(HttpRequestMessage req, HttpResponseMessage res) in /_/src/Microsoft.DotNet.Helix/Sdk/AzureDevOpsTask.cs:line 147
   at Microsoft.DotNet.Helix.AzureDevOps.StartAzurePipelinesTestRun.<>c__DisplayClass32_0.<<ExecuteCoreAsync>b__0>d.MoveNext() in /_/src/Microsoft.DotNet.Helix/Sdk/StartAzurePipelinesTestRun.cs:line 59
--- End of stack trace from previous location ---

When the org is online again, someone please re-trigger the build. Thanks in advance.

@kasperk81
Copy link
Contributor

When the org is online again

no clue. maintenance breaks should be publicly announced either at https://github.com/dotnet/announcements/issues or https://github.com/dotnet/core/issues

@MattGal

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT, thank you for the fix @gfoidl !

@adamsitnik
Copy link
Member

/backport to release/7.0

@github-actions
Copy link
Contributor

Started backporting to release/7.0: https://github.com/dotnet/runtime/actions/runs/3219528468

@MattGal
Copy link
Member

MattGal commented Oct 10, 2022

When the org is online again

no clue. maintenance breaks should be publicly announced either at https://github.com/dotnet/announcements/issues or https://github.com/dotnet/core/issues

@MattGal

Things are back to normal now, but yes I agree this was less than ideal. The expectation from previous migrations was that this should have taken a few hours, but once it started it had to be completed. It was expected that an Azure Devops Banner and partners DL email was enough, but clearly it caught people by surprise and we'll take these learnings for the next time a scheduled downtime happens.

@stephentoub
Copy link
Member

It was expected that an Azure Devops Banner and partners DL email was enough

There was no indication anywhere in dotnet/runtime, at least not that I saw.

@adamsitnik adamsitnik merged commit ef6bc67 into dotnet:main Oct 10, 2022
@gfoidl gfoidl deleted the base64-decode-fix branch October 10, 2022 19:48
@ghost ghost locked as resolved and limited conversation to collaborators Nov 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Memory community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants