Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Ascii.ToUtf16 overload that treats \0 as invalid #80366

Closed
gfoidl opened this issue Jan 9, 2023 · 26 comments
Closed

[API Proposal]: Ascii.ToUtf16 overload that treats \0 as invalid #80366

gfoidl opened this issue Jan 9, 2023 · 26 comments
Assignees
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Buffers

Comments

@gfoidl
Copy link
Member

gfoidl commented Jan 9, 2023

Background and motivation

For ASP.NET Core's StringUtilities the ASCII values of the range (0x00, 0x80) are considered valid, whilst Ascii.ToUtf16 treats the whole ASCII range [0x00, 0x80) as valid. In order to base StringUtilities on the Ascii-APIs and avoid custom vectorized code in ASP.NET Core internals \0 should be allowed to be treated as invalid. See dotnet/aspnetcore#45962 for further info.

API Proposal

namespace System.Buffers.Text
{
    public static class Ascii
    {
        // existing methods
+       public static OperationStatus ToUtf16(ReadOnlySpan<byte> source, Span<char> destination, out int bytesConsumed, out int charsWritten, bool treatNullAsInvalid = false);
    }
}

The new ASCII-APIs will get added to .NET 8, so w/o breaking change an optional argument could be added.

namespace System.Buffers.Text
{
    public static class Ascii
    {
        // existing methods
-       public static OperationStatus ToUtf16(ReadOnlySpan<byte> source, Span<char> destination, out int bytesConsumed, out int charsWritten);
+       public static OperationStatus ToUtf16(ReadOnlySpan<byte> source, Span<char> destination, out int bytesConsumed, out int charsWritten, bool treatNullAsInvalid = false);
    }
}

API Usage

    private static unsafe void GetHeaderName(ReadOnlySpan<byte> source, Span<char> buffer)
    {
        OperationStatus status = Ascii.ToUtf16(source, buffer, out _, out _, treatNullAsInvalid: true);

        if (status != OperationStatus.Done)
        {
            KestrelBadHttpRequestException.Throw(RequestRejectionReason.InvalidCharactersInHeaderName);
        }
    }

Alternative Designs

No response

Risks

The value for treatNullAsInvalid will be given as constant, so the JIT should be able to dead-code eliminate any code needed for "default case" (whole ASCII-range incl. \0), so no perf-regression should be expected.

Besides treating \0 as special value which is optinally treated as invalid I don't expect any other value to be considered special enough for optional exclusion.

@gfoidl gfoidl added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jan 9, 2023
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 9, 2023
@stephentoub
Copy link
Member

stephentoub commented Jan 9, 2023

Does the ASP.NET code measurably regress if the newly-added ToUtf16 is used plus a call to IndexOf((byte)'\0) to validate there wasn't a null?

    private static unsafe void GetHeaderName(ReadOnlySpan<byte> source, Span<char> buffer)
    {
        OperationStatus status = Ascii.ToUtf16(source, buffer, out _, out _);
        if (status != OperationStatus.Done || source.IndexOf((byte)'\0') >= 0)
        {
            KestrelBadHttpRequestException.Throw(RequestRejectionReason.InvalidCharactersInHeaderName);
        }
    }

@GrabYourPitchforks had some fairly strong opinions about special-casing '\0'.

@gfoidl
Copy link
Member Author

gfoidl commented Jan 9, 2023

In local micro-benchmarks yes, mainly due to the $O(2n)$-nature. And of course it depends on the input (length and position of \0).

It would be more interesting to see real-usage benchmarks, like how that would impact Techempower, etc. But unfortunately I don't know how to run such benchmarks (at the moment).

fairly strong opinions about special-casing \0

I'm looking forward to read them.

@ghost
Copy link

ghost commented Jan 9, 2023

Tagging subscribers to this area: @dotnet/area-system-buffers
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

For ASP.NET Core's StringUtilities the ASCII values of the range (0x00, 0x80) are considered valid, whilst Ascii.ToUtf16 treats the whole ASCII range [0x00, 0x80) as valid. In order to base StringUtilities on the Ascii-APIs and avoid custom vectorized code in ASP.NET Core internals \0 should be allowed to be treated as invalid. See dotnet/aspnetcore#45962 for further info.

API Proposal

namespace System.Buffers.Text
{
    public static class Ascii
    {
        // existing methods
+       public static OperationStatus ToUtf16(ReadOnlySpan<byte> source, Span<char> destination, out int bytesConsumed, out int charsWritten, bool treatNullAsInvalid = false);
    }
}

The new ASCII-APIs will get added to .NET 8, so w/o breaking change an optional argument could be added.

namespace System.Buffers.Text
{
    public static class Ascii
    {
        // existing methods
-       public static OperationStatus ToUtf16(ReadOnlySpan<byte> source, Span<char> destination, out int bytesConsumed, out int charsWritten);
+       public static OperationStatus ToUtf16(ReadOnlySpan<byte> source, Span<char> destination, out int bytesConsumed, out int charsWritten, bool treatNullAsInvalid = false);
    }
}

API Usage

    private static unsafe void GetHeaderName(ReadOnlySpan<byte> source, Span<char> buffer)
    {
        OperationStatus status = Ascii.ToUtf16(source, buffer, out _, out _, treatNullAsInvalid: true);

        if (status != OperationStatus.Done)
        {
            KestrelBadHttpRequestException.Throw(RequestRejectionReason.InvalidCharactersInHeaderName);
        }
    }

Alternative Designs

No response

Risks

The value for treatNullAsInvalid will be given as constant, so the JIT should be able to dead-code eliminate any code needed for "default case" (whole ASCII-range incl. \0), so no perf-regression should be expected.

Besides treating \0 as special value which is optinally treated as invalid I don't expect any other value to be considered special enough for optional exclusion.

Author: gfoidl
Assignees: -
Labels:

api-suggestion, area-System.Buffers, untriaged

Milestone: -

@benaadams
Copy link
Member

benaadams commented Jan 9, 2023

Does the ASP.NET code measurably regress if the newly-added ToUtf16 is used plus a call to IndexOf((byte)'\0) to validate there wasn't a null?

    private static unsafe void GetHeaderName(ReadOnlySpan<byte> source, Span<char> buffer)
    {
        OperationStatus status = Ascii.ToUtf16(source, buffer, out _, out _);
        if (status != OperationStatus.Done || source.IndexOf((byte)'\0') >= 0)
        {
            KestrelBadHttpRequestException.Throw(RequestRejectionReason.InvalidCharactersInHeaderName);
        }
    }

@GrabYourPitchforks had some fairly strong opinions about special-casing '\0'.

99% of the time there won't be any nulls but headers average 800 bytes to 2kB with cookies; so scanning the headers an additional time to check for nulls can be significant

@gfoidl
Copy link
Member Author

gfoidl commented Jan 9, 2023

headers average 800 bytes to 2kB with cookies

Thanks for these numbers!
Out of interest: how / where did you get these from?

@benaadams
Copy link
Member

headers average 800 bytes to 2kB with cookies

Thanks for these numbers! Out of interest: how / where did you get these from?

Googled average headers size 😅

As an anecdote going to Google homepage logged in my headers are 2158 bytes and it makes 27 requests to that domain (26 to other domains); so in total 58kB for that page and one domain

@svick
Copy link
Contributor

svick commented Jan 9, 2023

Wouldn't it be better to treat \0 as special at the highest level, instead of at the lowest level?

For example, I think that when parsing HTTP 1 headers, you could look for \r, \n or \0 as the first step, instead of just \r and \n, and deal with \0 at that point. That would then mean you could safely use the current version of Ascii.ToUtf16 to convert the header bytes to UTF-16.

@tannergooding tannergooding removed the untriaged New issue has not been triaged by the area owner label Mar 23, 2023
@tannergooding
Copy link
Member

Assigning to @GrabYourPitchforks for now until the necessary input can be given.

@GrabYourPitchforks
Copy link
Member

I am strongly against this proposal. ASCII is defined as characters in the range 0x00 .. 0x7F, inclusive. Sometimes a protocol will exclude certain characters (0x00, or the entire control character range 0x00 .. 0x1F and 0x7F), but at that point you're making something tied to a particular protocol rather than something that is a general-purpose ASCII API. It's similar to the reason we don't support WTF-8 within any of our UTF-8 APIs: certain protocols may utilize it, but it doesn't belong in a general-purpose UTF-8 processing API.

Since this is protocol-specific for aspnet, I recommend the code remain in that project.

@GrabYourPitchforks
Copy link
Member

It would be more interesting to see real-usage benchmarks, like how that would impact Techempower, etc.

@sebastienros Is there a way to measure real-world impact for this API? I've laid out above my arguments against doing this - namely, that protocol-level concerns don't belong in a general-purpose API. But if there is strongly compelling evidence that this is a real perf bottleneck and the runtime layer is the only layer that can provide this functionality properly, that should be weighed in favor of this API, even against my concerns.

@sebastienros
Copy link
Member

I will need to check what can be impacted in ASP.NET and see what benchmarks would exercise this code path. If someone knows which scenarios are useful here then I can start it.

@stephentoub
Copy link
Member

stephentoub commented Mar 23, 2023

Do we know why ASP.NET special-cases \0 here? What happens if we just stop doing that? If ASP.NET needs to do that, is it likely that others will similarly need to special-case certain values?

I'd really like to be able to encapsulate this in a core library provided helper, for ASP.NET to use and for others to use. Vectorizing such a thing is very non-trivial. Is there a shape of an API we could come up with that would enable efficiently doing this, e.g. a default overload that is for [0, 127] but another overload that lets you opt-out one or more values or ranges, or some such thing?

Note that one of the primary uses of IndexOfAnyValues is for use in protocols, where protocols need to search for or exempt certain things. Could/should we incorporate that somehow?

@benaadams
Copy link
Member

Do we know why ASP.NET special-cases \0 here?

Spec wise it shouldn't; however if its a front-end server that passes requests to another server; if and that server uses null terminated strings then the request can change in the second layer accessing url's which weren't mapped to the internet, which could be a security risk (along lines of https://en.wikipedia.org/wiki/HTTP_request_smuggling though different)

@benaadams
Copy link
Member

benaadams commented Mar 23, 2023

Note that one of the primary uses of IndexOfAnyValues is for use in protocols, where protocols need to search for or exempt certain things. Could/should we incorporate that somehow?

As @svick says, it just needs to check for 3 rather than 2, then throw if its \0; is already an api for it

For example, I think that when parsing HTTP 1 headers, you could look for \r, \n or \0 as the first step, instead of just \r and \n, and deal with \0 at that point. That would then mean you could safely use the current version of Ascii.ToUtf16 to convert the header bytes to UTF-16.

@stephentoub
Copy link
Member

As @svick says, it just needs to check for 3 rather than 2, then throw if its \0; is already an api for it

I hadn't seen @svick's comment:

For example, I think that when parsing HTTP 1 headers, you could look for \r, \n or \0 as the first step, instead of just \r and \n, and deal with \0 at that point. That would then mean you could safely use the current version of Ascii.ToUtf16 to convert the header bytes to UTF-16.

Is that viable? Can all of the places that call into this shared routine be updated trivially to ensure the data passed in doesn't contain a \0? If so, let's do that, add the AScii.ToUtf16 that's for the whole [0, 127] range, update ASP.NET to use that, and call it a good day.

@stephentoub
Copy link
Member

@BrennanConroy, do you have a suggestion for how we could make forward progress on this?

@BrennanConroy
Copy link
Member

BrennanConroy commented Apr 29, 2023

For example, I think that when parsing HTTP 1 headers, you could look for \r, \n or \0 as the first step, instead of just \r and \n, and deal with \0 at that point. That would then mean you could safely use the current version of Ascii.ToUtf16 to convert the header bytes to UTF-16.

Is that viable? Can all of the places that call into this shared routine be updated trivially to ensure the data passed in doesn't contain a \0? If so, let's do that, add the AScii.ToUtf16 that's for the whole [0, 127] range, update ASP.NET to use that, and call it a good day.

Looks like it would be "easy" to do this. If we updated https://github.com/dotnet/aspnetcore/blob/f62f12357c49c4f1cca502e8f4cf57353f0b320f/src/Servers/Kestrel/Core/src/Internal/Http/HttpParser.cs#LL64C37-L64C37 to instead be

private static ReadOnlySpan<byte> Delimiters => new byte[] { ByteLF, 0 };

if (reader.TryReadToAny(out ReadOnlySpan<byte> requestLine, Delimiters, advancePastDelimiter: true))
{
    if (requestLine.Length == 0 || (reader.TryPeek(out var next) && next == 0))
    {
        RejectRequestLine(requestLine);
    }
    ParseRequestLine(handler, requestLine);
    return true;
}

This is where we start parsing the HTTP/1 request so we're already going through the entire request line looking for \n, this just updates to TryReadToAny to search for \0 at the same time, which I hope is optimized for single pass 😃.

The concerns are:

  1. We're removing the \0 check from the GetAsciiStringNonNullCharacters method which means new callers could accidentally allow null
  2. The errors returned if \0 is found are lacking detail (although a lot faster 😆)
  3. There is some HTTP/2 and HTTP/3 code calling GetAsciiStringNonNullCharacters although it looked like they were supposed to already be null character checking before calling this method

@stephentoub
Copy link
Member

Looks like it would be "easy" to do this

Thanks, I sketched it out in:
dotnet/aspnetcore@main...stephentoub:aspnetcore:asciitoutf16
Not exactly what you suggested, but similar.

A bunch of tests failed, and I haven't gone through to see which would be expected (the tests have internals access) and which would be real problems.

Any interest in picking it up and seeing how far we can run with it?

@BrennanConroy
Copy link
Member

Yeah, I'll pick it up and see what the team thinks

@BrennanConroy
Copy link
Member

It looks like Ascii.ToUtf16 is still slower than the custom code in aspnetcore.
dotnet/aspnetcore#45962 (comment)

#80245 is open which might be indirectly tracking part of the work to improve the performance.
And there is a recent PR that might improve performance #85266.

@stephentoub
Copy link
Member

It looks like Ascii.ToUtf16 is still slower than the custom code in aspnetcore.
dotnet/aspnetcore#45962 (comment)

I'm not aware of any fundamental reason that should be the case. We should fix anything in the core routine that might be contributing those few additional cycles. I'd hope it's not just the difference between returning a bool and returning an OperationStatus.

cc: @adamsitnik, @GrabYourPitchforks

@BrennanConroy
Copy link
Member

Grabbed the assembly of Ascii.ToUtf16 and Kestrel's TryGetAsciiString

One very obvious difference is that the core processing is not inlined in the Ascii.ToUtf16 case. That's the first thing I would try when comparing perf, but it always takes me a couple hours to figure out how to get a custom runtime again, so I haven't tried yet 😆

But if someone wants to take a look at the assembly in the meantime and see if there is anything obviously worse in the Ascii.ToUtf16 case please do!

Ascii.ToUtf16
; Total bytes of code 171
; Assembly listing for method System.Text.Ascii:ToUtf16(System.ReadOnlySpan`1[ubyte],System.Span`1[ushort],byref):int (Tier1)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; Tier1 code
; optimized code
; optimized using Dynamic PGO
; rsp based frame
; partially interruptible
; with Dynamic PGO: edge weights are valid, and fgCalledCount is 17696
; 0 inlinees with PGO data; 2 single block inlinees; 0 inlinees without PGO data
G_M000_IG01:                ;; offset=0x0000
       push     r15
       push     r14
       push     rdi
       push     rsi
       push     rbp
       push     rbx
       sub      rsp, 56
       xor      eax, eax
       mov      qword ptr [rsp+0x30], rax
       mov      qword ptr [rsp+0x28], rax
       mov      rbx, r8

G_M000_IG02:                ;; offset=0x001B
       mov      rsi, bword ptr [rdx]
       mov      edi, dword ptr [rdx+0x08]
       mov      rbp, bword ptr [rcx]
       mov      ecx, dword ptr [rcx+0x08]
       cmp      ecx, edi
       jg       SHORT G_M000_IG05
       mov      r14d, ecx
       xor      r15d, r15d

G_M000_IG03:                ;; offset=0x0031
       mov      bword ptr [rsp+0x30], rbp
       mov      rcx, rbp
       mov      bword ptr [rsp+0x28], rsi
       mov      rdx, rsi
       mov      r8, r14
       call     [System.Text.Ascii:WidenAsciiToUtf16(ulong,ulong,ulong):ulong]
       mov      dword ptr [rbx], eax
       mov      ecx, 3
       cmp      r14, rax
       mov      eax, ecx
       cmove    eax, r15d

G_M000_IG04:                ;; offset=0x005A
       add      rsp, 56
       pop      rbx
       pop      rbp
       pop      rsi
       pop      rdi
       pop      r14
       pop      r15
       ret

G_M000_IG05:                ;; offset=0x0067
       mov      r14d, edi
       mov      r15d, 1
       jmp      SHORT G_M000_IG03

; Total bytes of code 114

-----------------------------------------------------------------------------------------------------------------------------------------

; Assembly listing for method System.Text.Ascii:WidenAsciiToUtf16(ulong,ulong,ulong):ulong (Tier1)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; Tier1 code
; optimized code
; optimized using Dynamic PGO
; rsp based frame
; fully interruptible
; with Dynamic PGO: edge weights are valid, and fgCalledCount is 12930
; 0 inlinees with PGO data; 6 single block inlinees; 2 inlinees without PGO data
G_M000_IG01:                ;; offset=0x0000
       vzeroupper

G_M000_IG02:                ;; offset=0x0003
       xor      eax, eax
       cmp      r8, 16
       jb       SHORT G_M000_IG04
       mov      r10, rdx
       cmp      r8, 32
       jb       G_M000_IG11
       lea      r9, [r8-0x20]

G_M000_IG03:                ;; offset=0x001C
       vmovups  ymm0, ymmword ptr [rcx+rax]
       vpmovmskb r11d, ymm0
       test     r11d, r11d
       jne      SHORT G_M000_IG04
       vmovaps  ymm1, ymm0
       vpmovzxbw ymm1, ymm1
       vextracti128 xmm0, ymm0, 1
       vpmovzxbw ymm0, ymm0
       vmovups  ymmword ptr [r10], ymm1
       vmovups  ymmword ptr [r10+0x20], ymm0
       add      rax, 32
       add      r10, 64
       cmp      rax, r9
       jbe      SHORT G_M000_IG03

G_M000_IG04:                ;; offset=0x0056
       sub      r8, rax
       cmp      r8, 4
       jb       SHORT G_M000_IG07

G_M000_IG05:                ;; offset=0x005F
       lea      r10, [rax+r8-0x04]
       align    [0 bytes for IG06]

G_M000_IG06:                ;; offset=0x0064
       mov      r9d, dword ptr [rcx+rax]
       test     r9d, 0xFFFFFFFF80808080
       jne      SHORT G_M000_IG10
       vmovd    xmm0, r9
       vpmovzxbw xmm0, xmm0
       vmovd    qword ptr [rdx+2*rax], xmm0
       add      rax, 4
       cmp      rax, r10
       jbe      SHORT G_M000_IG06

G_M000_IG07:                ;; offset=0x008A
       test     r8b, 2
       jne      SHORT G_M000_IG13
       test     r8b, 1
       jne      G_M000_IG14

G_M000_IG08:                ;; offset=0x009A
       vzeroupper
       ret

G_M000_IG09:                ;; offset=0x009E
       movzx    rcx, r9b
       mov      word  ptr [rdx+2*rax], cx
       inc      rax
       shr      r9d, 8

G_M000_IG10:                ;; offset=0x00AD
       movzx    rcx, r9b
       test     cl, 128
       je       SHORT G_M000_IG09
       jmp      SHORT G_M000_IG08

G_M000_IG11:                ;; offset=0x00B8
       lea      r9, [r8-0x10]

G_M000_IG12:                ;; offset=0x00BC
       vmovups  xmm0, xmmword ptr [rcx+rax]
       vptest   xmm0, xmmword ptr [reloc @RWD00]
       jne      SHORT G_M000_IG04
       vpmovzxbw xmm1, xmm0
       vpsrldq  xmm0, xmm0, 8
       vpmovzxbw xmm0, xmm0
       vmovups  xmmword ptr [r10], xmm1
       vmovups  xmmword ptr [r10+0x10], xmm0
       add      rax, 16
       add      r10, 32
       cmp      rax, r9
       jbe      SHORT G_M000_IG12
       jmp      G_M000_IG04

G_M000_IG13:                ;; offset=0x00F8
       movzx    r9, word  ptr [rcx+rax]
       test     r9d, 0xFFFFFFFF80808080
       jne      SHORT G_M000_IG10
       movzx    r10, r9b
       mov      word  ptr [rdx+2*rax], r10w
       shr      r9d, 8
       mov      word  ptr [rdx+2*rax+0x02], r9w
       add      rax, 2
       test     r8b, 1
       je       G_M000_IG08

G_M000_IG14:                ;; offset=0x0127
       movzx    r9, byte  ptr [rcx+rax]
       test     r9b, 128
       jne      G_M000_IG08
       mov      word  ptr [rdx+2*rax], r9w
       inc      rax
       jmp      G_M000_IG08

RWD00   dq      8080808080808080h, 8080808080808080h
; Total bytes of code 323
TryGetAsciiString
; Assembly listing for method StringUtilities:TryGetAsciiString(ulong,ulong,int):bool (Tier1)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; Tier1 code
; optimized code
; rsp based frame
; fully interruptible
; No PGO data
; 0 inlinees with PGO data; 8 single block inlinees; 2 inlinees without PGO data
G_M000_IG01:                ;; offset=0x0000
       vzeroupper

G_M000_IG02:                ;; offset=0x0003
       movsxd   rax, r8d
       add      rax, rcx
       lea      r8, [rax-0x20]
       cmp      rcx, r8
       ja       SHORT G_M000_IG05
       align    [0 bytes for IG03]

G_M000_IG03:                ;; offset=0x0012
       vmovups  ymm0, ymmword ptr [rcx]
       vxorps   ymm1, ymm1, ymm1
       vpcmpgtb ymm1, ymm0, ymm1
       vpmovmskb r10d, ymm1
       cmp      r10d, -1
       jne      G_M000_IG15
       vxorps   ymm1, ymm1, ymm1
       vpunpcklbw ymm1, ymm0, ymm1
       vxorps   ymm2, ymm2, ymm2
       vpunpckhbw ymm0, ymm0, ymm2
       vperm2i128 ymm2, ymm1, ymm0, 32
       vperm2i128 ymm0, ymm1, ymm0, 49
       vmovups  ymmword ptr [rdx], ymm2
       vmovups  ymmword ptr [rdx+0x20], ymm0
       add      rcx, 32
       add      rdx, 64
       cmp      rcx, r8
       jbe      SHORT G_M000_IG03

G_M000_IG04:                ;; offset=0x005E
       cmp      rcx, rax
       je       G_M000_IG13

G_M000_IG05:                ;; offset=0x0067

       lea      r8, [rax-0x10]
       cmp      rcx, r8
       ja       SHORT G_M000_IG08
       align    [0 bytes for IG06]

G_M000_IG06:                ;; offset=0x0070
       vmovups  xmm0, xmmword ptr [rcx]
       vxorps   xmm1, xmm1, xmm1
       vpcmpgtb xmm1, xmm0, xmm1
       vpmovmskb r10d, xmm1
       cmp      r10d, 0xFFFF
       jne      G_M000_IG15
       vxorps   xmm1, xmm1, xmm1
       vpunpcklbw xmm1, xmm0, xmm1
       vxorps   xmm2, xmm2, xmm2
       vpunpckhbw xmm0, xmm0, xmm2
       vmovups  xmmword ptr [rdx], xmm1
       vmovups  xmmword ptr [rdx+0x10], xmm0
       add      rcx, 16
       add      rdx, 32
       cmp      rcx, r8
       jbe      SHORT G_M000_IG06

G_M000_IG07:                ;; offset=0x00B3
       cmp      rcx, rax
       je       G_M000_IG13

G_M000_IG08:                ;; offset=0x00BC
       lea      r8, [rax-0x08]
       cmp      rcx, r8
       ja       SHORT G_M000_IG10
       align    [0 bytes for IG09]

G_M000_IG09:                ;; offset=0x00C5
       mov      r10, qword ptr [rcx]
       mov      r9, 0xFEFEFEFEFEFEFEFF
       add      r9, r10
       or       r9, r10
       mov      r11, 0x8080808080808080
       test     r9, r11
       jne      G_M000_IG15
       vmovd    xmm0, r10
       vxorps   xmm1, xmm1, xmm1
       vpunpcklbw xmm0, xmm0, xmm1
       vmovups  xmmword ptr [rdx], xmm0
       add      rcx, 8
       add      rdx, 16
       cmp      rcx, r8
       jbe      SHORT G_M000_IG09

G_M000_IG10:                ;; offset=0x0109
       lea      r8, [rax-0x04]

       cmp      rcx, r8
       ja       SHORT G_M000_IG11
       mov      r8d, dword ptr [rcx]
       lea      r10d, [r8+0xFEFEFEFF]
       or       r10d, r8d
       test     r10d, 0xFFFFFFFF80808080
       jne      SHORT G_M000_IG15
       vmovd    xmm0, r8
       vxorps   xmm1, xmm1, xmm1
       vpunpcklbw xmm0, xmm0, xmm1
       vmovd    qword ptr [rdx], xmm0
       add      rcx, 4
       add      rdx, 8

G_M000_IG11:                ;; offset=0x0142
       lea      r8, [rax-0x02]
       cmp      rcx, r8
       ja       SHORT G_M000_IG12
       movsx    r8, word  ptr [rcx]
       lea      r10d, [r8-0x101]
       movsx    r10, r10w
       or       r8d, r10d
       test     r8d, -0x7F80
       jne      SHORT G_M000_IG15
       movzx    r8, byte  ptr [rcx]
       mov      word  ptr [rdx], r8w
       movzx    r8, byte  ptr [rcx+0x01]
       mov      word  ptr [rdx+0x02], r8w
       add      rcx, 2
       add      rdx, 4

G_M000_IG12:                ;; offset=0x0180
       cmp      rcx, rax
       jae      SHORT G_M000_IG13
       cmp      byte  ptr [rcx], 0
       jle      SHORT G_M000_IG15
       movzx    rax, byte  ptr [rcx]
       mov      word  ptr [rdx], ax

G_M000_IG13:                ;; offset=0x0190
       mov      eax, 1

G_M000_IG14:                ;; offset=0x0195
       vzeroupper
       ret

G_M000_IG15:                ;; offset=0x0199
       xor      eax, eax

G_M000_IG16:                ;; offset=0x019B
       vzeroupper
       ret

; Total bytes of code 415

@GrabYourPitchforks
Copy link
Member

Below are the results I'm getting on my machine. This seems very much within the range of noise.

@BrennanConroy Are you seeing different results than below?


BenchmarkDotNet v0.13.8, Windows 11 (10.0.22621.2283/22H2/2022Update/SunValley2) (Hyper-V)
Intel Core i9-10900K CPU 3.70GHz, 1 CPU, 20 logical and 10 physical cores
.NET SDK 8.0.100-rc.1.23455.8
  [Host]   : .NET 8.0.0 (8.0.23.41904), X64 RyuJIT AVX2
  .NET 8.0 : .NET 8.0.0 (8.0.23.41904), X64 RyuJIT AVX2

Job=.NET 8.0  Runtime=.NET 8.0  

Method StringLength Mean Error StdDev Ratio
Ascii_ToUtf16 4 4.763 ns 0.0336 ns 0.0314 ns 1.00
StringUtilities_TryGetAscii 4 4.123 ns 0.0335 ns 0.0313 ns 0.87
Ascii_ToUtf16 8 4.968 ns 0.0201 ns 0.0168 ns 1.00
StringUtilities_TryGetAscii 8 4.714 ns 0.0307 ns 0.0287 ns 0.95
Ascii_ToUtf16 16 5.159 ns 0.0405 ns 0.0359 ns 1.00
StringUtilities_TryGetAscii 16 3.625 ns 0.0387 ns 0.0362 ns 0.70
Ascii_ToUtf16 24 6.823 ns 0.0681 ns 0.0637 ns 1.00
StringUtilities_TryGetAscii 24 5.212 ns 0.0336 ns 0.0298 ns 0.76
Ascii_ToUtf16 128 8.177 ns 0.0519 ns 0.0485 ns 1.00
StringUtilities_TryGetAscii 128 9.791 ns 0.0186 ns 0.0155 ns 1.20
Ascii_ToUtf16 256 10.232 ns 0.0517 ns 0.0458 ns 1.00
StringUtilities_TryGetAscii 256 11.014 ns 0.0491 ns 0.0459 ns 1.08
Ascii_ToUtf16 1024 28.757 ns 0.1900 ns 0.1777 ns 1.00
StringUtilities_TryGetAscii 1024 33.437 ns 0.2068 ns 0.1727 ns 1.16

@stephentoub
Copy link
Member

@BrennanConroy, can you comment on the above?

@stephentoub
Copy link
Member

Closing given dotnet/aspnetcore#56578

@stephentoub stephentoub closed this as not planned Won't fix, can't repro, duplicate, stale Jul 19, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Aug 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Buffers
Projects
None yet
Development

No branches or pull requests

9 participants