Avoid MemoryMarshal.Cast when transcoding from UTF-16 to UTF-8 while escaping in Utf8JsonWriter. #40996

ahsonkhan · 2019-09-10T21:29:13Z

Fixes https://github.com/dotnet/corefx/issues/40979 in master.

This is meant to be a targeted fix to be ported to 3.0.

cc @steveharter, @GrabYourPitchforks, @pranavkm, @ericstj

escaping in Utf8JsonWriter.

steveharter · 2019-09-10T21:35:25Z

src/System.Text.Json/src/System/Text/Json/Writer/JsonWriterHelper.Escaping.cs

+                fixed (char* ptr = value)
+                {
+                    idx = encoder.FindFirstCharacterToEncode(ptr, value.Length);
+                }
                goto Return;


For v5 we may want to add a overload of FindFirstCharacterToEncode(ReadOnlySpan<char) to S.T.Encoding.Web so consumers don't have to use unsafe pinning code.

GrabYourPitchforks

Need to address the null check, otherwise LGTM.

GrabYourPitchforks · 2019-09-10T21:41:18Z

src/System.Text.Json/src/System/Text/Json/Writer/JsonWriterHelper.Escaping.cs

-                idx = encoder.FindFirstCharacterToEncodeUtf8(MemoryMarshal.Cast<char, byte>(value));
+                fixed (char* ptr = value)
+                {
+                    idx = encoder.FindFirstCharacterToEncode(ptr, value.Length);


Some implementations of the FindFirstCharacterToEncode method may not accept null pointers. We should special-case value.IsEmpty at the beginning of this method and bail early.

Will fix this and add a buggy javascriptencoder implementation as a test.

GrabYourPitchforks · 2019-09-10T21:52:44Z

src/System.Text.Json/tests/Utf8JsonWriterTests.cs

+
+            using (var writer = new Utf8JsonWriter(output))
+            {
+                writer.WriteStringValue("\u6D4B\u8A6611");


U+6D4B and U+8A66 might be allowed through unescaped in a future version, which could break this unit test. If you're looking for something stable that's highly unlikely to ever be allowed through unescaped, consider something from the range U+E000..U+F8FF (inclusive). That block is permanently reserved for private use and I highly doubt even the "relaxed" escaper will ever allow those to pass through unescaped.

If the test doesn't rely on this being output escaped or unescaped, you're good to go. :)

might be allowed through unescaped in a future version

We have a few writer tests that ensure the default behavior escapes certain characters. If we change that behavior, we would have to/should change those tests as well, so I would prefer the tests break at that time.

might not handle null ptrs correctly.

ahsonkhan · 2019-09-11T00:32:09Z

MacOS Build x64_Debug test failures are unrelated (same as #40997 (comment)):
https://dev.azure.com/dnceng/public/_build/results?buildId=348666&view=ms.vss-test-web.build-test-results-tab

System.Security.Cryptography.OpenSsl.Tests on netcoreapp-OSX-Debug-x64-OSX.1014.Amd64.Open

System.Security.Cryptography.OpenSsl.Tests Total: 649, Errors: 0, Failed: 565, Skipped: 14, Time: 1.072s

https://helix.dot.net/api/2019-06-17/jobs/0dbc5fbf-e89f-423c-953b-735db1747ed7/workitems/System.Security.Cryptography.OpenSsl.Tests/console

…escaping in Utf8JsonWriter. (dotnet/corefx#40996) * Avoid MemoryMarshal.Cast when transcoding from UTF-16 to UTF-8 while escaping in Utf8JsonWriter. * Fix white space typo in the test expected string. * Guard against empty spans where an implementation of JavascriptEncoder might not handle null ptrs correctly. * Cleanup tests to avoid some duplication. * Some more test clean up. Commit migrated from dotnet/corefx@ee9995f

Avoid MemoryMarshal.Cast when transcoding from UTF-16 to UTF-8 while

ae0e491

escaping in Utf8JsonWriter.

ahsonkhan added the area-System.Text.Json label Sep 10, 2019

ahsonkhan added this to the 5.0 milestone Sep 10, 2019

ahsonkhan mentioned this pull request Sep 10, 2019

[release/3.0] Avoid MemoryMarshal.Cast when transcoding from UTF-16 to UTF-8 while escaping in Utf8JsonWriter. #40997

Merged

steveharter reviewed Sep 10, 2019

View reviewed changes

steveharter approved these changes Sep 10, 2019

View reviewed changes

Fix white space typo in the test expected string.

ca565b1

GrabYourPitchforks suggested changes Sep 10, 2019

View reviewed changes

ahsonkhan added 3 commits September 10, 2019 16:44

Guard against empty spans where an implementation of JavascriptEncoder

13375f7

might not handle null ptrs correctly.

Cleanup tests to avoid some duplication.

59a5953

Some more test clean up.

0e5f47a

GrabYourPitchforks approved these changes Sep 10, 2019

View reviewed changes

ahsonkhan mentioned this pull request Sep 11, 2019

Add Encoder option to writer and serializer #39524

Merged

ahsonkhan merged commit ee9995f into dotnet:master Sep 11, 2019

ahsonkhan deleted the FixEscapingStringsInWriter branch September 11, 2019 00:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid MemoryMarshal.Cast when transcoding from UTF-16 to UTF-8 while escaping in Utf8JsonWriter. #40996

Avoid MemoryMarshal.Cast when transcoding from UTF-16 to UTF-8 while escaping in Utf8JsonWriter. #40996

ahsonkhan commented Sep 10, 2019

steveharter Sep 10, 2019

GrabYourPitchforks left a comment

GrabYourPitchforks Sep 10, 2019

ahsonkhan Sep 10, 2019

GrabYourPitchforks Sep 10, 2019

ahsonkhan Sep 10, 2019

ahsonkhan commented Sep 11, 2019 •

edited

Loading

Avoid MemoryMarshal.Cast when transcoding from UTF-16 to UTF-8 while escaping in Utf8JsonWriter. #40996

Avoid MemoryMarshal.Cast when transcoding from UTF-16 to UTF-8 while escaping in Utf8JsonWriter. #40996

Conversation

ahsonkhan commented Sep 10, 2019

steveharter Sep 10, 2019

Choose a reason for hiding this comment

GrabYourPitchforks left a comment

Choose a reason for hiding this comment

GrabYourPitchforks Sep 10, 2019

Choose a reason for hiding this comment

ahsonkhan Sep 10, 2019

Choose a reason for hiding this comment

GrabYourPitchforks Sep 10, 2019

Choose a reason for hiding this comment

ahsonkhan Sep 10, 2019

Choose a reason for hiding this comment

ahsonkhan commented Sep 11, 2019 • edited Loading

ahsonkhan commented Sep 11, 2019 •

edited

Loading