-
Notifications
You must be signed in to change notification settings - Fork 47k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: renderToPipeableStream()
emit mysterious mojibake whitespace chars in the result
#24985
Comments
I confirmed:
So I guess some regressions happens in this range. |
This bug may be related to #24592 |
Thanks for the clear repro case! |
We encode strings 2048 UTF-8 bytes at a time. If the string we are encoding crosses to the next chunk but the current chunk doesn't fit an integral number of characters, we need to make sure not to send the whole buffer, only the bytes that are actually meaningful. Fixes facebook#24985. I was able to verify that this fixes the repro shared in the issue (be careful when testing because the null bytes do not show when printed to my terminal, at least). However, I don't see a clear way to add a test for this that will be resilient to small changes in how we encode the markup (since it depends on where specific multibyte characters fall against the 2048-byte boundaries).
(I am guessing #24592 is not related, since this issue seems to be with non-ASCII characters and I don't believe that one is.) |
We encode strings 2048 UTF-8 bytes at a time. If the string we are encoding crosses to the next chunk but the current chunk doesn't fit an integral number of characters, we need to make sure not to send the whole buffer, only the bytes that are actually meaningful. Fixes facebook#24985. I was able to verify that this fixes the repro shared in the issue (be careful when testing because the null bytes do not show when printed to my terminal, at least). However, I don't see a clear way to add a test for this that will be resilient to small changes in how we encode the markup (since it depends on where specific multibyte characters fall against the 2048-byte boundaries).
We encode strings 2048 UTF-8 bytes at a time. If the string we are encoding crosses to the next chunk but the current chunk doesn't fit an integral number of characters, we need to make sure not to send the whole buffer, only the bytes that are actually meaningful. Fixes #24985. I was able to verify that this fixes the repro shared in the issue (be careful when testing because the null bytes do not show when printed to my terminal, at least). However, I don't see a clear way to add a test for this that will be resilient to small changes in how we encode the markup (since it depends on where specific multibyte characters fall against the 2048-byte boundaries).
We encode strings 2048 UTF-8 bytes at a time. If the string we are encoding crosses to the next chunk but the current chunk doesn't fit an integral number of characters, we need to make sure not to send the whole buffer, only the bytes that are actually meaningful. Fixes #24985. I was able to verify that this fixes the repro shared in the issue (be careful when testing because the null bytes do not show when printed to my terminal, at least). However, I don't see a clear way to add a test for this that will be resilient to small changes in how we encode the markup (since it depends on where specific multibyte characters fall against the 2048-byte boundaries). DiffTrain build for [96cdeaf](96cdeaf)
@sophiebits Thank you! |
For those who have this problem in For quick code: function handleBrowserRequest(
request: Request,
responseStatusCode: number,
responseHeaders: Headers,
remixContext: EntryContext,
) {
return new Promise((resolve, reject) => {
let shellRendered = false;
const { pipe, abort } = renderToPipeableStream(
<RemixServer
context={remixContext}
url={request.url}
abortDelay={ABORT_DELAY}
/>,
{
onShellReady() {
shellRendered = true;
/**
* https://github.com/facebook/react/pull/26228
* Null bytes are possible at the end of the chunk, if we are using non-ascii characters.
* The fix is not released to 18.2.0 so all we can do is transforming the chunks.
*/
const body = new Transform({
transform(chunk: Buffer, encoding, callback) {
let endingZeroes =
Number(chunk.at(-1) === 0) + Number(chunk.at(-2) === 0);
callback(
null,
endingZeroes > 0
? chunk.subarray(0, chunk.length - endingZeroes)
: chunk,
);
},
});
responseHeaders.set("Content-Type", "text/html");
resolve(
new Response(createReadableStreamFromReadable(body), {
headers: responseHeaders,
status: responseStatusCode,
}),
);
pipe(body);
},
onShellError(error: unknown) {
reject(error);
},
onError(error: unknown) {
responseStatusCode = 500;
// Log streaming rendering errors from inside the shell. Don't log
// errors encountered during initial shell rendering since they'll
// reject and get logged in handleDocumentRequest.
if (shellRendered) {
console.error(error);
}
},
},
);
setTimeout(abort, ABORT_DELAY);
});
} Details: To fix the bug while not hurt the performance, we only need to check if the last two bytes are zeroes. Because utf-8 is 3-byte at most and the bug is only triggered if they are filling a fixed length array which doesn't fit a 3-byte character at the end. For example, given they are encoding into a 8-byte array, |
We have to wait for v19 or patch pnpm example: https://github.com/pnpm/pnpm.io/pull/588/files |
renderToPipeableStream()
emit mojibake whitespace chars in its result.This would sometimes breaks the final generated html.
React version: 18.2.0
Steps To Reproduce
Run this test case with jest.
The current behavior
The above testcase generate this result:
The expected behavior
The current behavior's result contains
その上今まで��の所とは違って無暗に明るい
.But we expect this sentence should be
その上今までの所とは違って無暗に明るい
.So
renderToPipeableStream()
should not emit mysterious whitespace chars.The text was updated successfully, but these errors were encountered: