Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix an un-normalized encoding… #9

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

savetheclocktower
Copy link

…saving the C++ code from receiving a Node-style encoding name that it doesn’t recognize.


Our TextBuffer object acts as a bridge between TextEditor and lower-level APIs like pathwatcher and superstring. And superstring, in turn, presents a JavaScript API but implements lots of its internals in C++ for speed.

But that’s where a bug can slip in, because JavaScript and C++ use different conventional names for common text encodings… including UTF-8, the commonest of all (in our particular domain). Node likes to call it utf8, but the superstring C++ code insists upon the more formal UTF-8. There’s a utility function called normalizeEncoding within the superstring JavaScript that converts Node-style encoding names to C++-style encoding names.

I’m not going to be able to demonstrate exactly how this bug happens, but take a look at superstring/index.js and search for the string encoding. You’ll see that, aside from the aforementioned normalizeEncoding utility, three different methods take an encoding parameter in some form: TextBuffer#load, TextBuffer#save, and TextBuffer#baseTextMatchesFile.

You might also notice that the first two of those methods pass their encoding parameter through the normalizeEncoding function almost immediately — but baseTextMatchesFile doesn’t. After lots of analysis and reflection, I can’t come up with a single reason why it shouldn’t normalize the encoding; it would seem to save me from this exception I kept getting today while working on a new package:

Screenshot 2024-05-21 at 7 21 06 PM

I got to this method from text-buffer code — specifically subscribeToFile, which my new package apparently called in the course of adding an onDidSave callback to a TextEditor.

It feels like this code path should be hit quite often, and I’m sure I’ve gotten this exact error before, but not for ages; so I definitely don’t have reproduction steps here. But I can assert that hot-fixing this in a debugger (via a conditional breakpoint that simply does encoding = normalizeEncoding(encoding) on the appropriate line) prevents this error, and it makes logical sense why.

…saving the C++ code from receiving a Node-style encoding name that it doesn’t recognize.
@savetheclocktower
Copy link
Author

I am… not surprised that CI feels dusty here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant