Fix error caused by a single Unicode before a surrogate pair #219
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes an error that occurs when a zero-width joiner is placed before a surrogate pair (related to my comment in #217).
Problem
The current implementation always validates two consecutive Unicode sequences as a surrogate pair.
This cause an error when an independent Unicode sequence is followed by a surrogate pair, which should be valid.
Caveats
Relating to this fix, an unpaired surrogate character at the end of a string will cause an error, which is silently ignored by the current implementation.