Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent JS/JVM behavior with byte order mark (BOM) #112

Closed
jpd236 opened this issue May 23, 2020 · 1 comment
Closed

Inconsistent JS/JVM behavior with byte order mark (BOM) #112

jpd236 opened this issue May 23, 2020 · 1 comment

Comments

@jpd236
Copy link

jpd236 commented May 23, 2020

I'm not sure if this problem is still relevant as it looks like the function in question has been commented out of the current tree:

https://github.com/Kotlin/kotlinx-io/blame/master/core/commonMain/src/kotlinx/io/text/CharsetEncoder.kt

but in case it is still an issue under the covers - with the 0.1.16 version of the library, I'm seeing inconsistent behavior when calling String(<bytes>, charset = Charsets.UTF_8) when bytes begins with a Byte order mark depending on whether I'm targeting the JVM or JS.

In the JVM, the BOM (0xEF, 0xBB, 0xBF) gets converted to a U+FEFF as the first character of the resulting string.

In JS, the BOM appears to be stripped out.

@fzhinkin
Copy link
Collaborator

We're rebooting the kotlinx-io development (see #131), all issues related to the previous versions will be closed. Consider reopening it if the issue remains (or the feature is still missing) in a new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants