Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n-api: remove napi_get_value_string_length() #12496

Closed
wants to merge 1 commit into from

Conversation

jasongin
Copy link
Member

Fixes: nodejs/abi-stable-node#226

This API doesn't serve much purpose, and is only likely to cause
confusion and bugs. The intention was that this would return the
number of characters in a string, independent of encoding, but
that's not generally useful. In almost all cases, one of the
encoding-specific napi_get_value_string_* APIs is more correct.
(Pass a null buffer if only the encoded length is desired.)

Anyway the current implementation of napi_get_value_string_length()
is technically wrong: it returns the number of 2-byte code units of
the UTF-16 encoding, but there are actually some characters that
are encoded as two UTF-16 code units.

Note the JavaScript String.prototype.length property returns the
number of UTF-16 code units, which may be different from the number
of characters. So, getting the true character count is not common
with JavaScript, and is probably best left to specialized
internationalization libraries.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

This API doesn't serve much purpose, and is only likely to cause
confusion and bugs. The intention was that this would return the
number of characters in a string independent of encoding, but
that's not generally useful. In almost all cases, one of the
encoding-specific napi_get_value_string_* APIs is more correct.
(Pass a null buffer if only the encoded length is desired.)

Anyway the current implementation of napi_get_value_string_length()
is technically wrong: it returns the number of 2-byte code units of
the UTF-16 encoding, but there are actually some characters that
are encoded as two UTF-16 code units.

Note the JavaScript String.prototype.length property returns the
number of UTF-16 code units, which may be different from the number
of characters. So, getting the true character count is not common
with JavaScript, and is probably best left to specialized
internationalization libraries.

Fixes: nodejs/abi-stable-node#226
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. node-api Issues and PRs related to the Node-API. labels Apr 18, 2017
Copy link
Member

@mhdawson mhdawson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhdawson
Copy link
Member

@mhdawson
Copy link
Member

CI good, landed as 468275a

@mhdawson mhdawson closed this Apr 20, 2017
mhdawson pushed a commit that referenced this pull request Apr 20, 2017
This API doesn't serve much purpose, and is only likely to cause
confusion and bugs. The intention was that this would return the
number of characters in a string independent of encoding, but
that's not generally useful. In almost all cases, one of the
encoding-specific napi_get_value_string_* APIs is more correct.
(Pass a null buffer if only the encoded length is desired.)

Anyway the current implementation of napi_get_value_string_length()
is technically wrong: it returns the number of 2-byte code units of
the UTF-16 encoding, but there are actually some characters that
are encoded as two UTF-16 code units.

Note the JavaScript String.prototype.length property returns the
number of UTF-16 code units, which may be different from the number
of characters. So, getting the true character count is not common
with JavaScript, and is probably best left to specialized
internationalization libraries.

PR-URL: #12496
Fixes: nodejs/abi-stable-node#226
Reviewed-By: Michael Dawson <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@mhdawson mhdawson self-assigned this Apr 20, 2017
@jasnell jasnell mentioned this pull request May 11, 2017
@gibfahn gibfahn mentioned this pull request Jun 15, 2017
3 tasks
gabrielschulhof pushed a commit to gabrielschulhof/node that referenced this pull request Apr 10, 2018
This API doesn't serve much purpose, and is only likely to cause
confusion and bugs. The intention was that this would return the
number of characters in a string independent of encoding, but
that's not generally useful. In almost all cases, one of the
encoding-specific napi_get_value_string_* APIs is more correct.
(Pass a null buffer if only the encoded length is desired.)

Anyway the current implementation of napi_get_value_string_length()
is technically wrong: it returns the number of 2-byte code units of
the UTF-16 encoding, but there are actually some characters that
are encoded as two UTF-16 code units.

Note the JavaScript String.prototype.length property returns the
number of UTF-16 code units, which may be different from the number
of characters. So, getting the true character count is not common
with JavaScript, and is probably best left to specialized
internationalization libraries.

PR-URL: nodejs#12496
Fixes: nodejs/abi-stable-node#226
Reviewed-By: Michael Dawson <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
Reviewed-By: James M Snell <[email protected]>
MylesBorins pushed a commit that referenced this pull request Apr 16, 2018
This API doesn't serve much purpose, and is only likely to cause
confusion and bugs. The intention was that this would return the
number of characters in a string independent of encoding, but
that's not generally useful. In almost all cases, one of the
encoding-specific napi_get_value_string_* APIs is more correct.
(Pass a null buffer if only the encoded length is desired.)

Anyway the current implementation of napi_get_value_string_length()
is technically wrong: it returns the number of 2-byte code units of
the UTF-16 encoding, but there are actually some characters that
are encoded as two UTF-16 code units.

Note the JavaScript String.prototype.length property returns the
number of UTF-16 code units, which may be different from the number
of characters. So, getting the true character count is not common
with JavaScript, and is probably best left to specialized
internationalization libraries.

Backport-PR-URL: #19447
PR-URL: #12496
Fixes: nodejs/abi-stable-node#226
Reviewed-By: Michael Dawson <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@MylesBorins MylesBorins mentioned this pull request Apr 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. node-api Issues and PRs related to the Node-API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider removing napi_get_value_string_length
6 participants