Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-47000: Add locale.getencoding() #32068

Merged
merged 13 commits into from
Apr 9, 2022
15 changes: 14 additions & 1 deletion Doc/library/locale.rst
Original file line number Diff line number Diff line change
Expand Up @@ -334,10 +334,23 @@ The :mod:`locale` module defines the following exception and functions:
locale. See also the :term:`filesystem encoding and error handler`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to copy/paste this paragraph to getencoding() doc.


.. versionchanged:: 3.7
The function now always returns ``UTF-8`` on Android or if the
The function now always returns ``"UTF-8"`` on Android or if the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The function now always returns ``"UTF-8"`` on Android or if the
The function now always returns ``"utf-8"`` on Android or if the

:ref:`Python UTF-8 Mode <utf8-mode>` is enabled.


.. function:: getencoding()

Returns the :term:`locale encoding`.
methane marked this conversation as resolved.
Show resolved Hide resolved

On Android, it always returns ``"UTF-8"``, the :term:`locale encoding` is
ignored.

This function is same to ``getpreferredencoding(False)`` except this
methane marked this conversation as resolved.
Show resolved Hide resolved
function ignore the :ref:`UTF-8 Mode <utf8-mode>`.
methane marked this conversation as resolved.
Show resolved Hide resolved

.. versionadded:: 3.11


.. function:: normalize(localename)

Returns a normalized locale code for the given locale name. The returned locale
Expand Down
6 changes: 6 additions & 0 deletions Doc/whatsnew/3.11.rst
Original file line number Diff line number Diff line change
Expand Up @@ -274,6 +274,12 @@ inspect
* Add :func:`inspect.ismethodwrapper` for checking if the type of an object is a
:class:`~types.MethodWrapperType`. (Contributed by Hakan Çelik in :issue:`29418`.)

locale
------

* Add :func:`locale.getencoding` that is same to
methane marked this conversation as resolved.
Show resolved Hide resolved
``locale.getpreferredencoding(False)`` but ignores :ref:`UTF-8 Mode <utf8-mode>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like repeating "Python" to remind that it's unrelated to the Unix locale, but really something specific to Python which ignores the locale.

Suggested change
``locale.getpreferredencoding(False)`` but ignores :ref:`UTF-8 Mode <utf8-mode>`.
``locale.getpreferredencoding(False)`` but ignores the :ref:`Python UTF-8 Mode <utf8-mode>`.


math
----

Expand Down
20 changes: 10 additions & 10 deletions Lib/locale.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
"setlocale", "resetlocale", "localeconv", "strcoll", "strxfrm",
"str", "atof", "atoi", "format", "format_string", "currency",
"normalize", "LC_CTYPE", "LC_COLLATE", "LC_TIME", "LC_MONETARY",
"LC_NUMERIC", "LC_ALL", "CHAR_MAX"]
"LC_NUMERIC", "LC_ALL", "CHAR_MAX", "getencoding"]

def _strcoll(a,b):
""" strcoll(string,string) -> int.
Expand Down Expand Up @@ -637,27 +637,27 @@ def resetlocale(category=LC_ALL):


try:
from _locale import _get_locale_encoding
from _locale import getencoding
except ImportError:
def _get_locale_encoding():
def getencoding():
if hasattr(sys, 'getandroidapilevel'):
# On Android langinfo.h and CODESET are missing, and UTF-8 is
# always used in mbstowcs() and wcstombs().
return 'UTF-8'
if sys.flags.utf8_mode:
return 'UTF-8'
encoding = getdefaultlocale()[1]
if encoding is None:
# LANG not set, default conservatively to ASCII
encoding = 'ascii'
# LANG not set, default to UTF-8
encoding = 'UTF-8'
methane marked this conversation as resolved.
Show resolved Hide resolved
return encoding

try:
CODESET
except NameError:
def getpreferredencoding(do_setlocale=True):
"""Return the charset that the user is likely using."""
return _get_locale_encoding()
if sys.flags.utf8_mode:
return 'UTF-8'
return getencoding()
else:
# On Unix, if CODESET is available, use that.
def getpreferredencoding(do_setlocale=True):
Expand All @@ -667,15 +667,15 @@ def getpreferredencoding(do_setlocale=True):
return 'UTF-8'

if not do_setlocale:
return _get_locale_encoding()
return getencoding()

old_loc = setlocale(LC_CTYPE)
try:
try:
setlocale(LC_CTYPE, "")
except Error:
pass
return _get_locale_encoding()
return getencoding()
finally:
setlocale(LC_CTYPE, old_loc)

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add :func:`locale.getencoding`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can copy/paste the Doc/whatsnew/3.11.rst entry entry.

9 changes: 8 additions & 1 deletion Modules/_io/textio.c
Original file line number Diff line number Diff line change
Expand Up @@ -1145,7 +1145,14 @@ _io_TextIOWrapper___init___impl(textio *self, PyObject *buffer,
}
}
if (encoding == NULL && self->encoding == NULL) {
self->encoding = _Py_GetLocaleEncodingObject();
if (_PyRuntime.preconfig.utf8_mode) {
_Py_DECLARE_STR(utf_8, "utf-8");
self->encoding = &_Py_STR(utf_8);
Py_INCREF(self->encoding);
methane marked this conversation as resolved.
Show resolved Hide resolved
}
else {
self->encoding = _Py_GetLocaleEncodingObject();
}
if (self->encoding == NULL) {
goto error;
}
Expand Down
8 changes: 4 additions & 4 deletions Modules/_localemodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -773,14 +773,14 @@ _locale_bind_textdomain_codeset_impl(PyObject *module, const char *domain,


/*[clinic input]
_locale._get_locale_encoding
_locale.getencoding

Get the current locale encoding.
[clinic start generated code]*/

static PyObject *
_locale__get_locale_encoding_impl(PyObject *module)
/*[clinic end generated code: output=e8e2f6f6f184591a input=513d9961d2f45c76]*/
_locale_getencoding_impl(PyObject *module)
/*[clinic end generated code: output=86b326b971872e46 input=6503d11e5958b360]*/
{
return _Py_GetLocaleEncodingObject();
}
Expand Down Expand Up @@ -811,7 +811,7 @@ static struct PyMethodDef PyLocale_Methods[] = {
_LOCALE_BIND_TEXTDOMAIN_CODESET_METHODDEF
#endif
#endif
_LOCALE__GET_LOCALE_ENCODING_METHODDEF
_LOCALE_GETENCODING_METHODDEF
{NULL, NULL}
};

Expand Down
16 changes: 8 additions & 8 deletions Modules/clinic/_localemodule.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 6 additions & 4 deletions Python/fileutils.c
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,12 @@ _Py_device_encoding(int fd)

return PyUnicode_FromFormat("cp%u", (unsigned int)cp);
#else
if (_PyRuntime.preconfig.utf8_mode) {
_Py_DECLARE_STR(utf_8, "utf-8");
PyObject *encoding = &_Py_STR(utf_8);
Py_INCREF(encoding);
return encoding;
}
return _Py_GetLocaleEncodingObject();
#endif
}
Expand Down Expand Up @@ -890,10 +896,6 @@ _Py_GetLocaleEncoding(void)
// and UTF-8 is always used in mbstowcs() and wcstombs().
return _PyMem_RawWcsdup(L"UTF-8");
#else
const PyPreConfig *preconfig = &_PyRuntime.preconfig;
if (preconfig->utf8_mode) {
return _PyMem_RawWcsdup(L"UTF-8");
}

#ifdef MS_WINDOWS
wchar_t encoding[23];
Expand Down
8 changes: 7 additions & 1 deletion Python/initconfig.c
Original file line number Diff line number Diff line change
Expand Up @@ -1779,7 +1779,13 @@ static PyStatus
config_get_locale_encoding(PyConfig *config, const PyPreConfig *preconfig,
wchar_t **locale_encoding)
{
wchar_t *encoding = _Py_GetLocaleEncoding();
wchar_t *encoding;
if (preconfig->utf8_mode) {
encoding = _PyMem_RawWcsdup(L"UTF-8");
}
else {
encoding = _Py_GetLocaleEncoding();
}
if (encoding == NULL) {
return _PyStatus_NO_MEMORY();
}
Expand Down