Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.environ should preserve the case of the OS keys ? #73010

Closed
tzickel mannequin opened this issue Nov 28, 2016 · 12 comments
Closed

os.environ should preserve the case of the OS keys ? #73010

tzickel mannequin opened this issue Nov 28, 2016 · 12 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@tzickel
Copy link
Mannequin

tzickel mannequin commented Nov 28, 2016

BPO 28824
Nosy @pfmoore, @tjguk, @bitdancer, @zware, @eryksun, @zooba, @tzickel

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2016-11-28.18:55:46.277>
labels = ['type-bug', '3.9', '3.10', '3.11', 'library', 'OS-windows']
title = 'os.environ should preserve the case of the OS keys ?'
updated_at = <Date 2022-03-02.11:19:53.537>
user = 'https://github.com/tzickel'

bugs.python.org fields:

activity = <Date 2022-03-02.11:19:53.537>
actor = 'eryksun'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)', 'Windows']
creation = <Date 2016-11-28.18:55:46.277>
creator = 'tzickel'
dependencies = []
files = []
hgrepos = []
issue_num = 28824
keywords = []
message_count = 10.0
messages = ['281906', '281909', '281910', '281912', '281914', '281915', '281918', '387676', '414115', '414141']
nosy_count = 8.0
nosy_names = ['paul.moore', 'tim.golden', 'r.david.murray', 'benrg', 'zach.ware', 'eryksun', 'steve.dower', 'tzickel']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue28824'
versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

@tzickel
Copy link
Mannequin Author

tzickel mannequin commented Nov 28, 2016

In Windows, python's os.environ currently handles the case sensitivity different that the OS. While it's true that the OS is case insensitive, it does preserve the case that you first set it as.

For example:
C:\Users\user>set aSD=Blah
C:\Users\user>set asd
aSD=Blah

But in python:
>>> import os
>>> 'aSD' in os.environ.keys()
False

Today as more people pass environment variables to processes, it's better to behave as the OS does. Basically I think that os.environ (both in 2.7 and 3) should preserve the case as well (for when you need to access / iterate over the keys or set a key), but ignore it when you get a key.

return encode(key).upper()

@tzickel tzickel mannequin added 3.7 (EOL) end of life OS-windows labels Nov 28, 2016
@bitdancer
Copy link
Member

That unfortunately would probably break existing code. It does seem reasonable that case should be ignored on get, though, if the OS does so. So making your 'in' statement work might be acceptable, backward compatibility wise. Probably only in 3.7, though.

Is it the OS that ignores case, or just the shell?

@zooba
Copy link
Member

zooba commented Nov 28, 2016

This works fine in Python 3, and also Python 2.7 *unless* you call .keys().

PS D:\> py -2.7
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> 'path' in os.environ, 'path' in os.environ.keys()
(True, False)

PS D:\> py
Python 3.6.0b4 (default, Nov 22 2016, 05:30:12) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> 'path' in os.environ, 'path' in os.environ.keys()
(True, True)

I suspect at this point, we aren't going to change this in Python 2.7 unless someone comes up with an incredibly motivating issue.

@zooba zooba removed the 3.7 (EOL) end of life label Nov 28, 2016
@tzickel
Copy link
Mannequin Author

tzickel mannequin commented Nov 28, 2016

My issue is that somebody wants to pass a few dict like environment variables as some prefix_key=value but he wants to preserve the case of the key for usage in python so the .keys() space needs to be enumerated.

A workaround for this issue can be importing nt and using nt.environ which preserves the cases.

@eryksun
Copy link
Contributor

eryksun commented Nov 28, 2016

I've come across a few problems when passing a modified os.environ.copy() to a child process via subprocess.Popen. Ideally it shouldn't be an issue, as long as the OS or C runtime functions are used, but some troublesome programs do their own case-sensitive search for environment variables (definitely a bug). In such cases the simplest workaround is to use nt.environ.copy(), but this doesn't include any changes in the environment since startup.

@tzickel
Copy link
Mannequin Author

tzickel mannequin commented Nov 28, 2016

Steve, I've checked in Python 3.5.2, and os.environ.keys() still uppercases everything when scanning (for my use case). Has it changed since then ?

@zooba
Copy link
Member

zooba commented Nov 28, 2016

Ah, I see what you mean. In this case, we could change how the case-insensitivity is handled here, but it would only be applicable to 3.7. I'm not opposed to changing the default behavior here, but it does kind of bring up the mapping dict discussion again.

If case is important to your application, environment variables are probably the wrong way to go about passing them in anyway. Either use the value of the variable rather than the key, or find a different approach. Given 'nt.environ' is available without case remapping, I think that's the best workaround.

@zooba zooba added the 3.7 (EOL) end of life label Nov 28, 2016
@eryksun
Copy link
Contributor

eryksun commented Feb 25, 2021

On Windows, maybe the os.environ mapping could use a case-insensitive subclass of str for its keys, such as the following:

    @total_ordering
    class _CaseInsensitiveString(str):
        def __eq__(self, other):
            if not isinstance(other, str):
                return NotImplemented
            return path.normcase(self) == path.normcase(other)

        def __lt__(self, other):
            if not isinstance(other, str):
                return NotImplemented
            return path.normcase(self) < path.normcase(other)

        def __hash__(self):
            return hash(path.normcase(self))

Change encodekey() to use the above type. For example:

    def encodekey(key):
        return _CaseInsensitiveString(encode(key))

in which encode() is still check_str().

@eryksun eryksun added stdlib Python modules in the Lib dir 3.9 only security fixes 3.10 only security fixes type-bug An unexpected behavior, bug, or error and removed 3.7 (EOL) end of life labels Feb 25, 2021
@eryksun eryksun added the 3.11 only security fixes label Feb 26, 2022
@benrg
Copy link
Mannequin

benrg mannequin commented Feb 26, 2022

This issue should be marked dependent on bpo-43702 or bpo-46862, since fixing it could break third-party code unless they're fixed first.

Given 'nt.environ' is available without case remapping, I think that's the best workaround.

Right now, it's not a good workaround because it contains the environment at the time the interpreter was started, not the current environment. On Posix, _Environ takes a reference to posix.environ and uses it directly, so it does get updated. On Windows, _Environ gets a rewritten dictionary and nt.environ is just a space-wasting attractive nuisance. I think it should be replaced with getenviron() which builds a dict from the environment block each time it's called. But posix.environ is documented (though nt.environ isn't), so maybe not.

class _CaseInsensitiveString(str):

I think there should be a public class like this. It could be useful to email.message.Message and its clients like urllib. They currently store headers in a list and every operation is O(n).

The semantics are tricky. As written, it violates the requirement that equal objects have equal hashes. To fix that, you'd have to make every CIS compare unequal to every str. At that point, it probably shouldn't be a str subclass, which also has the advantage that it's not limited to strings. It can be a generic compare-by-key wrapper.

@eryksun
Copy link
Contributor

eryksun commented Feb 27, 2022

I think there should be a public class like this.

I wrote a basic implementation of _CaseInsensitiveString under the assumption that it's hidden behind the __getitem__(), __setitem__(), and __delitem__() methods of the _Environ class. I don't want to complicate the implementation.

The problem of case-insensitive names in file/registry paths and environment variables should be addressed with ntpath.normcase(). This should be based on WinAPI LCMapStringEx() with LOCALE_NAME_INVARIANT and LCMAP_UPPERCASE. The current implementation of ntpath.normcase() uses str.lower(), which depends on Python's Unicode database. This possibly differs from what Windows considers to be lower case in the invariant locale. It's also wrong because Windows uses upper case for cases-insensitive comparisons, which matters when sorting names. See bpo-42658.

Right now, it's not a good workaround because it contains the
environment at the time the interpreter was started, not the
current environment.

Note that in some cases the "current environment" is the process environment, which isn't necessarily consistent with os.environ on any platform.

On POSIX, posix.environ is created from C environ. The latter is kept in sync with changes to posix.environ via C putenv() and unsetenv(). However, directly calling os.putenv() or os.unsetenv(), or the underlying C functions, modifies the process environment without changing posix.environ. If subprocess.Popen is called without overriding env, the child inherits the process environment, not posix.environ.

On Windows, os.environ is created from nt.environ, with variables names converted to upper case. nt.environ is created from C _wenviron. The latter is created from the process environment, as returned by WinAPI GetEnvironmentStringsW(), except with variable names that begin with "=" filtered out. os.putenv() and os.unsetenv() are based on C _wputenv(), which updates C _wenviron and also updates the process environment via WinAPI SetEnvironmentVariableW(). If either os.putenv() or os.unsetenv() is called directly, or _wputenv() at a lower level, then the variables in C _wenviron (not just the letter case of the names) will be out of sync with os.environ. Additionally, if SetEnvironmentVariableW() is called directly to set or unset a variable, then both os.environ and C _wenviron will be out of sync with the process environment. If subprocess.Popen is called without overriding env, the child inherits the process environment, not os.environ or C _wenviron.

@vmatt
Copy link

vmatt commented Nov 3, 2022

If you don't have duplicate entries, and want to just uppercase all envs, you can do it with this one-liner:
environ_upper = {k.upper():v for k,v in dict(os.environ).items()}

@karrtikr
Copy link

karrtikr commented Mar 3, 2023

Given this is now documented that keys are always converted to upper case in #101754, should we close this? @zooba @brettcannon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

6 participants