-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for newlines, backslashes, trailing comments and unquoted UTF-8 #148
Conversation
a5f4853
to
46d5a74
Compare
I added a test case so that coverage is kept at a high level. 📈 |
This looks great! I'm happy to merge this in, possible you can make it upto date with master? Let me know so I can hold up any further merges to master. The conflict seems to be due to 43af2c5 |
I've just rebased, this should be good to go! I noticed two small issues when rebasing, which I've fixed and added test cases for:
|
This was also caught by Flake8 as: ./dotenv/main.py:19:2: W605 invalid escape sequence '\$' ./dotenv/main.py:19:4: W605 invalid escape sequence '\{' ./dotenv/main.py:19:8: W605 invalid escape sequence '\}' ./dotenv/main.py:19:12: W605 invalid escape sequence '\}'
This avoids the use of the `is_file` class variable by abstracting away the difference between `StringIO` and a file stream.
Parsing .env files is a critical part of this package. To make it easier to change it and test it, it is important that it is done in only one place. Also, code that uses the parser now doesn't depend on the fact that each key-value binding spans exactly one line. This will make it easier to handle multiline bindings in the future.
This adds support for: * multiline values (i.e. containing newlines or escaped \n), fixes theskumar#89 * backslashes in values, fixes theskumar#112 * trailing comments, fixes theskumar#141 * UTF-8 in unquoted values, fixes theskumar#147 Parsing is no longer line-based. That's why `parse_line` was replaced by `parse_binding`. Thanks to the previous commit, users of `parse_stream` don't have to deal with this change. This supersedes a previous pull-request, theskumar#142, which would add support for multiline values in `Dotenv.parse` but not in the CLI (`dotenv get` and `dotenv set`). The key-value binding regular expression was inspired by https://github.com/bkeepers/dotenv/blob/d749366b6009126b115fb7b63e0509566365859a/lib/dotenv/parser.rb#L14-L30 Parsing of escapes was fixed thanks to https://stackoverflow.com/questions/4020539/process-escape-sequences-in-a-string-in-python/24519338#24519338
Any plans to release soon with this patch included? |
Thanks @bbc2 and others for the patience. I've had some little busy schedule. I'm updating the README etc. and then make a release. |
This was a breaking change for us. We had an entry |
That's interesting, I thought Bash would interpret $ a="1\t2"
$ echo $a
1\t2 $ bash --version
GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. However, Zsh outputs a tab instead of |
I've avoided the issue by removing the speech marks now. That seems to deal fine with spaces so I shouldn't really need to use speech marks for file paths |
This was a breaking change for me. I use a local .env file to store a password for local testing, the password contains a # |
I'm sorry it broke your use cases. I'll try to fix this soon. My plan:
|
… UTF-8 (theskumar#148) * Fix deprecation warning for POSIX variable regex This was also caught by Flake8 as: ./dotenv/main.py:19:2: W605 invalid escape sequence '\$' ./dotenv/main.py:19:4: W605 invalid escape sequence '\{' ./dotenv/main.py:19:8: W605 invalid escape sequence '\}' ./dotenv/main.py:19:12: W605 invalid escape sequence '\}' * Turn get_stream into a context manager This avoids the use of the `is_file` class variable by abstracting away the difference between `StringIO` and a file stream. * Deduplicate parsing code and abstract away lines Parsing .env files is a critical part of this package. To make it easier to change it and test it, it is important that it is done in only one place. Also, code that uses the parser now doesn't depend on the fact that each key-value binding spans exactly one line. This will make it easier to handle multiline bindings in the future. * Parse newline, UTF-8, trailing comment, backslash This adds support for: * multiline values (i.e. containing newlines or escaped \n), fixes theskumar#89 * backslashes in values, fixes theskumar#112 * trailing comments, fixes theskumar#141 * UTF-8 in unquoted values, fixes theskumar#147 Parsing is no longer line-based. That's why `parse_line` was replaced by `parse_binding`. Thanks to the previous commit, users of `parse_stream` don't have to deal with this change. This supersedes a previous pull-request, theskumar#142, which would add support for multiline values in `Dotenv.parse` but not in the CLI (`dotenv get` and `dotenv set`). The key-value binding regular expression was inspired by https://github.com/bkeepers/dotenv/blob/d749366b6009126b115fb7b63e0509566365859a/lib/dotenv/parser.rb#L14-L30 Parsing of escapes was fixed thanks to https://stackoverflow.com/questions/4020539/process-escape-sequences-in-a-string-in-python/24519338#24519338
This adds support for:
This supersedes a previous pull-request, #142, which would add support for
multiline values in
Dotenv.parse
but not in the CLI (dotenv get
anddotenv set
).The internal change is significant but I have added a lot of test cases to reduce the risk of breaking anything. Previous test cases are still present, so I wouldn't expect any major backward incompatibility.
I have written detailed commit messages and made my code as clear as possible. Let me know if anything should be improved. I'll be happy to fix anything unsatisfactory.