Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract platform-specifics from ENV to Crystal::System::Env and implement for win32 #6333

Merged
merged 11 commits into from
Jul 24, 2018

Conversation

straight-shoota
Copy link
Member

Requires #5623 (for win32).

module Crystal::System::Env
# Sets an environment variable.
def self.set(key : String, value : String) : Nil
raise ArgumentError.new("Key contains null byte") if key.byte_index(0)
Copy link
Contributor

@RX14 RX14 Jul 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key.check_no_bull_byte and down the diff

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I initially thought, too. But check_no_null_byte loses the information which of the arguments was invalid (it's String contains null byte instead of Key contains null byte). Doesn't make much difference, so I'm not sure about it...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then refactor it to take an argument

System.retry_wstr_buffer do |buffer, small_buf|
length = LibC.GetEnvironmentVariableW(key.to_utf16, buffer, buffer.size)
if 0 < length < buffer.size
break String.from_utf16(buffer[0, length])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably slightly clearer using return instead of break.

src/env.cr Outdated
value
end

# Returns `true` if the environment variable named *key* exists and `false`
# if it doesn't.
def self.has_key?(key : String) : Bool
!!LibC.getenv(key)
!!Crystal::System::Env.set?(key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename this to has_key? and remove the !!

@@ -6,4 +6,6 @@ lib LibC
alias WCHAR = UInt16
alias LPSTR = CHAR*
alias LPWSTR = WCHAR*
alias LPCWSTR = WCHAR*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LPCWSTR shouldn't be an alias since it's just constant which means nothing ABI-wise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... what should it be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LPWSTR...

@@ -5,6 +5,8 @@ require "c/int_safe"
lib LibC
fun GetLastError : DWORD

ERROR_ENVVAR_NOT_FOUND = 203_u32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@straight-shoota straight-shoota force-pushed the jm/feature/windows-env branch 2 times, most recently from ca2905f to 7efc63e Compare July 5, 2018 08:40
@straight-shoota straight-shoota force-pushed the jm/feature/windows-env branch 3 times, most recently from 52da7cc to a68fd42 Compare July 5, 2018 09:45
@straight-shoota
Copy link
Member Author

Rebased after #5623 was merged and ready for review 👍

@straight-shoota straight-shoota force-pushed the jm/feature/windows-env branch from a68fd42 to 9bf5cb8 Compare July 5, 2018 10:38
#
# Invalid values are encoded using the unicode replacement char with
# codepoint `0xfffd`.
def self.from_utf16(pointer : Pointer(UInt16)) : {String, Pointer(UInt16)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change and as such needs some more discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. When reading from a pointer it needs to advance the amount of UInt16 read. We can't determine that afterwards from String#bytesize because UTF-8 and UTF-16 have different bytesizes. We could rename the method with this changed behaviour (to maybe String.from_utf16_advancing_pointer). Or change the signature in some way. But I don't think it makes much sense to have both a version of this method (with pointer) that returns the advanced pointer and one that doesn't. It's easy to just ignore the returned pointer value if you don't need it.

@asterite
Copy link
Member

asterite commented Jul 8, 2018

This is why I recommended keeping the UTF-16 implementation internal until we got it working well in Windows, otherwise we are bound to make breaking changes.

I'd say for now don't worry about breaking changes here, I doubt anyone is using these methods, and if they are, well, it's a simple fix.

@RX14
Copy link
Contributor

RX14 commented Jul 8, 2018

I'm fine with this breaking change actually

@RX14 RX14 added topic:stdlib breaking-change platform:windows Windows support based on the MSVC toolchain / Win32 API labels Jul 8, 2018
src/string.cr Outdated
def check_no_null_byte(name = nil)
if byte_index(0)
name = "`#{name}` "
raise ArgumentError.new("String #{name}contains null byte")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this on two lines??

Also, I'd rather the entire exception message was passed in, i.e. def check_no_null_byte(message = "String contains null byte").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the first comment, disagree with the second one.

Copy link
Contributor

@RX14 RX14 Jul 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing you save by doing it the other way is a bit of typing, and the benefit of having fully customized error messages. For "Key contains null byte" vs "String `key` contains null byte" - the former is way better. Please, let's keep it simple, understandable and flexible.

Copy link
Member Author

@straight-shoota straight-shoota Jul 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two lines because I forgot a if name in the first line. To avoid empty argument to be printed. That's why the specs fail.

Maybe full message is better. But actually, I don't think there is much benefit from it. Yes, it's more flexible, but I don't see any use for that. check_no_null_byte is only used to validate method arguments and for such it makes sense to declare the name of the argument if there are more than one. But apart from that, I don't see any need for customization.

Refactor String#check_no_null_byte
@straight-shoota straight-shoota force-pushed the jm/feature/windows-env branch from ec4b780 to 5de7a7a Compare July 8, 2018 21:53
raise ArgumentError.new("String contains null byte") if byte_index(0)
def check_no_null_byte(name = nil)
if byte_index(0)
name = "`#{name}` " if name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the backticks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To show that this refers to a name. This formatting is used in many other compiler and stdlib error messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, I never noticed that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not that common

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO Key contains null byte reads way better than String 'key' contains null byte.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asterite Do you mean name should be required?

Copy link
Contributor

@RX14 RX14 Jul 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps check_no_null_byte(what = "String") then just #{what} contains null byte?

Or just go with making it fully configurable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at the code, it always seems to be a check against an argument. To avoid repeating the argument name, maybe we can have:

class String
  macro check_no_null_byte(name)
    raise ArgumentError.new("Argument '{{name}}' contains null byte") if {{name}}.byte_index(0)
    {{name}}
  end
end

Then it can be used like this:

def some_method(arg)
  String.check_no_null_byte(arg)
  # do something
end

That way there's no string interpolation, and there's no need to repeat the argument name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bikeshedding and overengineering. The diff is fine as-is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RX14 I was writing the same thing. All are valid options and it doesn't make a difference regarding ENV.
Anyone can send a PR later to refactor check_no_null_byte if you care enough.

@straight-shoota straight-shoota force-pushed the jm/feature/windows-env branch from 2d01d82 to 7adb2cc Compare July 9, 2018 14:24
Add spec for empty environment variable. This fails on win32 because `GetEnvironmentVariableW` doesnt differentiate between unset variable and empty string.
In contrast, `GetEnvironmentStringsW` contains empty variables.
@straight-shoota
Copy link
Member Author

straight-shoota commented Jul 10, 2018

I discovered an issue on win32: GetEnvironmentVariableW doesn't differentiate between unset variable and empty string. Both have length == 0 and return value ERROR_ENVVAR_NOT_FOUND. The latter seems incorrect in case of an empty string value. In contrast, GetEnvironmentStringsW contains variables with empty values.

This could be used to identify empty values if length == 0 but will of course have a performance impact, and it's probably not really worth it for this edge case. There will however be a discrepancy between ENV[] and ENV.each regarding empty strings.

Unless someone has a idea how to prevent this, I don't think there is anything to be done.

@RX14
Copy link
Contributor

RX14 commented Jul 11, 2018

@straight-shoota i'm pretty skeptical that GetEnvironmentVariable is behaving this way. Can you print the result of GetLastError (pp WinError.new) before and directly after the API call?

@RX14
Copy link
Contributor

RX14 commented Jul 11, 2018

This stackoverflow answer seems to show the Win32 API correctly returning the value of an empty env var: https://stackoverflow.com/questions/26993642/call-to-environment-getenvironmentvariable-affects-subsequent-calls

But it does show there's definitely weirdness around this part of the API - perhaps GetEnvironmentVariable is failing to clear GetLastError correctly as all syscalls should.

Fix implementation for empty string on win32 using `SetLastError`.
@straight-shoota
Copy link
Member Author

It actually seems to be the case that GetEnvironmentVariableW doesn't clear the last error. GetLastError is even ERROR_ENVVAR_NOT_FOUND if a positive length was returned.
And indeed, that's documented for SetLastError:

Most functions call SetLastError or SetLastErrorEx only when they fail. However, some system functions call SetLastError or SetLastErrorEx under conditions of success; those cases are noted in each function's documentation.

I fixed it by adding a call to SetLastError to set it to 0 before the call to GetEnvironmentVariableW. Now everything works 👍

@RX14
Copy link
Contributor

RX14 commented Jul 11, 2018

Oh actually, can you uncomment out the file/dir specs which were commented out due to using ENV?

@straight-shoota
Copy link
Member Author

As far as I can see, they're all pending for other reasons.

@RX14
Copy link
Contributor

RX14 commented Jul 11, 2018

@straight-shoota these should go from macroed out to just pending

# TODO: these specs don't compile on windows because ENV isn't ported

@straight-shoota straight-shoota force-pushed the jm/feature/windows-env branch from 66166cb to 0b33619 Compare July 11, 2018 21:05
@@ -582,41 +582,38 @@ describe "File" do
File.expand_path("~/a", "/tmp/gumby/ddd").should eq(File.join([home, "a"]))
end

# TODO: these specs don't compile on windows because ENV isn't ported
# TODO: remove /\A\/\// hack after this is removed from macros
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check the diff for #5623 and do this TODO too please

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure what you mean... reinstating %r{\A//}?

Copy link
Contributor

@RX14 RX14 Jul 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, and remove the TODO comment

@RX14
Copy link
Contributor

RX14 commented Jul 24, 2018

ping, needs a second review

@RX14 RX14 added this to the Next milestone Jul 24, 2018
@RX14 RX14 merged commit 4f720ab into crystal-lang:master Jul 24, 2018
@straight-shoota straight-shoota deleted the jm/feature/windows-env branch July 24, 2018 15:54
@straight-shoota
Copy link
Member Author

@RX14 The commit message is a mess... it should've been squashed without all these fixups.

@RX14 RX14 modified the milestones: Next, 0.26.0 Jul 30, 2018
@straight-shoota straight-shoota mentioned this pull request Aug 1, 2018
@straight-shoota straight-shoota mentioned this pull request Aug 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change platform:windows Windows support based on the MSVC toolchain / Win32 API topic:stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants