-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create RFC####-Improve-generation-of-argument-string-for-executables #90
Conversation
Add quoting specification and todos
verbatim argument preference variable prosa (other shells) prosa (rules are accepted)
batch considerations prosa: Linux, out of this RFC
comments-due-date small corrections
So great to see this RFC written! Thank you for taking this one step forward.
|
Workarounds that break:
Regarding |
One worse thing that just came to my mind: The rare cases where no escaping was necessary (because the executables don't follow the typical rules) will also break. The probably most common case could be |
Alternative: special rules for batch files
However, if the argument itself contains quotes, these are not escaped. Therefore, the corresponding element in `argv[]` has no quotes and depending on the actual string, the argument might be split into multiple `argv[]` elements. It can even occur, that the following arguments are not handled correctly. | ||
|
||
This RFC suggests making the NativeCommandParameterBinder compatible to | ||
[the typical CommandLine escaping rules](https://msdn.microsoft.com/en-us/library/17w5ykft.aspx). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First things first: Great RFC.
The linked topic's most recent version is now available at https://docs.microsoft.com/en-us/cpp/cpp/parsing-cpp-command-line-arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
|
||
### Linux | ||
|
||
While this change is important on windows, it's absolutely necessary on Linux: On Windows the CommandLine is split by the next executable and most executables follow the described rules. On Linux the CommandLine is not split by the called executable -- the .Net Core runtime splits the string. Therefore, on Linux the described rules do not only apply to many calls of external executables, they apply to ALL calls of external executables. When the proposed changes are implemented, the arguments from within PowerShell always arrive -- as expected -- as the `argv[]` array in called executables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hard to believe that CoreFX currently first requires assembling the arguments into a pseudo shell "command line" to assign a single string to ProcessStartInfo.Arguments
, only to split them back into arguments later - I've created https://github.com/dotnet/corefx/issues/23592 to address this.
Sadly, one of the few exceptions to those typical parsing rules is `cmd.exe`. Because of this, one cannot reliably call batch files with arbitrary arguments. (This is no problem of PowerShell, it's a cmd design problem.) Some arguments are impossible -- an uneven number of double quotes can only be sent in the last argument. To my knowledge there is no clean way to deal with this, therefore I think using the typical escaping rules is still the way to go. In many cases this is the correct way, as many batch files simply redirect their arguments to other executables and in those cases [these rules](https://msdn.microsoft.com/en-us/library/17w5ykft.aspx) apply. | ||
|
||
**Edit:** | ||
Optionally a special rule for batch files could be added: `"` in arguments of batch files won't be escaped according to the rules described in [Specification->Quoting](#quoting) -- instead, each literal `"` will be replaced by `""`. Many batch files seem to expect this and this way arguments won't be split into multiple `%1`,`%2`,... variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to state the perhaps obvious: in case of extension-less command names (e.g. foo
instead of foo.cmd
/ foo.bat
), PS first needs to determine if the command invoked happens to refer to a batch file or not.
|
||
### Quoting | ||
|
||
The decision if an argument needs to be quoted, will be simplified: Any Argument that contains `"`, `'` or a character that matches `char.IsWhiteSpace` will be quoted. To quote an argument (compatible to [MSVC rules](https://msdn.microsoft.com/en-us/library/17w5ykft.aspx) and `CommandLineToArgvW`): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seem to me that '
-containing tokens don't need quoting per se (I'm talking about embedded '
chars. in the literal resulting from PowerShell's own parsing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given PowerShell/PowerShell#4661, it's probably worth pointing out that whatever argument evaluates to the empty string must be represented as ""
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was also talking about "embedded '
chars". It is correct that these rules don't require quoting arguments containing '
.
The reason I included '
in the list of chars that require quoting, is that some applications mostly follow the MSVC rules, but additionally allow '
as a string delimiter (AFAIK cygwin, ruby). As I see no disadvantage for apps that strictly follow MSVC rules, I thought it might be worth to also quote arguments containing '
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty string
-> New proposal text:
Empty arguments and any argument that contains ", ' or a character that matches char.IsWhiteSpace will be quoted.
- Every occurrence of N times `\` followed by `"` will be replaced by (2*N+1) times `\` followed by `"`. (N ∈ {0,1,2,...}) | ||
- N times `\` at the end of the string is replaced by (2*N) times `\`. (N ∈ {0,1,2,...}) | ||
- `"` is added to the beginning and to the end of the string | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that CoreFX already has an internal implementation of the above rules: PasteArguments.Paste()
takes an enumerable collection of individual arguments and returns them as a "command line" string: https://github.com/dotnet/corefx/blob/5d3e42f831ec3d4c5964040e235824f779d5db53/src/Common/src/System/PasteArguments.cs
Update link for MSVC argument parsing rules
Quote empty arguments
Thanks for the RFC, @TSlivede! Sorry it took us so long to finally take a look at it. No quorum today (just @JamesWTruher, @SteveL-MSFT, and me), but we think that this makes a ton of sense irrespective of existing behavior and without regard for the breaking-ness of it. Now that we have experimental features, we think it's perfectly reasonable to go and implement this today behind an experimental feature flag, and we can figure out further down the line whether this is opt-in vs. opt-out behavior, whether there's some transition path, and if a preference variable is the right mechanism for turning it on and off. Certainly, implementing it behind an experimental feature flag will allow us to play with it and understand the extent of the breakage so we can make a call. @TSlivede given this is years later, are you still potentially interested in implementing this? |
potentially interested: Yes. But after almost 2 years, I need to make some slight adjustments. I can say more in a few days. |
Before changing the RFC, I'll ask for your thoughts here, so I don't have to change the RFC that often :-) I no longer want to add the In the RFC, I did not write anything about the behavior of I don't think, that this is a good option anymore, because now that There is another reason, why handling Option A: I therefore suggest this possible RFC text:
Option B: Alternatively, one could leave that behavior as it is:
Option C: I could possibly consider to actually replicate the quoting code to handle arguments before The advantage of option A over option B is, that it can be much better documented and explained. As by now the parameterBinder has the information which part of the argument was quoted, we could also consider to actually generate a CommandLine, that has partially quoted arguments. (Of course only for arguments that were partially quoted in the powershell command.) This would also require to have the quoting/escaping code in powershell, but it gives a real benefit: Currently this doesn't work: If you type in powershell:
then powershell sends this CommandLine to
and because The current workaround is this powershell command:
This works, because powershell tries to determine, if arguments already contain enough mached quotes, and if so, it doesn't add additional quotes. This workaround would no longer work after this RFC. The "correct"TM solution of this problem would be, that microsoft fixes However we could fix this for powershell by quoting only the part of the argument, that was previously quoted in the powershell command. The information is AFAIK available in the ParameterBinder because of the strange way, that globbing is implemented. |
Good points, @TSlivede.
I agree, so I don't think we should go out of our way to accommodate "rogue" programs - this should be a legitimate reason - perhaps the only remaining one - to use Unfortunately, as wonderful as Take @SteveL-MSFT's example: az "myargs&b" # az is az.cmd - a batch file - on Windows This breaks not only currently, but will also break if we use Note that this problem is limited to (a) Windows and (b) the fact the command line is (ultimately) parsed by another shell, Sadly, this means that we cannot use To be clear: I don't think we need to batch-file-friendly in the escaping- In short: On Windows, double-quoting must be applied not only if an argument contains a
Note:
|
As for the Options A vs. B vs. C: I think C is the best choice, though, as you point out, it requires our own implementation of the typical rules; given that need to accommodate |
Tagging in @theJasonHelmick to give a read over this one when he gets the chance |
As it is now again (almost) a year later, and there still are no answers of Powershell Team members to the questions in my comment I can now say, that I am no longer interested in implementing this, sorry. 🙁 |
Some more thoughts if someone wants to implement something:
While I agree, that this is definitely a good idea for arguments to batch files, I think that the case is less clear for calling
will no longer work. That's fine because even now I prefer to use
(I've described the If arguments with |
Another problem/inconsistency on windows: When one launches a script file by it's name (instead of explicitly calling the interpreter with the script file as argument), eg.
then powershell has to use one of the
where So far this is completely fine. However, that registry key can also contain placeholders of the form The problem is, that this code does not follow the same rules as the startup code of c/c++ application compiled with MCVS nor those of the Luckily this feature is AFAIK not used very often, so I don't think that it's necessary for powershell to consider this. But maybe someone else thinks differently about this, so I thought I'd mention it. |
As I'm giving up on this topic, I thought I could at least mention why:
If for some reason I get motivated enough in the future to approach this topic again, it's likely, that I will just implement something and create a pullrequest instead of again trying to discuss things in a RFC, sorry. 😟 |
@TSlivede Despite the Powershell Team's little interest, this is one of the most problematic issues with Powershell. It's a a major pain point for myself. I do plea with he Powershell team to strongly consider this PR and it's implications. |
Fully agreed, @musm. @TSlivede: Thanks for all the work you've done so far. In the hopes that this will pick up momentum again, let me address your
Thinking about this some more: I suggest special-casing calls to batch files only (commands that resolve to Special-casing is never a great solution, but I think it would ease a lot of pain that comes from calls such as (It is highly unfortunate that when a batch-file call from outside It is unfortunate that we therefore cannot take advantage of |
@joeyaiello, re the following and the way forward in general:
I've posted thoughts here. |
Apologies, @TSlivede. We're still working to get a handle on this older backlog of RFCs. Unfortunately, it's been very difficult to form a consensus around this issue (PowerShell/PowerShell#1995), particularly for the reasons you described with regards to inconsistent Windows executables. Right now, we're looking to improve this functionality within the work in PowerShell/PowerShell#14692 by driving an experimental feature to understand the real-world implications of the proposed fix, to be followed by a new RFC. More rationale posted in PowerShell/PowerShell#14747 |
Given all this, we're going to close this PR. Thank you for getting the conversation started with this proposal, @TSlivede |
This RFC suggests a change that fixes PowerShell/PowerShell#3049 / PowerShell/PowerShell#1995.
PS:
The pull request template suggests, that a new RFC should go into the "Draft" folder, however
https://github.com/PowerShell/PowerShell-RFC/blob/master/RFC0000-RFC-Process.md
mentions:
"New proposed drafts should be submitted as a Pull Request from the Author's fork into the Draft-Accepted folder."
So should I create a second Pull request for the "Draft Accepted" folder?
Link to file: https://github.com/TSlivede/PowerShell-RFC/blob/master/1-Draft/RFC%23%23%23%23-Improve-generation-of-argument-string-for-executables.md
This change is