-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REPL shell mode for Windows #33474
REPL shell mode for Windows #33474
Conversation
include("regex.jl") | ||
include("shell.jl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is needed since we use substitution string in the shell.jl
file.
base/client.jl
Outdated
@@ -31,8 +31,15 @@ stackframe_lineinfo_color() = repl_color("JULIA_STACKFRAME_LINEINFO_COLOR", :bol | |||
stackframe_function_color() = repl_color("JULIA_STACKFRAME_FUNCTION_COLOR", :bold) | |||
|
|||
function repl_cmd(cmd, out) | |||
shell = shell_split(get(ENV, "JULIA_SHELL", get(ENV, "SHELL", "/bin/sh"))) | |||
shell = get(ENV, "JULIA_SHELL", nothing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we cache this value in a reference? So that we don't have to look it up each time the Julia REPL is entered?
command = Cmd(`$shell /c $(shell_escape_CMDly(shell_escape_winsomely(cmd)))`, windows_verbatim=true) | ||
elseif shell_name in ("powershell", "pwsh") | ||
command = Cmd(`$shell -Command $(shell_escape_PWSHly(shell_escape_winsomely(cmd)))`, windows_verbatim=true) | ||
elseif shell_name == "busybox" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really necessary ? We don't even ship with busy box anymore...
I think this is from older julia code, so my suggestion is to just remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arg .. looks like it is needed for testing on windows... maybe we should look into WSL for testing on windows instead of downloading busybox?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still something about this rubs me the wrong way. We should still try to remove this and try to work around them in the tests. Thoughts welcome
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make this the else
block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly do you mean by "making this the else block"?
Do you mean :
if shell_name == "cmd"
command = _CMD_execute(shell, cmd)
elseif shell_name in ("powershell", "pwsh")
command = _powershell_execute(shell, cmd)
else
command = `$shell sh -c $(shell_escape_posixly(cmd))`
Is the assumption here that if you don't want cmd
or powershell
then you probably want it to default to use a linux-based shell running in a windows environment?
base/shell.jl
Outdated
function print_shell_escaped_BATCHly(io::IO, arg::AbstractString) | ||
# see https://www.robvanderwoude.com/variableexpansion.php | ||
# the rule for ! is a bit esoteric, but doesn't hurt to add it | ||
any(in("\r\n"), arg) && throw("Encountered unsupported character by CMD") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this an ArgumentError
error? same for above
base/shell.jl
Outdated
|
||
function print_shell_escaped_BATCHly(io::IO, arg::AbstractString) | ||
# see https://www.robvanderwoude.com/variableexpansion.php | ||
# the rule for ! is a bit esoteric, but doesn't hurt to add it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's just because folks are bad at it. It seems clearly part of the list to me. Some sources list (
and )
, but—as explained in https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts—they are not needed. Nor is @
, as is claimed by this otherwise fairly good source: http://daviddeley.com/autohotkey/parameters/parameters.htm#WINCMDRULES
5e5694a
to
083cb4a
Compare
base/shell.jl
Outdated
|
||
function print_shell_escaped_PWSHly(io::IO, arg::AbstractString) | ||
arg = replace(arg, r"[\$#;]" => s"`\0") | ||
arg = replace(arg, r"[\"|\\\\(?=\\\\*(\"|$))]" => s"\\\0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like since we are launching the process from a start-process
we need to additionally escaping "
and \
using this regex, which is described in the excellent stackoverflow answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That left me even more confused. It ends with but it’s not this simple, and the rules have changed over time.
One thing I’m fairly confident of is that this pwsh function needs us to add \r and \n to the winsomely whitespace list to pass through correctly here (no effect if it doesn’t go to pwsh, so might as well always do it there).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I’m fairly confident of is that this pwsh function needs us to add \r and \n to the winsomely whitespace list to pass through correctly here (no effect if it doesn’t go to pwsh, so might as well always do it there).
Why's that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we needed powershell to work for this example
shell> gci C:\\Program\ Files
Note
julia> Base.shell_escape_winsomely(Base.shell_split("gci C:\\\\Program\\ Files")...) |>print
gci "C:\Program Files"
Running
gci "C:\Program Files"
in a powershell environment works. But not in julia unless we
do the additional escaping so that powershell interprets the command as a single string.
arg = replace(arg, r"[\"|\\\\(?=\\\\*(\"|$))]" => s"\\\0")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vtjnash Please see
The string needs escaping, however, in order to be passed to the new Powershell process unmodified.
" instances embedded in the string are stripped, unless they are either represented as " or tripled (""").
(" rather than the usual `" is needed to escape double quotes in this case, because PowerShell expects " when passing a command to the powershell.exe executable.)
I feel confident that we now have a correct escaping solution to powershell and also for CMD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure what your source was for gci
, so I wrote a simple one to be able to inspect what reaches the target application. Save as gci.c
, compile with cc gci.c
, then run as a.exe
.
#include <windows.h>
#include <stdio.h>
int main (int argc, char **argv) {
printf("%s\n", GetCommandLineA());
for (int i = 0; i < argc; i++) { printf("%d: %s\n", i, argv[i]); }
}
base/shell.jl
Outdated
shell_escape_CMDly(arg::AbstractString) = sprint(print_shell_escaped_CMDly, arg) | ||
|
||
function print_shell_escaped_PWSHly(io::IO, arg::AbstractString) | ||
arg = replace(arg, r"[\$#;]" => s"`\0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\$
prevents powershell variables, ;
prevents executing two commands, #
escapes comment character
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arg = replace(arg, r"[\$#;]" => s"`\0") | |
arg = replace(arg, r"[`\"\$#;|><&(){}=]" => s"`\0") |
(eg. see expanded list at https://www.red-gate.com/simple-talk/wp-content/uploads/imported/2264-PS%20Quoting%20Wallchart_1_0_1.pdf, although I haven't yet found the case where =
is special, the rest are certainly good here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, I think it additionally also needs "\r" => "`r", "\n" => "`n", "\t" => "`t"
There was a bug here when someone passed |
@vtjnash how does this PR look now? |
Ok so I feel like I have finally converged to a solution here, and this should be ready for final review. |
My only question to @vtjnash is why do you feel like we also need to quote |
Should we default to |
base/shell.jl
Outdated
shell_escape_CMDly(arg::AbstractString) = sprint(print_shell_escaped_CMDly, arg) | ||
|
||
function print_shell_escaped_PWSHly(io::IO, arg::AbstractString) | ||
arg = replace(arg, r"[\$#;]" => s"`\0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arg = replace(arg, r"[\$#;]" => s"`\0") | |
arg = replace(arg, r"[`\"\$#;|><&(){}=]" => s"`\0") |
(eg. see expanded list at https://www.red-gate.com/simple-talk/wp-content/uploads/imported/2264-PS%20Quoting%20Wallchart_1_0_1.pdf, although I haven't yet found the case where =
is special, the rest are certainly good here)
base/client.jl
Outdated
command = Cmd(`$shell /c $(shell_escape_CMDly(shell_escape_winsomely(cmd)))`, windows_verbatim=true) | ||
elseif shell_name in ("powershell", "pwsh") | ||
# enclose in double quotes to enusre we prevent whitespace normalization | ||
command = Cmd(`$shell -Command "$(shell_escape_PWSHly(shell_escape_winsomely(cmd)))"`, windows_verbatim=true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
command = Cmd(`$shell -Command "$(shell_escape_PWSHly(shell_escape_winsomely(cmd)))"`, windows_verbatim=true) | |
command = Cmd(`$shell -Command "& $(shell_escape_PWSHly(shell_escape_winsomely(cmd)))"`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this is only correct so long as powershell
is a no-op though. If something is getting intercepted by powershell, we need to use completely different rules. Something like:
command = Cmd(`$shell -Command "$(shell_escape_PWSHly(shell_escape_winsomely(cmd)))"`, windows_verbatim=true) | |
CommandType = read(`powershell -Command "Get-Command -- $(shell_escape_PWSH_cmdlet_ly(cmd.exec[1])) | Select-Object -ExpandProperty CommandType"`, String) | |
# TODO: while CommandType == "Alias"; CommandType = ...; end | |
if CommandType == "Application" | |
command = Cmd(`$shell -Command "& $(shell_escape_PWSHly(shell_escape_winsomely(cmd)))"`) | |
else # handle Function and Cmdlet # TODO: what is the proper handling for the other types (ExternalScript, Script, Workflow, Configuration, and Filter) | |
command = Cmd(`$shell -Command "& $(shell_escape_PWSH_cmdlet_ly(cmd))"`) | |
end |
base/shell.jl
Outdated
# Similarily, if the string as a whole or a double-quoted string inside the function body | ||
# ends in a non-empty run of \, that \ would be interpreted | ||
# as a an escape character so \ must be doubled | ||
arg = replace(arg, r"\"|\\(?=\\*(\"|$))" => s"\\\0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is all mistaken and false. I think I understand where it's coming from, it just doesn't match the behavior I observe in my testing for this stage in the pipeline.
arg = replace(arg, r"\"|\\(?=\\*(\"|$))" => s"\\\0") |
|
||
See also [`shell_escape_CMDly`](@ref). | ||
""" | ||
shell_escape_PWSHly(arg::AbstractString) = sprint(print_shell_escaped_PWSHly, arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my suggested source for the PWSH_cmdlet_ly
function I mention above:
shell_escape_PWSHly(arg::AbstractString) = sprint(print_shell_escaped_PWSHly, arg) | |
shell_escape_PWSHly(arg::AbstractString) = sprint(print_shell_escaped_PWSHly, arg) | |
function print_shell_escaped_PWSH_cmdlet_ly(io::IO, args::AbstractString...) | |
# often the shortest way to escape a powershell string is to double any single quotes and then wrap the whole thing in single quotes (alternatively, we could prefix all the non-word characters with a back-tick and replace newlines with \`r and \`n) | |
# but skip the escaping for common cases we always know are safe (e.g. so that named parameters are typically still interpreted correctly) | |
isword(c::AbstractChar) = '0' <= c <= '9' || 'a' <= c <= 'z' || 'A' <= c <= 'Z' || c == '_' || c == '\\' || c == ':' || c == '/' || c == '-' | |
join(io, (all(isword, arg) ? arg : "'" * replace(arg, "'" => "''") * "'" for arg in args), " ") | |
end | |
""" | |
shell_escape_PWSH_cmdlet_ly(args::AbstractString...) -> String | |
Escapes special characters so they can be appropriately used with a PowerShell cmdlet (such as `echo`). | |
See also [`shell_escape_PWSHly`](@ref) and [`shell_escape_winsomely`](@ref). | |
""" | |
print_shell_escaped_PWSH_cmdlet_ly(io::IO, args::AbstractString...) |
base/shell.jl
Outdated
shell_escape_CMDly(arg::AbstractString) = sprint(print_shell_escaped_CMDly, arg) | ||
|
||
function print_shell_escaped_PWSHly(io::IO, arg::AbstractString) | ||
arg = replace(arg, r"[\$#;]" => s"`\0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, I think it additionally also needs "\r" => "`r", "\n" => "`n", "\t" => "`t"
base/shell.jl
Outdated
|
||
function print_shell_escaped_PWSHly(io::IO, arg::AbstractString) | ||
arg = replace(arg, r"[\$#;]" => s"`\0") | ||
arg = replace(arg, r"[\"|\\\\(?=\\\\*(\"|$))]" => s"\\\0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure what your source was for gci
, so I wrote a simple one to be able to inspect what reaches the target application. Save as gci.c
, compile with cc gci.c
, then run as a.exe
.
#include <windows.h>
#include <stdio.h>
int main (int argc, char **argv) {
printf("%s\n", GetCommandLineA());
for (int i = 0; i < argc; i++) { printf("%d: %s\n", i, argv[i]); }
}
We should probably also allow |
6b46b98
to
0dfcedd
Compare
8f2b093
to
164f274
Compare
Introduce REPL shell mode on Windows. Co-authored-by: Mustafa M. <[email protected]> Co-authored-by: Jameson Nash <[email protected]>
if shell_name == "fish" | ||
shell_escape_cmd = "begin; $(shell_escape_posixly(cmd)); and true; end" | ||
command = `$shell -c $shell_escape_cmd` | ||
elseif shell_name == "pwsh" | ||
command = _powershell_execute(shell, cmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this do the wrong escaping on Unix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'm not sure it's worth the effort to support pwsh
on Unix. This is only used for interactive mode, after all.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Powershell's handling of arguments is known (PowerShell/PowerShell#1995) to be buggy, so can't reliably be used for launching programs anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vtjnash, that's argument handling on Windows. Presumably on Unix systems pwsh
just uses *argv
, no? I guess some amount of escaping is still needed for passing arguments through to subprocesses, but possibly the same problem does not appear on Unix because there is no CommandLineToArgvW
layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, no (e.g. PowerShell/PowerShell#10440)
no CommandLineToArgvW layer
Sadly, it is currently present (although both broken and undocumented—and eventually supposed to be unnecessary)
should we try to get this in, in some form before feature freze? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove all of the powershell
-related content? The rest I think seems ready to go. We should add the powershell portions in a pair of followup PRs (one to add the shell_escape_PWSHly
layer and one to use it in the REPL)
command = Cmd(`$shell /c $(shell_escape_CMDly(shell_escape_winsomely(cmd)))`, windows_verbatim=true) | ||
elseif shell_name in ("powershell", "pwsh") | ||
command = Cmd(`$shell -Command $(shell_escape_PWSHly(shell_escape_winsomely(cmd)))`, windows_verbatim=true) | ||
elseif shell_name == "busybox" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make this the else
block
run(Cmd(Cmd(["cmd /c \"break | echo \$(shell_escape_CMDly(shell_escape_CMDly(to_print)))"]), windows_verbatim=true)) | ||
``` | ||
""" | ||
shell_escape_CMDly(arg::AbstractString) = sprint(print_shell_escaped_CMDly, arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to need tests. Also, can you split this aspect into a separate commit?
I think I'll just open a seperate PR with the suggestions |
function print_shell_escaped_CMDly(io::IO, arg::AbstractString) | ||
any(c -> c in ('\r', '\n'), arg) && throw(ArgumentError("Encountered unsupported character by CMD.")) | ||
# include " so to avoid toggling behavior of ^ | ||
arg = replace(arg, r"[%!^\"<>&|]" => s"^\0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arg = replace(arg, r"[%!^\"<>&|]" => s"^\0") | |
arg = replace(arg, r"[%)!^\"<>&|]" => s"^\0") |
Upon closer inspection of https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts/4095133#4095133, I realized that )
is a metacharacter even though (
is not. (or we can include both here for symmetry)
I'm looking back at this PR and I really dread to revisit this PR 😆😆. PowerShell's handling of external arguments is so borked that the only reasonable solution is to use the |
Counter-proposal: do the same thing on Windows as we do everywhere else, i.e. just exec the cmd that's entered, but ship with busybox tools on Windows and put the path to the busybox tools in the Also: It would be great (and has been on my todo list for so long) to implement |
Or just put the path to busybox tools in the PATH for all Cmd evaluations? |
Closing as stale. I'll look at opening a new PR when I find a more satisfactory solution. Shipping busybox is reverting back to something that we removed and suboptimal (even though what we have currently is worse). Powershell folks have been trying to work around some issues with calling external commands. We updated some the |
Previous Attempts: #31490 #23703 #21294 #20776 #11478 #33427 #11478
Some of these approaches tried to implement special Windows rule in the
shell_escape
mechanism, I think some also tried to pass 'raw' strings to the shell, all with varying degrees of success.Fixes #6316 #23597
References
Powershell:
CMD: