-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct shell quoting in REPL #13195
Conversation
Uh, wat? |
Does this work for cases like |
@vtjnash ...no. Oops. |
This function is used to display |
ae7871c
to
489b7e5
Compare
Made shell quoting more comfortable so that Corrected an error in the shell parser where backslashes within double quotes would not be properly handled. Added a test for this. Changed the way the REPL parses shell arguments (see the respective commit). Don't use Julia's shell parser, which is the wrong tool, since Julia's backtick syntax is quite different from shell syntax, and does not support output redirection. Simply split the command via |
489b7e5
to
51a292a
Compare
51a292a
to
bf8ecc9
Compare
Here is the use case that is corrected by my quoting changes. It has to do with double quoting, i.e. with using a cmd=`echo (hello)`;
run(cmd)
ecmd=Base.shell_escape(cmd)
cmd2=`sh -c $ecmd`
run(ecmd) The first
With the quoting changes here, this properly outputs |
Julia's "shell parsing" (as used for backticks) excplicitly states that output redirection, pipes etc. are not supported via their shell metacharacters. Thus parsing a REPL shell command via the backtick mechanism makes it impossible to redirect output, unless shell quoting has loopholes and lets these metacharacters through (which leads to severe problems if one e.g. wants to quote an already-quoted string). For example, Julia's shell parsing returns the same result for these two different commands: echo '>a' echo >a In both cases, the command ["echo", ">a"] is generated, assuming that the character ">" is not special. It is then impossible to extract the original REPL user's intention from this command. Thus the REPL needs to parse shell commands via `split`, which suffices to determine whether the command is `cd`. Everything else is left to the shell.
Honestly, this seems to me to be a quixotic endeavor. Every shell has different quoting rules and significant characters. Julia's Cmd parsing is a minimal but common subset of this. If you're writing shell commands using |
@StefanKarpinski The semantics of Also, the corrections (and the example above) are independent of the REPL. This is about the behaviour of Julia's backquotes to create I wouldn't call it "quixotic", as I'm claiming that it works now... |
This code also gets used in non-posix shell code paths where the rules are quite different and likely under-tested. Jameson knows more and can judge the correctness there better than I can. |
|
I think the second commit is on the right track, but if you would also need to provide a custom parser to find and handle FWIW, I still think we should stop trying to use the shell at all and just directly implement |
@StefanKarpinski I know that It seems you're opposed to making However, I want to add docstring to (Actually, the first Google hit is https://groups.google.com/forum/#!msg/julia-users/vo-jupVWyhs/2mxhSHWmvgIJ, where you yourself suggest to use Maybe we should rename the function from Maybe we should introduce the function that I am trying to introduce as |
I agree that we should add those features to backtick syntax (which wouldn't always produce single |
It should probably be called |
Regarding reimplementing shell features: I completely agree. Here is the description of my second commit -- I should have copied this into the pull request message: Don't shell-parse REPL shell commands to make redirection etc. work Julia's "shell parsing" (as used for backticks) excplicitly states that output redirection, pipes etc. are not supported via their shell metacharacters. Thus parsing a REPL shell command via the backtick mechanism makes it impossible to redirect output, unless shell quoting has loopholes and lets these metacharacters through (which leads to severe problems if one e.g. wants to quote an already-quoted string). For example, Julia's shell parsing returns the same result for these two different commands:
In both cases, the command ["echo", ">a"] is generated, assuming that the character ">" is not special. It is then impossible to extract the original REPL user's intention from this command. Thus the REPL needs to parse shell commands via |
@StefanKarpinski I don't think it breaks printing objects in backticks. I've added more ways to print -- in addition to escaping via single and double quotes, I also added escaping via backslashes. For example, |
It would be totally fine to always escape characters that are special to Related: we may want to start making unescaped |
bf8ecc9
to
48e0191
Compare
Yes, that's what the new escaping algorithm does. It has a white list of "good characters", and everything else is quoted. It tries quoting in three different ways, and chooses the shortest result. |
if isempty(cmd.exec) | ||
throw(ArgumentError("no cmd to execute")) | ||
elseif cmd.exec[1] == "cd" | ||
cmds = split(line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this handle paths with spaces, e.g.
shell> cd f\ g
shell> cd 'f g'
shell> cd "f g"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that part seems pretty broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, this needs to use Base.shell_split
instead.
I don't particularly care for the backslashing. Except for pathological cases, it can only ever be one character shorter and it's always less clear, imo. Otherwise this seems to be on the right track. |
@eschnett, how about breaking this PR into two parts:
The first part is pretty uncontroversial and should be easy to merge. The second part needs more work. |
I agree with @StefanKarpinski here about the backslashing, everything else fine. |
# terminal. | ||
parline = "($line)" | ||
ttyflag = isa(STDIN, TTY) ? "-i" : "" | ||
unixcmd = `$shell $ttyflag -c $parline` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When stdin isn't a TTY, you get $shell '' -c ...
, which will probably fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will. Got confused by string vs. command interpolation.
For backslashes, a good example will be escaping I'll prepare two separate pull requests. |
Maybe require the backslashing approach to shorter by two characters to win? |
I already went for 10 characters (instead of two). |
Fixes #10120.