Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't shell-parse REPL shell commands to make redirection etc. work #13215

Closed
wants to merge 1 commit into from

Conversation

eschnett
Copy link
Contributor

Julia's "shell parsing" (as used for backticks) excplicitly states that output redirection, pipes etc. are not supported via their shell metacharacters. Thus parsing a REPL shell command via the backtick mechanism makes it impossible to redirect output, unless shell quoting has loopholes and lets these metacharacters through (which leads to severe problems if one e.g. wants to quote an already-quoted string).

For example, Julia's shell parsing returns the same result for these two different commands:
echo '>a'
echo >a
In both cases, the command ["echo", ">a"] is generated, assuming that the character ">" is not special. It is then impossible to extract the original REPL user's intention from this command. Thus the REPL needs to parse shell commands via split, which suffices to determine whether the command is cd. Everything else is left to the shell.

@nolta
Copy link
Member

nolta commented Sep 18, 2015

Does

julia> dir = "/path/to/stuff";
shell> cd $dir

still work?

@eschnett
Copy link
Contributor Author

No, this does not work any more...

What does this mean if dir contains spaces? In a regular shell or in a Julia string, this would just expand dir, leading to multiple arguments. In Julia backticks, this leads to a single argument. Which do you want? Currently, Julia uses backtick expansion for the REPL, which is why the special characters for output redirection and pipelining etc. don't work. So there are two choices:

  • Continue to interpret REPL shell commands as if written in backticks, and forego interpreting them via the shell.
  • Interpret the input without using Julia's backticks, and feed things to a shell.

I opted for the second, which is what I (and probably most others) would expect from a "shell mode" in the REPL.

Given this, I'd suggest to have cd $dir in the REPL shell mode mean the same thing as if typed into an actual shell, with dir a shell variable. Expressions would not be quoted, and you have to type something like cd "$dir" if dir may contain spaces. That's not nice, but then, this is how shells behave.

We may also want to introduce a "backtick mode" to the REPL, which interprets input via Julia's backtick mechanism. This is more comfortable, but would completely avoid interpreting anything via the shell.

@hayd
Copy link
Member

hayd commented Sep 18, 2015

No, this does not work any more...

This a pretty big feature to just break.

http://julialang.org/blog/2013/04/put-this-in-your-pipe/

@hayd hayd added the breaking This change will break code label Sep 18, 2015
@eschnett
Copy link
Contributor Author

@hayd Everything described in this post is still working. However, in addition to using backticks to wrap shell commands, the Julia interpreter also has a "shell mode" that you enter by pressing semicolon ; at the beginning of a line. This gives you access to a full shell. However, some shell features are currently not working because of problems with quoting. My patch addresses these problems. One big issue is that things are currently not well specified, so it's not even clear how things should work.

@eschnett eschnett added the needs decision A decision on this change is needed label Sep 18, 2015
@hayd
Copy link
Member

hayd commented Sep 18, 2015

Being able to use julia variables in the same way with ; as with backticks is very useful.

This PR gives us 3 standards syntaxes: backticks, shell-mode, and your actual shell. :(

Maybe, like the previously discussed raw string macro, there should be a raw cmd.

@eschnett
Copy link
Contributor Author

I've tried to model the REPL shell behaviour to be as close to the actual shell as possible.

@eschnett eschnett removed the breaking This change will break code label Sep 19, 2015
@eschnett
Copy link
Contributor Author

The input sequence

julia> dir = "/tmp"
shell> cd $dir

is now working.

@eschnett eschnett removed the needs decision A decision on this change is needed label Sep 19, 2015
@eschnett eschnett force-pushed the repl-shell-quoting branch 2 times, most recently from 4331544 to 26894b7 Compare September 19, 2015 22:09
Julia's "shell parsing" (as used for backticks) excplicitly states that output redirection, pipes etc. are not supported via their shell metacharacters. Thus parsing a REPL shell command via the backtick mechanism makes it impossible to redirect output, unless shell quoting has loopholes and lets these metacharacters through (which leads to severe problems if one e.g. wants to quote an already-quoted string).

For example, Julia's shell parsing returns the same result for these two different commands:
   echo '>a'
   echo >a
In both cases, the command ["echo", ">a"] is generated, assuming that the character ">" is not special. It is then impossible to extract the original REPL user's intention from this command. Thus the REPL needs to parse shell commands via `split`, which suffices to determine whether the command is `cd`. Everything else is left to the shell.
@eschnett
Copy link
Contributor Author

Thanks for all the comments I received while implementing this. I think this patch is now ready to be merged. This patch cleans up how the "shell" REPL mode works.

Currently, it is unclear whether shell metacharacters should work or not. Some are working (echo >a redirects to a file), others are not (echo '>a' redirects to a file, although it shouldn't; echo "(a)" leads to a syntax error). I don't think it's documented whether these should work, but given that they are useful and that people asked for them (see #10120), I think it should. This is different from Julia's command strings (in backticks), which are explicitly documents as not supporting these metacharacters.

As #10120 notes, these metacharacters are quoted wrong. This patch corrects this in the following way:

  • Instead of evaluating the REPL line as if entered within backticks, the line is entered as if entered within double quotes. This gives a consistent, well-known behaviour regarding variable interpolation and quoting.
  • Evaluating the line still occurs via a shell, which still explicitly uses a subshell. However, quoting for this is handled exclusively via Julia's backticks instead of adding ad-hoc quoting.
  • Some changes were necessary to determine whether the REPL line is a cd command, and to extracting the directory. In essence, this uses the same mechanism as for other REPL lines.

With these changes, quoting in REPL shell commands is now completely consistent (barring any implementation errors), and has a well-specified semantics.

@mbauman
Copy link
Member

mbauman commented Sep 22, 2015

Do I understand correctly that this won't use backtick-command-style interpolation anymore?

julia> s = "foo bar > baz"
"foo bar > baz"

shell> julia -E ARGS -- $s
UTF8String["foo bar > baz"] # is this now four arguments? Or two with output to baz?

+100 to making shell mode work better, but I really don't think we should allow shell control syntax through interpolation. I'm not sure interpolation is useful otherwise (well, without always doing "$(shell_escape(s))" or whatever magic incantation would get it to just work as one literal argument).

@eschnett
Copy link
Contributor Author

@mbauman Yes, this would now be four arguments. As in a shell, if you want $s to be a single argument you have to quote it, as in

julia -E ARGS -- '$s'

I'm afraid that defining a new interpolation syntax and defining what "sanity" means for a shell it outside the scope of this patch.

@mbauman
Copy link
Member

mbauman commented Sep 22, 2015

I just tried it:

julia> s = "foo bar > baz"
"foo bar > baz"

shell> julia -E ARGS -- $s

shell> cat baz
UTF8String["foo","bar"]

julia -E ARGS -- '$s' will only work if s doesn't contain single quotes and could be very surprising (s = "this ' && echo 'could be worse")... but it's more likely a syntax error.

Would a possible solution here be to use the status quo + #13214, but explicitly enabling command redirection and piping at the shell prompt within Julia itself (somewhat akin to the special parsing of cd)? This would make those features cross-platform, to boot.

@eschnett
Copy link
Contributor Author

This would be substantially more complex to implement than the current patch.

@StefanKarpinski
Copy link
Member

TBH, this change isn't going to happen for the reasons that @mbauman has given.

@eschnett
Copy link
Contributor Author

I'm not trying to add a new feature: I'm trying to fix the broken quoting. Using the current version of Base.shell_escape to quote a string for a shell is just wrong.

I'm not saying that the REPL shell syntax should either support shell metacharacters, or shouldn't, or how this should be implemented, or what syntax they should have -- but the current state is dangerous.

Here are some examples of things going wrong, starting with harmless ones:

echo ')('

This should output output )( to the screen; instead, it's a syntax error.

echo '>a'

This should output >a to the screen (it's quoted); instead, it creates a file.

echo ')& mr -rf ~& (:'

(CAREFUL!)
This should also output some text to the screen. Everybody who knows either shell syntax, or who knows Julia's backticks, expects this to output text to the screen. It doesn't; it will try to execute the mr command. However, if you replace the & character by ;, the text will actually be output, and no command will be executed.

So, let's please fix the broken quoting. If you don't like my patch (which continues to makes echo >a create a file), then we can instead choose to have echo >a output two characters to the screen, using Julia's backtick interpretation. Or something else, you decide. But incorrect quoting is a beginner's mistake, and there shouldn't even be a discussion about how to fix it.

@StefanKarpinski
Copy link
Member

I agree that we should fix the current situation but your fix is a little too drastic. We need to come up with something else.

@eschnett
Copy link
Contributor Author

@StefanKarpinski Implementation-wise, I parse the REPL line as string instead of via backticks. Do you have a suggestion for a (simple?) alternative?

@StefanKarpinski
Copy link
Member

Nope. Not yet. Still thinking about it.

@mbauman
Copy link
Member

mbauman commented Apr 9, 2021

This has changed considerably and we now disallow these special characters to give space for a significantly different implementation here.

@mbauman mbauman closed this Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants