-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
argparse does not accept options taking arguments beginning with dash (regression from optparse) #53580
Comments
Porting the a2x program to argparse from the now-deprecated optparse subtly breaks it when certain options are passed: $ a2x --asciidoc-opts --safe gitcli.txt
$ ./a2x.argparse --asciidoc-opts --safe gitcli.txt
usage: a2x [-h] [--version] [-a ATTRIBUTE] [--asciidoc-opts ASCIIDOC_OPTS]
[--copy] [--conf-file CONF_FILE] [-D PATH] [-d DOCTYPE]
[--epubcheck] [-f FORMAT] [--icons] [--icons-dir PATH] [-k]
[--lynx] [-L] [-n] [-r PATH] [-s] [--stylesheet STYLESHEET]
[--safe] [--dblatex-opts DBLATEX_OPTS] [--fop]
[--fop-opts FOP_OPTS] [--xsltproc-opts XSLTPROC_OPTS] [-v]
a2x: error: argument --asciidoc-opts: expected one argument Apparently argparse uses a heuristic to try to guess whether an argument looks like an argument or an option, going so far as to check whether it looks like a negative number (!). It should _never_ guess: the option was specified to take an argument, so the following argument should always be parsed as an argument. Small test case: >>> import optparse
>>> parser = optparse.OptionParser(prog='a2x')
>>> parser.add_option('--asciidoc-opts',
... action='store', dest='asciidoc_opts', default='',
... metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '--safe'])
(<Values at 0x7f585142ef80: {'asciidoc_opts': '--safe'}>, [])
>>> import argparse
>>> parser = argparse.ArgumentParser(prog='a2x')
>>> parser.add_argument('--asciidoc-opts',
... action='store', dest='asciidoc_opts', default='',
... metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '--safe'])
usage: a2x [-h] [--asciidoc-opts ASCIIDOC_OPTS]
a2x: error: argument --asciidoc-opts: expected one argument |
It seems like reasonable request to me to be able to allow such arguments, especially since optparse did and we want people to be able to use argparse as a replacement. Though in general I find argparse's default behavior more useful. Since argparse has been released, I'm thinking this still has to be a feature request, since argparse is *not* a drop-in replacement for optparse. |
For what it's worth, I have trouble seeing this as anything but a bug. I understand the motivation of trying to catch user errors, but in doing so, you're breaking with the behavior of every other option parsing library that I'm aware of, in favor of an arbitrary heuristic that sometimes guesses wrong. That's not the kind of behavior I expect from my Python libraries; I want them to do what I ask them to, not try to guess what I probably meant. |
I’m not sure I understand. Why is it useful for an option parsing library to heuristically decide, by default, that I didn’t actually want to pass in the valid option that I passed in? Shouldn’t that be up to the caller (or up to the program, if it explicitly decides to reject such arguments)? Keep in mind that the caller might be another script instead of a user. |
Well, even if you call it a bug, it would be an argparse design bug, and design bug fixes are feature requests from a procedural point of view. |
Note that the negative number heuristic you're complaining about doesn't actually affect your code below. The negative number heuristic is only used when you have some options that look like negative numbers. See the docs for more information: http://docs.python.org/library/argparse.html#arguments-containing Your problem is that you want "--safe" to be treated as a positional argument even though you've declared it as an option. Basically there are two reasonable interpretations of this situation. Consider something like "--conf-file --safe". Either the user wants a conf file named "--safe", or the user accidentally forgot to type the name of the conf file. Argparse assumes the latter, though either one is conceivable. Argparse assumes the latter because, while it occasionally throws an unnecessary exception, the other behavior would allow an error to pass silently. I'm definitely opposed to changing the default behavior to swallow some errors silently. If you'd like to propose an API for enabling such behavior explicitly and supply a patch and tests implementing it, I'll be happy to review it though. |
Yes it does: >>> import argparse
>>> parser = argparse.ArgumentParser(prog='a2x')
>>> parser.add_argument('--asciidoc-opts',
... action='store', dest='asciidoc_opts', default='',
... metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '-1'])
Namespace(asciidoc_opts='-1')
>>> parser.parse_args(['--asciidoc-opts', '-one'])
usage: a2x [-h] [--asciidoc-opts ASCIIDOC_OPTS]
a2x: error: argument --asciidoc-opts: expected one argument
No, it doesn’t matter whether --safe was declared as an option: argparse rejected it on the basis of beginning with a dash (as I demonstrated in my small test case, which did not declare --safe as an option, and again in the example above with -one).
But it’s not argparse’s job to decide that the valid option I passed was actually a typo for something invalid. This would be like Python rejecting the valid call Including these special heuristics by default, that (1) are different from the standard behavior of all other option parsing libraries and (2) interfere with the ability to pass certain valid options, only leads to strange inconsistencies between command line programs written in different languages, and ultimately makes the command line harder to use for everyone. The default behavior should be the standard one. |
I still disagree. You're giving the parser ambiguous input. If a parser sees "--foo --bar", and "--foo" is a valid option, but "--bar" is not, this is a legitimately ambiguous situation. Either the user really wanted "--bar", and the parser doesn't support it, or the "--bar" was meant to be the argument to the "--foo" flag. At this point, the parser must make an arbitrary decision, and argparse chooses the interpretation that the user wanted the "--bar" flag. I understand that you have a good use case for the other interpretation. That's why I suggest you come up with a patch that allows this other interpretation to be enabled when necessary. Changing the default behavior is really a non-starter unless you can propose a sensible transition strategy (as is always necessary for changing APIs in backwards incompatible ways). |
There is no ambiguity. According to the way that every standard option parsing library has worked for decades, the parser knows that --foo takes an argument, so the string after --foo is in a different grammatical context than options are, and is automatically interpreted as an argument to --foo. (It doesn’t matter whether that string begins with a dash, is a valid argument, might become a valid argument in some future version, looks like a negative number, or any other such condition.) arguments = *(positional-argument / option) [-- *(positional-argument)] This is just like how variable names in Python are in a different grammatical position than keyword argument names, so that Popen(shell) is not confused with Popen(shell=True). This is not ambiguity; it simply follows from the standard definition of the grammar. argparse’s alternative interpretation of that string as another option does not make sense because it violates the requirement that --foo has been defined to take an argument. The only justification for considering that input ambiguous is if you start assuming that argparse knows better than the user (“the user accidentally forgot to type the name of the conf file”) and try to guess what they meant. This violates the user’s expectations of how the command line should work. It also creates subtle bugs in scripts that call argparse-based programs (think about call(["program", "--foo", foo_argument]) where foo_argument comes from some complex computation or even untrusted network input).
This would not be a backwards incompatible change, since every option that previously parsed successfully would also parse in the same way after the fix. |
Er, obviously positional arguments before the first ‘--’ can’t begin with a dash (I don’t think there’s any confusion over how those should work). The point was just that the grammar unambiguously allows the argument of --foo to be any string. |
It *would* be a backwards incompatible change. Currently, if I have a parser with both a "--foo" and a "--bar" option, and my user types "--foo --bar", they get an error saying that they were missing the argument to "--foo". Under your proposal, the "--foo" option will now silently consume the "--bar" option without an error. I know this is good from your perspective, but it would definitely break some of my scripts, and I imagine it would break other people's scripts as well. As I keep saying, I'm happy to add your alternative parsing as an option (assuming you provide a patch), but I really don't think it's the right thing to do by default. Most command line programs don't have options that take other option-like things as arguments (which is the source of your problem), so in most command line programs, people want an error when they get an option they don't recognize or an option that's missing its argument. Under your proposal, more such errors will pass silently and will have to be caught by additional code in the script. |
The reporter imho is 100% right. Simply because of the fact that in the current situation, there is no way to supply an argument starting with a dash (not even for instance a filename). That is, of course, total nonsense to be dictated by the parser library. |
While I also dislike the existing behavior, note that you can get what you want by using an equal sign. >>> import argparse
>>> parser = argparse.ArgumentParser(prog='a2x')
>>> parser.add_argument('--asciidoc-opts',
... action='store', dest='asciidoc_opts', default=''
... metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '-1'])
Namespace(asciidoc_opts='-1')
>>> parser.parse_args(['--asciidoc-opts=-one'])
Namespace(asciidoc_opts='-one') I always use the equal sign, so I've never noticed this behavior before. I wish that help would display the equal sign, but that's another issue. |
Yeah, I agree it's not ideal, though note that basic unix commands have trouble with arguments staring with dashes: $ cd -links-/
-bash: cd: -l: invalid option
cd: usage: cd [-L|-P] [dir] If you're working with a file on a filesystem, the time honored workaround is to prefix with ./ $ cd ./-links-/
$ Anyway, it doesn't seem like anyone is offering to write up a patch to enable such an alternative parsing strategy, perhaps Eric's "=" workaround should be documented prominently somewhere? |
Documenting “--extra-args=--foo” or “--extra-args -- --foo” (untested, but should work) seems good. |
"--" won't work. Traditionally, this has been used to separate optional arguments from positional arguments. Continuing the "cd" example, that's what would let you cd into a directory whose name starts with a hyphen: $ cd -links-/
-bash: cd: -l: invalid option
cd: usage: cd [-L|-P] [dir]
$ cd -- -links-
$ This would also work with argparse: import argparse
parser = argparse.ArgumentParser(prog='cd')
parser.add_argument('-L', help='follow symbolic links')
parser.add_argument('-P', help='do not follow symbolic links')
parser.add_argument('dir', help='directory name')
print(parser.parse_args(['--', '-Links-'])) prints:
Continuing the example from my earlier post shows it won't work for values for optional arguments: >>> parser.parse_args(['--asciidoc-opts -- -one'])
usage: a2x [-h] [--asciidoc-opts ASCIIDOC_OPTS]
a2x: error: unrecognized arguments: --asciidoc-opts -- -one I believe it's only the '=' that will solve this problem. In fact, because of this issue, I suggest we document '=' as the preferred way to call argparse when optional arguments have values, and change all of the examples to use it. I also think it would cause less confusion (because of this issue) if the help output showed the equal sign. But I realize that's probably more controversial. |
There are some problems that ‘=’ can’t solve, such as options with nargs ≥ 2. optparse has no trouble with this: >>> parser = optparse.OptionParser()
>>> parser.add_option('-a', nargs=2)
>>> parser.parse_args(['-a', '-first', '-second'])
(<Values at 0x7fc97a93a7e8: {'a': ('-first', '-second')}>, []) But inputting those arguments is _not possible_ with argparse. >>> parser = argparse.ArgumentParser()
>>> parser.add_argument('-a', nargs=2)
>>> parser.parse_args(['-a', '-first', '-second'])
usage: [-h] [-a A A]
: error: argument -a: expected 2 argument(s) |
Good point, I hadn't thought of that. Maybe ArgumentParser needs a "don't try to be so helpful, parse like optparse" option. Which is what Steven suggested earlier, I believe. I'd take a crack at this if there's general consensus on that solution. We can change the documentation to point out the issue now, but the feature request can only go in 3.3. |
That would be a good first step. I continue to advocate making that mode the default, because it’s consistent with how every other command line program works[1], and backwards compatible with the current argparse behavior. As far as documentation for older versions, would it be reasonable to un-deprecate optparse until argparse becomes a suitable replacement? There are still lots of programmers working in Python 2.7. [1] bethard’s msg128047 is confusing positional arguments with option arguments. All UNIX commands that accept option arguments have no trouble accepting option arguments that begin with -. For example, ‘grep -e -pattern file’ is commonly used to search for patterns beginning with -. |
I'd also like to see this as the default. After all, presumably we'd like Python scripts to work like all other command line programs, and I too am unaware of any other option parsing library that works the way argparse does. But changing released behavior in the stdlib is problematic, as everyone knows. I'll look into producing a patch to add this as optional behavior, then we can separately think about changing the default. |
I don't think there's any sense in "un-deprecating" optparse because: (1) It's only deprecated in the documentation - there is absolutely nothing in the code to keep you from continuing to use it, and there are no plans to remove it from Python. (2) One (mis?)feature doesn't make the rest of the module useless. And yes Eric, it would be awesome if you could develop a patch that allows the alternate parsing to be enabled when someone wants it. We should think about deprecation strategy though. Maybe something like: == Python 3.3 == == Python 2.4 == == Python 2.5 == I'm not sure that's the right way to do it, but if the plan is to change the default at some point, we should make sure that we have a deprecation plan before we add the feature. |
s/2.4/3.4/ |
Which is why I suggested un-deprecating it in the documentation. (I want to avoid encouraging programmers to switch away from optparse until this bug is fixed.)
Perhaps you weren’t literally proposing “error_on_unknown_options=False” as the name of the new flag, but note that neither the current nor proposed behaviors have nothing to do with whether arguments look like known or unknown options. Under the proposed behavior, anything in argument position (--asciidoc-opts ___) is parsed as an argument, no matter what it looks like. So a more accurate name might be “refuse_dashed_args=False”, or more generally (in case prefix_chars != '-'), “refuse_prefixed_args=False”? |
@Éric: yes, thanks! @anders: The reason the current implementation gives you the behavior you don't want is that the first thing it does is scan the args list for things that look like flags (based on prefix_chars). It assumes that everything that looks like a flag is intended to be one, before it ever looks at how many arguments the flag before it takes or anything like that. This is the source of your problem - argparse assumes "-safe" is a flag, and as a result, there is no argument for "--asciidoc-opts'. So perhaps a better name would be something like dont_assume_everything_that_looks_like_a_flag_is_intended_to_be_one. ;-) |
Steven: Yes, the current structure of the first pass scan makes any patch problematic. It really would be an implementation of a different algorithm. I'm still interested in looking at it, though. |
Without guessing which args are options, I don't see how it's possible to implement parse_known_args(). I'd propose raising an exception if it's called and dont_assume_everything_that_looks_like_a_flag_is_intended_to_be_one (or whatever it ends up being called) is True. |
Maybe dont_assume_everything_that_looks_like_a_flag_is_intended_to_be_one should actually be a new class, e.g. parser = AllowFlagsAsPositionalArgumentsArgumentParser() Then you just wouldn't provide parse_known_args on that parser. |
[I doubt my terminology is exactly correct in this post, but I've tried my best to make it so.) The more I think about this the more I realize we can't implement a parser that doesn't make guesses about '-' prefixed args and that works with arparse's existing behavior with respect to optional arguments. For example: Unless the parser tries to guess that --bar is an optional argument by itself, it can't know that --foo has an argument or not. I guess it could look and say that if you called this with '--foo --baz', then '--baz' must be an argument for '--foo', but then you could never have an argument to '--foo' named '--bar', plus it all seems fragile. Maybe this new parser (as Steven described it) wouldn't allow a variable number of arguments to optional arguments? That is, nargs couldn't be '?', '*', or '+', only a number. |
My concern is not that optparse will be taken away. My concern is that the documentation incorrectly discourages its use. https://docs.python.org/3/library/optparse.html Given that the apparent conclusion of this bug is that argparse has also become too difficult to fix, either argparse should be deprecated for exactly the same reason, or optparse should be un-deprecated. Most programs don’t need the extra features of argparse, and optparse doesn’t have this bug, so optparse is a better default choice; the documentation should not be encouraging argparse over it. |
It is not possible to configure AWS Batch job definition parameters to pass `--key=Ref::some_argument` <https://docs.aws.amazon.com/batch/latest/userguide/job_definition_parameters.html#parameters=>, and `argparse` has a long-standing issue re. not being able to handle `--key -value`-type key/value pairs <python/cpython#53580>. The only known workaround is to use the deprecated `optparse` module.
It is not possible to configure AWS Batch job definition parameters to pass `--key=Ref::some_argument` <https://docs.aws.amazon.com/batch/latest/userguide/job_definition_parameters.html#parameters=>, and `argparse` has a long-standing issue re. not being able to handle `--key -value`-type key/value pairs <python/cpython#53580>. The only known workaround is to use the deprecated `optparse` module.
It is not possible to configure AWS Batch job definition parameters to pass `--key=Ref::some_argument` <https://docs.aws.amazon.com/batch/latest/userguide/job_definition_parameters.html#parameters=>, and `argparse` has a long-standing issue re. not being able to handle `--key -value`-type key/value pairs <python/cpython#53580>. The only known workaround is to use the deprecated `optparse` module. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Is it infeasible to just give callers the ability to explicitly disable argument permutation? |
I'm not sure what you mean by argument permutation, but personally I think that any change is ill-advised. |
I meant that the current behavior is morally equivalent to argument permutation from GNU's If any change to argparse is ill-advised, then that effectively means that argparse will receive no further development, just like optparse, so why is optparse soft-deprecated but not argparse? |
Perhaps you mean something else by “morally equivalent”, but I want to be super clear for anyone reading that this issue is not present in GNU So this isn’t some kind of GNU compatibility hack; it’s just a plain bug that the developers have decided not to fix. GNU |
I believe a better description is "can't fix without either a) breaking existing usage, or b) writing an entire new algorithm and trying to jam it into argparse without accidentally breaking anything". I believe it's better to have a different algorithm, independent of argparse. And at this point in Python's development, this library should be on PyPI, not in the stdlib. Many such libraries are already on PyPI, such as click and appeal. |
I'm hitting this error on Mac. Try to pass to the flag |
If you want to stick to argparse, I think "--name-suffix=-pr-envs-{buildid}" will work. Or you could switch to a different parser, like optparse, click, or appeal. argparse cannot handle this case without the equal sign. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: