Error on input `--a b` #52

Jasha10 · 2023-07-17T23:26:16Z

I have a reproducible error when the bot is given input string --a b.

$ gpt
Hi! I'm here to help. Type :q or Ctrl-D to exit, :c or Ctrl-C and Enter to clear
the conversation, :r or Ctrl-R to re-generate the last response. To enter
multi-line mode, enter a backslash \ followed by a new line. Exit the multi-line
mode by pressing ESC and then Enter (Meta+Enter). Try :? for help.
> --a b
Invalid argument: a. Allowed arguments: ['model', 'temperature', 'top_p']
Invalid argument: a. Allowed arguments: ['model', 'temperature', 'top_p']
NoneType: None

$ gpt --version
gpt-cli v0.1.3

Here is the output from running `$ gpt --log_file log.txt --log_level DEBUG`:

$ cat log.txt
2023-07-17 18:26:33,531 - gptcli - INFO - Starting a new chat session. Assistant config: {'messages': [], 'temperature': 0.0, 'model': 'gpt-4'}
2023-07-17 18:26:33,539 - gptcli-session - INFO - Chat started
2023-07-17 18:26:33,539 - asyncio - DEBUG - Using selector: EpollSelector
2023-07-17 18:26:35,314 - gptcli-session - ERROR - Invalid argument: a. Allowed arguments: ['model', 'temperature', 'top_p']
NoneType: None

For context, this error came up when I copy/pasted the following rustc error message into the cli using multiline> mode:

 15 | ) -> impl Iterator<Item = AdaptedRecord> {
    |      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `()` is not an iterator
    |
    = help: the trait `Iterator` is not implemented for `()`

 For more information about this error, try `rustc --explain E0277`

This resulted in Invalid argument: explain. Allowed arguments: ...

The text was updated successfully, but these errors were encountered:

Jasha10 · 2023-07-17T23:35:28Z

It looks like the root cause is that the parse_args function matches the regex pattern --(\w+)(?:\s+|=)([^\s]+) and treats it specially.

@kharvd, I'd like to submit a PR that allows the user to disable this regex matching behavior. Would that be acceptable to you?

This is a workaround for kharvd#52 Previously, submitting strings such as `--a b` to the CLI would result in an error message.

kharvd · 2023-07-18T01:58:39Z

Thanks for a detailed report. If you'd like to add a config option to disable this (and maybe even make it disabled by default), I'm on board - just want to make sure the option is there, because I use --model a lot to run the same prompt with a different model.

One alternative idea is to allow switching the model and other args via : commands now that we have :q and friends: e.g. :model gpt-4 or :temperature 0.0

Jasha10 · 2023-07-18T08:00:47Z

One alternative idea is to allow switching the model and other args via : commands

Let me make sure I understand what you have in mind: if you write :model gpt-4, would this change the model for all subsequent messages?
I believe the current behavior is that Message ... --model gpt-4 changes the model to gpt-4 only for the current message, and subsequent messages will revert back to the model that was configured at the start of the chat session.

wfjt · 2023-09-10T19:55:53Z

Was hit this as well by including a -----BEGIN AGE ENCRYPTED FILE-----\n... string in multiline mode, got an error about -BEGIN being an invalid parameter. Quite annoying having to escape 20 times for the header and footer. I tried gpt -- but that made no difference, same error.

sghael · 2024-06-22T20:25:54Z

related comment: #21 (comment)

…otes. Fixes kharvd#52

Prompt parser will get tripped up if "--" is found anywhere in the prompt and isn't surrounded by specific deliminators. Instead let's force arguments to be at the beginning of the prompt and collect no more arguments after. This way users don't have to place flags inside deliminators and we extend what we can pass to the LLMs. More context is discussed in kharvd#52 with an example.

stevenwalton · 2024-07-31T09:23:20Z

The problem is not resolved and the error can still be hit if "--" is not contained in the specified deliminators.

I was having a lengthy conversation and got

Invalid argument: grep. Allowed arguments: ['model', 'temperature', 'top_p']

gpt --version shows gpt-cli v0.2.0 but I noticed that the sub-version was not bumped.

So instead I'll give you this to verify

$ which gpt
/Users/steven/.mamba/bin/gpt
$ cat /Users/steven/.mamba/lib/python3.12/site-packages/gptcli/cli.py | grep "def parse_args" -A 34
def parse_args(input: str) -> Tuple[str, Dict[str, Any]]:
    # Extract parts enclosed in specific delimiters (triple backticks, triple quotes, single backticks)
    extracted_parts = []
    delimiters = ['```', '"""', '`']

    def replacer(match):
        for i, delimiter in enumerate(delimiters):
            part = match.group(i + 1)
            if part is not None:
                extracted_parts.append((part, delimiter))
                break
        return f"__EXTRACTED_PART_{len(extracted_parts) - 1}__"

    # Construct the regex pattern dynamically from the delimiters list
    pattern_fragments = [re.escape(d) + '(.*?)' + re.escape(d) for d in delimiters]
    pattern = re.compile('|'.join(pattern_fragments), re.DOTALL)

    input = pattern.sub(replacer, input)

    # Parse the remaining string for arguments
    args = {}
    regex = r'--(\w+)(?:=(\S+)|\s+(\S+))?'
    matches = re.findall(regex, input)

    if matches:
        for key, value1, value2 in matches:
            value = value1 if value1 else value2 if value2 else ''
            args[key] = value.strip("\"'")
        input = re.sub(regex, "", input).strip()

    # Add back the extracted parts, with enclosing backticks or quotes
    for i, (part, delimiter) in enumerate(extracted_parts):
        input = input.replace(f"__EXTRACTED_PART_{i}__", f"{delimiter}{part.strip()}{delimiter}")

    return input, args

I imported the above lines into a python terminal and passed my prompt through it and found that args captured {'grep': 'renderer'}.
I was able to verify that the following line, in isolation, gets captured by args.

>  run journalctl -b 0 --grep "renderer for":

The issue is that I did not surround the command with markdown command backticks.

This has made me realize that there is a larger parsing issue at hand here.

First, I think the regex is inaccurate. It looks for any case of the flag pattern. But instead, we should look for the flag pattern if it appears at the at the beginning of the prompt. Maybe I'm lacking a bit of imagination, but I do not think you can change the {temperature,model,top_p} mid prompt and am pretty confident this isn't supported. So I think it is fine to force flags to be at the beginning of a prompt.

pattern = \
    r"(?:--|:)" \            # Find -- or : but don't put in capture group
    r"(\w[^\s=]*)" \         # flag keyword starts with a letter and excludes whitespace or =
    r"(?:[ =])" \            # Deliminator of space or equal sign, not in capture group
    r"(\w[^\s=]*|\d+[.]\d*)" # Flag argument begins with letter and doesn't have space or = or is a number (int or float) 

args = re.findall(pattern,         # get all flag pairs
                  re.match(        # limit matching to sequential flags at beginning of prompt
                      r"^(" \      # starting anchor and open capture group 
                      + pattern \ 
                      + r"\s?)*",  # look for patterns but allow ending in space
                  input
                  ).group()        # converts back to string
)

(?:--|:)(\w[^\s=]*)(?:[ =])(\w[^\s=]*|\d+[.]\d*) matches anything in the flag formulation with a key/argument pair, so we look for a sequence of flags and argument pairs that only appear at the beginning of the prompt.

I believe this solves the problem?

I tested my string that failed earlier and the new result is []. Testing the string "--temperature=5.0 :model GPT-4 --top-p=1 --top_p=5.7 hello world --newflag=boop-de --bad=flag=doop-de-doop" yields [('temperature', '5.0'), ('model', 'GPT-4'), ('top-p', '1'), ('top_p', '5.7')] and the string "--temperature=5.0 :model GPT-4 --top-p=1 --top_p=5.7 --newflag=boop-de --bad=flag=doop-de-doop hello world" yields [('temperature', '5.0'),('model', 'GPT-4'),('top-p', '1'),('top_p', '5.7'),('newflag', 'boop-de'),('bad', 'flag')]. Also, adding "a" to the beginning of the test strings and causes an empty result.

If I understand things correctly, we can remove all the logic from @sghael, including the deliminators which also allows us to create proper codeblocks with arbitrary number of '`' (as was used for this comment). So I believe we can rewrite as

def parse_args(input: str) -> Tuple[str, Dict[str, Any]]:
    pattern = r"(?:--|:)(\w[^\s=]*)(?:[ =])(\w[^\s=]*|\d+[.]\d*)"
    arg_pattern = r"^(" + pattern + r"\s?)*"
    args = re.findall(pattern, re.match(arg_pattern, input).group())
    args = dict((k,v) for k,v in args)
    input = re.sub(arg_pattern, "", input)
    return input, args

Have I missed anything? If not, I have a PR ready to go.

stevenwalton · 2024-09-13T23:15:11Z

@kharvd, you need to redo the check for my PR because it got stuck on a build (I added no build files...). It doesn't look like I can cause a rebuild and when I matched to the current branch it doesn't auto-recheck.

allow escapable blocks of text when wrapped in triple backticks or quotes. Fixes kharvd#52

Jasha10 added a commit to Jasha10/gpt-cli that referenced this issue Jul 17, 2023

i#52: comment out parse_args function in cli.py

14fd1a4

This is a workaround for kharvd#52 Previously, submitting strings such as `--a b` to the CLI would result in an error message.

sghael added a commit to sghael/gpt-cli that referenced this issue Jun 22, 2024

allow escapable blocks of text when wrapped in triple backticks or qu…

c36913e

…otes. Fixes kharvd#52

sghael mentioned this issue Jun 22, 2024

Escapable blocks of text #83

Merged

kharvd closed this as completed in #83 Jul 24, 2024

kharvd closed this as completed in 67491ba Jul 24, 2024

stevenwalton mentioned this issue Jul 31, 2024

Fixing Prompt Parser #85

Closed

CrustyCode pushed a commit to CrustyCode/gpt-cli-clean that referenced this issue Nov 11, 2024

Escapable blocks of text (kharvd#83)

9e31e62

allow escapable blocks of text when wrapped in triple backticks or quotes. Fixes kharvd#52

CrustyCode pushed a commit to CrustyCode/gpt-cli-clean that referenced this issue Nov 11, 2024

Escapable blocks of text (kharvd#83)

824fa13

allow escapable blocks of text when wrapped in triple backticks or quotes. Fixes kharvd#52

TimDaub mentioned this issue Dec 5, 2024

--abc causes headaches #101

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error on input `--a b` #52

Error on input `--a b` #52

Jasha10 commented Jul 17, 2023 •

edited

Loading

Jasha10 commented Jul 17, 2023

kharvd commented Jul 18, 2023

Jasha10 commented Jul 18, 2023

wfjt commented Sep 10, 2023

sghael commented Jun 22, 2024

stevenwalton commented Jul 31, 2024

stevenwalton commented Sep 13, 2024

Error on input --a b #52

Error on input --a b #52

Comments

Jasha10 commented Jul 17, 2023 • edited Loading

Jasha10 commented Jul 17, 2023

kharvd commented Jul 18, 2023

Jasha10 commented Jul 18, 2023

wfjt commented Sep 10, 2023

sghael commented Jun 22, 2024

stevenwalton commented Jul 31, 2024

stevenwalton commented Sep 13, 2024

Error on input `--a b` #52

Error on input `--a b` #52

Jasha10 commented Jul 17, 2023 •

edited

Loading