-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error on input --a b
#52
Comments
It looks like the root cause is that the @kharvd, I'd like to submit a PR that allows the user to disable this regex matching behavior. Would that be acceptable to you? |
This is a workaround for kharvd#52 Previously, submitting strings such as `--a b` to the CLI would result in an error message.
Thanks for a detailed report. If you'd like to add a config option to disable this (and maybe even make it disabled by default), I'm on board - just want to make sure the option is there, because I use One alternative idea is to allow switching the model and other args via |
Let me make sure I understand what you have in mind: if you write |
Was hit this as well by including a |
related comment: #21 (comment) |
Prompt parser will get tripped up if "--" is found anywhere in the prompt and isn't surrounded by specific deliminators. Instead let's force arguments to be at the beginning of the prompt and collect no more arguments after. This way users don't have to place flags inside deliminators and we extend what we can pass to the LLMs. More context is discussed in kharvd#52 with an example.
The problem is not resolved and the error can still be hit if "--" is not contained in the specified deliminators. I was having a lengthy conversation and got Invalid argument: grep. Allowed arguments: ['model', 'temperature', 'top_p']
So instead I'll give you this to verify $ which gpt
/Users/steven/.mamba/bin/gpt
$ cat /Users/steven/.mamba/lib/python3.12/site-packages/gptcli/cli.py | grep "def parse_args" -A 34
def parse_args(input: str) -> Tuple[str, Dict[str, Any]]:
# Extract parts enclosed in specific delimiters (triple backticks, triple quotes, single backticks)
extracted_parts = []
delimiters = ['```', '"""', '`']
def replacer(match):
for i, delimiter in enumerate(delimiters):
part = match.group(i + 1)
if part is not None:
extracted_parts.append((part, delimiter))
break
return f"__EXTRACTED_PART_{len(extracted_parts) - 1}__"
# Construct the regex pattern dynamically from the delimiters list
pattern_fragments = [re.escape(d) + '(.*?)' + re.escape(d) for d in delimiters]
pattern = re.compile('|'.join(pattern_fragments), re.DOTALL)
input = pattern.sub(replacer, input)
# Parse the remaining string for arguments
args = {}
regex = r'--(\w+)(?:=(\S+)|\s+(\S+))?'
matches = re.findall(regex, input)
if matches:
for key, value1, value2 in matches:
value = value1 if value1 else value2 if value2 else ''
args[key] = value.strip("\"'")
input = re.sub(regex, "", input).strip()
# Add back the extracted parts, with enclosing backticks or quotes
for i, (part, delimiter) in enumerate(extracted_parts):
input = input.replace(f"__EXTRACTED_PART_{i}__", f"{delimiter}{part.strip()}{delimiter}")
return input, args I imported the above lines into a python terminal and passed my prompt through it and found that
The issue is that I did not surround the command with markdown command backticks. This has made me realize that there is a larger parsing issue at hand here. First, I think the regex is inaccurate. It looks for any case of the flag pattern. But instead, we should look for the flag pattern if it appears at the at the beginning of the prompt. Maybe I'm lacking a bit of imagination, but I do not think you can change the {temperature,model,top_p} mid prompt and am pretty confident this isn't supported. So I think it is fine to force flags to be at the beginning of a prompt. pattern = \
r"(?:--|:)" \ # Find -- or : but don't put in capture group
r"(\w[^\s=]*)" \ # flag keyword starts with a letter and excludes whitespace or =
r"(?:[ =])" \ # Deliminator of space or equal sign, not in capture group
r"(\w[^\s=]*|\d+[.]\d*)" # Flag argument begins with letter and doesn't have space or = or is a number (int or float)
args = re.findall(pattern, # get all flag pairs
re.match( # limit matching to sequential flags at beginning of prompt
r"^(" \ # starting anchor and open capture group
+ pattern \
+ r"\s?)*", # look for patterns but allow ending in space
input
).group() # converts back to string
)
I believe this solves the problem? I tested my string that failed earlier and the new result is If I understand things correctly, we can remove all the logic from @sghael, including the deliminators which also allows us to create proper codeblocks with arbitrary number of '`' (as was used for this comment). So I believe we can rewrite as def parse_args(input: str) -> Tuple[str, Dict[str, Any]]:
pattern = r"(?:--|:)(\w[^\s=]*)(?:[ =])(\w[^\s=]*|\d+[.]\d*)"
arg_pattern = r"^(" + pattern + r"\s?)*"
args = re.findall(pattern, re.match(arg_pattern, input).group())
args = dict((k,v) for k,v in args)
input = re.sub(arg_pattern, "", input)
return input, args Have I missed anything? If not, I have a PR ready to go. |
@kharvd, you need to redo the check for my PR because it got stuck on a build (I added no build files...). It doesn't look like I can cause a rebuild and when I matched to the current branch it doesn't auto-recheck. |
allow escapable blocks of text when wrapped in triple backticks or quotes. Fixes kharvd#52
allow escapable blocks of text when wrapped in triple backticks or quotes. Fixes kharvd#52
I have a reproducible error when the bot is given input string
--a b
.Here is the output from running `$ gpt --log_file log.txt --log_level DEBUG`:
For context, this error came up when I copy/pasted the following rustc error message into the cli using
multiline>
mode:This resulted in
Invalid argument: explain. Allowed arguments: ...
The text was updated successfully, but these errors were encountered: