Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trigger tool detection only on complete lines to help nested code block start detection #251

Merged

Conversation

jrmi
Copy link
Contributor

@jrmi jrmi commented Nov 10, 2024

TL;DR; This MR fixes the bug when you have a markdown code block nested in another tooluse code block in streaming mode.

I think it's related to this issue #111

To reproduce

  • Start gptme
  • Submit the following prompt: Create the file foo.md and add a small explanation on how works the async in python with a code example

The response of this prompt looks something like that:

    [Some introduction text]

    ```save foo.md
    [In file explanation]

    ```python
    [python code block] 
    ```
    [In file conclusion]
    ```
    [end of llm answer]

If you are using streaming mode, the tooluse detection will stop immediately after the three back quotes preceding the python code block type because the check is executed after every single char.

Solution proposed

The idea of this MR is to wait for the end of a line (i.e. a \n char) to execute the tool check. That way, we know the next three back quotes aren't the end of the code block as they are followed by the code block type (python here) and the tool use detection algorithm can handle that.

It's not perfect as it still won't detect if there is no code block type but it should work in more situation waiting for a more robust solution.

Also I think it's a good idea to wait for the end of the line to try to detect the tools. I don't think we have benefits to execute it on every single char unless I missed something.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 2a3a910 in 28 seconds

More details
  • Looked at 48 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. tests/test_codeblock.py:61
  • Draft comment:
    Good addition of test_extract_codeblocks_unfinished_nested to verify behavior with unfinished nested code blocks.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The test test_extract_codeblocks_unfinished_nested is correctly added to verify the new behavior of detecting unfinished nested code blocks. This ensures that the tool detection logic is working as intended.

Workflow ID: wflow_2c382HSdk4AdGcsF


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@jrmi jrmi changed the title Draft: Trigger tool detection only on complete lines to help nested code block start detection Trigger tool detection only on complete lines to help nested code block start detection Nov 10, 2024
@ErikBjare
Copy link
Owner

Very good fix/workaround/improvement!

Thanks a lot :)

@ErikBjare ErikBjare merged commit 5a228d4 into ErikBjare:master Nov 10, 2024
4 of 7 checks passed
@ErikBjare
Copy link
Owner

ErikBjare commented Nov 15, 2024

This was certainly an improvement!

I stumbled into it failing to write this though:

Sure, let's save that:

```save rag.py
    ...
    return ToolSpec(
        name="rag",
        desc="RAG (Retrieval-Augmented Generation) for context-aware assistance",
        instructions="""
        Use RAG to index and search project documentation.

        Commands:
        - index [paths...] - Index documents in specified paths
        - search <query> - Search indexed documents
        - status - Show index status
        """,
        examples="""
        > User: Index the current directory
        ```rag index```

        > User: Search for documentation about functions
        ```rag search function documentation```

        > User: Show index status
        ```rag status```
        """,
        block_types=["rag"],
        execute=execute_rag,
        available=True,
    )

tool = init_rag()
```

(...message keeps going, doesn't get interrupted to execute save tool)

I assume this is due to the presence of ``` in the file, although they are indented, so they really shouldn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants