feat: implement anthropic-style computer tool #225

ErikBjare · 2024-10-24T18:08:04Z

Fixes #216

gptme wrote all of it initially, I just gave it Anthropics reference implementation.

The allowlisted read-commands were super nice, made it look through the code really fast! Really nice work @brayo-pip!

Doesn't fully work yet, but damn close. Hit Anthropic daily rate limit when working on it, so pushing now.

get going
get it working in Docker like Anthropic
get it working locally
make sure docs are correct
only enable the computer tool by default inside of the computer-use context
figure out what is causing the delays
- I seem to have fixed some of the delay by removing a speed limit argument that I accidentally added? Maybe?
- Still delays when starting new terminal windows
- I don't remember seeing these delays on my macOS machine
make the UI nice (nice enough, for now)
share the original conversation where gptme wrote most of it as an example
- Add a way to share conversations #32

Screenshot

Important

Introduces a new computer tool for GUI automation using X11, with Docker support and updated documentation.

Behavior:
- Introduces a new computer tool for GUI automation using X11 in gptme/tools/computer.py.
- Adds routes /computer and /chat in api.py to serve new interfaces.
- Updates start_x11.sh to initialize Xvfb, tint2, mutter, VNC, and noVNC for GUI automation.
Docker:
- Adds Dockerfile.computer for building a Docker image with X11 support.
- Updates Makefile with build-docker-computer target.
Documentation:
- Updates docs/examples.rst, docs/server.rst, and docs/tools.rst with instructions for using the new computer tool.
Misc:
- Adds desktop entry files for applications in image/.config/tint2/applications/.
- Updates .dockerignore to include scripts/start_x11.sh.

^{This description was created by}^{for fe9a633. It will automatically update as commits are pushed.}

ellipsis-dev

❌ Changes requested. Reviewed everything up to c110124 in 1 minute and 11 seconds

More details

Looked at 1024 lines of code in 17 files
Skipped 1 files when reviewing.
Skipped posting 4 drafted comments based on config settings.

1. gptme/tools/computer.py:94

Draft comment:
Consider adding exception handling for subprocess.Popen to manage potential errors gracefully.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is relevant to the changes made in the diff, specifically the run_xdotool function. The current implementation does handle errors by checking the return code and raising an exception, but the comment suggests a more graceful approach. This could be considered a code quality improvement, as it might involve more specific exception handling or logging.
The current error handling might be sufficient for the intended use case, and adding more exception handling could complicate the code unnecessarily. The comment does not specify what 'graceful' handling would entail.
While the current error handling is functional, the suggestion for more graceful handling could improve the robustness of the code, especially in a production environment where more detailed error information might be useful.
The comment is about a change made in the diff and suggests a potential code quality improvement. It should be kept as it provides a clear and actionable suggestion.

2. scripts/start_x11.sh:10

Draft comment:
Consider verifying if the Xvfb process is alive in addition to checking the lock file to avoid false positives.
Reason this comment was not posted:
Comment did not seem useful.

3. scripts/start_x11.sh:19

Draft comment:
Consider making the timeout for wait_for_xvfb configurable to allow flexibility in different environments.
Reason this comment was not posted:
Confidence changes required: 50%
The wait_for_xvfb function uses a fixed timeout to wait for Xvfb to start. This could be improved by making the timeout configurable.

4. scripts/start_x11.sh:52

Draft comment:
Ensure xdotool is installed or handle the case where it is missing to prevent script failures.
Reason this comment was not posted:
Confidence changes required: 50%
The start_x11.sh script uses xdotool to check for running processes, which may not be installed in all environments. This could lead to failures if xdotool is missing.

Workflow ID: wflow_IXuPzrPgFdbAI4PE

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

gptme/tools/computer.py

codecov-commenter · 2024-10-24T18:10:52Z

Codecov Report

Attention: Patch coverage is 33.62832% with 75 lines in your changes missing coverage. Please review.

Project coverage is 72.67%. Comparing base (a6b41aa) to head (fe9a633).
Report is 1 commits behind head on master.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
gptme/tools/computer.py	29.59%	69 Missing ⚠️
gptme/server/api.py	66.66%	2 Missing ⚠️
gptme/tools/vision.py	0.00%	2 Missing ⚠️
gptme/tools/__init__.py	83.33%	1 Missing ⚠️
gptme/tools/screenshot.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #225      +/-   ##
==========================================
- Coverage   74.16%   72.67%   -1.50%     
==========================================
  Files          59       60       +1     
  Lines        3731     3842     +111     
==========================================
+ Hits         2767     2792      +25     
- Misses        964     1050      +86

Flag	Coverage Δ
anthropic/claude-3-haiku-20240307	`71.65% <33.62%> (-1.41%)`	⬇️
openai/gpt-4o-mini	`71.34% <33.62%> (-1.35%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

0xbrayo · 2024-10-24T18:56:47Z

Super excited for this. Especially if it can later be integrated with local models. Could be game-changing.

ellipsis-dev

👍 Looks good to me! Incremental review on 2653012 in 28 seconds

More details

Looked at 19 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. scripts/Dockerfile.computer:61

Draft comment:
The poetry install command is run twice, once on line 58 and again on line 61. This is redundant and can be optimized by combining them into a single command after copying the project files.
Reason this comment was not posted:
Confidence changes required: 50%
The Dockerfile has a redundant poetry install command after copying the entire project. This can be optimized by combining the two poetry install commands into one after copying the project files.

Workflow ID: wflow_AAC9AJUFRdScP8tk

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

❌ Changes requested. Incremental review on 0dc3d40 in 50 seconds

More details

Looked at 497 lines of code in 13 files
Skipped 0 files when reviewing.
Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_negvnhzCgZJITmER

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

gptme/tools/computer.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

ellipsis-dev

👍 Looks good to me! Incremental review on 6b44dd6 in 24 seconds

More details

Looked at 497 lines of code in 13 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. gptme/tools/computer.py:94

Draft comment:
Using shell=True can be a security risk if the command string is constructed from external input. Consider using a list of arguments instead of a single string for the command.
Reason this comment was not posted:
Confidence changes required: 50%
The use of shell=True in subprocess.Popen can be a security risk if the command string is constructed from external input. In this case, it seems to be used with a fixed command, but it's still a good practice to avoid it when not necessary.

2. gptme/tools/computer.py:98

Draft comment:
Consider handling exceptions more specifically to ensure that subsequent actions are not attempted if a critical command fails.
Reason this comment was not posted:
Confidence changes required: 50%
The run_xdotool function is used multiple times in the computer_action function. If any of these commands fail, the error is caught and printed, but the function continues execution. This might lead to unexpected behavior if subsequent commands depend on the success of previous ones.

3. scripts/Dockerfile.computer:98

Draft comment:
Consider adding a health check for the VNC service on port 6080 to ensure it's running correctly.
Reason this comment was not posted:
Confidence changes required: 50%
The Dockerfile exposes ports 6080 and 8080, but the health check only checks port 8080. It might be beneficial to also check the status of the VNC service on port 6080 to ensure it's running correctly.

Workflow ID: wflow_eKurPjikSF9JTuYp

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ErikBjare · 2024-11-01T10:50:34Z

.dockerignore

@@ -10,6 +10,8 @@ gptme.toml

 # Build scripts
 scripts
+!scripts/start_x11.sh


Suggested change

!scripts/start_x11.sh

ErikBjare · 2024-11-01T11:02:55Z

I'm getting weird delays, but it works!

ellipsis-dev

👍 Looks good to me! Incremental review on 48b994e in 16 seconds

More details

Looked at 71 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. scripts/computer_home/entrypoint.sh:8

Draft comment:
Consider making the --tools option configurable instead of hardcoding computer,vision. This will allow more flexibility for different use cases.
Reason this comment was not posted:
Confidence changes required: 50%
The --tools option in entrypoint.sh is hardcoded to computer,vision, which might not be flexible for different use cases. It would be better to allow this to be configurable.

2. gptme/server/cli.py:29

Draft comment:
Consider adding validation for the --tools option to ensure only supported tools are enabled. Provide feedback if an invalid tool is specified.
Reason this comment was not posted:
Confidence changes required: 50%
The --tools option is added to the CLI, but there is no validation or feedback if an invalid tool is specified. This could lead to confusion if a user makes a typo or uses an unsupported tool.

Workflow ID: wflow_h9cTu6UXxhHAXr42

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 73f093f in 27 seconds

More details

Looked at 403 lines of code in 8 files
Skipped 0 files when reviewing.
Skipped posting 4 drafted comments based on config settings.

1. gptme/server/static/main.js:227

Draft comment:
The response from await req.json(); is not being used. Consider utilizing the response for further processing or error handling if needed.
Reason this comment was not posted:
Confidence changes required: 50%
In gptme/server/static/main.js, the sendMessage method has a potential issue where the await req.json(); call is not used. The response is not being utilized, which might be necessary for further processing or error handling.

2. gptme/tools/__init__.py:69

Draft comment:
Raising a ValueError for missing tools in the allowlist is a change from logging a warning. Ensure this change is intentional, as it could lead to crashes if the allowlist contains unavailable tools.
Reason this comment was not posted:
Confidence changes required: 50%
In gptme/tools/__init__.py, the init_tools function raises a ValueError if a tool in the allowlist is not found. This is a change from the previous behavior of logging a warning. This change might be intentional, but it could also lead to unexpected crashes if the allowlist contains a tool that is not available.

3. gptme/tools/computer.py:190

Draft comment:
The raise ValueError(f"Invalid action: {action}") is outside of any conditional block, which means it will always raise an error if none of the conditions are met. Ensure all valid actions are handled before this line.
Reason this comment was not posted:
Confidence changes required: 50%
In gptme/tools/computer.py, the computer function has a potential issue where the raise ValueError(f"Invalid action: {action}") is outside of any conditional block. This means it will always raise an error if none of the conditions are met, which might not be the intended behavior if the function is expected to handle all valid actions.

4. gptme/tools/screenshot.py:21

Draft comment:
The --overwrite option used with scrot is not valid and might cause the command to fail. Consider removing or replacing it with a valid option.
Reason this comment was not posted:
Confidence changes required: 50%
In gptme/tools/screenshot.py, the _screenshot function uses scrot with the --overwrite option, which is not a valid option for scrot. This might cause the command to fail on Linux systems.

Workflow ID: wflow_QCdaJNj8BdOfHcyS

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 136aa2a in 29 seconds

More details

Looked at 27 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_ando1II8li6zgoUG

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on f5d9586 in 27 seconds

More details

Looked at 27 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. gptme/tools/__init__.py:68

Draft comment:
Move the check for tool.name in tools_default_disabled before the tool.init call to avoid unnecessary initialization of tools that are disabled by default.
Reason this comment was not posted:
Comment looked like it was already resolved.

Workflow ID: wflow_UNcVdyHL7BLxu3VE

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

❌ Changes requested. Incremental review on 0b5e796 in 29 seconds

More details

Looked at 12 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_TZJ08dwhSxiL9Jqg

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

.dockerignore

ellipsis-dev

👍 Looks good to me! Incremental review on fe9a633 in 16 seconds

More details

Looked at 31 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. docs/server.rst:43

Draft comment:
Ensure the updated Docker run command is correct and tested, as it removes the port 5000 mapping. This could affect access to the basic chat interface.
Reason this comment was not posted:
Comment did not seem useful.

Workflow ID: wflow_bwhVDG0x0ZroBHaj

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev bot reviewed Oct 24, 2024

View reviewed changes

gptme/tools/computer.py Outdated Show resolved Hide resolved

gptme/tools/computer.py Outdated Show resolved Hide resolved

ErikBjare force-pushed the dev/computer-use branch from 304dc96 to 09b45d5 Compare October 24, 2024 18:13

This comment was marked as outdated.

Sign in to view

ellipsis-dev bot reviewed Oct 28, 2024

View reviewed changes

ErikBjare changed the title ~~feat: started working on anthropic-style computer tool~~ feat: implement anthropic-style computer tool Oct 29, 2024

ellipsis-dev bot reviewed Nov 1, 2024

View reviewed changes

gptme/tools/computer.py Outdated Show resolved Hide resolved

ErikBjare and others added 3 commits November 1, 2024 11:46

feat: started working on anthropic-style computer tool

652eb43

Apply suggestions from code review

f3b491d

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

fix: progress on computer use

0d1a9f3

ErikBjare force-pushed the dev/computer-use branch from 0dc3d40 to 6b44dd6 Compare November 1, 2024 10:46

ErikBjare added 8 commits November 1, 2024 11:46

fix: added Dockerfile.server

2eff75d

fix: fixed vnc in computer use webui

cb6ffd5

docs: fixed docs for computer use

20fb541

fix: rewrote computer_action function to not be a generator

067adbc

docs: fixed server docs for computer use

0ec7539

docs: refactored computer use warning into seperate file

65f2250

fix: optimized Dockerfile.computer for faster rebuilds

ea240bb

fix: refactor and misc fixes to computer use

6b44dd6

ellipsis-dev bot reviewed Nov 1, 2024

View reviewed changes

ErikBjare commented Nov 1, 2024

View reviewed changes

ErikBjare mentioned this pull request Nov 1, 2024

feat: add platform info to the system prompt #171

Merged

fix: enable select tools in computer use context

48b994e

ellipsis-dev bot reviewed Nov 1, 2024

View reviewed changes

fix: multiple fixes to computer use and web ui

73f093f

ellipsis-dev bot reviewed Nov 1, 2024

View reviewed changes

ErikBjare force-pushed the dev/computer-use branch from 136aa2a to f5d9586 Compare November 1, 2024 12:39

fix: disable computer tool unless explicitly enabled

f5d9586

ellipsis-dev bot reviewed Nov 1, 2024

View reviewed changes

fix: removed deleted file from .dockerignore

0b5e796

ellipsis-dev bot reviewed Nov 1, 2024

View reviewed changes

.dockerignore Show resolved Hide resolved

docs: minor fix to computer use docs

fe9a633

ellipsis-dev bot reviewed Nov 1, 2024

View reviewed changes

ErikBjare merged commit 175167e into master Nov 1, 2024
7 checks passed

ErikBjare mentioned this pull request Nov 1, 2024

Complete "computer use" support #216

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement anthropic-style computer tool #225

feat: implement anthropic-style computer tool #225

ErikBjare commented Oct 24, 2024 •

edited

Loading

ellipsis-dev bot left a comment

codecov-commenter commented Oct 24, 2024 •

edited

Loading

This comment was marked as outdated.

0xbrayo commented Oct 24, 2024

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ErikBjare Nov 1, 2024

ErikBjare commented Nov 1, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

feat: implement anthropic-style computer tool #225

feat: implement anthropic-style computer tool #225

Conversation

ErikBjare commented Oct 24, 2024 • edited Loading

Screenshot

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

codecov-commenter commented Oct 24, 2024 • edited Loading

Codecov Report

This comment was marked as outdated.

0xbrayo commented Oct 24, 2024

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ErikBjare Nov 1, 2024

Choose a reason for hiding this comment

ErikBjare commented Nov 1, 2024

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ErikBjare commented Oct 24, 2024 •

edited

Loading

codecov-commenter commented Oct 24, 2024 •

edited

Loading