You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since Anthropic just announced their computer use stuff (see #50 (comment)), we should just finish ours as we already have the screenshots.
We can take screenshots, we just need to enable acting on them by clicking or making input.
To not burn tons of tokens we should probably put it in a loop where it doesn't stack tons view/interact steps, maybe use a looping subagent or some kind of context-efficient tool-use loop until goal is achieved (we might need some stuff like this generally for automation goals). Should study how they do it.
They run it in a Docker container and stream it in a webapp using VNC. We should make it possible to do it this way with gptme, but I think gptme should be able to control the local system first, and a Docker system second.
Since Anthropic just announced their computer use stuff (see #50 (comment)), we should just finish ours as we already have the screenshots.
We can take screenshots, we just need to enable acting on them by clicking or making input.
To not burn tons of tokens we should probably put it in a loop where it doesn't stack tons view/interact steps, maybe use a looping subagent or some kind of context-efficient tool-use loop until goal is achieved (we might need some stuff like this generally for automation goals). Should study how they do it.
They run it in a Docker container and stream it in a webapp using VNC. We should make it possible to do it this way with gptme, but I think gptme should be able to control the local system first, and a Docker system second.
Milestones
The text was updated successfully, but these errors were encountered: