-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path.cursorrules
210 lines (168 loc) · 6.79 KB
/
.cursorrules
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
You are working with Scrapybara, a TypeScript SDK for deploying and managing remote desktop instances for AI agents. Use this guide to properly interact with the SDK.
CORE SDK USAGE:
- Initialize client: import { ScrapybaraClient } from "scrapybara"; const client = new ScrapybaraClient({ apiKey: "KEY" });
- Instance lifecycle:
const instance = await client.startUbuntu({ timeoutHours: 1 });
await instance.pause(); // Pause to save resources
await instance.resume({ timeoutHours: 1 }); // Resume work
await instance.stop(); // Terminate and clean up
- Instance types:
const ubuntuInstance = client.startUbuntu(); // supports bash, computer, edit, browser
const browserInstance = client.startBrowser(); // supports computer, browser
const windowsInstance = client.startWindows(); // supports computer
CORE INSTANCE OPERATIONS:
- Screenshots: const base64Image = await instance.screenshot().base64Image;
- Bash commands: await instance.bash({ command: "ls -la" });
- Mouse control: await instance.computer({ action: "mouse_move", coordinate: [x, y] });
- Click actions: await instance.computer({ action: "left_click" });
- File operations: await instance.file.read({ path: "/path/file" }), await instance.file.write({ path: "/path/file", content: "data" });
ACT SDK (Primary Focus):
- Purpose: Enables building computer use agents with unified tools and model interfaces
- Core components:
1. Model: Handles LLM integration (currently Anthropic)
import { anthropic } from "scrapybara/anthropic";
const model = anthropic(); // Or model = anthropic({ apiKey: "KEY" }) for own key
2. Tools: Interface for computer interactions
- bashTool: Run shell commands
- computerTool: Mouse/keyboard control
- editTool: File operations
const tools = [
bashTool(instance),
computerTool(instance),
editTool(instance),
];
3. Prompt:
- system: system prompt, recommend to use UBUNTU_SYSTEM_PROMPT, BROWSER_SYSTEM_PROMPT, WINDOWS_SYSTEM_PROMPT
- prompt: simple user prompt
- messages: list of messages
- Only include either prompt or messages, not both
const { messages, steps, text, output, usage } = await client.act({
model: anthropic(),
tools,
system: UBUNTU_SYSTEM_PROMPT,
prompt: "Task",
onStep: handleStep
});
MESSAGE HANDLING:
- Response Structure: Messages are structured with roles (user/assistant/tool) and typed content
- Content Types:
- TextPart: Simple text content
{ type: "text", text: "content" }
- ImagePart: Base64 or URL images
{ type: "image", image: "base64...", mimeType: "image/png" }
- ToolCallPart: Tool invocations
{
type: "tool-call",
toolCallId: "id",
toolName: "bash",
args: { command: "ls" }
}
- ToolResultPart: Tool execution results
{
type: "tool-result",
toolCallId: "id",
toolName: "bash",
result: "output",
isError: false
}
STEP HANDLING:
// Access step information in callbacks
const handleStep = (step: Step) => {
console.log(`Text: ${step.text}`);
if (step.toolCalls) {
for (const call of step.toolCalls) {
console.log(`Tool: ${call.toolName}`);
}
}
if (step.toolResults) {
for (const result of step.toolResults) {
console.log(`Result: ${result.result}`);
}
}
console.log(`Tokens: ${step.usage?.totalTokens ?? 'N/A'}`);
};
STRUCTURED OUTPUT:
Use the schema parameter to define a desired structured output. The response's output field will contain the validated typed data returned by the model.
const schema = z.object({
posts: z.array(z.object({
title: z.string(),
url: z.string(),
points: z.number(),
})),
});
const { output } = await client.act({
model: anthropic(),
tools,
schema,
system: UBUNTU_SYSTEM_PROMPT,
prompt: "Get the top 10 posts on Hacker News",
});
const posts = output.posts;
TOKEN USAGE:
- Track token usage through TokenUsage objects
- Fields: promptTokens, completionTokens, totalTokens
- Available in both Step and ActResponse objects
Here's a brief example of how to use the Scrapybara SDK:
import { ScrapybaraClient } from "scrapybara";
import { anthropic } from "scrapybara/anthropic";
import { UBUNTU_SYSTEM_PROMPT } from "scrapybara/prompts";
import { bashTool, computerTool, editTool } from "scrapybara/tools";
const client = new ScrapybaraClient();
const instance = await client.startUbuntu();
await instance.browser.start();
const { messages, steps, text, output, usage } = await client.act({
model: anthropic(),
tools: [
bashTool(instance),
computerTool(instance),
editTool(instance),
],
system: UBUNTU_SYSTEM_PROMPT,
prompt: "Go to the YC website and fetch the HTML",
onStep: (step) => console.log(`${step}\n`),
});
await instance.browser.stop();
await instance.stop();
EXECUTION PATTERNS:
1. Basic agent execution:
const { messages, steps, text, output, usage } = await client.act({
model: anthropic(),
tools,
system: "System context here",
prompt: "Task description"
});
2. Browser automation:
const cdpUrl = await instance.browser.start().cdpUrl;
const authStateId = await instance.browser.saveAuth({ name: "default" }).authStateId; // Save auth
await instance.browser.authenticate({ authStateId }); // Reuse auth
3. File management:
await instance.file.write({ path: "/tmp/data.txt", content: "content" });
const content = await instance.file.read({ path: "/tmp/data.txt" }).content;
IMPORTANT GUIDELINES:
- Always stop instances after use to prevent unnecessary billing
- Use async/await for all operations as they are asynchronous
- Handle API errors with try/catch blocks
- Default timeout is 60s; customize with timeout parameter or requestOptions
- Instance auto-terminates after 1 hour by default
- For browser operations, always start browser before browserTool usage
- Prefer bash commands over GUI interactions for launching applications
ERROR HANDLING:
import { ApiError } from "scrapybara/core";
try {
await client.startUbuntu();
} catch (e) {
if (e instanceof ApiError) {
console.error(`Error ${e.statusCode}: ${e.body}`);
}
}
BROWSER TOOL OPERATIONS:
- Required setup:
const cdpUrl = await instance.browser.start().cdpUrl;
const tools = [browserTool(instance)];
- Commands: goTo, getHtml, evaluate, click, type, screenshot, getText, getAttribute
- Always handle browser authentication states appropriately
ENV VARIABLES & CONFIGURATION:
- Set env vars: await instance.env.set({ API_KEY: "value" });
- Get env vars: const vars = await instance.env.get().variables;
- Delete env vars: await instance.env.delete(["VAR_NAME"]);
Remember to handle resources properly and implement appropriate error handling in your code. This SDK is primarily designed for AI agent automation tasks, so structure your code accordingly.