Agents SDK Overview
computer-agents is the TypeScript SDK that provides a unified interface for building multi-agent systems. Create LLM agents for reasoning and computer agents for execution, with seamless switching between local and cloud runtimes.
Installation
cd computer-agents
pnpm install
pnpm buildOr add to your project:
npm install computer-agents
# or: pnpm add computer-agentsQuick example
import { Agent, run, LocalRuntime } from 'computer-agents';
// Computer agent for execution
const executor = new Agent({
name: 'Developer',
agentType: 'computer',
runtime: new LocalRuntime(),
workspace: './my-repo',
instructions: 'You are a software developer.'
});
const result = await run(executor, 'Add input validation to the API');
console.log(result.finalOutput);Two agent types
Testbase has exactly two agent types. This simplicity is intentional—specialized roles like “planner” or “reviewer” are workflow patterns you compose manually, not built-in types.
LLM Agents (agentType: 'llm')
Purpose: Reasoning, planning, analysis, and review
Execution: OpenAI API (direct function calling)
Configuration:
const planner = new Agent({
name: 'Planner',
agentType: 'llm',
model: 'gpt-4o',
instructions: 'Create detailed implementation plans.',
temperature: 0.7,
maxTokens: 4000
});Key properties:
agentType: 'llm'(required)model- OpenAI model name (default:'gpt-4o')instructions- System prompt for the agenttools- Custom function tools (optional)handoffs- Other agents this agent can delegate totemperature,maxTokens- Standard OpenAI parameters
Use cases:
- Creating implementation plans
- Reviewing code quality
- Analyzing requirements
- Making decisions
- Coordinating workflows
No runtime required - LLM agents execute directly via OpenAI API.
Computer Agents (agentType: 'computer')
Purpose: Code execution, file operations, and terminal commands
Execution: Codex SDK (computer-use agent)
Configuration:
import { Agent, run, LocalRuntime } from 'computer-agents';
const executor = new Agent({
name: 'Developer',
agentType: 'computer',
runtime: new LocalRuntime({
debug: true
}),
workspace: './project',
instructions: 'Execute code changes as requested.'
});Key properties:
agentType: 'computer'(required)runtime- LocalRuntime or CloudRuntime (required)workspace- Git repository path (required)instructions- System prompt for the agentmcpServers- MCP server configurationsreasoningEffort- ‘none’ | ‘low’ | ‘medium’ | ‘high’
Use cases:
- Writing and modifying code
- Running tests and builds
- Installing packages
- Executing shell commands
- Deploying applications
Runtime required - Computer agents need LocalRuntime or CloudRuntime.
Agent configuration reference
interface AgentConfig {
// Core (required)
name: string;
agentType: 'llm' | 'computer';
// Computer agents only
runtime?: LocalRuntime | CloudRuntime; // Required for 'computer'
workspace?: string; // Required for 'computer'
// LLM agents only
model?: string; // Default: 'gpt-4o'
tools?: ToolDefinition[]; // Custom function tools
// Common (optional)
instructions?: string; // System prompt
mcpServers?: McpServerConfig[]; // Unified MCP config
reasoningEffort?: 'none' | 'low' | 'medium' | 'high';
temperature?: number;
maxTokens?: number;
handoffs?: Agent[]; // Agents to delegate to
}Execution with run()
The run() function executes a task and returns results:
import { run } from 'computer-agents';
const result = await run(agent, 'Create a Python calculator');
// Result structure
interface RunResult {
finalOutput: string; // Main result from agent
state: AgentState; // Full execution state
// ... other metadata
}Input formats
Simple string:
await run(agent, 'Add error handling');With options:
await run(agent, {
input: 'Add error handling',
outputSchema: zodSchema // For structured output
});Session continuity
Testbase automatically maintains session continuity across multiple run() calls:
const agent = new Agent({
agentType: 'computer',
runtime: new LocalRuntime(),
workspace: './project'
});
// First run - new session
await run(agent, 'Create app.py');
console.log(agent.currentThreadId); // "thread-abc-123"
// Second run - continues same session!
await run(agent, 'Add error handling');
console.log(agent.currentThreadId); // "thread-abc-123" (same!)
// Third run - still the same session
await run(agent, 'Add tests');
console.log(agent.currentThreadId); // "thread-abc-123" (same!)
// Reset when needed
agent.resetSession();
await run(agent, 'New project');
console.log(agent.currentThreadId); // "thread-xyz-789" (new!)Key benefits:
- No manual session management
- Agent remembers previous conversation
- Natural incremental workflows
- Matches human development patterns
Runtimes
Runtimes determine where and how computer agents execute.
LocalRuntime
Local machine execution:
import { LocalRuntime } from 'computer-agents';
const runtime = new LocalRuntime({
debug: true // Show execution details
});
const agent = new Agent({
agentType: 'computer',
runtime,
workspace: './project'
});Characteristics:
- Fast (no network overhead)
- Free (no cloud costs)
- Direct file access
- Good for development
CloudRuntime
GCE VM execution with workspace sync:
import { CloudRuntime } from 'computer-agents';
const runtime = new CloudRuntime({
apiKey: process.env.TESTBASE_API_KEY,
debug: true
});
const agent = new Agent({
agentType: 'computer',
runtime,
workspace: './project'
});Characteristics:
- Isolated environment
- Usage billing
- Workspace persistence (GCS)
- Good for production
Switch runtimes easily:
// Development
const runtime = new LocalRuntime();
// Production
const runtime = new CloudRuntime({
apiKey: process.env.TESTBASE_API_KEY
});
// Same agent code works with both!
const agent = new Agent({
agentType: 'computer',
runtime, // Just change this
workspace: './project'
});See Cloud Platform for detailed CloudRuntime documentation.
Multi-agent workflows
Build sophisticated workflows by composing agents manually.
Pattern: Plan → Execute → Review
import { Agent, run, LocalRuntime } from 'computer-agents';
// Step 1: LLM creates plan
const planner = new Agent({
agentType: 'llm',
model: 'gpt-4o',
instructions: 'Create detailed implementation plans.'
});
// Step 2: Computer agent executes
const executor = new Agent({
agentType: 'computer',
runtime: new LocalRuntime(),
workspace: './project',
instructions: 'Execute implementation plans.'
});
// Step 3: LLM reviews result
const reviewer = new Agent({
agentType: 'llm',
model: 'gpt-4o',
instructions: 'Review code for quality and correctness.'
});
// Run the workflow
const task = 'Add user authentication';
const plan = await run(planner, `Plan: ${task}`);
const implementation = await run(executor, plan.finalOutput);
const review = await run(reviewer, `Review: ${implementation.finalOutput}`);
console.log('Review:', review.finalOutput);Pattern: Iterative refinement
let approved = false;
let task = 'Add user authentication';
while (!approved) {
// Computer agent implements
const implementation = await run(executor, task);
// LLM agent reviews
const review = await run(reviewer, `Review: ${implementation.finalOutput}`);
if (review.finalOutput.includes('APPROVED')) {
approved = true;
console.log('Implementation approved!');
} else {
// Extract issues and retry
task = `Fix these issues: ${review.finalOutput}`;
}
}Pattern: Agent handoffs
Use the OpenAI Agents SDK handoff system:
const planner = new Agent({
agentType: 'llm',
model: 'gpt-4o',
handoffs: [executor], // Can delegate to executor
instructions: 'Create plans and delegate execution.'
});
const executor = new Agent({
agentType: 'computer',
runtime: new LocalRuntime(),
workspace: './project',
handoffs: [reviewer], // Can delegate to reviewer
instructions: 'Execute plans and request review.'
});
const reviewer = new Agent({
agentType: 'llm',
model: 'gpt-4o',
instructions: 'Review implementations for quality.'
});
// Start with planner - it manages the workflow
const result = await run(planner, 'Build a REST API for users');Pattern: Parallel execution
// Execute multiple independent tasks in parallel
const tasks = [
'Add input validation',
'Update documentation',
'Fix TypeScript errors'
];
const results = await Promise.all(
tasks.map(task => run(executor, task))
);
console.log('All tasks completed:', results.map(r => r.finalOutput));Error handling
try {
const result = await run(agent, 'Complex task');
console.log('Success:', result.finalOutput);
} catch (error) {
if (error.message.includes('Runtime required')) {
console.error('Computer agents need a runtime!');
} else if (error.message.includes('Insufficient Credits')) {
console.error('Out of credits. Please add more.');
} else {
console.error('Execution failed:', error.message);
}
}Best practices
1. Use clear instructions
// ❌ Vague
instructions: 'Do coding stuff'
// ✅ Clear
instructions: 'You are a senior software developer. Write clean, well-tested code following the project conventions. Always include error handling and documentation.'2. Compose workflows explicitly
// ✅ Explicit workflow (you control the flow)
const plan = await run(planner, task);
const code = await run(executor, plan.finalOutput);
const review = await run(reviewer, code.finalOutput);
// This is better than magic orchestration - you see exactly what happens3. Use session continuity
// ✅ Natural incremental development
await run(agent, 'Create the user model');
await run(agent, 'Add validation');
await run(agent, 'Add tests');
// Agent remembers context from previous tasks4. Choose the right runtime
// ✅ Development: LocalRuntime (fast, free)
const devRuntime = new LocalRuntime();
// ✅ Production: CloudRuntime (isolated, billed)
const prodRuntime = new CloudRuntime({
apiKey: process.env.TESTBASE_API_KEY
});Next steps
- MCP Integration - Connect external tools and services
- Cloud Connection - Deploy to production with CloudRuntime
- Advanced Patterns - Complex workflows and optimization
- Cloud Platform - Billing, API keys, and operations
Examples
Check out the ready-to-run examples:
cd computer-agents/examples/testbase
# Basic computer agent
node basic-computer-agent.mjs
# Cloud execution
TESTBASE_API_KEY=key node computer-agent-cloud.mjs
# Multi-agent workflow
node multi-agent-workflow.mjs
# Session continuity
node hello-world.mjsYou’re ready to build! The SDK is simple, powerful, and production-ready.