Agents SDK Overview

computer-agents is the TypeScript SDK that provides a unified interface for building multi-agent systems. Create LLM agents for reasoning and computer agents for execution, with seamless switching between local and cloud runtimes.

Installation


cd computer-agents
pnpm install
pnpm build

Or add to your project:


npm install computer-agents
# or: pnpm add computer-agents

Quick example


import { Agent, run, LocalRuntime } from 'computer-agents';
 
// Computer agent for execution
const executor = new Agent({
  name: 'Developer',
  agentType: 'computer',
  runtime: new LocalRuntime(),
  workspace: './my-repo',
  instructions: 'You are a software developer.'
});
 
const result = await run(executor, 'Add input validation to the API');
console.log(result.finalOutput);

Two agent types

Testbase has exactly two agent types. This simplicity is intentional—specialized roles like “planner” or “reviewer” are workflow patterns you compose manually, not built-in types.

LLM Agents (`agentType: 'llm'`)

Purpose: Reasoning, planning, analysis, and review

Execution: OpenAI API (direct function calling)

Configuration:


const planner = new Agent({
  name: 'Planner',
  agentType: 'llm',
  model: 'gpt-4o',
  instructions: 'Create detailed implementation plans.',
  temperature: 0.7,
  maxTokens: 4000
});

Key properties:

agentType: 'llm' (required)
model - OpenAI model name (default: 'gpt-4o')
instructions - System prompt for the agent
tools - Custom function tools (optional)
handoffs - Other agents this agent can delegate to
temperature, maxTokens - Standard OpenAI parameters

Use cases:

Creating implementation plans
Reviewing code quality
Analyzing requirements
Making decisions
Coordinating workflows

No runtime required - LLM agents execute directly via OpenAI API.

Computer Agents (`agentType: 'computer'`)

Purpose: Code execution, file operations, and terminal commands

Execution: Codex SDK (computer-use agent)

Configuration:


import { Agent, run, LocalRuntime } from 'computer-agents';
 
const executor = new Agent({
  name: 'Developer',
  agentType: 'computer',
  runtime: new LocalRuntime({
    debug: true
  }),
  workspace: './project',
  instructions: 'Execute code changes as requested.'
});

Key properties:

agentType: 'computer' (required)
runtime - LocalRuntime or CloudRuntime (required)
workspace - Git repository path (required)
instructions - System prompt for the agent
mcpServers - MCP server configurations
reasoningEffort - ‘none’ | ‘low’ | ‘medium’ | ‘high’

Use cases:

Writing and modifying code
Running tests and builds
Installing packages
Executing shell commands
Deploying applications

Runtime required - Computer agents need LocalRuntime or CloudRuntime.

Agent configuration reference


interface AgentConfig {
  // Core (required)
  name: string;
  agentType: 'llm' | 'computer';
 
  // Computer agents only
  runtime?: LocalRuntime | CloudRuntime;  // Required for 'computer'
  workspace?: string;                      // Required for 'computer'
 
  // LLM agents only
  model?: string;                          // Default: 'gpt-4o'
  tools?: ToolDefinition[];                // Custom function tools
 
  // Common (optional)
  instructions?: string;                   // System prompt
  mcpServers?: McpServerConfig[];          // Unified MCP config
  reasoningEffort?: 'none' | 'low' | 'medium' | 'high';
  temperature?: number;
  maxTokens?: number;
  handoffs?: Agent[];                      // Agents to delegate to
}

Execution with `run()`

The run() function executes a task and returns results:


import { run } from 'computer-agents';
 
const result = await run(agent, 'Create a Python calculator');
 
// Result structure
interface RunResult {
  finalOutput: string;        // Main result from agent
  state: AgentState;          // Full execution state
  // ... other metadata
}

Input formats

Simple string:


await run(agent, 'Add error handling');

With options:


await run(agent, {
  input: 'Add error handling',
  outputSchema: zodSchema  // For structured output
});

Session continuity

Testbase automatically maintains session continuity across multiple run() calls:


const agent = new Agent({
  agentType: 'computer',
  runtime: new LocalRuntime(),
  workspace: './project'
});
 
// First run - new session
await run(agent, 'Create app.py');
console.log(agent.currentThreadId);  // "thread-abc-123"
 
// Second run - continues same session!
await run(agent, 'Add error handling');
console.log(agent.currentThreadId);  // "thread-abc-123" (same!)
 
// Third run - still the same session
await run(agent, 'Add tests');
console.log(agent.currentThreadId);  // "thread-abc-123" (same!)
 
// Reset when needed
agent.resetSession();
await run(agent, 'New project');
console.log(agent.currentThreadId);  // "thread-xyz-789" (new!)

Key benefits:

No manual session management
Agent remembers previous conversation
Natural incremental workflows
Matches human development patterns

Runtimes

Runtimes determine where and how computer agents execute.

LocalRuntime

Local machine execution:


import { LocalRuntime } from 'computer-agents';
 
const runtime = new LocalRuntime({
  debug: true  // Show execution details
});
 
const agent = new Agent({
  agentType: 'computer',
  runtime,
  workspace: './project'
});

Characteristics:

Fast (no network overhead)
Free (no cloud costs)
Direct file access
Good for development

CloudRuntime

GCE VM execution with workspace sync:


import { CloudRuntime } from 'computer-agents';
 
const runtime = new CloudRuntime({
  apiKey: process.env.TESTBASE_API_KEY,
  debug: true
});
 
const agent = new Agent({
  agentType: 'computer',
  runtime,
  workspace: './project'
});

Characteristics:

Isolated environment
Usage billing
Workspace persistence (GCS)
Good for production

Switch runtimes easily:


// Development
const runtime = new LocalRuntime();
 
// Production
const runtime = new CloudRuntime({
  apiKey: process.env.TESTBASE_API_KEY
});
 
// Same agent code works with both!
const agent = new Agent({
  agentType: 'computer',
  runtime,  // Just change this
  workspace: './project'
});

See Cloud Platform for detailed CloudRuntime documentation.

Multi-agent workflows

Build sophisticated workflows by composing agents manually.

Pattern: Plan → Execute → Review


import { Agent, run, LocalRuntime } from 'computer-agents';
 
// Step 1: LLM creates plan
const planner = new Agent({
  agentType: 'llm',
  model: 'gpt-4o',
  instructions: 'Create detailed implementation plans.'
});
 
// Step 2: Computer agent executes
const executor = new Agent({
  agentType: 'computer',
  runtime: new LocalRuntime(),
  workspace: './project',
  instructions: 'Execute implementation plans.'
});
 
// Step 3: LLM reviews result
const reviewer = new Agent({
  agentType: 'llm',
  model: 'gpt-4o',
  instructions: 'Review code for quality and correctness.'
});
 
// Run the workflow
const task = 'Add user authentication';
const plan = await run(planner, `Plan: ${task}`);
const implementation = await run(executor, plan.finalOutput);
const review = await run(reviewer, `Review: ${implementation.finalOutput}`);
 
console.log('Review:', review.finalOutput);

Pattern: Iterative refinement


let approved = false;
let task = 'Add user authentication';
 
while (!approved) {
  // Computer agent implements
  const implementation = await run(executor, task);
 
  // LLM agent reviews
  const review = await run(reviewer, `Review: ${implementation.finalOutput}`);
 
  if (review.finalOutput.includes('APPROVED')) {
    approved = true;
    console.log('Implementation approved!');
  } else {
    // Extract issues and retry
    task = `Fix these issues: ${review.finalOutput}`;
  }
}

Pattern: Agent handoffs

Use the OpenAI Agents SDK handoff system:


const planner = new Agent({
  agentType: 'llm',
  model: 'gpt-4o',
  handoffs: [executor],  // Can delegate to executor
  instructions: 'Create plans and delegate execution.'
});
 
const executor = new Agent({
  agentType: 'computer',
  runtime: new LocalRuntime(),
  workspace: './project',
  handoffs: [reviewer],  // Can delegate to reviewer
  instructions: 'Execute plans and request review.'
});
 
const reviewer = new Agent({
  agentType: 'llm',
  model: 'gpt-4o',
  instructions: 'Review implementations for quality.'
});
 
// Start with planner - it manages the workflow
const result = await run(planner, 'Build a REST API for users');

Pattern: Parallel execution


// Execute multiple independent tasks in parallel
const tasks = [
  'Add input validation',
  'Update documentation',
  'Fix TypeScript errors'
];
 
const results = await Promise.all(
  tasks.map(task => run(executor, task))
);
 
console.log('All tasks completed:', results.map(r => r.finalOutput));

Error handling


try {
  const result = await run(agent, 'Complex task');
  console.log('Success:', result.finalOutput);
} catch (error) {
  if (error.message.includes('Runtime required')) {
    console.error('Computer agents need a runtime!');
  } else if (error.message.includes('Insufficient Credits')) {
    console.error('Out of credits. Please add more.');
  } else {
    console.error('Execution failed:', error.message);
  }
}

Best practices

1. Use clear instructions


// ❌ Vague
instructions: 'Do coding stuff'
 
// ✅ Clear
instructions: 'You are a senior software developer. Write clean, well-tested code following the project conventions. Always include error handling and documentation.'

2. Compose workflows explicitly


// ✅ Explicit workflow (you control the flow)
const plan = await run(planner, task);
const code = await run(executor, plan.finalOutput);
const review = await run(reviewer, code.finalOutput);
 
// This is better than magic orchestration - you see exactly what happens

3. Use session continuity


// ✅ Natural incremental development
await run(agent, 'Create the user model');
await run(agent, 'Add validation');
await run(agent, 'Add tests');
// Agent remembers context from previous tasks

4. Choose the right runtime


// ✅ Development: LocalRuntime (fast, free)
const devRuntime = new LocalRuntime();
 
// ✅ Production: CloudRuntime (isolated, billed)
const prodRuntime = new CloudRuntime({
  apiKey: process.env.TESTBASE_API_KEY
});

Next steps

MCP Integration - Connect external tools and services
Cloud Connection - Deploy to production with CloudRuntime
Advanced Patterns - Complex workflows and optimization
Cloud Platform - Billing, API keys, and operations

Examples

Check out the ready-to-run examples:


cd computer-agents/examples/testbase
 
# Basic computer agent
node basic-computer-agent.mjs
 
# Cloud execution
TESTBASE_API_KEY=key node computer-agent-cloud.mjs
 
# Multi-agent workflow
node multi-agent-workflow.mjs
 
# Session continuity
node hello-world.mjs

You’re ready to build! The SDK is simple, powerful, and production-ready.