Advanced Usage

Once you have agents running locally or in the cloud, use the techniques below to make them production-ready.

Switch providers dynamically

Agents can change execution mode mid-session by swapping the runtime client. Useful for fallbacks or progressive rollouts.


async function withCloudFallback(task: string) {
  const localWorker = new Agent({
    name: 'Hybrid Worker',
    agentType: 'worker',
    workspace: './repo'
  });
 
  try {
    return await run(localWorker, task);
  } catch (error) {
    console.warn('Local run failed, retrying in cloud', error);
 
    const cloud = new TestbaseCloud({ apiKey: process.env.TESTBASE_API_KEY });
    const cloudWorker = await cloud.createCloudAgent({
      name: 'Cloud Worker',
      agentType: 'worker',
      workspace: './repo'
    });
 
    return run(cloudWorker, task);
  }
}

Intercept results

run resolves to an object that includes execution artefacts (chat history, plan, diff summaries). Persist or redact them before sharing outside the team.


const result = await run(agent, 'Implement feature flagging');
 
// redact secrets before storing
const safeSummary = result.finalOutput.replace(process.env.SECRET!, '<redacted>');
await auditLog.write({ summary: safeSummary, plan: result.plan });

Custom toolchains

Combine MCP servers with traditional tools (tools array) for LLM agents that need HTTP calls or vector searches without Codex CLI.
Register additional MCP servers dynamically per task when context requires it (e.g., attach GitHub only for release workflows).

Testing strategies

Unit test prompts: treat agent configurations as code. Export helper functions that build agents with deterministic instructions and assert on the prompt template.
Replay sessions: rerun previously recorded tasks against run and diff artefacts. Because session logs include the entire conversation, you can replay regression cases with minimal setup.
Mock Cloud SDK: stub TestbaseCloud methods when running unit tests locally by providing a fake client that returns deterministic responses.

Error handling patterns

Error	Trigger	Remediation
`AuthenticationError`	Invalid API key for Cloud SDK	Refresh credentials; confirm `tb_{env}_...` key.
`ContainerTimeoutError`	The container exceeded `timeout` during execution	Increase `timeout`, review logs for hanging commands.
`McpServerError`	MCP invocation failed or timed out	Inspect `error.toolName`, retry selectively, or detach the MCP server.
`WorkspaceSyncError`	GCS sync conflict	Ensure uploads don’t overwrite critical files; consider persisting diffs before re-running.

Wrap calls in retry logic where idempotent (e.g., container creation) and escalate to humans when diffs are ambiguous.

Observability hooks

cloud.getLogs(containerId, lines) fetches recent container logs—surface them in dashboards or Slack alerts when health checks fail.
Track result.metrics (if enabled) to understand token usage and turnaround time per agent role.
Use session IDs as trace anchors in your logging system; the Agents SDK returns them via result.sessionId.

Integration with orchestrators

computer-agents includes an orchestrator type for chaining planners/workers/reviewers. When you need deeper control, compose your own loop:


const planner = new Agent({ name: 'Planner', agentType: 'planner', workspace: './repo' });
const worker = await cloud.createCloudAgent({ name: 'Worker', agentType: 'worker', workspace: './repo' });
const reviewer = new Agent({ name: 'Reviewer', agentType: 'reviewer', workspace: './repo' });
 
const plan = await run(planner, 'Plan multi-tenant billing');
await run(worker, `Implement the following plan:\n${plan.finalOutput}`);
const review = await run(reviewer, 'Review the latest diff for regressions');
 
if (review.reviewPassed === false) {
  // kick off another iteration or notify a human
}

By layering these patterns you can evolve from simple scripted runs to resilient, monitored automation that scales with your product.