Skip to Content
CloudCloud Platform

Testbase Cloud Platform

Testbase Cloud provides production-ready infrastructure for running computer agents in isolated environments with automatic workspace persistence, usage tracking, and billing.

Overview

The cloud platform runs on a GCE (Google Compute Engine) VM with GCS (Google Cloud Storage) for workspace persistence. It provides a REST API for executing computer agent tasks with the same Codex SDK that powers local execution.

┌──────────────────────────────┐ │ CloudRuntime (SDK) │ │ • Upload workspace to GCS │ │ • POST task to API │ │ • Download results from GCS │ └──────────────────────────────┘ ┌──────────────────────────────┐ │ GCE VM (34.170.205.13:8080) │ │ • Express API server │ │ • Codex SDK execution │ │ • API key authentication │ │ • Usage tracking & billing │ │ • SQLite database │ └──────────────────────────────┘ ┌──────────────────────────────┐ │ Google Cloud Storage │ │ • gs://testbase-workspaces │ │ • Workspace persistence │ │ • Session metadata │ │ • Thread cache │ └──────────────────────────────┘

Quick start

1. Get an API key

Contact your administrator or create one via the admin API:

curl -X POST http://34.170.205.13:8080/admin/keys \ -H "Authorization: Bearer $ADMIN_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "My Production Key", "keyType": "standard", "prefix": "tb_prod_" }'

Response:

{ "id": "...", "key": "tb_prod_abc123...", "warning": "⚠️ Store this key securely - it cannot be retrieved again!" }

2. Set environment variables

export TESTBASE_API_KEY=tb_prod_abc123... export OPENAI_API_KEY=sk-...

3. Use CloudRuntime

import { Agent, run, CloudRuntime } from 'computer-agents'; const runtime = new CloudRuntime({ apiKey: process.env.TESTBASE_API_KEY, debug: true // See execution details }); const agent = new Agent({ agentType: 'computer', runtime, workspace: './my-project', instructions: 'You are a software developer.' }); const result = await run(agent, 'Create a Python calculator'); console.log(result.finalOutput); // Usage and billing info automatically included console.log('Tokens used:', result.usage?.totalTokens); console.log('Cost:', result.usage?.totalCost); console.log('Balance remaining:', result.billing?.balanceAfter);

Core capabilities

Isolated execution

Each task runs in a fresh environment on the GCE VM:

  • No cross-contamination between tasks
  • Clean state for every execution
  • Secure API key authentication
  • Rate limiting for protection

Workspace persistence

Workspaces are synchronized to/from GCS:

  • Before execution: Upload local workspace to GCS
  • During execution: VM operates on workspace
  • After execution: Download updated workspace to local

This ensures:

  • Files persist between runs
  • Team collaboration on shared workspaces
  • Session continuity across multiple tasks
  • Recovery after VM restarts

Usage tracking & billing

Every execution is tracked for billing purposes:

  • Token usage: Input and output tokens measured
  • Cost calculation: Based on OpenAI pricing (1.5x markup)
  • Balance tracking: Real-time credit balance
  • Spending limits: Optional daily/monthly caps

Key types:

  • Standard keys - Pay-per-token with billing
  • Internal keys - Unlimited usage (testing only)

Session continuity

Thread IDs are cached in GCS, enabling:

  • Multi-turn conversations across executions
  • Context preservation between tasks
  • Automatic session resumption
  • Manual session reset when needed

Configuration

Default settings

// These are the defaults - no configuration needed const runtime = new CloudRuntime({ apiKey: process.env.TESTBASE_API_KEY, apiUrl: 'http://34.170.205.13:8080', // Hardcoded debug: false });

Custom configuration

const runtime = new CloudRuntime({ apiKey: process.env.TESTBASE_API_KEY, apiUrl: 'http://34.170.205.13:8080', // Override if needed debug: true, // Show execution logs timeout: 300000 // 5 minutes (optional) });

Performance characteristics

CloudRuntime has overhead from workspace synchronization:

Workspace SizeUploadExecutionDownloadTotal Overhead
Small (< 10 files, < 1MB)~1-2s~5-15s~1-2s~2-4s
Medium (50-100 files, 10MB)~5-10s~5-15s~5-10s~10-20s
Large (500+ files, 50MB+)~20-30s~5-15s~20-30s~40-60s

Optimization tips:

  • Use LocalRuntime for development (no sync overhead)
  • Keep workspaces reasonably sized
  • Consider incremental sync for large projects (planned v2.0)
  • Use .gitignore to exclude unnecessary files

See Limitations for details.

API key management

Creating keys

Via admin API:

POST /admin/keys Authorization: Bearer $ADMIN_API_KEY Content-Type: application/json { "name": "Production Key", "keyType": "standard", // or "internal" "prefix": "tb_prod_", "expiresIn": 365, // days (optional) "permissions": ["execute", "read", "write"] }

Key types

Standard keys:

  • Pay-per-token billing
  • Credit balance required
  • Daily/monthly spending limits
  • Usage tracked in database

Internal keys:

  • Unlimited usage
  • No billing or tracking
  • For testing/development only
  • Bypasses budget checks

Authentication

Include API key in requests:

// Via CloudRuntime (automatic) const runtime = new CloudRuntime({ apiKey: process.env.TESTBASE_API_KEY }); // Via direct HTTP (manual) fetch('http://34.170.205.13:8080/execute', { headers: { 'Authorization': `Bearer ${apiKey}` } });

Billing system

Pricing

Based on OpenAI rates with 1.5x markup:

  • Input tokens: $0.015 per 1K tokens
  • Output tokens: $0.045 per 1K tokens

Budget protection

Automatic protection prevents overspending:

Balance check:

  • Blocks execution if balance ≤ 0
  • Returns HTTP 402 Payment Required

Spending limits:

  • Optional daily/monthly caps
  • Returns HTTP 429 when exceeded
  • Per-key configuration

Real-time tracking:

  • Token usage counted per execution
  • Balance deducted immediately
  • Transaction history logged

Checking your balance

Via SDK result:

const result = await run(agent, task); console.log('Balance:', result.billing?.balanceAfter); console.log('Total spent:', result.billing?.totalSpent);

Via REST API:

GET /billing/account Authorization: Bearer $TESTBASE_API_KEY

Response:

{ "creditsBalance": 9.89, "totalSpent": 0.11, "dailyLimit": 10, "monthlyLimit": 300 }

Adding credits

Admins can add credits via API:

POST /billing/admin/:apiKeyId/credits Authorization: Bearer $ADMIN_API_KEY Content-Type: application/json { "amount": 100, "description": "Credit purchase - $100" }

REST API reference

POST /execute

Execute a computer agent task.

Headers:

Authorization: Bearer YOUR_API_KEY Content-Type: application/json

Request body:

{ "task": "Create hello.py", "workspaceId": "my-project-abc123", "sessionId": "thread-xyz" // optional }

Response:

{ "output": "Created hello.py successfully", "sessionId": "thread-xyz", "workspaceId": "my-project-abc123", "usage": { "inputTokens": 6548, "outputTokens": 108, "totalTokens": 6656, "totalCost": 0.10308 }, "billing": { "balanceAfter": 9.89692, "totalSpent": 0.10308 } }

GET /health

Check API health and version.

Response:

{ "status": "healthy", "project": "firechatbot-a9654", "timestamp": "2025-01-20T10:00:00.000Z", "uptime": 3600 }

GET /billing/account

Get billing account details.

Headers:

Authorization: Bearer YOUR_API_KEY

Response:

{ "id": "...", "creditsBalance": 9.89, "totalSpent": 0.11, "dailyLimit": 10, "monthlyLimit": 300 }

See API Documentation for complete reference.

Deployment

The cloud infrastructure is deployed on:

  • GCP Project: firechatbot-a9654
  • VM: testbase-ubuntu-vm (GCE)
  • Region: us-central1-a
  • External IP: 34.170.205.13:8080
  • Storage: gs://testbase-workspaces

Deploying updates

cd computer-agents/packages/cloud-infrastructure # Build pnpm build # Deploy to VM pnpm deploy

Viewing logs

# SSH to VM gcloud compute ssh testbase-ubuntu-vm \ --project=firechatbot-a9654 \ --zone=us-central1-a # View service logs sudo journalctl -u testbase-cloud -f # Check service status sudo systemctl status testbase-cloud

See Operations Guide for detailed deployment documentation.

Security

API key authentication

All endpoints (except /health) require authentication:

  • Keys are SHA-256 hashed in database
  • Plain keys only shown once at creation
  • No way to retrieve plain key after creation

Budget protection

Standard keys have automatic budget controls:

  • Balance checks before execution
  • Spending limits enforced
  • Usage tracked per key
  • Transaction logs for audit

Rate limiting

Global rate limits protect the API:

  • 100 requests per 15 min per IP
  • 30 execute requests per 15 min per IP
  • Configurable per-key limits (planned)

Network security

  • VM firewall rules restrict access
  • HTTPS recommended for production (nginx reverse proxy)
  • API keys transmitted via headers (not query params)

Limitations

Current limitations (v1.0)

  1. Workspace sync overhead - Full sync on every execution
  2. No streaming - Blocks until task completes
  3. No task interruption - Cannot cancel running tasks
  4. Single VM - Limited concurrent execution
  5. Network overhead - Upload/download via gsutil

Planned improvements

v2.0 (Q2 2025):

  • Incremental workspace sync (only changed files)
  • Streaming support via Codex SDK
  • Direct GCS mount (eliminate sync overhead)

v2.1 (Q3 2025):

  • Task interruption support
  • Budget warnings and alerts
  • Enhanced observability

v3.0 (Q4 2025):

  • Auto-scaling with multiple VMs
  • Team collaboration features
  • PostgreSQL migration

See Architecture for complete roadmap.

When to use CloudRuntime

Use CloudRuntime when:

  • Production deployments
  • Isolated execution required
  • Usage tracking/billing needed
  • Team collaboration on shared workspaces
  • Workspace persistence important

Use LocalRuntime when:

  • Development and testing
  • Fast iteration cycles needed
  • Large repositories (> 500 files)
  • No billing/isolation requirements
  • Workspace sync overhead unacceptable

Next steps

The cloud platform is production-ready with comprehensive billing, authentication, and monitoring.

Last updated on