Testbase Cloud Platform

Testbase Cloud provides production-ready infrastructure for running computer agents in isolated environments with automatic workspace persistence, usage tracking, and billing.

Overview

The cloud platform runs on a GCE (Google Compute Engine) VM with GCS (Google Cloud Storage) for workspace persistence. It provides a REST API for executing computer agent tasks with the same Codex SDK that powers local execution.


┌──────────────────────────────┐
│    CloudRuntime (SDK)        │
│  • Upload workspace to GCS   │
│  • POST task to API          │
│  • Download results from GCS │
└──────────────────────────────┘
              │
              ▼
┌──────────────────────────────┐
│  GCE VM (34.170.205.13:8080) │
│  • Express API server        │
│  • Codex SDK execution       │
│  • API key authentication    │
│  • Usage tracking & billing  │
│  • SQLite database           │
└──────────────────────────────┘
              │
              ▼
┌──────────────────────────────┐
│ Google Cloud Storage         │
│  • gs://testbase-workspaces  │
│  • Workspace persistence     │
│  • Session metadata          │
│  • Thread cache              │
└──────────────────────────────┘

Quick start

1. Get an API key

Contact your administrator or create one via the admin API:


curl -X POST http://34.170.205.13:8080/admin/keys \
  -H "Authorization: Bearer $ADMIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My Production Key",
    "keyType": "standard",
    "prefix": "tb_prod_"
  }'

Response:


{
  "id": "...",
  "key": "tb_prod_abc123...",
  "warning": "⚠️ Store this key securely - it cannot be retrieved again!"
}

2. Set environment variables


export TESTBASE_API_KEY=tb_prod_abc123...
export OPENAI_API_KEY=sk-...

3. Use CloudRuntime


import { Agent, run, CloudRuntime } from 'computer-agents';
 
const runtime = new CloudRuntime({
  apiKey: process.env.TESTBASE_API_KEY,
  debug: true  // See execution details
});
 
const agent = new Agent({
  agentType: 'computer',
  runtime,
  workspace: './my-project',
  instructions: 'You are a software developer.'
});
 
const result = await run(agent, 'Create a Python calculator');
console.log(result.finalOutput);
 
// Usage and billing info automatically included
console.log('Tokens used:', result.usage?.totalTokens);
console.log('Cost:', result.usage?.totalCost);
console.log('Balance remaining:', result.billing?.balanceAfter);

Core capabilities

Isolated execution

Each task runs in a fresh environment on the GCE VM:

No cross-contamination between tasks
Clean state for every execution
Secure API key authentication
Rate limiting for protection

Workspace persistence

Workspaces are synchronized to/from GCS:

Before execution: Upload local workspace to GCS
During execution: VM operates on workspace
After execution: Download updated workspace to local

This ensures:

Files persist between runs
Team collaboration on shared workspaces
Session continuity across multiple tasks
Recovery after VM restarts

Usage tracking & billing

Every execution is tracked for billing purposes:

Token usage: Input and output tokens measured
Cost calculation: Based on OpenAI pricing (1.5x markup)
Balance tracking: Real-time credit balance
Spending limits: Optional daily/monthly caps

Key types:

Standard keys - Pay-per-token with billing
Internal keys - Unlimited usage (testing only)

Session continuity

Thread IDs are cached in GCS, enabling:

Multi-turn conversations across executions
Context preservation between tasks
Automatic session resumption
Manual session reset when needed

Configuration

Default settings


// These are the defaults - no configuration needed
const runtime = new CloudRuntime({
  apiKey: process.env.TESTBASE_API_KEY,
  apiUrl: 'http://34.170.205.13:8080',  // Hardcoded
  debug: false
});

Custom configuration


const runtime = new CloudRuntime({
  apiKey: process.env.TESTBASE_API_KEY,
  apiUrl: 'http://34.170.205.13:8080',  // Override if needed
  debug: true,  // Show execution logs
  timeout: 300000  // 5 minutes (optional)
});

Performance characteristics

CloudRuntime has overhead from workspace synchronization:

Workspace Size	Upload	Execution	Download	Total Overhead
Small (< 10 files, < 1MB)	~1-2s	~5-15s	~1-2s	~2-4s
Medium (50-100 files, 10MB)	~5-10s	~5-15s	~5-10s	~10-20s
Large (500+ files, 50MB+)	~20-30s	~5-15s	~20-30s	~40-60s

Optimization tips:

Use LocalRuntime for development (no sync overhead)
Keep workspaces reasonably sized
Consider incremental sync for large projects (planned v2.0)
Use .gitignore to exclude unnecessary files

See Limitations for details.

API key management

Creating keys

Via admin API:


POST /admin/keys
Authorization: Bearer $ADMIN_API_KEY
Content-Type: application/json
 
{
  "name": "Production Key",
  "keyType": "standard",  // or "internal"
  "prefix": "tb_prod_",
  "expiresIn": 365,  // days (optional)
  "permissions": ["execute", "read", "write"]
}

Key types

Standard keys:

Pay-per-token billing
Credit balance required
Daily/monthly spending limits
Usage tracked in database

Internal keys:

Unlimited usage
No billing or tracking
For testing/development only
Bypasses budget checks

Authentication

Include API key in requests:


// Via CloudRuntime (automatic)
const runtime = new CloudRuntime({
  apiKey: process.env.TESTBASE_API_KEY
});
 
// Via direct HTTP (manual)
fetch('http://34.170.205.13:8080/execute', {
  headers: {
    'Authorization': `Bearer ${apiKey}`
  }
});

Billing system

Pricing

Based on OpenAI rates with 1.5x markup:

Input tokens: $0.015 per 1K tokens
Output tokens: $0.045 per 1K tokens

Budget protection

Automatic protection prevents overspending:

Balance check:

Blocks execution if balance ≤ 0
Returns HTTP 402 Payment Required

Spending limits:

Optional daily/monthly caps
Returns HTTP 429 when exceeded
Per-key configuration

Real-time tracking:

Token usage counted per execution
Balance deducted immediately
Transaction history logged

Checking your balance

Via SDK result:


const result = await run(agent, task);
console.log('Balance:', result.billing?.balanceAfter);
console.log('Total spent:', result.billing?.totalSpent);

Via REST API:


GET /billing/account
Authorization: Bearer $TESTBASE_API_KEY

Response:


{
  "creditsBalance": 9.89,
  "totalSpent": 0.11,
  "dailyLimit": 10,
  "monthlyLimit": 300
}

Adding credits

Admins can add credits via API:


POST /billing/admin/:apiKeyId/credits
Authorization: Bearer $ADMIN_API_KEY
Content-Type: application/json
 
{
  "amount": 100,
  "description": "Credit purchase - $100"
}

REST API reference

POST /execute

Execute a computer agent task.

Headers:


Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Request body:


{
  "task": "Create hello.py",
  "workspaceId": "my-project-abc123",
  "sessionId": "thread-xyz" // optional
}

Response:


{
  "output": "Created hello.py successfully",
  "sessionId": "thread-xyz",
  "workspaceId": "my-project-abc123",
  "usage": {
    "inputTokens": 6548,
    "outputTokens": 108,
    "totalTokens": 6656,
    "totalCost": 0.10308
  },
  "billing": {
    "balanceAfter": 9.89692,
    "totalSpent": 0.10308
  }
}

GET /health

Check API health and version.

Response:


{
  "status": "healthy",
  "project": "firechatbot-a9654",
  "timestamp": "2025-01-20T10:00:00.000Z",
  "uptime": 3600
}

GET /billing/account

Get billing account details.

Headers:


Authorization: Bearer YOUR_API_KEY

Response:


{
  "id": "...",
  "creditsBalance": 9.89,
  "totalSpent": 0.11,
  "dailyLimit": 10,
  "monthlyLimit": 300
}

See API Documentation for complete reference.

Deployment

The cloud infrastructure is deployed on:

GCP Project: firechatbot-a9654
VM: testbase-ubuntu-vm (GCE)
Region: us-central1-a
External IP: 34.170.205.13:8080
Storage: gs://testbase-workspaces

Deploying updates


cd computer-agents/packages/cloud-infrastructure
 
# Build
pnpm build
 
# Deploy to VM
pnpm deploy

Viewing logs


# SSH to VM
gcloud compute ssh testbase-ubuntu-vm \
  --project=firechatbot-a9654 \
  --zone=us-central1-a
 
# View service logs
sudo journalctl -u testbase-cloud -f
 
# Check service status
sudo systemctl status testbase-cloud

See Operations Guide for detailed deployment documentation.

Security

API key authentication

All endpoints (except /health) require authentication:

Keys are SHA-256 hashed in database
Plain keys only shown once at creation
No way to retrieve plain key after creation

Budget protection

Standard keys have automatic budget controls:

Balance checks before execution
Spending limits enforced
Usage tracked per key
Transaction logs for audit

Rate limiting

Global rate limits protect the API:

100 requests per 15 min per IP
30 execute requests per 15 min per IP
Configurable per-key limits (planned)

Network security

VM firewall rules restrict access
HTTPS recommended for production (nginx reverse proxy)
API keys transmitted via headers (not query params)

Limitations

Current limitations (v1.0)

Workspace sync overhead - Full sync on every execution
No streaming - Blocks until task completes
No task interruption - Cannot cancel running tasks
Single VM - Limited concurrent execution
Network overhead - Upload/download via gsutil

Planned improvements

v2.0 (Q2 2025):

Incremental workspace sync (only changed files)
Streaming support via Codex SDK
Direct GCS mount (eliminate sync overhead)

v2.1 (Q3 2025):

Task interruption support
Budget warnings and alerts
Enhanced observability

v3.0 (Q4 2025):

Auto-scaling with multiple VMs
Team collaboration features
PostgreSQL migration

See Architecture for complete roadmap.

When to use CloudRuntime

✅ Use CloudRuntime when:

Production deployments
Isolated execution required
Usage tracking/billing needed
Team collaboration on shared workspaces
Workspace persistence important

❌ Use LocalRuntime when:

Development and testing
Fast iteration cycles needed
Large repositories (> 500 files)
No billing/isolation requirements
Workspace sync overhead unacceptable

Next steps

API Documentation - Complete REST API reference
Operations Guide - Deployment and maintenance
Billing Guide - Detailed billing documentation
API Key Management - Key creation and management

The cloud platform is production-ready with comprehensive billing, authentication, and monitoring.