Testbase Cloud Platform
Testbase Cloud provides production-ready infrastructure for running computer agents in isolated environments with automatic workspace persistence, usage tracking, and billing.
Overview
The cloud platform runs on a GCE (Google Compute Engine) VM with GCS (Google Cloud Storage) for workspace persistence. It provides a REST API for executing computer agent tasks with the same Codex SDK that powers local execution.
┌──────────────────────────────┐
│ CloudRuntime (SDK) │
│ • Upload workspace to GCS │
│ • POST task to API │
│ • Download results from GCS │
└──────────────────────────────┘
│
▼
┌──────────────────────────────┐
│ GCE VM (34.170.205.13:8080) │
│ • Express API server │
│ • Codex SDK execution │
│ • API key authentication │
│ • Usage tracking & billing │
│ • SQLite database │
└──────────────────────────────┘
│
▼
┌──────────────────────────────┐
│ Google Cloud Storage │
│ • gs://testbase-workspaces │
│ • Workspace persistence │
│ • Session metadata │
│ • Thread cache │
└──────────────────────────────┘Quick start
1. Get an API key
Contact your administrator or create one via the admin API:
curl -X POST http://34.170.205.13:8080/admin/keys \
-H "Authorization: Bearer $ADMIN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "My Production Key",
"keyType": "standard",
"prefix": "tb_prod_"
}'Response:
{
"id": "...",
"key": "tb_prod_abc123...",
"warning": "⚠️ Store this key securely - it cannot be retrieved again!"
}2. Set environment variables
export TESTBASE_API_KEY=tb_prod_abc123...
export OPENAI_API_KEY=sk-...3. Use CloudRuntime
import { Agent, run, CloudRuntime } from 'computer-agents';
const runtime = new CloudRuntime({
apiKey: process.env.TESTBASE_API_KEY,
debug: true // See execution details
});
const agent = new Agent({
agentType: 'computer',
runtime,
workspace: './my-project',
instructions: 'You are a software developer.'
});
const result = await run(agent, 'Create a Python calculator');
console.log(result.finalOutput);
// Usage and billing info automatically included
console.log('Tokens used:', result.usage?.totalTokens);
console.log('Cost:', result.usage?.totalCost);
console.log('Balance remaining:', result.billing?.balanceAfter);Core capabilities
Isolated execution
Each task runs in a fresh environment on the GCE VM:
- No cross-contamination between tasks
- Clean state for every execution
- Secure API key authentication
- Rate limiting for protection
Workspace persistence
Workspaces are synchronized to/from GCS:
- Before execution: Upload local workspace to GCS
- During execution: VM operates on workspace
- After execution: Download updated workspace to local
This ensures:
- Files persist between runs
- Team collaboration on shared workspaces
- Session continuity across multiple tasks
- Recovery after VM restarts
Usage tracking & billing
Every execution is tracked for billing purposes:
- Token usage: Input and output tokens measured
- Cost calculation: Based on OpenAI pricing (1.5x markup)
- Balance tracking: Real-time credit balance
- Spending limits: Optional daily/monthly caps
Key types:
- Standard keys - Pay-per-token with billing
- Internal keys - Unlimited usage (testing only)
Session continuity
Thread IDs are cached in GCS, enabling:
- Multi-turn conversations across executions
- Context preservation between tasks
- Automatic session resumption
- Manual session reset when needed
Configuration
Default settings
// These are the defaults - no configuration needed
const runtime = new CloudRuntime({
apiKey: process.env.TESTBASE_API_KEY,
apiUrl: 'http://34.170.205.13:8080', // Hardcoded
debug: false
});Custom configuration
const runtime = new CloudRuntime({
apiKey: process.env.TESTBASE_API_KEY,
apiUrl: 'http://34.170.205.13:8080', // Override if needed
debug: true, // Show execution logs
timeout: 300000 // 5 minutes (optional)
});Performance characteristics
CloudRuntime has overhead from workspace synchronization:
| Workspace Size | Upload | Execution | Download | Total Overhead |
|---|---|---|---|---|
| Small (< 10 files, < 1MB) | ~1-2s | ~5-15s | ~1-2s | ~2-4s |
| Medium (50-100 files, 10MB) | ~5-10s | ~5-15s | ~5-10s | ~10-20s |
| Large (500+ files, 50MB+) | ~20-30s | ~5-15s | ~20-30s | ~40-60s |
Optimization tips:
- Use LocalRuntime for development (no sync overhead)
- Keep workspaces reasonably sized
- Consider incremental sync for large projects (planned v2.0)
- Use
.gitignoreto exclude unnecessary files
See Limitations for details.
API key management
Creating keys
Via admin API:
POST /admin/keys
Authorization: Bearer $ADMIN_API_KEY
Content-Type: application/json
{
"name": "Production Key",
"keyType": "standard", // or "internal"
"prefix": "tb_prod_",
"expiresIn": 365, // days (optional)
"permissions": ["execute", "read", "write"]
}Key types
Standard keys:
- Pay-per-token billing
- Credit balance required
- Daily/monthly spending limits
- Usage tracked in database
Internal keys:
- Unlimited usage
- No billing or tracking
- For testing/development only
- Bypasses budget checks
Authentication
Include API key in requests:
// Via CloudRuntime (automatic)
const runtime = new CloudRuntime({
apiKey: process.env.TESTBASE_API_KEY
});
// Via direct HTTP (manual)
fetch('http://34.170.205.13:8080/execute', {
headers: {
'Authorization': `Bearer ${apiKey}`
}
});Billing system
Pricing
Based on OpenAI rates with 1.5x markup:
- Input tokens: $0.015 per 1K tokens
- Output tokens: $0.045 per 1K tokens
Budget protection
Automatic protection prevents overspending:
Balance check:
- Blocks execution if balance ≤ 0
- Returns HTTP 402 Payment Required
Spending limits:
- Optional daily/monthly caps
- Returns HTTP 429 when exceeded
- Per-key configuration
Real-time tracking:
- Token usage counted per execution
- Balance deducted immediately
- Transaction history logged
Checking your balance
Via SDK result:
const result = await run(agent, task);
console.log('Balance:', result.billing?.balanceAfter);
console.log('Total spent:', result.billing?.totalSpent);Via REST API:
GET /billing/account
Authorization: Bearer $TESTBASE_API_KEYResponse:
{
"creditsBalance": 9.89,
"totalSpent": 0.11,
"dailyLimit": 10,
"monthlyLimit": 300
}Adding credits
Admins can add credits via API:
POST /billing/admin/:apiKeyId/credits
Authorization: Bearer $ADMIN_API_KEY
Content-Type: application/json
{
"amount": 100,
"description": "Credit purchase - $100"
}REST API reference
POST /execute
Execute a computer agent task.
Headers:
Authorization: Bearer YOUR_API_KEY
Content-Type: application/jsonRequest body:
{
"task": "Create hello.py",
"workspaceId": "my-project-abc123",
"sessionId": "thread-xyz" // optional
}Response:
{
"output": "Created hello.py successfully",
"sessionId": "thread-xyz",
"workspaceId": "my-project-abc123",
"usage": {
"inputTokens": 6548,
"outputTokens": 108,
"totalTokens": 6656,
"totalCost": 0.10308
},
"billing": {
"balanceAfter": 9.89692,
"totalSpent": 0.10308
}
}GET /health
Check API health and version.
Response:
{
"status": "healthy",
"project": "firechatbot-a9654",
"timestamp": "2025-01-20T10:00:00.000Z",
"uptime": 3600
}GET /billing/account
Get billing account details.
Headers:
Authorization: Bearer YOUR_API_KEYResponse:
{
"id": "...",
"creditsBalance": 9.89,
"totalSpent": 0.11,
"dailyLimit": 10,
"monthlyLimit": 300
}See API Documentation for complete reference.
Deployment
The cloud infrastructure is deployed on:
- GCP Project:
firechatbot-a9654 - VM:
testbase-ubuntu-vm(GCE) - Region:
us-central1-a - External IP:
34.170.205.13:8080 - Storage:
gs://testbase-workspaces
Deploying updates
cd computer-agents/packages/cloud-infrastructure
# Build
pnpm build
# Deploy to VM
pnpm deployViewing logs
# SSH to VM
gcloud compute ssh testbase-ubuntu-vm \
--project=firechatbot-a9654 \
--zone=us-central1-a
# View service logs
sudo journalctl -u testbase-cloud -f
# Check service status
sudo systemctl status testbase-cloudSee Operations Guide for detailed deployment documentation.
Security
API key authentication
All endpoints (except /health) require authentication:
- Keys are SHA-256 hashed in database
- Plain keys only shown once at creation
- No way to retrieve plain key after creation
Budget protection
Standard keys have automatic budget controls:
- Balance checks before execution
- Spending limits enforced
- Usage tracked per key
- Transaction logs for audit
Rate limiting
Global rate limits protect the API:
- 100 requests per 15 min per IP
- 30 execute requests per 15 min per IP
- Configurable per-key limits (planned)
Network security
- VM firewall rules restrict access
- HTTPS recommended for production (nginx reverse proxy)
- API keys transmitted via headers (not query params)
Limitations
Current limitations (v1.0)
- Workspace sync overhead - Full sync on every execution
- No streaming - Blocks until task completes
- No task interruption - Cannot cancel running tasks
- Single VM - Limited concurrent execution
- Network overhead - Upload/download via gsutil
Planned improvements
v2.0 (Q2 2025):
- Incremental workspace sync (only changed files)
- Streaming support via Codex SDK
- Direct GCS mount (eliminate sync overhead)
v2.1 (Q3 2025):
- Task interruption support
- Budget warnings and alerts
- Enhanced observability
v3.0 (Q4 2025):
- Auto-scaling with multiple VMs
- Team collaboration features
- PostgreSQL migration
See Architecture for complete roadmap.
When to use CloudRuntime
✅ Use CloudRuntime when:
- Production deployments
- Isolated execution required
- Usage tracking/billing needed
- Team collaboration on shared workspaces
- Workspace persistence important
❌ Use LocalRuntime when:
- Development and testing
- Fast iteration cycles needed
- Large repositories (> 500 files)
- No billing/isolation requirements
- Workspace sync overhead unacceptable
Next steps
- API Documentation - Complete REST API reference
- Operations Guide - Deployment and maintenance
- Billing Guide - Detailed billing documentation
- API Key Management - Key creation and management
The cloud platform is production-ready with comprehensive billing, authentication, and monitoring.