Execution Engine
The Execution Engine is where the rubber meets the road. It takes the ranked provider list from the Scoring Layer, sends the request, handles streaming responses, manages retries on failure, and executes fallback chains when providers are down.
Request Lifecycle
Section titled “Request Lifecycle”Ranked Candidate List | v[1] Attempt Primary --> Send to top-scored provider | v[2] Stream / Await --> Handle streaming or blocking response | v[3] Success? --> Yes: return result + record trace | No: continue to fallback v[4] Attempt Fallback 1 --> Send to next candidate | v[5] Exhaustion? --> Yes: return error with full trace | No: continue chain v[6] Record Trace --> Persist full execution traceCore Components
Section titled “Core Components”Request Builder
Section titled “Request Builder”- File:
gateway/handler.ts - Transforms internal Layerr request format into provider-specific formats (OpenAI, Anthropic, Ollama, custom)
- Handles API key injection from the Secrets Manager
- Sets headers, timeouts, and retry policies
Provider Adapters
Section titled “Provider Adapters”- OpenAI-compatible:
providers/openai-compat/adapter.ts - Ollama:
providers/ollama/adapter.ts - Each adapter normalizes request/response formats so the rest of the system is provider-agnostic
Streaming Handler
Section titled “Streaming Handler”- File:
server.ts(primary handler for/v1/chat/completions) - Manages Server-Sent Events (SSE) streams
- Supports cancellation mid-stream
- Handles token counting for cost attribution in real-time
Retry & Circuit Breaker
Section titled “Retry & Circuit Breaker”- File:
runtime/protection/classifier.ts - Classifies errors as transient (retryable) or permanent
- Implements exponential backoff with jitter
- Circuit breaker pattern: after N consecutive failures, provider is temporarily removed from rotation
Error Classification
Section titled “Error Classification”The protection classifier (runtime/protection/classifier.ts) categorises errors:
| Error Type | Examples | Action |
|---|---|---|
| Transient | 429 rate limit, 503 unavailable, timeout | Retry with backoff |
| Auth | 401, 403 | Fail fast, alert admin |
| Content | 400 bad request, context too long | Fail fast, log for analysis |
| Provider Down | Connection refused, DNS failure | Immediately fallback, mark unhealthy |
Timeout Profiles
Section titled “Timeout Profiles”| Profile | Timeout | Use Case |
|---|---|---|
| Fast | 5 seconds | Autocomplete, quick fixes |
| Standard | 30 seconds | General coding tasks |
| Deep | 120 seconds | Architecture design, complex reasoning |
| Custom | User-defined | Specialised workloads |
Execution Trace Format
Section titled “Execution Trace Format”Every execution is recorded as a trace with:
interface ExecutionTrace { traceId: string; workspaceId: string; request: { intent: IntentClassification; workload: WorkloadProfile; strategy: Strategy; providers: ProviderCandidate[]; // ranked list }; attempts: Attempt[]; finalOutcome: { provider: string; model: string; latencyMs: number; tokensIn: number; tokensOut: number; costUsd: number; qualityScore: number; }; timestamps: { routedAt: Date; firstTokenAt: Date; completedAt: Date; };}File Reference
Section titled “File Reference”| File | What It Does |
|---|---|
server.ts | Main server entrypoint. All API routes flow through here |
gateway/handler.ts | Request transformation and gateway routing |
providers/openai-compat/adapter.ts | OpenAI-compatible API adapter |
providers/ollama/adapter.ts | Ollama local model adapter |
providers/resolution.ts | Provider URL resolution and health checking |
providers/registry.ts | Provider registration and metadata |
runtime/protection/classifier.ts | Error classification and retry logic |
runtime/timeout/profiles.ts | Timeout profile definitions |
src/features/runtime/execution/executionModel.ts | Frontend execution data model |
src/features/runtime/lifecycle/lifecycleModel.ts | Execution lifecycle state machine |
Integration
Section titled “Integration”- Scoring Layer → provides ranked candidate list
- Secrets Manager → provides API keys for provider auth
- Provider Health → receives success/failure signals for real-time health updates
- Replay → receives full execution traces for storage
- Economics → receives token counts and costs for attribution
- Explainability → receives execution details for post-hoc explanations