Multi-Agent Coordination
Hierarchical agent coordination where work flows down the org chart and results flow up, with full observability via REST API and WebSocket event streaming.
Maximus supports hierarchical agent coordination. An orchestrator delegates to managers, managers delegate to workers, and every action is traceable in real-time. This document covers hierarchy setup, delegation patterns, task lifecycle, safety mechanisms, observability, and the full API reference.
Hierarchy Setup reportsTo
Agent hierarchy is defined using the reportsTo field in each agent's Markdown frontmatter. An agent without reportsTo is a root (typically the orchestrator). Agents with reportsTo are children that can only receive delegated work from their parent.
Delegation is code-enforced, not agent-decided — the runtime validates hierarchy before spawning any child work. An agent cannot self-route or delegate to arbitrary agents.
Example Agent Definitions
Three agent definition files establishing an orchestrator → manager → worker chain:
Root coordinator — no reportsTo
--- name: orchestrator description: Top-level coordinator that breaks work into streams model: opus maxTurns: 50 --- You coordinate complex projects by breaking them into work streams and delegating to specialized managers. You synthesize results from managers into cohesive deliverables.
Manager — reportsTo: orchestrator
--- name: research-manager description: Manages research tasks and coordinates research workers model: sonnet maxTurns: 30 reportsTo: orchestrator skills: - github-operations --- You manage research workflows. When given a research objective, break it into focused tasks and delegate to research workers. Aggregate findings into structured reports.
Worker — reportsTo: research-manager
--- name: research-worker description: Executes focused research tasks model: haiku maxTurns: 20 reportsTo: research-manager skills: - github-operations --- You execute focused research tasks. Gather information, analyze it, and return structured findings to your manager.
Resulting Hierarchy
orchestrator (root -- no reportsTo) | +-- research-manager (reportsTo: orchestrator) | +-- research-worker (reportsTo: research-manager)
The AgentRegistry.canDelegateTo(from, to) method validates that the target agent's reportsTo matches the delegating agent's name. See packages/core/src/agents/registry.ts.
Delegation Patterns Primitives
Maximus provides two coordination primitives:
How Delegation Works
The Delegator class (packages/core/src/delegation/delegator.ts) executes this sequence:
Validate hierarchy
Confirms registry.canDelegateTo(from, to) returns true
Check circuit breakers
Ensures chain depth and concurrent task limits are not exceeded
Check token budget
If budgetCeiling is set, verifies the trace has not exceeded it
Create task
Creates a Task record in the TaskStore with status created
Transition to assigned
Task status moves to assigned
Acquire agent lock
Prevents concurrent sessions on the same agent
Transition to in-progress
Task status moves to in-progress, agent session starts
Run child session
Calls engine.runAgent() which starts a Claude SDK session
Record usage and complete
On success, records token usage and transitions task to completed
Handle failure
On error, transitions task to failed, propagates error to parent
Fan-Out Parallel
A manager can delegate to multiple workers in parallel. Each delegation creates its own task, acquires its own lock, and runs independently. The maxConcurrent circuit breaker limits how many can run simultaneously within a trace.
const results = await Promise.all([ delegator.delegate({ fromAgent: "research-manager", toAgent: "worker-1", prompt: "Research topic A", traceId, }), delegator.delegate({ fromAgent: "research-manager", toAgent: "worker-2", prompt: "Research topic B", traceId, }), ]);
Context Passing
Context is passed as a structured message (prompt + relevant prior output), not raw conversation history. The parent agent decides what context is relevant:
const result = await delegator.delegate({ fromAgent: "orchestrator", toAgent: "research-manager", prompt: `Analyze the quarterly report. Here is the summary from finance: ${financeResult.output}`, traceId, });
Results Flow Back
The parent receives the child's SessionResult.output. The parent can then act on it, delegate further, or return it up the chain.
Error Handling
On failure, the task is marked failed with the error message. The error propagates to the parent agent who can decide to retry, escalate, or abort.
Delegation target does not report to sender
Chain depth or concurrent task limit exceeded
Token budget ceiling reached
Task Lifecycle States
Every delegation creates a first-class Task entity that tracks the full lifecycle of that unit of work.
State Machine
+----------+
| created |
+----+-----+
|
+----v-----+
| assigned |
+----+-----+
|
+------v-------+
| in-progress |
+---+-------+---+
| |
+------v--+ +--v------+
|completed| | failed |
+---------+ +---------+
Transitions are strictly enforced. The only valid paths are:
created → assigned → in-progressin-progress → completed or failedNo skipping states. See packages/core/src/tasks/lifecycle.ts for the VALID_TRANSITIONS map.
Task Fields
Each task tracks these fields (defined in packages/shared/src/tasks.ts):
TaskStore API
The TaskStore class (packages/core/src/tasks/store.ts) provides:
Tasks are stored in-memory for v1. They are queryable via the REST API while the server is running.
Safety Budgets & Breakers
Token Budgets
Token budgets are configurable per delegation chain via the budgetCeiling field on DelegationRequest. The BudgetTracker (packages/core/src/tasks/budget.ts) accumulates usage per traceId and blocks delegation when the ceiling is reached.
await delegator.delegate({ fromAgent: "orchestrator", toAgent: "research-manager", prompt: "...", traceId, budgetCeiling: 100000, // Max tokens for this entire chain });
If the chain's accumulated usage reaches or exceeds the ceiling, BudgetExceededError is thrown.
Circuit Breakers
Two circuit breakers prevent runaway delegation:
When either limit is reached, CircuitBreakerError is thrown with the reason (max_depth or max_concurrent) and the current value.
Agent Write Lock
A per-agent write lock (AgentLock) prevents concurrent sessions targeting the same agent. The lock is acquired before runAgent() and released in a finally block, ensuring cleanup even on failure.
Error Types
import { HierarchyViolationError, CircuitBreakerError, BudgetExceededError, } from "@maximus/core"; try { await delegator.delegate(request); } catch (error) { if (error instanceof HierarchyViolationError) { // fromAgent cannot delegate to toAgent } else if (error instanceof CircuitBreakerError) { // error.reason: "max_depth" | "max_concurrent" // error.value: current depth or concurrent count } else if (error instanceof BudgetExceededError) { // error.used: tokens used so far // error.ceiling: configured ceiling } }
Observability Tracing
Trace IDs
A trace ID is generated at the root of a delegation chain and propagated to all child tasks and sessions. Every task and event within the chain shares the same traceId, enabling end-to-end tracing.
const traceId = nanoid(); // Generated once at root await delegator.delegate({ fromAgent: "orchestrator", toAgent: "manager", prompt: "...", traceId, }); // All child tasks and events carry this traceId
Event Interface
Every AgentEvent (defined in packages/shared/src/events.ts) carries traceId and parentSessionId fields:
interface AgentEvent { id: string; timestamp: number; sessionId: string; agentName: string; type: AgentEventType; payload: Record<string, unknown>; traceId?: string; parentSessionId?: string; }
Task Lifecycle Events
Events emitted by the Delegator:
Agent Session Events
Structured Logging
All events flow through the EventBus (packages/core/src/events/bus.ts). The server uses pino for structured logging with trace context attached to log lines for correlation.
REST API Reference
The server exposes a REST API for querying tasks, agents, and system health. All endpoints return JSON.
Health Check
# GET /api/health curl http://localhost:3000/api/health
Response:
{ "status": "ok", "timestamp": 1710936000000 }
List Tasks
# All tasks curl http://localhost:3000/api/tasks # Filter by trace ID curl "http://localhost:3000/api/tasks?traceId=abc123" # Filter by agent name curl "http://localhost:3000/api/tasks?agentName=research-manager" # Filter by status curl "http://localhost:3000/api/tasks?status=completed" # Combine filters curl "http://localhost:3000/api/tasks?traceId=abc123&status=in-progress"
Response:
{
"tasks": [
{
"id": "task_abc",
"parentTaskId": null,
"agentName": "research-manager",
"status": "completed",
"prompt": "Analyze the quarterly report",
"result": "Key findings: ...",
"traceId": "abc123",
"tokenUsage": 1500,
"createdAt": 1710936000000,
"updatedAt": 1710936005000,
"completedAt": 1710936005000
}
]
}
Get Task by ID
# GET /api/tasks/:id curl http://localhost:3000/api/tasks/task_abc
Response:
{
"task": {
"id": "task_abc",
"agentName": "research-manager",
"status": "completed",
"prompt": "Analyze the quarterly report",
"result": "Key findings: ...",
"traceId": "abc123",
"tokenUsage": 1500,
"createdAt": 1710936000000,
"updatedAt": 1710936005000,
"completedAt": 1710936005000
}
}
Returns 404 if the task does not exist.
List Agents
# GET /api/agents curl http://localhost:3000/api/agents
Response:
{
"agents": [
{
"name": "orchestrator",
"description": "Top-level coordinator",
"model": "opus",
"skills": []
},
{
"name": "research-manager",
"description": "Manages research tasks",
"model": "sonnet",
"reportsTo": "orchestrator",
"skills": ["github-operations"]
}
]
}
Get Org Chart
# GET /api/agents/org-chart curl http://localhost:3000/api/agents/org-chart
Response:
{
"agents": [
{ "name": "orchestrator", "description": "Top-level coordinator" },
{ "name": "research-manager", "reportsTo": "orchestrator", "description": "Manages research tasks" },
{ "name": "research-worker", "reportsTo": "research-manager", "description": "Executes research tasks" }
]
}
WebSocket Event Streaming Real-time
The server provides real-time event streaming over WebSocket. All EventBus events are broadcast to connected clients via the EventBridge (packages/server/src/ws/bridge.ts). The WebSocket endpoint is at /ws on the same port as the HTTP server (single-port architecture using noServer WebSocket upgrade).
Connecting
# Using wscat wscat -c ws://localhost:3000/ws
Frame Format
All messages are JSON frames with the WebSocketFrame structure (packages/server/src/ws/frames.ts):
interface WebSocketFrame { type: "event" | "connected" | "error"; event?: string; // Event type (for "event" frames) payload: Record<string, unknown>; seq: number; // Sequential frame number }
Welcome Frame
On connection, clients receive a welcome frame:
{
"type": "connected",
"payload": { "message": "Connected to Maximus event stream" },
"seq": 0
}
Event Frames
Task and agent events are delivered as frames with sequential numbering:
{
"type": "event",
"event": "task:created",
"payload": {
"id": "evt_abc",
"timestamp": 1710936000000,
"sessionId": "",
"agentName": "research-manager",
"type": "task:created",
"payload": { "taskId": "task_abc", "parentTaskId": null },
"traceId": "abc123"
},
"seq": 1
}
Sequential Numbering
The seq field increments globally across all frames. Clients can detect dropped frames by checking for gaps in the sequence. A gap indicates frames were skipped (due to backpressure or disconnection).
Backpressure Handling
If a client's send buffer exceeds 64KB (BACKPRESSURE_THRESHOLD), frames are skipped for that client to prevent memory buildup. Slow clients may miss events — use the REST API to query tasks for the authoritative state.
Client-Side Filtering
The server broadcasts all events to all connected clients. Filter on the client side by inspecting the payload fields:
const ws = new WebSocket("ws://localhost:3000/ws"); ws.onmessage = (event) => { const frame = JSON.parse(event.data); if (frame.type !== "event") return; // Filter by trace ID if (frame.payload.traceId === "abc123") { console.log(`[${frame.event}]`, frame.payload); } // Filter by agent name if (frame.payload.agentName === "research-manager") { console.log(`[${frame.event}]`, frame.payload); } };
Quick Start Example End-to-End
A complete example: define 3 agents, start the server, delegate work, and observe via API and WebSocket.
1. Define Agents
Create the agent files shown in Hierarchy Setup above:
Root coordinator
Manages research workers, reportsTo: orchestrator
Executes tasks, reportsTo: research-manager
2. Start the Server
import { AgentEngine } from "@maximus/core"; import { createApp } from "@maximus/server"; const engine = new AgentEngine({ agentsDir: "./agents", skillsDir: "./skills", }); await engine.initialize(); const { server } = createApp(engine); server.listen(3000, () => { console.log("Maximus running on http://localhost:3000"); });
3. Delegate Work
import { nanoid } from "nanoid"; const delegator = engine.getDelegator(); const traceId = nanoid(); const result = await delegator.delegate({ fromAgent: "orchestrator", toAgent: "research-manager", prompt: "Research the top 3 competitors and summarize their strengths", traceId, }); console.log("Result:", result.output);
4. Query via REST API
# Check all tasks in the delegation chain curl "http://localhost:3000/api/tasks?traceId=${TRACE_ID}" # View the org chart curl http://localhost:3000/api/agents/org-chart # Check system health curl http://localhost:3000/api/health
5. Observe via WebSocket
# Stream events in real-time wscat -c ws://localhost:3000/ws
You will see frames for task:created, task:assigned, session:start, agent:message, task:completed, and more — all carrying the traceId for correlation.