Multi-Agent Coordination

Hierarchical agent coordination where work flows down the org chart and results flow up, with full observability via REST API and WebSocket event streaming.

Maximus supports hierarchical agent coordination. An orchestrator delegates to managers, managers delegate to workers, and every action is traceable in real-time. This document covers hierarchy setup, delegation patterns, task lifecycle, safety mechanisms, observability, and the full API reference.

Hierarchy Setup reportsTo

Agent hierarchy is defined using the reportsTo field in each agent's Markdown frontmatter. An agent without reportsTo is a root (typically the orchestrator). Agents with reportsTo are children that can only receive delegated work from their parent.

Delegation is code-enforced, not agent-decided — the runtime validates hierarchy before spawning any child work. An agent cannot self-route or delegate to arbitrary agents.

Example Agent Definitions

Three agent definition files establishing an orchestrator → manager → worker chain:

agents/orchestrator.md

Root coordinator — no reportsTo

---
name: orchestrator
description: Top-level coordinator that breaks work into streams
model: opus
maxTurns: 50
---

You coordinate complex projects by breaking them into work streams
and delegating to specialized managers. You synthesize results from
managers into cohesive deliverables.

agents/research-manager.md

Manager — reportsTo: orchestrator

---
name: research-manager
description: Manages research tasks and coordinates research workers
model: sonnet
maxTurns: 30
reportsTo: orchestrator
skills:
  - github-operations
---

You manage research workflows. When given a research objective,
break it into focused tasks and delegate to research workers.
Aggregate findings into structured reports.

agents/research-worker.md

Worker — reportsTo: research-manager

---
name: research-worker
description: Executes focused research tasks
model: haiku
maxTurns: 20
reportsTo: research-manager
skills:
  - github-operations
---

You execute focused research tasks. Gather information, analyze it,
and return structured findings to your manager.

Resulting Hierarchy

orchestrator          (root -- no reportsTo)
  |
  +-- research-manager  (reportsTo: orchestrator)
        |
        +-- research-worker  (reportsTo: research-manager)

The AgentRegistry.canDelegateTo(from, to) method validates that the target agent's reportsTo matches the delegating agent's name. See packages/core/src/agents/registry.ts.

Delegation Patterns Primitives

Maximus provides two coordination primitives:

Primitive	Direction	Validation
Delegate	Parent to child (hierarchical)	Target's `reportsTo` must match sender
Message	Peer to peer (same level)	Both agents must share the same `reportsTo`

How Delegation Works

The Delegator class (packages/core/src/delegation/delegator.ts) executes this sequence:

Validate hierarchy

Confirms registry.canDelegateTo(from, to) returns true

Check circuit breakers

Ensures chain depth and concurrent task limits are not exceeded

Check token budget

If budgetCeiling is set, verifies the trace has not exceeded it

Create task

Creates a Task record in the TaskStore with status created

Transition to assigned

Task status moves to assigned

Acquire agent lock

Prevents concurrent sessions on the same agent

Transition to in-progress

Task status moves to in-progress, agent session starts

Run child session

Calls engine.runAgent() which starts a Claude SDK session

Record usage and complete

On success, records token usage and transitions task to completed

Handle failure

On error, transitions task to failed, propagates error to parent

Fan-Out Parallel

A manager can delegate to multiple workers in parallel. Each delegation creates its own task, acquires its own lock, and runs independently. The maxConcurrent circuit breaker limits how many can run simultaneously within a trace.

const results = await Promise.all([
  delegator.delegate({
    fromAgent: "research-manager",
    toAgent: "worker-1",
    prompt: "Research topic A",
    traceId,
  }),
  delegator.delegate({
    fromAgent: "research-manager",
    toAgent: "worker-2",
    prompt: "Research topic B",
    traceId,
  }),
]);

Context Passing

Context is passed as a structured message (prompt + relevant prior output), not raw conversation history. The parent agent decides what context is relevant:

const result = await delegator.delegate({
  fromAgent: "orchestrator",
  toAgent: "research-manager",
  prompt: `Analyze the quarterly report. Here is the summary from finance:

  ${financeResult.output}`,
  traceId,
});

Results Flow Back

The parent receives the child's SessionResult.output. The parent can then act on it, delegate further, or return it up the chain.

Error Handling

On failure, the task is marked failed with the error message. The error propagates to the parent agent who can decide to retry, escalate, or abort.

HierarchyViolationError

Delegation target does not report to sender

CircuitBreakerError

Chain depth or concurrent task limit exceeded

BudgetExceededError

Token budget ceiling reached

Task Lifecycle States

Every delegation creates a first-class Task entity that tracks the full lifecycle of that unit of work.

State Machine

              +----------+
              |  created |
              +----+-----+
                   |
              +----v-----+
              | assigned  |
              +----+-----+
                   |
           +------v-------+
           | in-progress   |
           +---+-------+---+
               |       |
        +------v--+ +--v------+
        |completed| |  failed |
        +---------+ +---------+

Transitions are strictly enforced. The only valid paths are:

Forward

created → assigned → in-progress

Terminal

in-progress → completed or failed

No skipping states. See packages/core/src/tasks/lifecycle.ts for the VALID_TRANSITIONS map.

Task Fields

Each task tracks these fields (defined in packages/shared/src/tasks.ts):

Field	Type	Description
`id`	string	Unique task ID (nanoid)
`parentTaskId`	string?	ID of the parent task (for delegation chains)
`agentName`	string	Name of the agent assigned to this task
`status`	TaskStatus	Current lifecycle state
`prompt`	string	The work instruction
`result`	string?	Output on completion
`error`	string?	Error message on failure
`traceId`	string	Trace ID linking all tasks in a delegation chain
`tokenUsage`	number	Token cost recorded for this task
`createdAt`	number	Timestamp of creation
`updatedAt`	number	Timestamp of last update
`completedAt`	number?	Timestamp of completion or failure

TaskStore API

The TaskStore class (packages/core/src/tasks/store.ts) provides:

Method	Description
`create(params)`	Create a new task with status `created`
`get(id)`	Get a task by ID (throws if not found)
`transition(id, status, update?)`	Transition task to new status with optional field updates
`getByTraceId(traceId)`	Get all tasks in a delegation chain
`getChainDepth(traceId)`	Compute max delegation depth by walking `parentTaskId` links
`getActiveConcurrentCount(traceId)`	Count tasks with `in-progress` or `assigned` status
`getAll()`	Get all tasks

Tasks are stored in-memory for v1. They are queryable via the REST API while the server is running.

Safety Budgets & Breakers

Token Budgets

Token budgets are configurable per delegation chain via the budgetCeiling field on DelegationRequest. The BudgetTracker (packages/core/src/tasks/budget.ts) accumulates usage per traceId and blocks delegation when the ceiling is reached.

await delegator.delegate({
  fromAgent: "orchestrator",
  toAgent: "research-manager",
  prompt: "...",
  traceId,
  budgetCeiling: 100000, // Max tokens for this entire chain
});

If the chain's accumulated usage reaches or exceeds the ceiling, BudgetExceededError is thrown.

Circuit Breakers

Two circuit breakers prevent runaway delegation:

Breaker	Default	Description
`maxDepth`	5	Maximum delegation chain depth (orchestrator → manager → worker = depth 2)
`maxConcurrent`	10	Maximum concurrent active tasks within a trace

When either limit is reached, CircuitBreakerError is thrown with the reason (max_depth or max_concurrent) and the current value.

Agent Write Lock

A per-agent write lock (AgentLock) prevents concurrent sessions targeting the same agent. The lock is acquired before runAgent() and released in a finally block, ensuring cleanup even on failure.

Error Types

import {
  HierarchyViolationError,
  CircuitBreakerError,
  BudgetExceededError,
} from "@maximus/core";

try {
  await delegator.delegate(request);
} catch (error) {
  if (error instanceof HierarchyViolationError) {
    // fromAgent cannot delegate to toAgent
  } else if (error instanceof CircuitBreakerError) {
    // error.reason: "max_depth" | "max_concurrent"
    // error.value: current depth or concurrent count
  } else if (error instanceof BudgetExceededError) {
    // error.used: tokens used so far
    // error.ceiling: configured ceiling
  }
}

Observability Tracing

Trace IDs

A trace ID is generated at the root of a delegation chain and propagated to all child tasks and sessions. Every task and event within the chain shares the same traceId, enabling end-to-end tracing.

const traceId = nanoid(); // Generated once at root
await delegator.delegate({
  fromAgent: "orchestrator",
  toAgent: "manager",
  prompt: "...",
  traceId,
});
// All child tasks and events carry this traceId

Event Interface

Every AgentEvent (defined in packages/shared/src/events.ts) carries traceId and parentSessionId fields:

interface AgentEvent {
  id: string;
  timestamp: number;
  sessionId: string;
  agentName: string;
  type: AgentEventType;
  payload: Record<string, unknown>;
  traceId?: string;
  parentSessionId?: string;
}

Task Lifecycle Events

Events emitted by the Delegator:

Event	When
`task:created`	Task created in store
`task:assigned`	Task assigned to agent
`task:completed`	Agent finished successfully
`task:failed`	Agent encountered an error

Agent Session Events

Event	When
`session:start`	Agent session begins
`session:end`	Agent session ends
`agent:message`	Agent produces a text message
`agent:tool_call`	Agent calls a tool
`agent:tool_result`	Tool returns a result
`agent:delegation`	Agent delegates to a child
`agent:completion`	Agent completes its turn
`agent:error`	Agent encounters an error

Structured Logging

All events flow through the EventBus (packages/core/src/events/bus.ts). The server uses pino for structured logging with trace context attached to log lines for correlation.

REST API Reference

The server exposes a REST API for querying tasks, agents, and system health. All endpoints return JSON.

Health Check

# GET /api/health
curl http://localhost:3000/api/health

Response:

{ "status": "ok", "timestamp": 1710936000000 }

List Tasks

# All tasks
curl http://localhost:3000/api/tasks

# Filter by trace ID
curl "http://localhost:3000/api/tasks?traceId=abc123"

# Filter by agent name
curl "http://localhost:3000/api/tasks?agentName=research-manager"

# Filter by status
curl "http://localhost:3000/api/tasks?status=completed"

# Combine filters
curl "http://localhost:3000/api/tasks?traceId=abc123&status=in-progress"

Response:

{
  "tasks": [
    {
      "id": "task_abc",
      "parentTaskId": null,
      "agentName": "research-manager",
      "status": "completed",
      "prompt": "Analyze the quarterly report",
      "result": "Key findings: ...",
      "traceId": "abc123",
      "tokenUsage": 1500,
      "createdAt": 1710936000000,
      "updatedAt": 1710936005000,
      "completedAt": 1710936005000
    }
  ]
}

Get Task by ID

# GET /api/tasks/:id
curl http://localhost:3000/api/tasks/task_abc

Response:

{
  "task": {
    "id": "task_abc",
    "agentName": "research-manager",
    "status": "completed",
    "prompt": "Analyze the quarterly report",
    "result": "Key findings: ...",
    "traceId": "abc123",
    "tokenUsage": 1500,
    "createdAt": 1710936000000,
    "updatedAt": 1710936005000,
    "completedAt": 1710936005000
  }
}

Returns 404 if the task does not exist.

List Agents

# GET /api/agents
curl http://localhost:3000/api/agents

Response:

{
  "agents": [
    {
      "name": "orchestrator",
      "description": "Top-level coordinator",
      "model": "opus",
      "skills": []
    },
    {
      "name": "research-manager",
      "description": "Manages research tasks",
      "model": "sonnet",
      "reportsTo": "orchestrator",
      "skills": ["github-operations"]
    }
  ]
}

Get Org Chart

# GET /api/agents/org-chart
curl http://localhost:3000/api/agents/org-chart

Response:

{
  "agents": [
    { "name": "orchestrator", "description": "Top-level coordinator" },
    { "name": "research-manager", "reportsTo": "orchestrator", "description": "Manages research tasks" },
    { "name": "research-worker", "reportsTo": "research-manager", "description": "Executes research tasks" }
  ]
}

WebSocket Event Streaming Real-time

The server provides real-time event streaming over WebSocket. All EventBus events are broadcast to connected clients via the EventBridge (packages/server/src/ws/bridge.ts). The WebSocket endpoint is at /ws on the same port as the HTTP server (single-port architecture using noServer WebSocket upgrade).

Connecting

# Using wscat
wscat -c ws://localhost:3000/ws

Frame Format

All messages are JSON frames with the WebSocketFrame structure (packages/server/src/ws/frames.ts):

interface WebSocketFrame {
  type: "event" | "connected" | "error";
  event?: string;       // Event type (for "event" frames)
  payload: Record<string, unknown>;
  seq: number;          // Sequential frame number
}

Welcome Frame

On connection, clients receive a welcome frame:

{
  "type": "connected",
  "payload": { "message": "Connected to Maximus event stream" },
  "seq": 0
}

Event Frames

Task and agent events are delivered as frames with sequential numbering:

{
  "type": "event",
  "event": "task:created",
  "payload": {
    "id": "evt_abc",
    "timestamp": 1710936000000,
    "sessionId": "",
    "agentName": "research-manager",
    "type": "task:created",
    "payload": { "taskId": "task_abc", "parentTaskId": null },
    "traceId": "abc123"
  },
  "seq": 1
}

Sequential Numbering

The seq field increments globally across all frames. Clients can detect dropped frames by checking for gaps in the sequence. A gap indicates frames were skipped (due to backpressure or disconnection).

Backpressure Handling

If a client's send buffer exceeds 64KB (BACKPRESSURE_THRESHOLD), frames are skipped for that client to prevent memory buildup. Slow clients may miss events — use the REST API to query tasks for the authoritative state.

Client-Side Filtering

The server broadcasts all events to all connected clients. Filter on the client side by inspecting the payload fields:

const ws = new WebSocket("ws://localhost:3000/ws");

ws.onmessage = (event) => {
  const frame = JSON.parse(event.data);
  if (frame.type !== "event") return;

  // Filter by trace ID
  if (frame.payload.traceId === "abc123") {
    console.log(`[${frame.event}]`, frame.payload);
  }

  // Filter by agent name
  if (frame.payload.agentName === "research-manager") {
    console.log(`[${frame.event}]`, frame.payload);
  }
};

Quick Start Example End-to-End

A complete example: define 3 agents, start the server, delegate work, and observe via API and WebSocket.

1. Define Agents

Create the agent files shown in Hierarchy Setup above:

agents/orchestrator.md

Root coordinator

agents/research-manager.md

Manages research workers, reportsTo: orchestrator

agents/research-worker.md

Executes tasks, reportsTo: research-manager

2. Start the Server

import { AgentEngine } from "@maximus/core";
import { createApp } from "@maximus/server";

const engine = new AgentEngine({
  agentsDir: "./agents",
  skillsDir: "./skills",
});
await engine.initialize();

const { server } = createApp(engine);
server.listen(3000, () => {
  console.log("Maximus running on http://localhost:3000");
});

3. Delegate Work

import { nanoid } from "nanoid";

const delegator = engine.getDelegator();
const traceId = nanoid();

const result = await delegator.delegate({
  fromAgent: "orchestrator",
  toAgent: "research-manager",
  prompt: "Research the top 3 competitors and summarize their strengths",
  traceId,
});

console.log("Result:", result.output);

4. Query via REST API

# Check all tasks in the delegation chain
curl "http://localhost:3000/api/tasks?traceId=${TRACE_ID}"

# View the org chart
curl http://localhost:3000/api/agents/org-chart

# Check system health
curl http://localhost:3000/api/health

5. Observe via WebSocket

# Stream events in real-time
wscat -c ws://localhost:3000/ws

You will see frames for task:created, task:assigned, session:start, agent:message, task:completed, and more — all carrying the traceId for correlation.

Multi-Agent Coordination

Hierarchy Setup reportsTo

Example Agent Definitions

Resulting Hierarchy

Delegation Patterns Primitives

How Delegation Works

Validate hierarchy

Check circuit breakers

Check token budget

Create task

Transition to assigned

Acquire agent lock

Transition to in-progress

Run child session

Record usage and complete

Handle failure

Fan-Out Parallel

Context Passing

Results Flow Back

Error Handling

Task Lifecycle States

State Machine

Task Fields

TaskStore API

Safety Budgets & Breakers

Token Budgets

Circuit Breakers

Agent Write Lock

Error Types

Observability Tracing

Trace IDs

Event Interface

Task Lifecycle Events

Agent Session Events

Structured Logging

REST API Reference

Health Check

List Tasks

Get Task by ID

List Agents

Get Org Chart

WebSocket Event Streaming Real-time

Connecting

Frame Format

Welcome Frame

Event Frames

Sequential Numbering

Backpressure Handling

Client-Side Filtering

Quick Start Example End-to-End

1. Define Agents

2. Start the Server

3. Delegate Work

4. Query via REST API

5. Observe via WebSocket

Related Documentation