Stop Reaching for Multi-Agent — Start With Stateful Sessions
Most teams jump to multi-agent orchestration before they've exhausted what a single agent with proper state management can do

Most teams building AI agents hit the same inflection point. The first prototype works — a single agent, a single prompt, a single conversation. Then requirements pile up: "Can we add a research step?" "Can it hand off to a specialist?" "Can two agents collaborate?"
The instinct is to reach for multi-agent orchestration. Add another agent. Wire up communication. Introduce a supervisor. Before long, you're debugging race conditions in a system where a single stateful session would have been enough.
The pattern we see again and again: teams jump to multi-agent before they've exhausted what a single agent with proper state management can do. And they pay for it in complexity they didn't need to take on.
The complexity cliff is real
A single agent with access to tools, persistent state, and session history is already capable of remarkable work. It can look up user information, update records, maintain context across dozens of turns, and adapt its behavior based on accumulated state.
The moment you introduce a second agent, you're no longer building a chatbot — you're building a distributed system. You inherit coordination overhead, shared-state problems, message-passing semantics, and failure modes that didn't exist before. Every message between agents is a potential point of failure, data loss, or inconsistency.
That's not an argument against multi-agent systems. They solve real problems. But they should be the answer to "I've genuinely exhausted what a single agent can do," not "I think my problem sounds like it needs two agents."
What stateful sessions actually give you
The core insight is that a well-designed session — one with proper state management — collapses a surprising number of "multi-agent" use cases back into single-agent problems.
Consider what a session can track:
- Conversation history — full message thread, persisted across interactions
- Resources — persistent state that the agent reads and writes, synced to your application in real-time
- Variables — internal state for intermediate results that don't need to reach the client
- Inputs — immutable context set at session creation (user identity, configuration, policies)
Here's a concrete example. In Octavus, a session stores all of these as first-class concepts:
input:
COMPANY_NAME:
type: string
description: The company name for support context
USER_ID:
type: string
description: Current user's ID
optional: true
resources:
CONVERSATION_SUMMARY:
type: string
description: Running summary of the conversation
default: ''
USER_CONTEXT:
type: unknown
description: Cached user information
default: {}
variables:
CHAT_TRANSCRIPT:
type: string
description: Serialized conversation for processing
The input is set once at session creation — immutable context like user identity and configuration. Resources persist across every interaction and sync to the client, so your frontend can react to state changes (a running conversation summary, for example). Variables are internal — intermediate results the agent needs but the user doesn't see.
This isn't a key-value store bolted onto a prompt. It's a structured state model with clear scoping rules:
| Type | Scope | Persistence | Synced to Client |
| Input | Session | Immutable | At creation |
| Resources | Session | Across triggers | Yes, via callbacks |
| Variables | Session | Across triggers | No, internal only |
When a resource updates, your frontend receives a resource-update event in the stream. No polling. No separate state sync layer. The agent writes to CONVERSATION_SUMMARY, and your UI renders it.
useOctavusChat({
onResourceUpdate: (name, value) => {
if (name === 'CONVERSATION_SUMMARY') {
setSummary(value);
}
},
});
This is the kind of state management that, without first-class session support, drives teams toward multi-agent setups. "We need one agent to summarize and another to respond" is often better solved as "we need one agent that writes a summary to a resource after each turn."
Handlers replace coordinators
The second pattern that pulls teams toward multi-agent prematurely is sequential task execution. "First we need to research, then analyze, then respond." That sounds like a pipeline of agents.
But look at what a single agent's handler can do:
handlers:
request-human:
Serialize conversation:
block: serialize-thread
format: markdown
output: CONVERSATION_TEXT
Generate summary:
block: next-message
output: SUMMARY
Create ticket:
block: tool-call
tool: create-support-ticket
input:
summary: SUMMARY
output: TICKET
This handler serializes the conversation, generates a summary, and creates a support ticket — three distinct steps, executed in sequence, within one session. No inter-agent communication. No coordinator. No shared message bus.
Each block's output becomes the next block's input through variables. The state flows through the handler like data through a pipeline. It's declarative, inspectable, and runs within a single session's state boundary.
When you actually need workers
Sometimes a task genuinely doesn't fit within a conversational agent's handler. Maybe it requires a different model. Maybe it needs its own tool set. Maybe it's a discrete unit of work that you want to reuse across multiple agents.
That's what workers are for — task-based agents that execute a sequence of steps and return an output value. They're not multi-agent orchestration. They're composable subtasks.
# A worker: research-assistant
input:
TOPIC:
type: string
description: Topic to research
variables:
RESEARCH_DATA:
type: string
ANALYSIS:
type: string
steps:
Start research:
block: start-thread
thread: research
model: anthropic/claude-sonnet-4-5
system: research-system
tools: [web-search]
maxSteps: 5
Add research request:
block: add-message
thread: research
role: user
prompt: research-prompt
input: [TOPIC]
Generate research:
block: next-message
thread: research
output: RESEARCH_DATA
Start analysis:
block: start-thread
thread: analysis
model: anthropic/claude-sonnet-4-5
system: analysis-system
Add analysis request:
block: add-message
thread: analysis
role: user
prompt: analysis-prompt
input: [RESEARCH_DATA]
Generate analysis:
block: next-message
thread: analysis
output: ANALYSIS
output: ANALYSIS
The key difference between a worker and "another agent" is scope. A worker runs, returns a value, and is done. It doesn't maintain its own conversation. It doesn't coordinate with other agents. It's a function call with LLM steps inside.
An interactive agent can call workers deterministically from a handler:
handlers:
user-message:
Research topic:
block: run-worker
worker: research-assistant
input:
TOPIC: USER_MESSAGE
output: RESEARCH_RESULT
Or the LLM can call workers as tools when it decides they're needed:
agent:
model: anthropic/claude-sonnet-4-5
system: system
workers: [research-assistant, generate-title]
agentic: true
This gives you decomposition without coordination overhead. Each worker gets its own threads, its own model configuration, even its own tool set — but the parent agent's session manages the lifecycle.
Sessions survive disconnects
One advantage of treating sessions as the primary unit of state: they're designed to survive interruptions.
Sessions expire after a configurable period of inactivity (24 hours by default). When a user returns, you check the session status and either resume or restore:
const result = await client.agentSessions.getMessages(sessionId);
if (result.status === 'active') {
// Session is live — resume where we left off
return { sessionId: result.sessionId, messages: result.messages };
}
// Session expired — restore from stored messages
if (storedMessages.length > 0) {
const restored = await client.agentSessions.restore(
sessionId,
storedMessages,
{ COMPANY_NAME: 'Acme Corp' },
);
if (restored.restored) {
return { sessionId: restored.sessionId, messages: storedMessages };
}
}
// Nothing to restore — start fresh
const newSessionId = await client.agentSessions.create('support-chat', {
COMPANY_NAME: 'Acme Corp',
});
return { sessionId: newSessionId, messages: [] };
In a multi-agent system, restoring state after a disconnect means restoring state for every agent involved, plus whatever coordination state existed between them. In a single-agent session, it's one restore call.
The progression that works
Based on what we've seen teams build successfully, the progression looks like this:
1. Start with a single agent and proper sessions. Define your inputs, resources, and variables. Use handlers with multiple blocks for sequential work. This handles more than you think.
2. Extract workers for discrete, reusable tasks. When a subtask needs its own model, tools, or system prompt — and especially when you want to reuse it across agents — pull it into a worker. Workers are composable units, not autonomous entities.
3. Introduce multi-agent only for genuinely autonomous coordination. When you need agents that maintain their own long-running state, make independent decisions, and communicate asynchronously — that's when multi-agent orchestration earns its complexity.
Most teams building production agents today are somewhere between step 1 and step 2. The trap is jumping to step 3 because the architecture sounds right, not because the problem demands it.
Simplicity is not a consolation prize
There's a perception that single-agent systems are somehow less sophisticated. That "real" agent work requires multiple agents talking to each other.
But the engineering that matters most — session lifecycle, state persistence, tool execution, streaming, error handling, restore mechanics — lives entirely within the single-agent boundary. Getting those primitives right is the hard work. Multi-agent adds a coordination layer on top of that work. It doesn't replace it.
If your single-agent sessions aren't solid, your multi-agent system won't be either. Every weakness in your state management gets multiplied by the number of agents involved.
Start with sessions. Get the state model right. Layer complexity only when the problem forces your hand.




