Headless Agents: When Your Agent Doesn't Need a Chat Interface
Workers, pipelines, and the shift from conversational AI to backend agent execution

The most interesting agents won't have a chat interface. They'll run in the background — processing data, executing multi-step workflows, calling tools, returning structured output — all without a user typing a single message.
The industry is starting to figure this out. After two years of shoehorning everything into a chat bubble, the realization is landing: the "assistant" paradigm only covers a fraction of what agents can do. The real value is in headless execution — agents as backend primitives, invoked by your code, not your users.
But running an agent without a conversation UI introduces a set of problems that most frameworks don't address. How do you define a multi-step execution flow declaratively? How do you mix different models at different stages? How do you pipe the output of one LLM call into the next? How do you get a structured return value instead of a stream of chat messages?
This post walks through how headless agents work in practice using Octavus's worker protocol — a format built specifically for task-based, UI-less agent execution.
Chat Agents vs. Worker Agents
Most agent frameworks model everything as a conversation. You send a message, the agent responds, maybe it calls some tools along the way, and the conversation continues. This is the interactive pattern — it assumes a human in the loop and a persistent back-and-forth.
Workers flip this model. A worker takes structured input, runs a defined sequence of steps, and returns an output value. There's no conversation to maintain, no session to keep alive across interactions. It's a function call, except the function body is an LLM pipeline.
Here's how the two compare:
| Aspect | Interactive Agent | Worker Agent |
| Structure | triggers + handlers + agent | steps + output |
| LLM Config | Global agent: section | Per-thread via start-thread |
| Invocation | Fire a named trigger | Direct execution with input |
| Session | Persists across triggers (24h TTL) | Single execution |
| Result | Streaming chat | Streaming + output value |
The interactive format is right for chatbots, support agents, anything conversational. Workers are right for everything else: background jobs, data pipelines, content generation, classification tasks, research workflows, scheduled automations.
Anatomy of a Worker Protocol
Workers are defined in YAML, same as interactive agents. But the structure is simpler — no triggers, no handlers, just inputs, steps, and an output.
Here's a research worker that takes a topic, does web research, then produces a structured analysis:
input:
TOPIC:
type: string
description: Topic to research
DEPTH:
type: string
optional: true
default: medium
variables:
RESEARCH_DATA:
type: string
ANALYSIS:
type: string
tools:
web-search:
description: Search the web
parameters:
query: { type: string }
steps:
Start research:
block: start-thread
thread: research
model: anthropic/claude-sonnet-4-5
system: research-system
input: [TOPIC, DEPTH]
tools: [web-search]
maxSteps: 5
Add research request:
block: add-message
thread: research
role: user
prompt: research-prompt
input: [TOPIC, DEPTH]
Generate research:
block: next-message
thread: research
output: RESEARCH_DATA
Start analysis:
block: start-thread
thread: analysis
model: anthropic/claude-sonnet-4-5
system: analysis-system
Add analysis request:
block: add-message
thread: analysis
role: user
prompt: analysis-prompt
input: [RESEARCH_DATA]
Generate analysis:
block: next-message
thread: analysis
output: ANALYSIS
output: ANALYSIS
A few things to notice:
Multiple threads, independently configured. The research phase uses web-search with maxSteps: 5 (allowing agentic tool loops). The analysis phase uses a different system prompt and no tools. Each thread gets its own model, tools, and settings — you're not locked into a single configuration for the whole execution.
Data flows through variables. RESEARCH_DATA captures the output of the research thread, then gets passed as input to the analysis thread. Variables are the connective tissue between steps.
The output field declares the return value. When this worker finishes, the caller gets back whatever's in ANALYSIS. It's not a chat message — it's structured data you can use downstream.
Running Workers from Code
On the server side, you have two ways to execute a worker: generate() for simple fire-and-forget execution, and execute() for streaming.
The simple path:
import { OctavusClient } from '@octavus/server-sdk';
const client = new OctavusClient({
baseUrl: 'https://octavus.ai',
apiKey: process.env.OCTAVUS_API_KEY!,
});
const { output } = await client.workers.generate(
'research-assistant-id',
{ TOPIC: 'AI safety', DEPTH: 'detailed' },
{
tools: {
'web-search': async ({ query }) => await searchWeb(query),
},
},
);
console.log('Result:', output);
generate() runs the worker to completion and hands you the output. Tool handlers execute on your infrastructure — the LLM decides when to call web-search, but your code does the actual searching.
When you need visibility into what's happening mid-execution, use execute():
const events = client.workers.execute(
'research-assistant-id',
{ TOPIC: 'AI safety' },
{
tools: {
'web-search': async ({ query }) => await searchWeb(query),
},
},
);
for await (const event of events) {
switch (event.type) {
case 'worker-start':
console.log(`Started: ${event.workerSlug}`);
break;
case 'block-start':
console.log(`Step: ${event.blockName}`);
break;
case 'text-delta':
process.stdout.write(event.delta);
break;
case 'worker-result':
console.log('Output:', event.output);
break;
}
}
The streaming API emits fine-grained events: step transitions, text deltas, tool calls, and the final output. This is useful for progress tracking, logging, or piping worker events to a client over SSE.
Composing Workers into Larger Systems
Workers get more interesting when you compose them. An interactive agent can call workers as sub-tasks — either deterministically from a handler, or agentically where the LLM decides when to invoke them.
First, declare the worker in your interactive agent's protocol:
workers:
research-assistant:
description: Researching topic
display: stream
tools:
search: web-search # Map worker's "search" tool to parent's "web-search"
Then call it from a handler:
handlers:
user-message:
Run research:
block: run-worker
worker: research-assistant
input:
TOPIC: USER_MESSAGE
output: RESEARCH_RESULT
Or let the LLM call it as a tool:
agent:
model: anthropic/claude-sonnet-4-5
system: system
workers: [research-assistant]
agentic: true
The tool mapping line — search: web-search — deserves attention. The worker protocol defines a tool called search. The parent agent has a handler for web-search. The mapping connects them, so when the worker's LLM calls search, the parent's web-search handler executes. Tools stay on your infrastructure; workers compose cleanly without duplicating handler code.
Where Headless Agents Make Sense
The pattern unlocks a category of work that doesn't fit the chat paradigm:
Scheduled jobs. A worker that runs nightly, pulls metrics from three APIs, synthesizes a summary, and posts it to Slack. No conversation needed — just input, process, output.
Pipeline stages. A content moderation worker that classifies user submissions, extracts metadata, and returns a structured verdict. Plug it into your existing pipeline as an async function call.
Agent-to-agent delegation. An interactive support agent that hands off research to a worker, gets back structured findings, and weaves them into the conversation. The user sees a seamless response; under the hood, two agents collaborated.
Background enrichment. A worker triggered by a webhook that takes a new customer record, enriches it with public data, scores it, and writes the result to your database.
In each case, the agent is a backend component — invoked by code, returning data, running without a UI. The declarative protocol means you can version it, review it in a PR, and validate it before deployment. The execution is still an LLM pipeline with tool calls and reasoning. You just stripped away the chat chrome.
The Shift Away from Chat
The chat interface was a useful starting point for agent development. It gave everyone a familiar interaction model and made demos easy. But it also anchored our thinking — we started treating "agent" as a synonym for "chatbot with tools."
The more useful framing: an agent is a program where some of the logic is delegated to a language model. Sometimes that program needs a conversation loop. Often it doesn't. The worker pattern makes the "often it doesn't" case a first-class citizen — defined declaratively, executed programmatically, composed with other agents, and returning structured output your code can use.
Agents are moving into the backend. The interesting question isn't whether they'll have a UI. It's how we make them reliable, composable, and observable when they don't.




