Your Agent's Tools Are the Wrong Abstraction

Most teams building agents start with tools. A function to search the database, another to create a ticket, a third to call an external API. They write them inline, wire them up to the model, and move on.

Then the second agent shows up.

Suddenly that search-products function needs to work in a different context. The ticket creation tool needs slightly different validation. The database query tool from agent A doesn't have the same auth context as agent B. Copy-paste begins, then divergence, then confusion about which version is canonical.

This is a familiar pattern. It's the same one that played out with microservices, internal APIs, and shared libraries. And the fix is the same too: tools aren't agent code — they're organizational infrastructure. Treat them accordingly.

The Single-Agent Trap

When you're building one agent, tool design doesn't feel like it matters. You define a function, give it a description the model can read, and implement the logic right there in your handler. Maybe you're using a framework that encourages this — tools as closures, defined alongside your prompt.

This works fine. Until it doesn't.

The problems surface when you need to:

Share a tool across agents — Two agents need to look up user accounts, but one has admin context and the other doesn't.
Test tools independently — Your tool is tangled up with session state and model behavior, making unit tests painful.
Version tool behavior — The product search tool changed its return shape, and now three agents are broken.
Audit tool usage — Which agents can create support tickets? Who knows — it's scattered across codebases.

These aren't edge cases. They're the normal trajectory of any team that moves past a proof of concept.

Declarations vs. Implementations

There's a design principle that helps here: separate what a tool is from how it works.

This is the same split that made Terraform, Kubernetes, and CI/CD pipelines tractable. You declare the interface in one place, implement it in another. The declaration is portable. The implementation is contextual.

In Octavus, this separation is built into the protocol layer. Tools are declared in YAML — their name, description, parameters, and types:

tools:
  get-user-account:
    description: >
      Retrieves the user's account information including name, email,
      subscription plan, and account creation date.
    parameters:
      userId:
        type: string
        description: The user ID to look up

  create-support-ticket:
    description: Create a support ticket for the user
    parameters:
      summary:
        type: string
        description: Brief summary of the issue
      priority:
        type: string
        description: Ticket priority (low, medium, high, urgent)

The model sees this declaration. It decides when to call a tool based on the description and parameter schema. But the actual implementation lives in your backend — in your server code, with your auth, your database, your business logic:

const session = client.agentSessions.attach(sessionId, {
  tools: {
    'get-user-account': async (args) => {
      const user = await db.users.findById(args.userId as string);
      return {
        name: user.name,
        email: user.email,
        plan: user.subscription.plan,
      };
    },

    'create-support-ticket': async (args) => {
      const ticket = await ticketService.create({
        summary: args.summary as string,
        priority: args.priority as string,
        source: 'ai-agent',
      });
      return { ticketId: ticket.id };
    },
  },
});

This split matters for a few reasons. The declaration is versioned with the protocol — it's a YAML file in a git repo. The implementation can vary by deployment, by environment, by request context. Two agents can share the same tool declaration but wire up different implementations based on their auth boundaries.

Where Tools Actually Run

There's a related question that too many teams gloss over: where does the tool execute?

A growing number of platforms run tool code on their infrastructure. Your function gets serialized, shipped to a remote environment, and executed somewhere you don't control. This works for demos. It falls apart when you need to query your production database, hit an internal API behind a VPN, or comply with data residency requirements.

Tool execution belongs on your infrastructure. The data is there. The auth is there. The compliance boundaries are there.

In Octavus, server tools run in your backend process. When the model calls a tool, the platform pauses the stream, sends the tool request to your SDK, your handler runs locally, and the result flows back to continue the conversation:

LLM decides to call tool
    → Platform pauses stream
    → Server SDK executes your handler locally
    → Results sent back to LLM
    → Stream continues

You get full control: your database connections, your request context, your error handling. And when a tool doesn't make sense on the server — browser geolocation, user confirmations, clipboard access — it gets forwarded to the client instead. No server handler registered? The SDK routes the call to the client automatically.

// Server: only handles what belongs on the server
const session = client.agentSessions.attach(sessionId, {
  tools: {
    'get-user-account': async (args) => {
      return await db.users.findById(user.id);
    },
    // 'get-browser-location' — no handler here, forwarded to client
    // 'confirm-purchase' — no handler here, forwarded to client
  },
});

// Client: handles what belongs in the browser
const chat = useOctavusChat({
  transport,
  clientTools: {
    'get-browser-location': async () => {
      const pos = await new Promise((resolve, reject) => {
        navigator.geolocation.getCurrentPosition(resolve, reject);
      });
      return { lat: pos.coords.latitude, lng: pos.coords.longitude };
    },
    'confirm-purchase': 'interactive',
  },
});

Server tools for data. Client tools for the browser and user interaction. The routing is automatic — you just decide where each handler lives.

Composability Is the Actual Goal

The conversation around tool standards right now — MCP, function calling schemas, universal tool servers — is pointed in the right direction. But standardizing the wire format is only half the problem. The other half is organizational: how do teams share, reuse, and govern tools across agents?

If every agent defines its own tools from scratch, you end up with the same function implemented twelve times with twelve slightly different interfaces. If every tool is a one-off closure inside a route handler, there's no way to audit what capabilities your agents actually have.

The pattern that works is the one that's worked in every other layer of software infrastructure:

Declare interfaces centrally — Tool definitions live in protocols, versioned and reviewable.
Implement behind the interface — Handlers live in your backend, with access to your real dependencies.
Compose, don't duplicate — Multiple agents reference the same tool declaration. Implementations vary by context, not by copy-paste.
Enforce boundaries — Server tools stay on the server. Client tools stay on the client. Auth context is per-request, not per-agent.

This isn't a novel idea. It's just the boring infrastructure pattern applied to a new domain. Declare the shape, implement the behavior, share the interface.

The Uncomfortable Part

There's a reason teams skip this and go straight to inline tool definitions. It's faster. When you're prototyping, you don't want to think about tool governance or shared interfaces. You want the agent to call a function and get a result.

That instinct is fine for a prototype. But the teams shipping agents to production are learning what every platform team has learned before: the unglamorous work of defining clean interfaces, owning execution on your infrastructure, and building for reuse is what separates a demo from a system.

The model call is the easy part. The tool layer is where the engineering lives.

Your Agent's Tools Are the Wrong Abstraction

The Single-Agent Trap

Declarations vs. Implementations

Where Tools Actually Run

Composability Is the Actual Goal

The Uncomfortable Part

Comments

More from this blog

The Best Agent Architectures Are Event-Driven, Not Chat-Driven

Your Agent's Behavior Shouldn't Live in Your Codebase

The Agents That Work in Production Are Built to Check Their Own Work

You're Not Building an Agent — You're Building Infrastructure

Command Palette

The Single-Agent Trap

Declarations vs. Implementations

Where Tools Actually Run

Composability Is the Actual Goal

The Uncomfortable Part

Comments

More from this blog