Cullis: a federated trust fabric for AI agents

Mastio is the control point each organization installs for its own AI agents: identity, policy, audit, MCP gateway. Court is the layer that federates Mastios across different organizations by routing sealed envelopes it cannot open. Two layers, two roles, two adoption timelines.

In short

Cullis is not a single product, it is two layers designed to be adopted separately.

Mastio is the enterprise control point for AI agents. It issues a cryptographic identity to each agent, acts as an MCP gateway toward internal tools and servers, applies the policies the organization decides, and writes a non-repudiable audit log. It lives inside the organization, runs even air-gapped, and requires no external counterpart to function.

Court is the federation layer between Mastios in different organizations. It routes envelopes encrypted end-to-end between sender and receiver: it sees who talked to whom and when, never sees the content, never holds keys that would let it impersonate a Mastio or an agent. A Court compromise is a metadata leak, never a confidentiality or non-repudiation breach.

An organization adopts Mastio first, on its own, to put its own agents in order. It adds Court (its own private Court, a consortium Court, or a managed one) when it needs to talk to agents in other organizations. The same Mastio works in both modes, with no redeploy.

The problem

Companies are trying to bring AI into their workflows, and they all hit the same wall: how do you give an agent access to internal systems without leaving the door wide open?

The state of the art moves through four phases, each one increasing the value of AI inside the company and the amount of trust that has to be managed explicitly.

Phase 1: copy and paste. An employee who wants to automate a task opens an LLM chat product, copies context out of the internal system, pastes into the chat, copies the answer back. IT sees nothing. Sensitive data leaves the perimeter. Each department improvises differently and no one is accountable for any of it. This is the current state in many companies today.

Phase 2: centralized gateways. To restore order, two families of gateways are emerging, usually distinct. On one side, MCP gateways, which expose internal tools and services to agents and mediate calls toward those systems. On the other, AI gateways, which mediate calls toward LLM models (multi-provider routing, cost tracking, rate limiting). Both solve part of the problem (visibility, access control, traffic governance), but they share the same structural defect: they remain designed for users, not for agents. The caller identity is the user holding the API key, not the agent making the call. If Alex has five different agents, the audit log says “user Alex called endpoint X” five times, and no one knows which of the five did what.

Phase 3: every user will have a personal agent. This is where the market is heading. The agent no longer lives in the browser, it lives on the user’s client, orchestrates tasks, calls services through the gateway. The volume of calls explodes by an order of magnitude, and each call is still attributed to the user, not to the agent. The policy “Alex can access X” is no longer enough: what is needed is “Alex’s planning agent, but not their research agent, can access X”. That is something you cannot write today.

At this level, security problems also begin: things that are edge cases today and will be daily concerns tomorrow. Prompt injections that exfiltrate data through a trusted agent. Confused deputy: a high-privilege agent acting on behalf of a low-privilege user. Agent identity spoofing. Agents calling each other with no one knowing who authorized what.

Phase 4: agents must speak across organizations. The Acme agent needs to ask the Globex agent to verify a supplier. The clinical agent at one hospital needs to call the triage agent at a partner network. An insurance company needs to exchange the file of a claim that involves both with another insurance company. At this level there is no solution: an MCP gateway is a construct internal to one organization, bilateral API keys do not scale beyond two or three counterparts, SFTP and partner portals are the current fallback, and auditing a conversation between organizations is a manual reconciliation between two SIEMs each side controls at home.

The stack that handles all this trust today does not exist as such. The pieces exist (TLS, OIDC, MCP, the gateways themselves), but there is no layer that says “here is this agent’s identity, here is the policy that governs what it can do, here is the non-repudiable audit of what it did, and all of this still works when the agent talks to one from a different company”.

Cullis is that layer.

The components

Cullis has three pieces, each with a specific role in making sure every message is identified, authorized, and auditable.

Cullis components in cross-org action: two organizations (Acme, Globex) each with their own agent, bridge (Connector/SDK/SPIRE) and Mastio; Court at the center routes sealed envelopes between the two Mastios — Fig. 01 · Components: Mastio, Court, agent-side bridge.

Mastio: the enterprise control point

Mastio is the component each organization installs inside its own perimeter. It is the guardian of the perimeter for everything that concerns AI agents. It does five things:

Identity authority. It holds the organization’s CA, issues x509 certificates for each agent, manages rotation, maintains the revocation list.
MCP gateway for agents. It mediates agent calls toward internal MCP tools and servers. Here is a structural difference from existing MCP gateways: the caller that authenticates is the agent with its cryptographic identity, not the user with an API key. From this single choice flow all the properties the others cannot give (per-agent policy, per-agent audit, per-agent non-repudiation).
LLM gateway for agents. It also mediates calls toward models, not only those toward MCP tools. The same per-agent identity applies on both sides: the audit log records which agent called which model with which prompt, not “user Alex called provider X”. Detailed in the next section.
Policy engine. It applies the rules the organization decides on which agents can do what, toward which internal tools, toward which external agents.
Non-repudiable audit log. It writes every event in an append-only log with a hash chain and a temporal anchor (RFC 3161), so the history is reconstructable and tamper-evident even from the one who administers the database.

Mastio works on its own, with no external counterpart. An organization can adopt it to put its own agents in order with no commitment to an external network.

Court: the federation layer

Court is the component that comes into play when an agent in one organization needs to talk to an agent in another. It routes envelopes encrypted end-to-end between the sender’s Mastio and the receiver’s.

The key point about Court is what it does not do:

It never reads message content (encrypted end-to-end with keys Court does not hold).
It does not hold keys that would let it impersonate a Mastio or an agent.
It is not the primary source of cross-org non-repudiation (more on this below).

Court’s persistent state is an audit chain of routing metadata (who published what, when, from which Mastio), partitioned per organization and anchored to an external TSA (RFC 3161), so that Court itself is a tamper-evident witness of what it has seen. The payload remains invisible to Court.

Cross-org non-repudiation does not depend on Court’s chain: it derives from the cross-org dual-write between the two local chains of the involved Mastios. For each message exchanged between Acme and Globex, Mastio Acme appends the event to its own local chain and Mastio Globex appends the same event to its own. Three months later, a regulator can compare the two chains independently: if they agree it is crisp non-repudiation, if they disagree the divergence point identifies exactly the contested event. Court’s chain is a third parallel witness: useful in disputes about routing path, but its compromise does not ruin the non-repudiation of past conversations, because the two cross-verified Mastio chains stay intact each on its own database.

A Court compromise is therefore a metadata leak and a service interruption for in-flight messages, never a confidentiality or non-repudiation breach. An organization can choose to use a private Court (its own), a consortium Court (run by a neutral third party), or a Cullis-managed Court.

The agent-side bridge: Connector, SDK, SPIRE

Wherever the agent lives, a component is needed that holds the private key (never exported, never transmitted), terminates the mTLS channel toward Mastio, signs outgoing envelopes, and validates incoming ones. This role is filled by one of three distinct products, chosen based on where the agent runs.

Connector is the daemon for laptops and desktops, the product designed for the end user (developer, knowledge worker). It has a local web dashboard and a tray icon, it is zero-CLI: whoever installs it opens the dashboard, approves enrollment toward the enterprise Mastio, and from that moment the agent in their coding tool (any MCP-aware client) talks to Mastio through the Connector.

SDK (cullis_sdk) is the Python library a backend service includes in its own code. Same cryptographic role as Connector, but in-process instead of sidecar. It is the right product for a ticketing agent, a scheduled job, a microservice that talks to an enterprise database: anything that does not have an end user to whom a dashboard would be shown.

SPIRE integration is for organizations that already have SPIRE as a workload identity fabric in production. The workload presents an SVID issued by SPIRE, Mastio accepts it through the chain to the Org CA, and the SPIFFE identity becomes the internal agent_id. There is no new daemon to install and no key binding to manage: SPIRE already does the work, Mastio reuses it.

The three products share the same cryptographic function in three different form factors. Each one covers one of the enrollment flows described in a later section.

Cullis Chat and Frontdesk: the conversational interface

The agent-side bridge is the “machine to Mastio” entry primitive. Mastio also exposes a second entry primitive, “user to Mastio”, through a conversational interface.

Cullis Chat is the SPA (single-page app) that lets a user talk to an AI agent inside Cullis. It renders the conversations, the tool calls, surfaces the principal model (users, agents, sessions), and is audited like everything else. It runs on the local Connector, under the same workload identity as the power user who runs it.

Cullis Frontdesk is the server-side packaging of the same SPA for multi-user deployment. Installed once per organization, regular users sign in through SSO from a browser, each request is signed with the user’s own principal through a multi-user KMS (covered in the “Topography” section).

The two deployments map to two usage profiles. Cullis Chat on a single power user’s laptop, with a local Connector, fits developers and advanced users who manage their own workload identity. Cullis Frontdesk fits the average employee, who opens a browser, signs in via SSO, and talks to the organization’s AI agents without installing anything.

The unified gateway: tools and models under the same control

The aspect of Mastio that deserves its own section is the symmetry between the two flows it mediates. When you describe a gateway for AI agents, the first thing that comes to mind is “agent asks for a tool, the gateway answers”. That is half of the flow. The other half is “agent calls the model, the model answers”. Both are exit points from the perimeter, both can carry sensitive data, both have to be mediated by the same control plane if per-agent identity and audit are to hold everywhere.

If the gateway only mediates toward tools but leaves direct access to the model open, there is a direct path agent → external LLM provider that the control plane does not see. The problem is not theoretical: an agent that already has access to internal data can exfiltrate it by writing it into the prompt to the external model, and the provider receives it in cleartext while the audit log records nothing. The same holds in the other direction: a prompt injection inserted in the model’s reply directs the agent to do things it should not have done. Without a gateway on the model side as well, two of the main attack vectors against AI agents today remain uncovered.

Mastio mediates both sides. The same process that sees agent → tool calls also sees agent → model calls, under the same per-agent identity, inside the same non-repudiable audit log. It is not a second product, it is a second arrow leaving the same control plane. That is why Mastio’s responsibilities are five, not four.

Technically, routing toward LLM providers is implemented by embedding an open-source LLM router library (LiteLLM, MIT licensed). It speaks to roughly a hundred providers through a single chat-completion schema. Reusing it means not reinventing adapters for each provider, and concentrating where a standard layer does not yet exist: per-agent identity on the call, the non-repudiable audit chain, and active safety on prompts and results. Mastio does not aim to be one more LLM router, it aims to be the trust layer above the router.

For organizations that already operate an enterprise AI gateway in production (any of the established ones, or an internal proxy built on top of a router library), Mastio sits in front without asking to replace anything. The existing gateway keeps doing its routing and cost-tracking work, Mastio adds the per-agent identity, the cross-org audit, and the federation on top. This holds for those who have already invested in a model-side solution and want the trust layer that does not exist there yet. The two operational variants (embedded and brownfield) are detailed in the topography section.

Topography: how Cullis deploys

Cullis deploys in two topographies, a simpler one (intra-org) and a richer one (cross-org). The difference is the number of Mastios involved and the presence or not of a Court. On both, the two operational variants of the AI gateway (embedded or brownfield) apply, depending on what model-side infrastructure the organization already has in place.

Intra-org: standalone Mastio

One organization, one Mastio, no Court. Typical components:

A VM or a Kubernetes pod running Mastio
Postgres for state (bindings, audit, issued certificates, enrollment sessions)
Vault or enterprise KMS for the Org CA
Reverse proxy (nginx, traefik) in front of Mastio for mTLS toward agents
On each developer’s laptop: the Connector
On internal backend services: the SDK embedded in the code
On Kubernetes workloads: SPIRE as the attestation authority, integrated with Mastio

Two access modes apply on top of this base, depending on who is using the agent.

Intra-org topography with two access modes: Mode A (direct workload identity via Connector/SDK/SPIRE) and Mode B (Cullis Frontdesk shared install with per-user signing through a multi-user KMS), reverse proxy with mTLS termination, Mastio FastAPI, Postgres for the audit chain, Org KMS for the CA, mediated outbound to internal MCP tools and LLM providers — Fig. 02 · Intra-org topography with two access modes.

Mode A · direct workload identity. A power user runs Cullis Chat on their own laptop, talking to the local Connector. Or a backend service runs the SDK in-process. Or a Kubernetes workload uses SPIRE. In all three, the agent process itself holds a workload identity (x509 + SPIFFE) and authenticates to Mastio with mTLS.

Mode B · shared frontdesk with user principals. Cullis Frontdesk is installed inside the organization, regular users sign in through SSO from a browser. Each request is signed with the user’s own principal via a multi-user KMS. The KMS is the same Org KMS Mastio uses for the Org CA, accessed through a per-user namespace.

The typical flow in either mode is agent (or user) → bridge or Frontdesk → mTLS to Mastio → Mastio mediates toward internal MCP tools or toward the LLM. Everything stays inside the enterprise perimeter, nothing leaves toward third parties unless it is the target LLM provider.

Cross-org: two Mastios and a Court

When the organization has to talk to agents in other organizations, a Court is added (its own, consortium, or managed) and federation with the counterpart Mastio is established. There is Mastio A in Acme + Mastio B in Globex, each with its own Org CA, and a Court reachable from both.

Cross-org topography: Acme agent to Mastio Acme, sealed envelope to Court (sees only metadata), Court routes to Mastio Globex and then to the Globex agent. Three parallel audit chains (Acme, Globex, Court partitioned per organization), all anchored to an external TSA RFC 3161 — Fig. 03 · Cross-org topography with sealed envelope and triple audit.

The flow: Acme agent → Mastio Acme (signs, applies local policy, encrypts E2E with the public key of Mastio Globex) → Court (sees only routing metadata) → Mastio Globex (verifies signature, applies local policy, decrypts) → destination Globex agent through its local Connector or SDK. Audit is dual-write: one chain on Mastio Acme, one on Mastio Globex, both anchored via TSA for non-repudiation. Court never sees the payload and holds no keys that would let it impersonate a Mastio or an agent.

Both AI gateway variants in either topography

LLM egress always goes through Mastio, but the path between Mastio and the providers has two operational variants.

The two LLM gateway variants: embedded (LLM library inside Mastio, direct call to the provider) and brownfield (Mastio in front of an existing AI gateway that calls the provider). In both, the agent reaches Mastio over mTLS+DPoP — Fig. 04 · LLM gateway, embedded vs brownfield variants.

Embedded (default): routing is inside the Mastio process (LLM router as a library). No new containers or services to deploy, the provider keys live in the enterprise KMS, the audit of the call to the provider is the same chain as everything else.
Brownfield: routing is delegated to an existing AI gateway already operated by the organization. Mastio does not act as the LLM gateway here: it translates the Cullis identity into the credential the existing gateway expects, applies policy at the boundary, and audits the crossing. The existing gateway keeps its own controls (RBAC, budget, rate limit).

The choice between embedded and brownfield is operational, not functional: from the trust plane’s point of view the two variants are equivalent.

The three adoption flows

Every agent, before it can call tools or talk to other agents, must have an identity in the organization’s Mastio. The enrollment flows are three, picked based on where the agent lives.

Interactive laptops (Connector device-code)

The case of developers using a coding agent on their own machine, and tomorrow of anyone who will have a personal agent running next to them.

The developer launches the Connector, which generates a private key locally that never leaves the device. The Connector requests an enrollment session from the organization’s Mastio, an admin approves it from the dashboard, and the Connector completes the handshake by signing a challenge with the freshly generated key. From that moment the laptop is an enrolled agent, recognized by Mastio, subject to organization policy, present in the audit log.

Internal services and enterprise MCP servers (BYOCA)

The case of models and services developed in-house and made available to all employees: a ticketing agent, an agent that talks to an enterprise database, an MCP server that exposes an internal API, a scheduled job that processes documents.

The AI team (or platform team) generates a certificate for the agent signed by the enterprise CA (Vault, an internal CA, an enterprise PKI), and hands it to Mastio along with the key. Mastio verifies the chain, verifies possession, and from that moment the service is enrolled. The backend service uses the SDK (cullis_sdk) to hold the key and sign calls toward Mastio in-process. BYOCA stands for “Bring Your Own CA”: an organization that already has certificate issuance processes reuses them, with no need to go through an interactive flow. This is the flow that typically populates Mastio’s “gateway slice” (the set of internal MCP servers accessible to employee agents).

Kubernetes workloads with an existing identity fabric (SPIRE)

The case of organizations that already adopted SPIRE as workload identity fabric for their Kubernetes clusters and want the same workloads to appear inside Mastio with the SPIFFE identity SPIRE already attests.

Mastio treats SPIRE as the attestation authority: the workload presents an SVID issued by SPIRE, Mastio verifies the chain up to the Org CA that signed the SPIRE intermediate, and accepts the SPIFFE identity as agent_id. Credential rotation is automatic (typical TTL ~1 hour). This flow covers the most mature enterprise slice, and is addressed to those who have already done the work of SPIRE adoption and do not want to manage a second credentials system in parallel.