Designing Safe Hand-offs: Making Agent-to-Agent Collaboration Probably Safe
- Daniel Bertrand

- Mar 15
- 3 min read
Plain-English idea: when one AI agent asks another for help, that “handoff” should be as clear and controlled as a purchase order—not a casual DM. Safe hand-offs let you keep the speed of automation without the risk of quiet, hard-to-spot incidents.

Why hand-offs matter
Most teams don’t run a single agent—they run several (triage, finance, research, HR, etc.). The moment one agent can ask another to “just do it,” you gain speed and the chance for misunderstanding or abuse. A sloppy hand-off can:
blur who actually made the decision,
mix outside requests with inside authority,
and hide risky actions inside “normal” teamwork.
The failure modes (in human terms)
The vague ask: “Can you send them the full report?” (Which report? To whom? Is that allowed?)
Borrowed authority: a low-access bot asks a high-access bot to run a sensitive task—no one notices.
Hidden shortcuts: agents use informal phrasing or code-words; your filters see “normal” text, not a risky request.
Design goals (no jargon)
Clarity: every hand-off says what, why, where, and who—in a small, structured form.
Control: sensitive requests get a human click.
Trace: you can reconstruct who asked for what and what happened next.
These principles align with practical guidance from OWASP (avoid “excessive agency”), threat-aware posture from NCSC, and governance practices promoted by NIST and MITRE.
The Safe Hand-off Card (one screen, five fields)
Replace free-text agent DMs with a tiny “handoff card” that every agent must fill. Keep it boring and consistent:
Intent (pick one): review / summarize / update / export / delete / invite
Data type (pick one): Public / Internal / Sensitive
Destination (pick one): named lists, folders, or systems only (no open text)
Reason (one sentence): why this is needed
Source (pick one): internal / external (did this start with an email, web page, upload, ticket?)
E.g., If Intent = export/delete/invite and/or Source = external, require human approval before the receiving agent acts.
The approval map (one slide leaders can adopt)
No approval: read-only or internal review of Public/Internal data.
Manager approval: any export of Internal data.
Data owner approval: any action (export/delete/invite) on Sensitive data, or any request originating from external sources.
Keep the rule short enough to remember, and post it where requests appear.

The “broker” pattern (your safety gate)
Don’t let agents message each other directly. Route all hand-offs through a simple broker that:
enforces the card (no card, no action),
checks the approval map,
and logs the full story (who asked, what was requested, approvals, results).
This sounds technical, but many teams implement it with a form, a small workflow, and a shared log. Your security team can later add analytics. (This also pairs well with playbooks from CISA on secure automation.)
What “good” looks like (examples)
Good:
“Intent: export • Data: Internal • Destination: Finance-Metrics list • Reason: monthly headcount • Source: internal → Auto-approved.”
Needs approval:
“Intent: export • Data: Sensitive • Destination: S3-Payroll-Archive • Reason: audit request • Source: external → Data owner approval required.”
Blocked:
“Intent: export • Data: Sensitive • Destination: ‘someone@vendor.com’ (free text) → No free-text destinations allowed.”
Red flags your broker should catch
Free-text anything (intent, destination, recipients).
External → sensitive requests without approval.
“First-time” actions for a given agent (new tool, new destination).
After-hours big moves (large exports at 2 a.m. with no approver online).
Roles & responsibilities (so nothing slips)
Executives: endorse the card + broker as the standard; make approvals an expectation, not a favour.
Managers: own your agents’ approval lists and review them monthly.
Front-line staff: write clear “reasons” on the card; treat external requests as advice, not orders.
IT/Sec: implement the broker, lock destinations to a pick-list, and alert on “external → sensitive” sequences.

Launch plan (one week, low lift)
Day 1–2: Publish the five-field hand-off card and the approval map.
Day 3–4: Turn off free-text DMs between agents; route through the broker (even if it’s a lightweight form).
Day 5: Limit destinations to named lists/folders/systems.
Day 6–7: Review the first week’s hand-offs; tighten anything confusing.
Metrics that prove it’s working
% of hand-offs using the card (goal: 100%)
of blocked “external → sensitive” requests (should trend down)
Time from request to approval (should stay short and predictable)
% of hand-offs with clear “Reason” filled (goal: >95%)
Copy-paste checklist
No direct agent DMs; all hand-offs use the card
Approvals required for export/delete/invite and any external-origin request
Destinations are pick-lists; free-text disabled
Logs show who/what/why/where for each hand-off
Weekly 15-minute review of outliers (new tools, after-hours, large exports)


