Stop writing approval gates in code
You needed production deploys to require a human sign-off. So
you wrote it: an if in the deploy tool that checks
a flag, posts to Slack, and waits. It worked. Then the refund
tool needed the same gate. Then the delete-user tool. Now the
rule lives in three places, each a little different — and
nobody can tell you, in one sentence, what your approval policy
actually is.
That's the trouble with governance written as application code. It feels like the natural place to put it. It's the wrong place.
Three things go wrong when governance is code
It drifts. Three hand-written copies of "require approval" do not stay identical. One forgets to write an audit line. One has a 30-second timeout, another waits forever. Six months in, the gates are cousins, not clones — and the differences are all accidental.
It can't be reviewed as a whole. Ask "what in
this system needs human approval?" and there's no answer to
point at. There's only a grep. Your policy isn't a document or
a config — it's an emergent property of scattered
if statements, and emergent properties are exactly
what you don't want a security policy to be.
It's tangled with the real work. The gate lives in the same function as the deploy. You can't test "the approval policy" in isolation, because there is no such thing — there's only the deploy function, with a gate knotted through it. Auditors can't read it. New engineers don't see it until they trip over it.
The same gate, declared
Here is that approval gate in mcp-flowgate. The whole thing:
proxy:
expose:
- name: deploy.prod
executor: { kind: human, queue: prod-deployments }
With kind: human, deploy.prod stops
deploying when the model asks for it. The runtime records a
human.approval.requested event, returns a pending
status, and halts. The action does not run. A person resolves
the queue.
The rule isn't buried inside the deploy logic. It's a property
of the exposure, sitting in a config file, where anyone can
read it without reading code. "What needs approval?" is now a
question you answer by searching one file for
kind: human.
It's a layer, not a flag
Here's the part that makes declared governance more than cosmetic. That gate isn't one clever line — it's three independent layers, and they compose.
First, a transition can be tagged actor: human.
The runtime checks the caller and rejects an agent or anonymous
submitter with ACTOR_MISMATCH — before the
executor runs at all.
Second, the human executor itself only ever
records an approval request and returns a queued status. It
physically does not advance the workflow. So even if a config
author forgets the actor: human tag, the executor
still won't fire the action.
Third, you can stack a permission guard on top for
role-based control over who may resolve it.
Three layers, each enough on its own, none depending on the others being correct. That's defense in depth — and you get it by declaring, not by remembering to write all three checks into every tool by hand.
The rest of governance has the same shape
Approval gates aren't special. Every guardrail you'd otherwise hand-code is declarable:
- name: deploy.prod
inputSchema:
type: object
required: [environment, version]
properties:
environment: { type: string, enum: [staging, production] }
guards:
- { kind: evidence, requires: [tests_passed] }
reliability:
timeoutMs: 30000
retry: { maxAttempts: 3, backoff: exponential }
executor: { kind: cli, connection: kubectl } inputSchemais JSON Schema. It runs before the executor. Bad input is rejected withINPUT_SCHEMA_VIOLATIONand never reaches your tool. You didn't write a validation function.guardsare preconditions —permission,role, a smallexprlanguage, orevidencethat some named fact was recorded. You didn't write a precondition check.reliabilityis timeout, retry with backoff, and fallback executors. You didn't write a retry loop — and you didn't write three subtly different ones.
Each of these would otherwise be code, copied into every tool that needs it, drifting on its own schedule. Declared, each rule lives once, at the level where the rule actually belongs.
Put the rule where it's invariant
That last phrase — "the level where the rule belongs" — is worth slowing down on, because it's how declared governance stops drift from creeping back.
A guard can sit at three levels, and the levels mean different things. On a capability, a guard says "this action can never run without X" — and it travels with the capability everywhere it's used. On a wrapper — a capability that layers policy onto another — a guard says "whenever this policy is applied, also require Y." On a transition, a guard says "going from this state to that one specifically needs Z."
This sounds like filing. It's actually the safeguard. Put "requires tests passed" on the capability, and no workflow that uses it can forget the rule — remembering isn't the workflow's job, because the rule is invariant and lives at the invariant level. Application code has no clean way to express "this is true everywhere this thing is used." Config makes you name the level, and naming it is what keeps the rule from being three slightly different rules a year from now.
Gates that depend on facts, not flags
The most interesting guard kind is evidence, and
it's worth its own section because it changes what a gate
is.
A normal guard checks a value. An evidence guard checks that
something happened. When an executor succeeds, it can
record evidence — a named kind, like tests_passed.
An evidence guard on a later transition requires that named
evidence to exist for this workflow before the transition can
fire.
So "you can't deploy until the tests passed" stops being a
boolean somebody set. It becomes "a test run actually recorded
tests_passed evidence on this workflow." You can't
satisfy it by flipping a flag, and you can't satisfy it by
reordering calls — the evidence has to have been genuinely
produced by a step that ran.
Evidence also counts. The object form requires N records of a kind — a real two-person rule, declared in two lines:
# one approval is not enough — require two
guards:
- { kind: evidence, requires: [{ kind: approval, count: 2 }] }
That's a quorum. The transition is illegal until two separate
approval records exist. Writing that correctly by hand —
counting distinct approvers, not double-counting one, not
racing — is exactly the kind of code that goes subtly wrong.
Declared, it's a guard with a count.
Retries that don't double-fire
One more piece of hand-rolled governance worth retiring: the idempotency key. A guarded action with retries is great until the action has side effects — a payment, a created PR. A retry of a call that timed out but actually succeeded can do the work twice.
Declare idempotencyKey: true and the runtime
computes one key per submit and reuses it across every retry
and every fallback executor. The REST executor sends
it as an Idempotency-Key header; the CLI executor
exposes it as an environment variable. You declared "retry
this safely." You didn't hand-thread a key through a retry loop
and hope you covered the fallback path too.
Why config — and where the line is drawn
There's a fair objection here: configuration languages have a habit of growing into bad programming languages. Give config an expression syntax and soon it has loops, and soon your "config" is unreviewable code wearing a YAML costume.
mcp-flowgate draws that line on purpose, and the docs are blunt
about it. The expression syntax is deliberately tiny —
$.scope.path references, basic comparisons, a
handful of operator objects. No embedded scripting language.
The reasoning, straight from the project's own design notes:
- Declarative purity. A YAML map is data you can diff and reason about. An embedded expression language is code hiding in data.
- Static validation. A tiny syntax can be checked at config-load time — without running anything.
- Audit clarity. Audit events carry the literal expression. Two small guards document themselves; one dense scripting line doesn't.
- No smuggling. Once expressions can do real work, application logic creeps out of executors and into config — exactly the mixing this whole approach exists to prevent.
The point was never "config can do anything." It's that governance — gates, guards, retries, schemas — is a different kind of thing from application logic, and it earns the right to live somewhere you can see all of it at once.
What declared governance gives you for free
Because the rules are data, two things fall out at no extra cost.
Every guard evaluation emits a guard.evaluated
audit event. Every rejected transition emits
transition.rejected. The audit log isn't a thing
you remembered to add — it's the running record of your policy
being enforced, call by call.
And the config gets a linter. mcp-flowgate check
validates it in CI — catching dangling transition targets,
unreachable states, and dead-end non-terminal states before
they ship. Governance scattered through application code
doesn't get a linter. It gets an incident.
The test
Here's the one question worth asking of your own system. If an auditor — or a new engineer, or you in a year — asked "what here requires human approval, and what happens when a call fails?" — could you answer by opening one file?
If the answer is "no, you'd have to read the code of every tool," your governance is application code. Move it. Make it something you can point at.
The governance guide walks through every guard kind, actor enforcement, and reliability in full, with worked configs.