← Back to blog

Stop writing approval gates in code

You needed production deploys to require a human sign-off. So you wrote it: an if in the deploy tool that checks a flag, posts to Slack, and waits. It worked. Then the refund tool needed the same gate. Then the delete-user tool. Now the rule lives in three places, each a little different — and nobody can tell you, in one sentence, what your approval policy actually is.

That's the trouble with governance written as application code. It feels like the natural place to put it. It's the wrong place.

Three things go wrong when governance is code

It drifts. Three hand-written copies of "require approval" do not stay identical. One forgets to write an audit line. One has a 30-second timeout, another waits forever. Six months in, the gates are cousins, not clones — and the differences are all accidental.

It can't be reviewed as a whole. Ask "what in this system needs human approval?" and there's no answer to point at. There's only a grep. Your policy isn't a document or a config — it's an emergent property of scattered if statements, and emergent properties are exactly what you don't want a security policy to be.

It's tangled with the real work. The gate lives in the same function as the deploy. You can't test "the approval policy" in isolation, because there is no such thing — there's only the deploy function, with a gate knotted through it. Auditors can't read it. New engineers don't see it until they trip over it.

The same gate, declared

Here is that approval gate in mcp-flowgate. The whole thing:

gateway.yaml
proxy:
  expose:
    - name: deploy.prod
      executor: { kind: human, queue: prod-deployments }

With kind: human, deploy.prod stops deploying when the model asks for it. The runtime records a human.approval.requested event, returns a pending status, and halts. The action does not run. A person resolves the queue.

The rule isn't buried inside the deploy logic. It's a property of the exposure, sitting in a config file, where anyone can read it without reading code. "What needs approval?" is now a question you answer by searching one file for kind: human.

It's a layer, not a flag

Here's the part that makes declared governance more than cosmetic. That gate isn't one clever line — it's three independent layers, and they compose.

First, a transition can be tagged actor: human. The runtime checks the caller and rejects an agent or anonymous submitter with ACTOR_MISMATCH — before the executor runs at all.

Second, the human executor itself only ever records an approval request and returns a queued status. It physically does not advance the workflow. So even if a config author forgets the actor: human tag, the executor still won't fire the action.

Third, you can stack a permission guard on top for role-based control over who may resolve it.

Three layers, each enough on its own, none depending on the others being correct. That's defense in depth — and you get it by declaring, not by remembering to write all three checks into every tool by hand.

The rest of governance has the same shape

Approval gates aren't special. Every guardrail you'd otherwise hand-code is declarable:

gateway.yaml
- name: deploy.prod
  inputSchema:
    type: object
    required: [environment, version]
    properties:
      environment: { type: string, enum: [staging, production] }
  guards:
    - { kind: evidence, requires: [tests_passed] }
  reliability:
    timeoutMs: 30000
    retry: { maxAttempts: 3, backoff: exponential }
  executor: { kind: cli, connection: kubectl }
  • inputSchema is JSON Schema. It runs before the executor. Bad input is rejected with INPUT_SCHEMA_VIOLATION and never reaches your tool. You didn't write a validation function.
  • guards are preconditions — permission, role, a small expr language, or evidence that some named fact was recorded. You didn't write a precondition check.
  • reliability is timeout, retry with backoff, and fallback executors. You didn't write a retry loop — and you didn't write three subtly different ones.

Each of these would otherwise be code, copied into every tool that needs it, drifting on its own schedule. Declared, each rule lives once, at the level where the rule actually belongs.

Put the rule where it's invariant

That last phrase — "the level where the rule belongs" — is worth slowing down on, because it's how declared governance stops drift from creeping back.

A guard can sit at three levels, and the levels mean different things. On a capability, a guard says "this action can never run without X" — and it travels with the capability everywhere it's used. On a wrapper — a capability that layers policy onto another — a guard says "whenever this policy is applied, also require Y." On a transition, a guard says "going from this state to that one specifically needs Z."

This sounds like filing. It's actually the safeguard. Put "requires tests passed" on the capability, and no workflow that uses it can forget the rule — remembering isn't the workflow's job, because the rule is invariant and lives at the invariant level. Application code has no clean way to express "this is true everywhere this thing is used." Config makes you name the level, and naming it is what keeps the rule from being three slightly different rules a year from now.

Gates that depend on facts, not flags

The most interesting guard kind is evidence, and it's worth its own section because it changes what a gate is.

A normal guard checks a value. An evidence guard checks that something happened. When an executor succeeds, it can record evidence — a named kind, like tests_passed. An evidence guard on a later transition requires that named evidence to exist for this workflow before the transition can fire.

So "you can't deploy until the tests passed" stops being a boolean somebody set. It becomes "a test run actually recorded tests_passed evidence on this workflow." You can't satisfy it by flipping a flag, and you can't satisfy it by reordering calls — the evidence has to have been genuinely produced by a step that ran.

Evidence also counts. The object form requires N records of a kind — a real two-person rule, declared in two lines:

gateway.yaml
# one approval is not enough — require two
guards:
  - { kind: evidence, requires: [{ kind: approval, count: 2 }] }

That's a quorum. The transition is illegal until two separate approval records exist. Writing that correctly by hand — counting distinct approvers, not double-counting one, not racing — is exactly the kind of code that goes subtly wrong. Declared, it's a guard with a count.

Retries that don't double-fire

One more piece of hand-rolled governance worth retiring: the idempotency key. A guarded action with retries is great until the action has side effects — a payment, a created PR. A retry of a call that timed out but actually succeeded can do the work twice.

Declare idempotencyKey: true and the runtime computes one key per submit and reuses it across every retry and every fallback executor. The REST executor sends it as an Idempotency-Key header; the CLI executor exposes it as an environment variable. You declared "retry this safely." You didn't hand-thread a key through a retry loop and hope you covered the fallback path too.

Why config — and where the line is drawn

There's a fair objection here: configuration languages have a habit of growing into bad programming languages. Give config an expression syntax and soon it has loops, and soon your "config" is unreviewable code wearing a YAML costume.

mcp-flowgate draws that line on purpose, and the docs are blunt about it. The expression syntax is deliberately tiny — $.scope.path references, basic comparisons, a handful of operator objects. No embedded scripting language. The reasoning, straight from the project's own design notes:

  • Declarative purity. A YAML map is data you can diff and reason about. An embedded expression language is code hiding in data.
  • Static validation. A tiny syntax can be checked at config-load time — without running anything.
  • Audit clarity. Audit events carry the literal expression. Two small guards document themselves; one dense scripting line doesn't.
  • No smuggling. Once expressions can do real work, application logic creeps out of executors and into config — exactly the mixing this whole approach exists to prevent.

The point was never "config can do anything." It's that governance — gates, guards, retries, schemas — is a different kind of thing from application logic, and it earns the right to live somewhere you can see all of it at once.

What declared governance gives you for free

Because the rules are data, two things fall out at no extra cost.

Every guard evaluation emits a guard.evaluated audit event. Every rejected transition emits transition.rejected. The audit log isn't a thing you remembered to add — it's the running record of your policy being enforced, call by call.

And the config gets a linter. mcp-flowgate check validates it in CI — catching dangling transition targets, unreachable states, and dead-end non-terminal states before they ship. Governance scattered through application code doesn't get a linter. It gets an incident.

The test

Here's the one question worth asking of your own system. If an auditor — or a new engineer, or you in a year — asked "what here requires human approval, and what happens when a call fails?" — could you answer by opening one file?

If the answer is "no, you'd have to read the code of every tool," your governance is application code. Move it. Make it something you can point at.

The governance guide walks through every guard kind, actor enforcement, and reliability in full, with worked configs.