← Back to blog

Deterministic chaining: when you don't need the LLM to decide

Your deploy pipeline is four steps: lint, test, build, deploy. The first three either pass or they don't — there's no judgment in them. But your agent treats all four the same way. It stops and thinks before every one, including the three that were never decisions.

That's wasted work, and it's the most wasteful kind: the model spending its expensive attention on steps that have a known, computable outcome.

The round-trip problem

Walk through what happens without chaining. The model starts the workflow. It gets back the lint step. It reads the response. It reasons about it. It picks the transition. It submits. It waits for lint to finish. It gets the result. It reasons again. It picks the next transition. It submits. It waits for tests. And on.

Every one of those is a full LLM round trip, and each round trip costs you four things:

  • Input tokens to re-read the workflow state
  • Output tokens to "reason" about a choice that isn't one
  • Network latency, end to end
  • API cost that scales with how long the conversation has grown

Put rough numbers on it. Take a 10-step pipeline where 8 of the steps are computable. Without chaining, that's 8 round trips that exist only to produce the answer "yes, proceed." Each one re-sends the workflow state the model has to read, and spends output tokens — the expensive kind, 3 to 5 times the price of input — reasoning about a step with one possible outcome. Eight times. Per run. Every run. The pipeline that should cost one model interaction costs nine.

Tag it and forget it

The fix is one field on a transition: actor: deterministic.

deploy-pipeline.yaml
lint:
  transitions:
    run_lint:
      target: test
      actor: deterministic
      executor: { kind: cli, command: lint-check }
test:
  transitions:
    run_tests:
      target: build
      actor: deterministic
      executor: { kind: cli, command: test-runner }
build:
  transitions:
    build_artifact:
      target: ready_to_deploy
      actor: deterministic
ready_to_deploy:
  goal: Confirm deployment
  transitions:
    deploy:
      target: deployed
      actor: agent

When a state has only deterministic transitions, the runtime chains through them on its own. The model calls workflow.start. The runtime sees that the first state's only move is deterministic, so it executes lint. Succeeds. Moves to test. Executes. Succeeds. Moves to build. Executes. Succeeds. It arrives at ready_to_deploy — which has an agent transition — and the chain stops.

Three executor calls. Zero LLM round trips. One response back to the model, at the first point where there's an actual decision to make.

What the model gets back

When the chain hands control back, the response carries everything the model needs to make the deploy decision well:

  • A chain array tracing each auto-executed step and its result
  • A guidance object — the goal and instructions for the current state
  • The links for the legal transitions — here, just deploy

The model reads the lint output, the test count, the build artifact — it lands at the decision with full context. And it spent nothing getting there. The intermediate steps never even appeared in its list of options; deterministic transitions are hidden from the links array.

Work that happens the moment a state is entered

There's a companion to deterministic transitions worth knowing: the onEnter action. A state can declare work that runs automatically as soon as the workflow arrives there — and stash the result into context for later guards to read.

The pattern is "as soon as you reach this state, run the analysis." A risk-review state, on entry, runs the risk analysis and writes its score into context; the transitions out of that state then have a real number to guard on — remediate if the score is too high, request approval if it's acceptable. The model didn't have to call the analysis and didn't have to decide to. Arriving at the state was enough to make the facts exist.

When a step breaks mid-chain

If the test step fails, the chain stops at the failure — and the model doesn't have to start over. The response includes:

  • The partial chain trace — lint succeeded, tests failed
  • The error from the failed executor
  • A recovery link to retry just the step that broke

The model sees what worked, what didn't, and has a link to try again from the failure point. A broken step in the middle of an automated chain is still a response with a way forward — the same recovery property every other call in mcp-flowgate has.

Where a chain stops

A chain isn't a loop that runs until it's tired. It stops at exactly four things, and it's worth knowing all four:

  • A decision point — any state with a non-deterministic transition
  • A terminal state
  • The depth limitmaxChainDepth, default 50
  • An executor failure

The depth limit is the safety net. If a config accidentally wires a cycle of deterministic transitions — it shouldn't, but mistakes happen — the chain stops at the limit instead of running forever. You can set it per workflow.

Chains that branch on real outcomes

Here's where it gets more capable than "run these steps in order." Sometimes the next step depends on what the last one returned — and that's still not a judgment call. "Run the tests; if they pass, go green; if they fail, go red" has a computable answer. The destination just depends on the result.

A transition can declare branches. After the executor runs and its output is mapped into context, the runtime picks the destination — the first branch whose when guard passes wins:

tdd.yaml
run_tests:
  target: red                            # default fallback
  executor:
    kind: cli
    connection: shell
    args: ["-c", "cargo test"]
    treatNonZeroAsFailure: false      # exit code is data, not failure
  output:
    passed: "$.output.success"
  branches:
    - when:   { kind: expr, expr: "$.context.passed == true" }
      target: green
    - when:   { kind: expr, expr: "$.context.passed == false" }
      target: red

Two details make this work. treatNonZeroAsFailure: false tells the CLI executor to capture a non-zero exit code as dataoutput.success: false — instead of erroring the transition. And the branches pick the path from that data. A test run that fails is no longer an exception; it's an outcome the chain routes on.

That's the mechanism behind the tdd example, which enforces a real red → green → refactor cycle — the chain runs the tests and routes itself, no model involved, right up until there's something genuine to decide.

Chains of chains: composing whole workflows

A deterministic step doesn't have to be a single command. With the workflow executor kind, a transition can run an entire sub-workflow as one step:

deploy-with-lock.yaml
acquire_lock:
  transitions:
    start:
      target: deploying
      executor:
        kind: workflow                 # run an entire sub-workflow as one step
        definitionId: with_artifact_lock
        input: { artifact: "$.context.artifact_name" }

The runtime starts the named sub-workflow, polls it to a terminal state, and returns its result — all as a single step in the parent. That makes a real pattern composable: acquire a lock (a sub-workflow), deploy, release the lock (another sub-workflow), each stage its own governed flow with its own guards and audit trail. A sub-workflow can itself use a workflow executor, so to keep that from recursing forever there's a hard depth cap of 10.

Deterministic chaining, then, isn't only "run these commands in a line." It's "run everything that's computable" — and "computable" can be a whole workflow.

Automated, not invisible

Auto-executed doesn't mean unrecorded. A chain leaves a complete trail: each deterministic step emits a chain.step event, a finished chain emits chain.completed, a failed one emits chain.failed, and every branch decision emits a transition.branched event naming the matched branch and the chosen target.

There's a second kind of trace, too. A successful executor can record evidence — the CLI executor logs a cli_output record on every successful command. So the steps a chain ran without waking the model still leave behind facts that a later guard, in the governed part of the workflow, can require. The auto-run lint and test steps don't just pass — they leave proof that they passed. You get the speed of automation with no audit gap and no honesty gap.

When to reach for it — and when not to

Any step whose outcome is computable rather than a judgment call is a candidate: linting, testing, building, data validation, format conversion, sending a notification. If a human wouldn't need to stop and think about it, the model doesn't either.

The honest flip side: chaining is for computable steps only. If a step needs judgment — even a little — it belongs to an agent transition, and it should stay there. Tagging something deterministic to shave a round trip off a step that actually needs a decision doesn't save you anything. It just gets you a confident wrong answer faster.

The model's attention is the expensive part of your system. Deterministic chaining is the discipline of not spending it on steps that were never decisions — and saving it for the ones that are.

The chaining guide has the full YAML reference, and the deploy-pipeline example is a complete, runnable workflow.