Agentic loops are just state machines

2026-04-22 · 3 min read
AIAgentsEngineering

The first agent I shipped to production looped forever. The user typed a question, the agent reasoned about it, called a search tool, got results that didn't quite match, reasoned again, called the same search tool with a slightly rephrased query, got the same results back. Round and round. There was no terminal state. The model would have happily kept reasoning for the rest of its life if we hadn't killed the process.

The fix was three lines of code: a max_iterations counter and an explicit done transition when the agent's confidence crossed a threshold. After that I stopped thinking about agents as systems that "think" and started thinking about them as state machines.

ReAct is observe, reason, act, observe

The ReAct pattern (observe, reason, act, observe) is presented in papers like it's a novel paradigm. It isn't. It's a four-state finite state machine with one self-loop and one terminal transition. The states are whatever bundle you call the current context: system prompt, accumulated memory, the scratchpad, the last tool result. The transitions are LLM responses and tool calls. The terminal state is whatever condition you defined: a finish tool, a max-iterations cap, a regex match on the output.

That's the whole pattern. Once you accept it, the "agent" stops being mystical and starts being engineered.

Why this reframing matters

Evals become tractable. The moment you have states and transitions, you can enumerate paths and assert invariants. After a search transition, the next state must contain a citation. After a tool_error, the next state must not call the same tool with the same arguments. These are properties you can test, not vibes you have to feel out.

Debugging stops being archaeology. When a run goes sideways, you log the transition that broke the contract, not the entire conversation history. The trace stops being a wall of text and starts being a sequence of state changes.

Reliability work gets a vocabulary. Idempotent transitions. Terminal guards. Dead-state recovery. These are all standard FSM problems with standard FSM solutions. The agent literature reinvents half of them under new names.

Practical implication

Stop building code that "thinks". Build state machines that emit reasoning as a side effect.

The thinking framing is what made my first agent loop forever. I'd written a function that called the model, parsed the output, and decided whether to keep going based on what the model said. There was no state, just a conversation. The model was perfectly happy to keep being conversational forever.

The FSM framing forces you to ask the right questions up front. What are the states? What transitions exist between them? What's the terminal state? Which transitions are reachable from which states? When you can't answer those questions, you haven't designed a system. You've written a while True loop.

The next time someone tells you their agent "decides" something, ask them to draw the FSM. If they can't, the agent doesn't decide anything. It transitions, badly.