21 Jun 2026 · 9 min read

From Infrastructure as Code to Infra as Agent

Terraform taught us to declare machines instead of building them by hand. The next step is infrastructure that is operated by agents — and agents that are themselves declared, reviewed and promoted like any other artifact.

Infrastructure as Code won so completely that it is hard to remember what it replaced. Servers were once built by hand and described in wikis that were wrong by Thursday. Each machine was a small biography — who installed what, when, in what order — and when its author left the company, the machine became a haunted house.

Then we learned to declare instead of build. The server became a file; the file lived in git; the diff became the unit of change; review became the moment of safety. “Why is production like this?” stopped being archaeology and became git log.

That migration is about to run again, one layer up. IaC made infrastructure legible to humans. The next era — call it Infra as Agent — makes infrastructure operable by agents: the layer that watches, scales, patches and heals your systems stops being a pile of scripts and becomes something that reasons. Terraform declared the machines. Now we must declare the operator.

The new snowflake servers

Look honestly at how most organisations run their first agents, and you will recognise the haunted house immediately. The typical agent is configured in a vendor dashboard. Its system prompt was last edited by someone who has since changed teams, at an unrecorded time, for an unrecorded reason. Its API keys are long-lived and shared. Its tool list grew by accretion. Nobody can say what changed before the Tuesday it started behaving differently, because nothing records changes. It is a pet — a clever, credentialed, internet-connected pet.

This matters far more than the snowflake servers ever did, because this snowflake acts. If the agent layer is going to operate the infrastructure, the agent layer must be held to a higher standard than the infrastructure — its entire definition (model, instructions, tools with their scopes, policies, budgets, eval criteria) captured in declarative files, in a repository, behind review.

Fig. 1 — The Infra as Agent pipeline. The same shape as every delivery pipeline you trust, applied to the operator itself.

What changes when the operator is a file

Each stage of that pipeline carries an old discipline into new terrain.

The definition is the truth. If the prompt in production differs from the prompt in git, that is drift — detectable, alertable, correctable — not a mystery.
Review gains a new question. Alongside “is this code correct?” sits “what can this change persuade the agent to do?” A widened tool scope deserves the scrutiny of an IAM policy change, because that is what it is.
Evals are the test suite. Deterministic code gets unit tests; probabilistic behaviour gets evals — a battery of scenarios with graded expectations, run on every change. A prompt edit that drops the eval score blocks the merge, exactly as a failing test does. Without evals, every prompt change is a deploy with no tests on a system with no types.
Rollout is progressive. New agent versions run in shadow first, proposing without acting; then under approval gates; then autonomously within budgets. Trust is promoted through environments, like any artifact.
Rollback is boring. A regression in judgement is reverted like a regression in code. You can roll back a personality now. It should feel exactly as undramatic as rolling back a deploy.

The model is a dependency

One habit from the IaC years matters more than the rest: pinning. The model behind your operating agent is a dependency, and model upgrades are dependency bumps — sometimes minor, sometimes breaking, never to be absorbed silently. Pin versions explicitly, run the eval suite against the candidate, read the diff in behaviour, then merge. The teams that learned this with database drivers already know the choreography; only the substance is new.

What Infra as Agent looks like in practice

The endpoint is not science fiction; the pieces ship today. A diagnostic agent declared as a Kubernetes resource, reconciled by a controller, holding read-only tools to Prometheus and the API server. A remediation agent whose write scopes were merged one pull request at a time, each expansion justified by a track record the evals can show. A capacity agent that proposes Terraform diffs but never applies them — the apply belongs to the same pipeline every human change goes through.

Notice what is common to all of these: the agent operates the infrastructure, but the repository operates the agent. Authority flows from git, through review, through evals, into a scoped runtime — never from a dashboard, never from folklore.

The terrain ahead

Terraform’s deepest gift was not automation. It was legibility — the ability of an organisation to read its own infrastructure, reason about change, and hand systems between people without folklore. The agentic layer needs that gift more urgently than the servers ever did, because this layer acts, persuades and errs in ways a misconfigured VM never could.

The tooling will consolidate; the YAML will mutate; the names will shift. The practice will not: declared, reviewed, evaluated, promoted, reverted. We spent two decades learning to operate machines we could read. We are not going to be operated for by machines we cannot.

— Continue reading

All field notes →

Work with us