kagent: running AI agents the Kubernetes-native way
A CNCF sandbox project treats agents as cluster citizens — declared in CRDs, reconciled by a controller, governed by the RBAC and GitOps you already trust. Why that pattern matters more than the project.
Kubernetes has a gravitational pull. Databases resisted it, then arrived. CI resisted it, then arrived. Machine learning training resisted it loudly, then arrived with operators and schedulers. Now it is the agents’ turn, and the project leading that descent into the cluster is kagent — started at Solo.io, donated to the CNCF as a sandbox project, and quietly becoming the reference answer to a question every platform team is about to be asked: where do the agents run?
Agents as cluster citizens
kagent’s core move is philosophical, and it will outlive any particular codebase: an agent is not a script someone runs, it is a declared resource the cluster reconciles. You describe your agents the way you describe Deployments — in YAML, in git — and a controller makes reality match the declaration.
- An Agent resource carries the instructions and the set of tools the agent may use.
- A ModelConfig names which model serves the intelligence, and how to reach it.
- Tool servers — speaking MCP, the Model Context Protocol — expose capabilities like querying Prometheus, inspecting Argo CD applications, or reading Kubernetes objects.
Why declaration beats invocation
The genius of this arrangement is everything the agent inherits for free. The moment an agent is a pod created from a CRD, your existing platform answers the hard questions:
- Who can change the agent’s instructions? Whoever can merge to the repo Argo CD watches — same as every other change.
- What can the agent reach? Whatever its ServiceAccount and NetworkPolicies allow — no more, reviewed like everything else.
- What did it do last Tuesday? The audit log and the traces, in the same observability stack as the rest of the estate.
- How do we roll back a bad agent? The same way you roll back a bad Deployment. It is just a resource.
Compare this with the alternative quietly accumulating in most organisations: agents defined in someone’s SaaS dashboard, holding long-lived API keys, modified by whoever has the password, with no diff, no review and no audit trail. The cluster-citizen pattern is not merely tidier. It is the difference between an agent you can account for and one you can only apologise for.
What we actually use it for
The natural first assignments are the ones where the agent’s knowledge is the cluster itself. Why is this pod CrashLooping? Which deployment correlates with the latency regression? Is this Istio routing rule doing what its author believed? kagent ships with tooling aimed exactly at this terrain — Kubernetes, Prometheus, Helm, Istio, Argo — which makes it a credible junior SRE for cloud native estates on day one.
Our counsel is the same as for any new colleague: read-only tools first. Let the agent earn write access the way a human engineer would, one demonstrated competence at a time. A kubectl with write permissions is a loaded instrument; hand it over deliberately.
Sandbox means sandbox
Honesty requires the caveats. kagent is a CNCF sandbox project: young, moving quickly, with the API churn that implies. Model calls cost real money, and a chatty diagnostic agent on a busy cluster can surprise you at invoice time — set budgets and rate limits as you would for any metered dependency. And no CRD protects you from prompt injection via the data the agent reads; scoped tools and human gates on writes remain the real defence.
None of that diminishes the significance. The pattern kagent demonstrates — agents declared in git, reconciled by controllers, governed by the platform — is what agent infrastructure will look like when it grows up. The cluster absorbed everything else. It is absorbing this too.