AgentOps: Operating Claude Code at Scale

Running Claude Code on one developer's laptop is straightforward. Running it across a team of ten, with autonomous sessions in CI/CD, shared cost budgets, and audit requirements — that's AgentOps. This guide covers the operational patterns that make scaled Claude Code usage sustainable.

What AgentOps means

AgentOps is the practice of operating AI agents the way you'd operate any production system: with monitoring, cost controls, governance, and incident response. It's DevOps for AI coding agents.

Session monitoring

Every Claude Code session generates a transcript — a full record of what it read, what it changed, and what commands it ran. At scale, you need visibility into these sessions:

Session logs — Collect transcripts from all team members and CI/CD runs into a central location. See Centralised Logging for implementation patterns.
Activity dashboards — Track sessions per day, tasks completed, error rates, and cost per session.
Anomaly detection — Flag sessions that run unusually long, consume excessive tokens, or repeatedly fail. These often indicate unclear task definitions.

Cost management

Token costs scale with usage. Without controls, a busy team can run up significant bills:

Per-session budgets — Set maximum token limits per session. Claude Code respects these and will stop before exceeding them.
Team-level budgets — Use Anthropic's API dashboard to set monthly spending limits per team or project.
Cost attribution — Tag sessions by project, team, or purpose so you can see where the money goes.
Model selection — Not every task needs the most capable model. Use Sonnet for routine tasks and reserve Opus for complex architecture work.

Real cost numbers

From our experience running Claude Code across multiple projects:

A typical developer uses $15–40/day in token costs on the API plan
CI/CD automation (PR reviews, test generation) adds $5–15/day per active repo
The Max plan at $200/month breaks even if a developer uses it for 2+ hours daily

Governance

As Claude Code becomes part of your development process, governance questions arise:

Who can run autonomous sessions? — Not every developer should have unsupervised agent access from day one. Start with interactive use and graduate to autonomous as trust is established.
What can agents access? — Use permission modes to control whether Claude Code can run arbitrary commands, access the network, or modify certain files.
Audit trail — Every session should be traceable to a person, a project, and a purpose. This is essential for regulated industries.
Change approval — Autonomous agents should create PRs, not push directly. Human review remains the gate.

CI/CD integration

Claude Code excels in CI/CD pipelines for tasks like:

Automated PR review — Run Claude Code on every PR to check for common issues, suggest improvements, and flag security concerns.
Test generation — When new code is pushed, Claude Code can generate missing tests.
Dependency updates — Schedule Claude Code to review and update dependencies, creating PRs with changelogs.
Documentation sync — Keep docs up to date by running Claude Code after feature merges.

Team onboarding

Rolling out Claude Code to a team works best in phases:

Pilot — One or two experienced developers use it for a month. They document what works and what doesn't.
Expand — Roll out to the full team with the patterns the pilot established. Share skills, CLAUDE.md files, and cost expectations.
Automate — Add CI/CD integrations and autonomous workflows once the team is comfortable with interactive use.
Optimise — Review costs, quality metrics, and team feedback. Adjust skills, guardrails, and processes.

Next steps

AgentOps depends on good guardrails and centralised logging. For multi-agent coordination, see Hierarchies.