Local Models with Claude Code
Anthropic's API sends your code to their servers for processing. For most businesses, this is fine — Anthropic doesn't train on your data. But some teams need their code to stay on-premises: regulated industries, government contracts, or simply a preference for data sovereignty. Local models give you AI-assisted development without the data leaving your infrastructure.
When to consider local models
- Regulatory requirements — Your industry mandates that source code stays within your infrastructure or specific geographic regions.
- Classified or sensitive code — Government, defence, or financial code that cannot be transmitted to third-party APIs.
- Cost control at scale — Very high-volume usage (hundreds of sessions daily) where API costs exceed the cost of running your own infrastructure.
- Latency requirements — Co-located models respond faster than API calls to Anthropic's servers, especially for rapid iteration.
Options for local/private deployment
GCP Vertex AI
Google Cloud hosts Claude models on Vertex AI. Your code is processed within GCP's infrastructure under your project's data residency settings. This is the easiest path to "private" Claude usage:
- Same model quality as the Anthropic API
- Data stays within your GCP project
- Governed by your existing GCP agreements and compliance certifications
- No infrastructure to manage — it's a managed API
AWS Bedrock
Amazon Bedrock provides similar managed access to Claude models within AWS infrastructure. If your team is already on AWS, this keeps everything in one cloud:
- VPC integration for network isolation
- IAM-based access control
- CloudTrail logging for audit compliance
Self-hosted open models
For complete infrastructure control, run open-source models on your own hardware. Models like Llama and Mistral can run locally and be used with Claude Code's model provider configuration. Trade-offs:
- Full data sovereignty — nothing leaves your network
- Lower capability than Claude for complex coding tasks
- Requires GPU infrastructure (significant upfront cost)
- You manage updates, scaling, and availability
Configuring Claude Code for alternative providers
Claude Code supports configuring alternative API endpoints. You can point it at Vertex AI, Bedrock, or any OpenAI-compatible API:
- Set the API endpoint to your Vertex AI or Bedrock URL
- Configure authentication using your cloud provider's credentials
- Select the model — same Claude models are available through these providers
Hybrid approaches
Most teams don't go fully local. A hybrid approach balances privacy with capability:
- Sensitive repos use Vertex AI/Bedrock — Code stays within your cloud infrastructure.
- Non-sensitive repos use the Anthropic API — Simpler setup, potentially lower cost.
- Local models for routine tasks — Use smaller, local models for code formatting, simple refactoring, and boilerplate generation. Save the full Claude models for complex work.
Cost comparison
Cost varies significantly by approach:
- Anthropic API — Pay per token. Most cost-effective for low to medium usage. The Max plan offers predictable pricing.
- Vertex AI / Bedrock — Similar per-token pricing to the Anthropic API, plus your cloud provider's markup. May be offset by existing cloud commitments.
- Self-hosted — High fixed cost (GPU hardware or cloud GPU instances), low marginal cost per token. Only economical at very high volumes.
Quality considerations
Not all models are equal for coding tasks. Claude Opus and Sonnet are specifically tuned for code understanding, generation, and review. Open-source alternatives are improving rapidly but still lag behind for:
- Large codebase comprehension
- Multi-file refactoring
- Complex debugging and root cause analysis
- Following nuanced instructions in CLAUDE.md and skills
Test any alternative model thoroughly on your actual codebase before committing to it for production use.
Next steps
Local models fit into a broader AgentOps strategy. Combine with guardrails for access control and centralised logging for visibility across both local and cloud-hosted sessions.