Complete pricing for Claude Haiku, Sonnet, and Opus โ with real cost scenarios for developers and teams building AI applications.
Anthropic charges per million tokens (MTok). Input and output tokens are priced separately โ output is typically 4-5x more expensive.
| Model | Input (per MTok) | Output (per MTok) | Context | Best For |
|---|---|---|---|---|
|
Claude 3.5 Haiku Fast
|
$0.80 | $4.00 | 200K | High-volume tasks, classification, quick responses |
|
Claude 3.5 Sonnet Popular
|
$3.00 | $15.00 | 200K | Production AI features, coding, analysis, complex Q&A |
|
Claude 3 Haiku
|
$0.25 | $1.25 | 200K | Ultra high-volume, cost-sensitive applications |
|
Claude 3 Opus Powerful
|
$15.00 | $75.00 | 200K | Complex reasoning, research, multi-step analysis |
Anthropic adjusts pricing when new models launch. Get notified immediately โ before it affects your bill.
Get Free Price Alerts Free Pricing APIPicking the wrong model is the most common reason teams overpay for Claude. Here's the decision framework:
| Use Case | Recommended Model | Why |
|---|---|---|
| Chatbots, customer support | Claude 3.5 Haiku | Fast, cheap, handles most queries well |
| Coding assistance, code review | Claude 3.5 Sonnet | Best code quality; speed is acceptable |
| Content generation (blog, marketing) | Claude 3.5 Sonnet | Output quality matters; cost per article is still low |
| High-volume classification / extraction | Claude 3 Haiku | Cheapest per token; quality sufficient for structured tasks |
| Complex research, legal analysis | Claude 3 Opus | Highest capability for nuanced reasoning tasks |
| RAG / document Q&A | Claude 3.5 Sonnet | Handles long context well; accurate retrieval synthesis |
| Streaming chat interfaces | Claude 3.5 Haiku | Fastest time-to-first-token for real-time UX |
Anthropic offers prompt caching for repeated context โ if you have a long system prompt sent with every request, cached input tokens cost 90% less. This is huge for apps with 1K+ token system prompts used at scale.
Most teams start with Sonnet by default. Run a quality eval on Haiku first โ for many tasks (summarization, classification, FAQ) the output is indistinguishable at 3-5x lower cost.
Every token in your prompt costs money. Audit your system prompts for filler text. Use structured formats (JSON, bullet points) instead of verbose prose. A 2K-token system prompt costs $6/month per 1M requests โ 1K tokens saves $3/month at that scale.
Claude sometimes generates more tokens than needed. Use max_tokens to cap responses. For a Q&A bot, 300-500 tokens is usually enough โ setting this prevents runaway costs from occasional verbose responses.
Anthropic's Batch API offers 50% discounts on asynchronous requests processed within 24 hours. Ideal for bulk document processing, overnight report generation, or any task that doesn't need real-time response.
Anthropic has generally reduced prices as newer, more efficient models are released:
| Date | Model | Change | Details |
|---|---|---|---|
| Nov 2024 | Claude 3.5 Haiku | Price cut vs Claude 3 Haiku | Input $0.80/MTok (Claude 3 Haiku was $0.25, but Claude 3.5 Haiku offers much higher capability) |
| Jun 2024 | Claude 3.5 Sonnet | Same price, 2x capability | Launched at same $3/$15 price as Claude 3 Sonnet with significantly better performance |
| Mar 2024 | Claude 3 Family | New pricing model | Haiku/Sonnet/Opus launched โ replaced Claude 2 with per-MTok pricing structure |
| Jan 2024 | Claude 2 | Deprecated | Claude 2 pricing retired as Claude 3 family launched at competitive per-token rates |
Both APIs use per-token pricing. Here's how they compare at similar capability levels:
| Model Tier | Anthropic Claude | OpenAI Equivalent | Input Cost |
|---|---|---|---|
| Fast / Cheap | Claude 3 Haiku ($0.25) | GPT-4o Mini ($0.15) | OpenAI cheaper for bulk tasks |
| Balanced | Claude 3.5 Sonnet ($3.00) | GPT-4o ($2.50) | Similar pricing, Claude often preferred for long docs |
| Most Capable | Claude 3 Opus ($15.00) | GPT-4 ($30.00) | Claude significantly cheaper at top tier |
Claude's main advantages: larger context window (200K vs 128K), stronger performance on long-document tasks, and better instruction-following for complex prompts. OpenAI has an edge on image input tasks and multimodal workflows.
No. You must add a payment method before making API calls. New accounts receive some free credits to test. Claude.ai (the consumer app) has a free tier but does not provide API access. For development and testing, Anthropic's free credits are usually sufficient to evaluate the API.
One token is approximately 4 characters or 0.75 words. A typical 750-word document is about 1,000 tokens. Code is often denser โ 1,000 tokens of Python might be only 300-400 lines. You can use Anthropic's tokenizer tool to count tokens before making requests.
Rate limits are per-workspace and depend on your usage tier. New accounts start with lower limits (e.g., 5 requests/minute for Opus). As you spend more, limits automatically increase. You can also request higher limits via the Anthropic developer console. Tier 4 (highest) allows millions of tokens per minute.
The Batch API processes requests asynchronously (results delivered within 24 hours) at 50% off standard pricing. Claude 3.5 Sonnet via Batch API costs $1.50 input / $7.50 output instead of $3/$15. Ideal for document processing pipelines, nightly report generation, and non-real-time workloads.
You are charged for tokens processed, including input tokens on failed requests. If Claude generates an error mid-response, you're charged for input tokens and any output tokens generated before the failure. This makes it important to handle retries carefully and set appropriate timeouts.
Anthropic adjusts pricing with new model releases. Set up a free alert and know instantly when pricing changes โ before it impacts your team's budget.
Set Up Free Alert Explore Free APIRelated Reading