How much does the Claude API cost?

Claude API pricing is per million tokens (MTok). Claude 3.5 Haiku: $0.80 input / $4 output per MTok. Claude 3.5 Sonnet: $3 input / $15 output per MTok. Claude 3 Opus: $15 input / $75 output per MTok. Most applications use Haiku or Sonnet; Opus is for complex reasoning tasks. A 1,000-token request (typical chat message) costs roughly $0.003 with Sonnet.

What is a token in Anthropic's pricing?

Anthropic charges per token where 1 token ≈ 4 characters or 0.75 words. A typical 750-word document is roughly 1,000 tokens. Pricing is per million tokens (MTok). Input tokens (your prompt) and output tokens (Claude's response) are priced separately, with output tokens typically 4-5x more expensive.

Which Claude model should I use for my app?

Claude 3.5 Haiku is best for high-volume tasks needing fast responses (classification, extraction, simple Q&A) at lowest cost. Claude 3.5 Sonnet balances capability and cost — best for most production AI features. Claude Opus is for complex reasoning, analysis, and coding tasks where quality matters more than cost.

Does Anthropic offer free tier for the Claude API?

Anthropic does not offer a free tier for the Claude API. You must add a payment method before making API calls. However, new accounts receive free credits to test the API. Claude.ai (the consumer product) has a free tier but it does not include API access.

Has Anthropic changed Claude API pricing recently?

Anthropic has generally reduced API prices as new models launch. Claude 3.5 Haiku launched at lower prices than Claude 3 Haiku. Claude 3.5 Sonnet offers significantly better performance at the same price as the original Claude 3 Sonnet. Anthropic provides price change notices via email and their developer console.

Anthropic Claude API Pricing 2026

Complete pricing for Claude Haiku, Sonnet, and Opus — with real cost scenarios for developers and teams building AI applications.

Claude API Pricing by Model

Anthropic charges per million tokens (MTok). Input and output tokens are priced separately — output is typically 4-5x more expensive.

Model	Input (per MTok)	Output (per MTok)	Context	Best For
Claude 3.5 Haiku Fast	$0.80	$4.00	200K	High-volume tasks, classification, quick responses
Claude 3.5 Sonnet Popular	$3.00	$15.00	200K	Production AI features, coding, analysis, complex Q&A
Claude 3 Haiku	$0.25	$1.25	200K	Ultra high-volume, cost-sensitive applications
Claude 3 Opus Powerful	$15.00	$75.00	200K	Complex reasoning, research, multi-step analysis

Quick math: A typical 750-word message (1K tokens prompt + 500 token response) costs approximately $0.004 with Sonnet or $0.001 with Haiku. Most apps do well on Sonnet; switch to Haiku for high-frequency classification tasks.

Real Cost Scenarios

Scenario 1: SaaS Customer Support Bot (Claude 3.5 Haiku)

Monthly tickets handled 10,000

Avg input tokens per ticket (context + question) 2,000 tokens

Avg output tokens per ticket (response) 300 tokens

Input cost: 20M tokens × $0.80/MTok $16.00

Output cost: 3M tokens × $4.00/MTok $12.00

Total monthly cost $28/month

Scenario 2: AI Coding Assistant (Claude 3.5 Sonnet)

Developer sessions per month 500

Avg input tokens per session (code context + question) 8,000 tokens

Avg output tokens per session (code + explanation) 2,000 tokens

Input cost: 4M tokens × $3/MTok $12.00

Output cost: 1M tokens × $15/MTok $15.00

Total monthly cost $27/month

Scenario 3: Content Generation Platform (Claude 3.5 Sonnet)

Articles generated per month 1,000

Avg input tokens (brief + instructions) 500 tokens

Avg output tokens (1,500-word article) 2,000 tokens

Input cost: 0.5M tokens × $3/MTok $1.50

Output cost: 2M tokens × $15/MTok $30.00

Total monthly cost $31.50/month

Scenario 4: High-Volume Document Analysis (Claude 3 Haiku)

Documents processed per month 100,000

Avg input tokens per doc (document + prompt) 3,000 tokens

Avg output tokens per doc (extraction result) 200 tokens

Input cost: 300M tokens × $0.25/MTok $75.00

Output cost: 20M tokens × $1.25/MTok $25.00

Total monthly cost $100/month

Monitor Claude API Pricing Changes

Anthropic adjusts pricing when new models launch. Get notified immediately — before it affects your bill.

Get Free Price Alerts Free Pricing API

Claude Model Comparison: Which One Should You Use?

Picking the wrong model is the most common reason teams overpay for Claude. Here's the decision framework:

Use Case	Recommended Model	Why
Chatbots, customer support	Claude 3.5 Haiku	Fast, cheap, handles most queries well
Coding assistance, code review	Claude 3.5 Sonnet	Best code quality; speed is acceptable
Content generation (blog, marketing)	Claude 3.5 Sonnet	Output quality matters; cost per article is still low
High-volume classification / extraction	Claude 3 Haiku	Cheapest per token; quality sufficient for structured tasks
Complex research, legal analysis	Claude 3 Opus	Highest capability for nuanced reasoning tasks
RAG / document Q&A	Claude 3.5 Sonnet	Handles long context well; accurate retrieval synthesis
Streaming chat interfaces	Claude 3.5 Haiku	Fastest time-to-first-token for real-time UX

Claude API Cost Optimization Tips

1. Cache Your System Prompts (Prompt Caching)

Anthropic offers prompt caching for repeated context — if you have a long system prompt sent with every request, cached input tokens cost 90% less. This is huge for apps with 1K+ token system prompts used at scale.

2. Start with Haiku, Upgrade Only If Needed

Most teams start with Sonnet by default. Run a quality eval on Haiku first — for many tasks (summarization, classification, FAQ) the output is indistinguishable at 3-5x lower cost.

3. Trim Your Context

Every token in your prompt costs money. Audit your system prompts for filler text. Use structured formats (JSON, bullet points) instead of verbose prose. A 2K-token system prompt costs $6/month per 1M requests — 1K tokens saves $3/month at that scale.

4. Set Token Limits on Output

Claude sometimes generates more tokens than needed. Use max_tokens to cap responses. For a Q&A bot, 300-500 tokens is usually enough — setting this prevents runaway costs from occasional verbose responses.

5. Batch Non-Urgent Requests

Anthropic's Batch API offers 50% discounts on asynchronous requests processed within 24 hours. Ideal for bulk document processing, overnight report generation, or any task that doesn't need real-time response.

Rule of thumb: If you're spending more than $200/month on the Claude API, invest 2 hours in prompt optimization and model selection. Most teams cut costs 40-60% without sacrificing quality.

Claude API Price History

Anthropic has generally reduced prices as newer, more efficient models are released:

Date	Model	Change	Details
Nov 2024	Claude 3.5 Haiku	Price cut vs Claude 3 Haiku	Input $0.80/MTok (Claude 3 Haiku was $0.25, but Claude 3.5 Haiku offers much higher capability)
Jun 2024	Claude 3.5 Sonnet	Same price, 2x capability	Launched at same $3/$15 price as Claude 3 Sonnet with significantly better performance
Mar 2024	Claude 3 Family	New pricing model	Haiku/Sonnet/Opus launched — replaced Claude 2 with per-MTok pricing structure
Jan 2024	Claude 2	Deprecated	Claude 2 pricing retired as Claude 3 family launched at competitive per-token rates

Claude API vs OpenAI API: Cost Comparison

Both APIs use per-token pricing. Here's how they compare at similar capability levels:

Model Tier	Anthropic Claude	OpenAI Equivalent	Input Cost
Fast / Cheap	Claude 3 Haiku ($0.25)	GPT-4o Mini ($0.15)	OpenAI cheaper for bulk tasks
Balanced	Claude 3.5 Sonnet ($3.00)	GPT-4o ($2.50)	Similar pricing, Claude often preferred for long docs
Most Capable	Claude 3 Opus ($15.00)	GPT-4 ($30.00)	Claude significantly cheaper at top tier

Claude's main advantages: larger context window (200K vs 128K), stronger performance on long-document tasks, and better instruction-following for complex prompts. OpenAI has an edge on image input tasks and multimodal workflows.

See full OpenAI API pricing breakdown →

Frequently Asked Questions

Does Anthropic offer a free tier for the Claude API?

No. You must add a payment method before making API calls. New accounts receive some free credits to test. Claude.ai (the consumer app) has a free tier but does not provide API access. For development and testing, Anthropic's free credits are usually sufficient to evaluate the API.

What counts as a token?

One token is approximately 4 characters or 0.75 words. A typical 750-word document is about 1,000 tokens. Code is often denser — 1,000 tokens of Python might be only 300-400 lines. You can use Anthropic's tokenizer tool to count tokens before making requests.

How do Anthropic's rate limits work?

Rate limits are per-workspace and depend on your usage tier. New accounts start with lower limits (e.g., 5 requests/minute for Opus). As you spend more, limits automatically increase. You can also request higher limits via the Anthropic developer console. Tier 4 (highest) allows millions of tokens per minute.

What is the Batch API and how much does it save?

The Batch API processes requests asynchronously (results delivered within 24 hours) at 50% off standard pricing. Claude 3.5 Sonnet via Batch API costs $1.50 input / $7.50 output instead of $3/$15. Ideal for document processing pipelines, nightly report generation, and non-real-time workloads.

Does Anthropic charge for failed requests?

You are charged for tokens processed, including input tokens on failed requests. If Claude generates an error mid-response, you're charged for input tokens and any output tokens generated before the failure. This makes it important to handle retries carefully and set appropriate timeouts.

Never Miss a Claude API Price Change

Anthropic adjusts pricing with new model releases. Set up a free alert and know instantly when pricing changes — before it impacts your team's budget.

Set Up Free Alert Explore Free API