Inference API Pricing
Maintained by Coolhand — a platform for optimizing AI workflows
About this data
This page lists pricing metadata for every LLM model tracked by Coolhand. Data is gathered automatically by the Godfrey agent, which periodically fetches and parses pricing pages published by model providers. All records are reviewed before being published, but are obviously at the mercy of the API providers to update or change without much prior notice. The Sources link on each model's entry points to the original provider pricing page used as the data source.
Cost units: Prices are stored as cost-per-token in USD but shown in standard per million token format by default.
Programmatic access:
This data is available as a JSON API at
GET /api/v2/inference_apis.
See the API docs
for the full schema.
| Source API | Model | Display Name | Provider | Input Cost | Output Cost | Status | Deprecation Date | Deprecation Notes | Sources |
|---|---|---|---|---|---|---|---|---|---|
|
anthropic
|
claude-3-5-haiku-20241022
| Claude 3.5 Haiku | $0.80 | $4.00 |
Deprecated
|
2026-04-27
| Deprecated and retired February 19, 2026. Replacement: claude-haiku-4-5-20251001. | ||
|
anthropic
|
claude-3-5-sonnet-20241022
| Claude 3.5 Sonnet | $3.00 | $15.00 | |||||
|
anthropic
|
claude-3-7-sonnet-latest
| Claude 3.7 Sonnet (Latest) | anthropic | $3.00 | $15.00 | ||||
|
anthropic
|
claude-3-haiku-20240307
| Claude 3 Haiku (2024-03-07) | $0.25 | $1.25 |
Deprecated
|
2026-04-27
| Deprecated and retired April 20, 2026. Replacement: claude-haiku-4-5-20251001. | ||
|
anthropic
|
claude-3-opus-20240229
| Claude 3 Opus | $15.00 | $75.00 |
Deprecated
|
2026-04-27
| Deprecated and retired January 5, 2026. Replacement: claude-opus-4-7. | ||
|
anthropic
|
claude-3-sonnet-20240229
| Claude 3 Sonnet | $3.00 | $15.00 |
Deprecated
|
2026-04-27
| Deprecated and retired July 21, 2025. Replacement: claude-sonnet-4-6. | ||
|
anthropic
|
claude-haiku-4-5-20251001
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | Source | |||
|
anthropic
|
claude-opus-4-1-20250805
| Claude Opus 4.1 | Anthropic | $15.00 | $75.00 | Source | |||
|
anthropic
|
claude-opus-4-20250514
| Claude Opus 4 | Anthropic | $15.00 | $75.00 |
Deprecated
|
2026-04-27
| Deprecated as of April 14, 2026. Retirement scheduled for June 15, 2026. Replacement: claude-opus-4-7. | Source |
|
anthropic
|
claude-opus-4-5-20251101
| Claude Opus 4.5 | Anthropic | $5.00 | $25.00 | Source | |||
|
anthropic
|
claude-opus-4-6
| Claude Opus 4.6 | Anthropic | $5.00 | $25.00 | Source | |||
|
anthropic
|
claude-opus-4-7
| Claude Opus 4.7 | Anthropic | $5.00 | $25.00 | Source | |||
|
anthropic
|
claude-sonnet-4-20250514
| Claude Sonnet 4 | Anthropic | $3.00 | $15.00 |
Deprecated
|
2026-04-27
| Deprecated as of April 14, 2026. Retirement scheduled for June 15, 2026. Replacement: claude-sonnet-4-6. | Source |
|
anthropic
|
claude-sonnet-4-5-20250929
| Claude Sonnet 4.5 | Anthropic | $3.00 | $15.00 | Source | |||
|
anthropic
|
claude-sonnet-4-6
| Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | Source | |||
|
gemini
|
gemini-2.0-flash
| Gemini 2.0 Flash | $0.10 | $0.40 |
Deprecated
|
2026-04-27
| Deprecated, will be shut down June 1, 2026. See https://ai.google.dev/pricing. | ||
|
gemini
|
gemini-2.0-flash-lite
| Gemini 2.0 Flash-Lite | $0.08 | $0.30 |
Deprecated
|
2026-04-27
| Deprecated, will be shut down June 1, 2026. See https://ai.google.dev/pricing. | ||
|
gemini
|
gemini-2.5-pro
| Gemini 2.5 Pro | Google Gemini | $1.25 | $10.00 | Source | |||
|
ollama
|
gpt-4-turbo-preview
| GPT-4 Turbo Preview (Ollama) | $0.18 | $0.18 | |||||
|
ollama
|
llama3.1:latest
| Llama 3.1 (8b) | $0.18 | $0.18 | |||||
|
openai
|
chatgpt-4o-latest
| ChatGPT 4o (Latest) | $5.00 | $15.00 | |||||
|
openai
|
gpt-3.5-turbo
| GPT-3.5 Turbo | $0.50 | $1.50 | |||||
|
openai
|
gpt-4.1
| GPT-4.1 | $2.00 | $8.00 | |||||
|
openai
|
gpt-4.1-2025-04-14
| GPT-4.1 (2025-04-14) | $2.00 | $8.00 | |||||
|
openai
|
gpt-4.1-mini
| GPT-4.1 Mini | $0.40 | $1.60 | |||||
|
openai
|
gpt-4.1-mini-2025-04-14
| GPT-4.1 Mini (2025-04-14) | $0.80 | $3.20 | |||||
|
openai
|
gpt-4.1-nano
| GPT-4.1 Nano | $0.10 | $0.40 | |||||
|
openai
|
gpt-4o
| GPT-4o | $2.50 | $10.00 | |||||
|
openai
|
gpt-4o-mini
| GPT-4o Mini | $0.15 | $0.60 | |||||
|
openai
|
gpt-4o-mini-2024-07-18
| GPT-4o Mini (2024-07-18) | $0.15 | $0.60 | |||||
|
openai
|
gpt-4-turbo
| GPT-4 Turbo | $10.00 | $30.00 | |||||
|
openai
|
gpt-4-turbo-preview
| GPT-4 Turbo Preview | $10.00 | $30.00 | |||||
|
openai
|
gpt-5
| GPT-5 | $1.25 | $10.00 | |||||
|
openai
|
gpt-5-mini
| GPT-5 Mini | $0.25 | $2.00 | |||||
|
openai
|
gpt-5-nano
| GPT-5 Nano | $0.05 | $0.40 | |||||
|
openai
|
o4-mini
| o4-mini | $1.10 | $4.40 | |||||
|
openai
|
text-embedding-3-large
| Text Embedding 3 Large | $0.13 | $0.00 | |||||
|
openai
|
text-embedding-3-small
| Text Embedding 3 Small | $0.02 | $0.00 | |||||
|
openai
|
text-embedding-ada-002
| Text Embedding Ada 002 | $0.10 | $0.00 | |||||
|
openai
|
text-embedding-ada-002-v2
| Text Embedding Ada 002 (v2) | $0.10 | $0.00 | |||||
|
vertex
|
gemini-2.0-flash
| Gemini 2.0 Flash | $0.10 | $0.40 | |||||
|
vertex
|
gemini-2.0-flash-001
| Gemini 2.0 Flash | $0.10 | $0.40 | |||||
|
vertex
|
gemini-2.5-flash
| Gemini 2.5 Flash | $0.30 | $2.50 | |||||
|
vertex
|
gemini-2.5-pro
| Gemini 2.5 Pro | $1.25 | $10.00 | |||||
|
vertex
|
llama-3.1-405b-instruct-maas
| Llama 3.1 405B Instruct | meta | $5.00 | $16.00 | ||||
|
vertex
|
llama-4-maverick-17b-128e-instruct-maas
| Llama 4 Maverick 17B Instruct | meta | $0.35 | $1.15 |
This data is part of Coolhand
Coolhand helps engineering teams collect human feedback on AI outputs and automatically improve their LLM prompts.