Per-request AI API cost
Estimate how much one model call costs by entering input tokens and output tokens. This is the fastest way to price a chat, agent step, RAG answer, or automation task.
AI API pricing intelligence
Estimate API spend from token counts, compare GPT and Claude pricing, and understand how input tokens, output tokens, cached context, and request volume affect production AI costs.
Estimate how much one model call costs by entering input tokens and output tokens. This is the fastest way to price a chat, agent step, RAG answer, or automation task.
Compare base input, cached input, and output token rates. Output-heavy workflows can become expensive even when the prompt is small.
Use the comparison page to see when a cheaper model tier is enough and when a premium model may still be worth the higher per-token price.
Most AI APIs charge per million tokens. Input tokens are the instructions, retrieved context, tool results, and conversation history you send to the model. Output tokens are the text or structured response the model generates.
For RAG systems and agents, cost depends on more than one prompt. You need to estimate the average context size, the number of model calls per user task, and the percentage of repeated context that can use cached-input pricing.
AI API pricing is the cost charged by model providers for processing input tokens, generating output tokens, and sometimes reading cached context or using special tools.
Output tokens usually require more compute because the model must generate them step by step. That is why output token pricing is often higher than input token pricing.
Start with average input tokens, average output tokens, expected request volume, and cache hit rate. Then compare models by per-request and per-1,000-request cost.