Models & Pricing
The prices listed below are in units of per 1M tokens. A token, the smallest unit of text the model recognizes, can be a word, a number, or a punctuation mark. Billing is based on the total number of input and output tokens.
| Model | Context Length | Input Price | Output (Thinking) | Output (Non-Thinking) |
|---|---|---|---|---|
| deepseek-v4-flash | 128K | $0.10 / 1M | $0.40 / 1M | $0.30 / 1M |
| deepseek-v4-pro | 128K | $2.00 / 1M | $8.00 / 1M | $4.00 / 1M |
| deepseek-chat (deprecated) | 64K | $0.14 / 1M | — | $0.28 / 1M |
| deepseek-reasoner (deprecated) | 64K | $0.55 / 1M | $2.19 / 1M | — |
💡 For all models, the input cache hit price has been reduced to 1/10 of the launch price (effective 2026/4/26). deepseek-v4-pro is currently offered at a 75% discount until 2026/05/31.
Deduction Rules
Expense = number of tokens × price. Fees are deducted from your topped-up balance or granted balance, with granted balance used first when both are available.