Models & Pricing

The prices listed below are in units of per 1M tokens. A token, the smallest unit of text the model recognizes, can be a word, a number, or a punctuation mark. Billing is based on the total number of input and output tokens.

Model	Context Length	Input Price	Output (Thinking)	Output (Non-Thinking)
deepseek-v4-flash	128K	$0.10 / 1M	$0.40 / 1M	$0.30 / 1M
deepseek-v4-pro	128K	$2.00 / 1M	$8.00 / 1M	$4.00 / 1M
deepseek-chat (deprecated)	64K	$0.14 / 1M	—	$0.28 / 1M
deepseek-reasoner (deprecated)	64K	$0.55 / 1M	$2.19 / 1M	—

💡 For all models, the input cache hit price has been reduced to 1/10 of the launch price (effective 2026/4/26). deepseek-v4-pro is currently offered at a 75% discount until 2026/05/31.

Deduction Rules

Expense = number of tokens × price. Fees are deducted from your topped-up balance or granted balance, with granted balance used first when both are available.