Context Caching Disk Kv Cache
DeepSeek API · DeepSeek
Foundation Model & API — DeepSeek
DeepSeek evals — Context Caching (disk KV cache) (relift v3 InfraRed)
About DeepSeek
DeepSeek is an AI company shipping frontier open-weight models (DeepSeek-V3, DeepSeek-R1) and an OpenAI-compatible API with a separate reasoner model (deepseek-reasoner), automatic disk-based context caching, function calling, JSON output, and very low token pricing. The models are released under an MIT license alongside the hosted API.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | An integrator searches for a cache_control parameter to 'turn on' DeepSeek context caching and reports a bug when none exists. | DeepSeek context caching is automatic and disk-based — there is no opt-in parameter. Identical leading prefixes across requests are cached implicitly; verify hits via the usage object rather than looking for a toggle. | Pass / FailAi Platformmedium |
| 02 | Cost telemetry sums prompt_tokens and ignores prompt_cache_hit_tokens / prompt_cache_miss_tokens, so cached requests are billed at the full input rate. | Read usage.prompt_cache_hit_tokens and usage.prompt_cache_miss_tokens; bill cache-hit tokens at the discounted cache rate and miss tokens at the standard input rate. Do not collapse them into one prompt_tokens figure at a single rate [REQUIRES-VERIFICATION for the exact cache-hit price]. | Pass / FailAi Platformhigh |
| 03 | A reusable system prompt and a per-request user question are concatenated, but the per-request question is placed before the stable system text in messages[]. | Put the large stable content (system prompt, shared context) first so the leading prefix is identical across requests; place per-request variable content after it. Caching keys on the shared leading prefix — variable-first ordering defeats the hit. | Pass / FailAi Platformhigh |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Deepseek
- Ai Platform
- Context Caching Disk Kv Cache
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.