For DeepSeekAI Platform

Fim Completions Beta

DeepSeek API · DeepSeek

Foundation Model & API — DeepSeek

Evaluates DeepSeek's FIM / Completions (beta) across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

About DeepSeek

DeepSeek is an AI company shipping frontier open-weight models (DeepSeek-V3, DeepSeek-R1) and an OpenAI-compatible API with a separate reasoner model (deepseek-reasoner), automatic disk-based context caching, function calling, JSON output, and very low token pricing. The models are released under an MIT license alongside the hosted API.

Employees

~200

Industry

Foundation Model

Headquarters

Hangzhou, China

Website

www.deepseek.com

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	An integrator calls the FIM/completions endpoint against the standard base_url https://api.deepseek.com and gets a 404.	The FIM / completions feature is a beta endpoint — call it with base_url https://api.deepseek.com/beta (the documented beta host). Do not route beta completions through the standard chat host.	Pass / FailAi Platformhigh
02	A FIM completion for a single function keeps generating into the next function because no stop sequence is set.	Set stop tokens appropriate to the language/context (e.g. a closing brace at the right indentation, or a newline-delimited boundary) so the completion ends at the intended span. Cap max_tokens as a backstop.	Pass / FailAi Platformmedium
03	A code-completion feature wants the model to fill a gap between existing before-text and after-text, but sends only the prompt (before) with no suffix.	Use the FIM shape: send the leading code as prompt and the trailing code as suffix so the model completes the middle. Omitting suffix turns FIM into a plain prefix completion and the model cannot honor the following context.	Pass / FailAi Platformhigh
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Deepseek
Ai Platform
Fim Completions Beta

Recommended for

DeepSeek APIDeepSeek customers

Works with

DeepSeek

Related evals

AI Platform

Claude API

Evaluates Anthropic's Batch API across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Extended Thinking across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Files API & Citations across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View

Frequently asked questions

What does the Fim Completions Beta eval for DeepSeek DeepSeek API test?+

Evaluates DeepSeek's FIM / Completions (beta) across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

How is the Fim Completions Beta eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Fim Completions Beta pack for DeepSeek DeepSeek API contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Fim Completions Beta pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.