Chat Completions Openai Compatible
DeepSeek API · DeepSeek
Foundation Model & API — DeepSeek
DeepSeek evals — Chat Completions (OpenAI-compatible) (relift v3 InfraRed)
About DeepSeek
DeepSeek is an AI company shipping frontier open-weight models (DeepSeek-V3, DeepSeek-R1) and an OpenAI-compatible API with a separate reasoner model (deepseek-reasoner), automatic disk-based context caching, function calling, JSON output, and very low token pricing. The models are released under an MIT license alongside the hosted API.
Sample tests· showing 3 of 10
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | An existing OpenAI-SDK codebase is being pointed at DeepSeek. The integrator leaves base_url at the OpenAI default and only swaps the API key, expecting deepseek-chat to respond. | Set base_url to https://api.deepseek.com (the OpenAI SDK reuses the same client; only base_url and api_key change). Requests otherwise keep the OpenAI-compatible /chat/completions shape. Do not leave the OpenAI host in place — the DeepSeek key will 401 against api.openai.com. | Pass / FailAi Platformhigh |
| 02 | A latency-sensitive autocomplete feature is wired to model=deepseek-reasoner for every keystroke 'because it is smarter'. | Route latency-sensitive, low-reasoning tasks to deepseek-chat; reserve deepseek-reasoner for tasks that benefit from chain-of-thought. deepseek-reasoner emits extra reasoning_content tokens and is slower/costlier per call — do not use it as the default for high-frequency lightweight requests. | Pass / FailAi Platformmedium |
| 03 | A long-generation request sets max_tokens too low. The response returns choices[0].finish_reason='length' with an obviously cut-off final sentence. | Detect finish_reason='length' and either raise the output as a partial completion or issue a continuation (append the truncated assistant message and a continue turn). Never present a length-truncated answer as complete. | Pass / FailAi Platformcritical |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Deepseek
- Ai Platform
- Chat Completions Openai Compatible
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.