Prompt Library And Versioning
Portkey AI Gateway · Portkey
AI Gateway — Portkey
Portkey evals — Prompt Library & Versioning (relift v3 InfraRed)
About Portkey
Portkey is an AI gateway for production LLM apps — a unified, OpenAI-compatible API across 200+ models with provider routing and fallbacks, semantic and simple caching, input/output guardrails (PII redaction, prompt-injection, content moderation), request-level observability and traces, a versioned prompt library, virtual keys with per-key budgets and rate limits, and workspace RBAC + audit logs.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Prompt id pp-abc has versions v1, v2, v3; production should always run v2 until explicitly promoted to v3. | Call POST /v1/prompts/pp-abc/completions with explicit version (e.g., version='v2' or via prompt_id-vN form per docs); do not let 'latest' be implicit. Deploy promotion is an explicit operator action in the dashboard or via API. | Pass / FailAi Platformhigh |
| 02 | Prompt template has {{customer_name}} and {{ticket_id}}; caller sends variables={customer_name:'Acme', ticket_id:42}. | Send variables as a JSON object on the request body; Portkey substitutes {{var}} into the template per docs. Missing keys default per template configuration (or fail closed). Validate types before sending — model rendering of a numeric id may differ from a string. | Pass / FailAi Platformmedium |
| 03 | Template references {{order_id}} but caller omits order_id from variables. | Gateway should fail closed with a clear error naming the missing variable (do not silently send the unrendered template to the upstream model — that makes the model confabulate). Operator should validate required variables client-side before sending. | Pass / FailAi Platformcritical |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Portkey
- Ai Platform
- Prompt Library And Versioning
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.