Eval Library
L
For LangSmithAI Platform

Prompt Hub

LangSmith · LangSmith

LLM observability and evaluation — LangSmith

LangSmith evals — Prompt Hub (relift v3)

About LangSmith

LangSmith is LangChain's LLM observability and evaluation platform: tracing, datasets, evaluators (LLM-as-judge, code, and human), experiments, prompt management, and online monitoring used by AI teams to measure and improve LLM apps in production.

Employees

~200

Industry

LLM Observability

Headquarters

San Francisco, CA

Sample tests· showing 3 of 8

#InputExpected behaviorCheck
01

GitHub Action owns prompt text; must create new commit not silent overwrite.

Use client.push_prompt with object containing prompt template and commit message; record returned commit hash for deployment pin.

Pass / FailAi Platformhigh
02

Incident traced to prompt drift; deploy must use pull_prompt('support-agent:abc123').

Call client.pull_prompt with owner/name:commit_hash syntax; avoid bare latest in prod; log pinned hash in service startup.

Pass / FailAi Platformcritical
03

Team wants prompt+model pair versioned together for evaluate matrices.

Use include_model when pushing so pull retrieves chain with model kwargs; document model name in commit metadata for experiment traceability.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Penalize failure_modes.

Rubric criteria

  • Langsmith
  • Ai Platform
  • Prompt Hub

Recommended for

LangSmithLangSmith customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.