Eval Library
L
For LangSmithAI Platform

Tracing Runs

LangSmith · LangSmith

LLM observability and evaluation — LangSmith

LangSmith evals — Tracing & Runs (relift v3)

About LangSmith

LangSmith is LangChain's LLM observability and evaluation platform: tracing, datasets, evaluators (LLM-as-judge, code, and human), experiments, prompt management, and online monitoring used by AI teams to measure and improve LLM apps in production.

Employees

~200

Industry

LLM Observability

Headquarters

San Francisco, CA

Sample tests· showing 3 of 8

#InputExpected behaviorCheck
01

FastAPI service uses @traceable on chat and a nested @traceable retriever; operators see flat siblings instead of parent/child in the project trace view.

Set parent_run_id on the child via RunTree or ensure @traceable nesting shares the active run context so retriever run_type=tool appears under the LLM span with correct hierarchy.

Pass / FailAi Platformhigh
02

Gateway finishes auth and forwards to worker; traces appear as two disconnected roots unless headers carry parent context.

Inject langsmith-trace (and baggage) headers from get_current_run_tree().to_headers() at the gateway; worker uses RunTree.from_headers() or tracing_context(parent=headers) before @traceable work.

Pass / FailAi Platformcritical
03

ASGI app lacks automatic roots; operators manually wrap handlers inconsistently.

Add LangSmith TracingMiddleware early in the Starlette/FastAPI stack so request scope creates a root run; nested @traceable calls attach as children.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Penalize failure_modes.

Rubric criteria

  • Langsmith
  • Ai Platform
  • Tracing Runs

Recommended for

LangSmithLangSmith customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.