Tracing Runs
LangSmith · LangSmith
LLM observability and evaluation — LangSmith
LangSmith evals — Tracing & Runs (relift v3)
About LangSmith
LangSmith is LangChain's LLM observability and evaluation platform: tracing, datasets, evaluators (LLM-as-judge, code, and human), experiments, prompt management, and online monitoring used by AI teams to measure and improve LLM apps in production.
Employees
~200
Industry
LLM Observability
Headquarters
San Francisco, CA
Website
www.langchain.com/langsmithSample tests· showing 3 of 8
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | FastAPI service uses @traceable on chat and a nested @traceable retriever; operators see flat siblings instead of parent/child in the project trace view. | Set parent_run_id on the child via RunTree or ensure @traceable nesting shares the active run context so retriever run_type=tool appears under the LLM span with correct hierarchy. | Pass / FailAi Platformhigh |
| 02 | Gateway finishes auth and forwards to worker; traces appear as two disconnected roots unless headers carry parent context. | Inject langsmith-trace (and baggage) headers from get_current_run_tree().to_headers() at the gateway; worker uses RunTree.from_headers() or tracing_context(parent=headers) before @traceable work. | Pass / FailAi Platformcritical |
| 03 | ASGI app lacks automatic roots; operators manually wrap handlers inconsistently. | Add LangSmith TracingMiddleware early in the Starlette/FastAPI stack so request scope creates a root run; nested @traceable calls attach as children. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Penalize failure_modes.
Rubric criteria
- Langsmith
- Ai Platform
- Tracing Runs
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.