Eval Library
L
For LangChainAI Platform

Agents Langgraph

LangChain (+ LangGraph) · LangChain

LLM Orchestration Framework — LangChain

LangChain evals — Agents (LangGraph) (relift v3 InfraRed)

About LangChain

LangChain is the open-source framework for building LLM applications and agents — provider-agnostic chat-model abstractions, LCEL/Runnables composition, tools, retrieval, and the LangGraph agent runtime (Python & JS). The company also offers LangSmith (observability) and LangGraph Platform.

Employees

~200

Industry

Agent Framework

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Integrator wants a tool-calling agent and hand-builds a StateGraph from scratch, reimplementing the model/tool loop and introducing routing bugs.

Use langgraph.prebuilt.create_react_agent(model, tools) for the standard ReAct tool-calling loop; it wires the model node, ToolNode, and conditional routing back to the model. Drop to a custom StateGraph only when the prebuilt loop is insufficient.

Pass / FailAi Platformhigh
02

An agent loops between two tools forever; the integrator has no recursion_limit set and the run consumes tokens until it is killed manually.

Set a sensible recursion_limit in the config so LangGraph raises GraphRecursionError when the step budget is exceeded, then handle it (surface to the user / inspect state). Do not raise the limit blindly to 'make it finish' — investigate the loop.

Pass / FailAi Platformcritical
03

Integrator builds a new agent on the legacy AgentExecutor / initialize_agent path and expects checkpointing and interrupts to work.

Prefer the LangGraph agent (create_react_agent) for new builds; the legacy AgentExecutor is maintained but does not offer LangGraph's persistence, streaming, and interrupt features. If maintaining legacy code, do not assume LangGraph capabilities are present.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Langchain
  • Ai Platform
  • Agents Langgraph

Recommended for

LangChain (+ LangGraph)LangChain customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.