Ai Langchain Nodes
n8n (self-host + Cloud) · n8n
Workflow Automation — n8n
n8n evals — AI / LangChain Nodes (relift v3 InfraRed)
About n8n
n8n is an open-source workflow automation platform — visually composed workflows of 1000+ nodes including AI/LangChain nodes (AI Agent, vector stores, memory, tools), with triggers (webhook/schedule/poll/form/chat), credentials with encryption at rest, queue-mode execution (Redis-backed workers), self-host (Docker/Kubernetes) and n8n Cloud options, and source-control/embed for teams.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Operator drops an AI Agent node, an OpenAI Chat Model node, and three HTTP Request Tool nodes onto the canvas but only connects the Chat Model. The agent responds without ever calling the tools. | Each tool node must be connected to the AI Agent's `ai_tool` input (the dotted-line port labelled 'Tool'). Verify the agent's exposed tools list reflects all three. Tool nodes provide their `name` + `description` to the LLM — confirm both are set so the model can route. | Pass / FailAi Platformcritical |
| 02 | AI Agent should return `{intent, entities[]}`. Downstream Set node reads `$json.output.intent` but sometimes intent is undefined. | Wire a Structured Output Parser (or Auto-fixing Output Parser) to the agent's `ai_outputParser` input with a JSON schema for `{intent, entities[]}`. Parser validates and (with auto-fixing) re-prompts the model on schema violation. Downstream code should still handle the parsed object's absence defe… | Pass / FailAi Platformhigh |
| 03 | Operator builds a chat UI off the Chat Trigger but the user sees the full LLM response only after 8s — no streaming. | Verify whether the Chat Trigger + AI Agent path supports response streaming on the configured n8n version [REQUIRES-VERIFICATION] — if not, operator-side UX should show a typing indicator until completion, or front the chat with a streaming-capable shim. Do not promise token-by-token UX without con… | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- N8n
- Ai Platform
- Ai Langchain Nodes
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.