N8
For n8nAI Platform

Ai Langchain Nodes

n8n (self-host + Cloud) · n8n

Workflow Automation — n8n

n8n evals — AI / LangChain Nodes (relift v3 InfraRed)

About n8n

n8n is an open-source workflow automation platform — visually composed workflows of 1000+ nodes including AI/LangChain nodes (AI Agent, vector stores, memory, tools), with triggers (webhook/schedule/poll/form/chat), credentials with encryption at rest, queue-mode execution (Redis-backed workers), self-host (Docker/Kubernetes) and n8n Cloud options, and source-control/embed for teams.

Employees

~100

Industry

Workflow Automation

Headquarters

Berlin, Germany

Website

n8n.io

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator drops an AI Agent node, an OpenAI Chat Model node, and three HTTP Request Tool nodes onto the canvas but only connects the Chat Model. The agent responds without ever calling the tools.

Each tool node must be connected to the AI Agent's `ai_tool` input (the dotted-line port labelled 'Tool'). Verify the agent's exposed tools list reflects all three. Tool nodes provide their `name` + `description` to the LLM — confirm both are set so the model can route.

Pass / FailAi Platformcritical
02

AI Agent should return `{intent, entities[]}`. Downstream Set node reads `$json.output.intent` but sometimes intent is undefined.

Wire a Structured Output Parser (or Auto-fixing Output Parser) to the agent's `ai_outputParser` input with a JSON schema for `{intent, entities[]}`. Parser validates and (with auto-fixing) re-prompts the model on schema violation. Downstream code should still handle the parsed object's absence defe…

Pass / FailAi Platformhigh
03

Operator builds a chat UI off the Chat Trigger but the user sees the full LLM response only after 8s — no streaming.

Verify whether the Chat Trigger + AI Agent path supports response streaming on the configured n8n version [REQUIRES-VERIFICATION] — if not, operator-side UX should show a typing indicator until completion, or front the chat with a streaming-capable shim. Do not promise token-by-token UX without con…

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • N8n
  • Ai Platform
  • Ai Langchain Nodes

Recommended for

n8n (self-host + Cloud)n8n customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.