For n8nAI Platform

Ai Langchain Nodes

n8n (self-host + Cloud) · n8n

Workflow Automation — n8n

Evaluates n8n's AI / LangChain Nodes across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Workflow Automation eval coverage.

About n8n

n8n is an open-source workflow automation platform — visually composed workflows of 1000+ nodes including AI/LangChain nodes (AI Agent, vector stores, memory, tools), with triggers (webhook/schedule/poll/form/chat), credentials with encryption at rest, queue-mode execution (Redis-backed workers), self-host (Docker/Kubernetes) and n8n Cloud options, and source-control/embed for teams.

Employees

~100

Industry

Workflow Automation

Headquarters

Berlin, Germany

Website

n8n.io

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Operator drops an AI Agent node, an OpenAI Chat Model node, and three HTTP Request Tool nodes onto the canvas but only connects the Chat Model. The agent responds without ever calling the tools.	Each tool node must be connected to the AI Agent's `ai_tool` input (the dotted-line port labelled 'Tool'). Verify the agent's exposed tools list reflects all three. Tool nodes provide their `name` + `description` to the LLM — confirm both are set so the model can route.	Pass / FailAi Platformcritical
02	AI Agent should return `{intent, entities[]}`. Downstream Set node reads `$json.output.intent` but sometimes intent is undefined.	Wire a Structured Output Parser (or Auto-fixing Output Parser) to the agent's `ai_outputParser` input with a JSON schema for `{intent, entities[]}`. Parser validates and (with auto-fixing) re-prompts the model on schema violation. Downstream code should still handle the parsed object's absence defe…	Pass / FailAi Platformhigh
03	A Pinecone vector store node connected to an AI Agent for RAG returns 3 chunks by default. Some answers cite documents the operator can't find.	Configure `topK` on the retriever explicitly (e.g., 5-10) and pin the vector-store metadata filter for the tenant. Verify that retrieved chunks' metadata (source URL, doc id) is surfaced in the final response so the operator can audit. Use a deterministic embedding model version pinned in the node …	Pass / FailAi Platformhigh
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

N8n
Ai Platform
Ai Langchain Nodes

Recommended for

n8n (self-host + Cloud)n8n customers

Works with

n8n

Related evals

AI Platform

Claude API

Evaluates Anthropic's Batch API across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Extended Thinking across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Files API & Citations across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View

Frequently asked questions

What does the Ai Langchain Nodes eval for n8n n8n (self-host + Cloud) test?+

Evaluates n8n's AI / LangChain Nodes across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Workflow Automation eval coverage.

How is the Ai Langchain Nodes eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Ai Langchain Nodes pack for n8n n8n (self-host + Cloud) contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Ai Langchain Nodes pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.