Eval Library
C
For ComposioAI Platform

Mcp Server

Composio · Composio

Agent Tooling & Integrations — Composio

Composio evals — MCP Server (relift v3 InfraRed)

About Composio

Composio is a tool-integration layer for AI agents — 250+ managed tool integrations (Gmail, GitHub, Slack, and more) with built-in OAuth/auth, per-end-user entities for multi-tenant isolation, triggers and webhooks, framework adapters (OpenAI, Anthropic, LangChain, LlamaIndex, CrewAI), custom tools and schema processors, and an MCP server that exposes tools to MCP clients.

Employees

~40

Industry

Agent Tooling

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

An operator points an MCP client (e.g. Claude Desktop) at the Composio MCP server and wants a specific set of tools available.

Configure the MCP server endpoint with the intended toolset (selected apps/actions) and connect the MCP client to that URL. Expose only the tools the client needs rather than the full catalog. Exact MCP endpoint/config shape [REQUIRES-VERIFICATION] against current docs.

Pass / FailAi Platformhigh
02

Two end users both use an MCP client backed by the Composio MCP server. Each must only reach their own connected accounts.

Scope MCP access per entity so a client session can only invoke actions on its own entity's connections. Confirm the documented per-entity scoping model [REQUIRES-VERIFICATION]; never expose one shared connection set to all MCP clients.

Pass / FailAi Platformcritical
03

The MCP client lists available tools; the Composio MCP server must advertise each tool's name, description, and input schema.

Advertise tools over MCP with complete, valid input schemas and descriptions so the client's model can call them correctly. Ensure the schema matches what execute will accept; broken/empty schemas cause invalid calls.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Composio
  • Ai Platform
  • Mcp Server

Recommended for

ComposioComposio customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.