Eval Library
C
For ComposioAI Platform

Entities And Multi Tenancy

Composio · Composio

Agent Tooling & Integrations — Composio

Composio evals — Entities & Multi-tenancy (relift v3 InfraRed)

About Composio

Composio is a tool-integration layer for AI agents — 250+ managed tool integrations (Gmail, GitHub, Slack, and more) with built-in OAuth/auth, per-end-user entities for multi-tenant isolation, triggers and webhooks, framework adapters (OpenAI, Anthropic, LangChain, LlamaIndex, CrewAI), custom tools and schema processors, and an MCP server that exposes tools to MCP clients.

Employees

~40

Industry

Agent Tooling

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

A SaaS app has thousands of end users. The team wires the agent so every user's actions run under one shared entity.

Map each end user to a distinct, stable entity_id (e.g. the app's user id) so connections and executions are isolated per user. A single shared entity collapses every user's connections into one identity and breaks isolation.

Pass / FailAi Platformcritical
02

An execute call is issued without specifying entity_id. Composio applies a default entity.

Treat omission of entity_id as a bug in any multi-tenant flow: it falls back to a default/shared entity. Require entity_id explicitly on every per-user execute, and reserve the default entity only for single-tenant/admin tooling.

Pass / FailAi Platformhigh
03

A single user connects Gmail, GitHub, and Slack. The agent must pick the right connection per action.

Within one entity, resolve the connection matching the action's app/toolkit (GITHUB_* -> the GitHub connection). Do not assume one connection per entity; select by app and confirm it is ACTIVE before executing.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Composio
  • Ai Platform
  • Entities And Multi Tenancy

Recommended for

ComposioComposio customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.