Eval Library
C
For ComposioAI Platform

Connected Accounts And Auth

Composio · Composio

Agent Tooling & Integrations — Composio

Composio evals — Connected Accounts & Auth (relift v3 InfraRed)

About Composio

Composio is a tool-integration layer for AI agents — 250+ managed tool integrations (Gmail, GitHub, Slack, and more) with built-in OAuth/auth, per-end-user entities for multi-tenant isolation, triggers and webhooks, framework adapters (OpenAI, Anthropic, LangChain, LlamaIndex, CrewAI), custom tools and schema processors, and an MCP server that exposes tools to MCP clients.

Employees

~40

Industry

Agent Tooling

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

A new end user must connect their Google account. The agent initiates a managed OAuth connection request via Composio.

Initiate a connection request scoped to the user's entity_id; Composio returns a redirect URL. Send the user through that URL to complete the OAuth authorization-code flow. Do not attempt to collect the user's Google password or handle the token exchange yourself.

Pass / FailAi Platformcritical
02

The user finishes the OAuth consent screen and is redirected back. The agent must confirm the connection is usable before executing actions.

After the redirect, poll/await the connected account until its status is ACTIVE before executing any action against it. Treat INITIATED as not-yet-usable. Only execute once the connection is confirmed active.

Pass / FailAi Platformhigh
03

An integration uses an API key (not OAuth). The agent must establish an API-key connection for the user.

Initiate an API-key/bearer connection by collecting the credential over a secure channel and handing it to Composio's connection request so Composio stores it. Reference the resulting connection by id/entity for execution — do not keep the raw key in agent state or prompt.

Pass / FailAi Platformcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Composio
  • Ai Platform
  • Connected Accounts And Auth

Recommended for

ComposioComposio customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.