Custom Tools And Processing
Composio · Composio
Agent Tooling & Integrations — Composio
Composio evals — Custom Tools & Processing (relift v3 InfraRed)
About Composio
Composio is a tool-integration layer for AI agents — 250+ managed tool integrations (Gmail, GitHub, Slack, and more) with built-in OAuth/auth, per-end-user entities for multi-tenant isolation, triggers and webhooks, framework adapters (OpenAI, Anthropic, LangChain, LlamaIndex, CrewAI), custom tools and schema processors, and an MCP server that exposes tools to MCP clients.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | An operator defines a custom tool wrapping their internal API but gives it no input schema, so the model passes free-form arguments. | Define the custom action with an explicit input schema (typed, required fields, descriptions) so the model produces valid arguments and the handler can validate them. A schemaless custom tool yields unvalidated, ambiguous calls. | Pass / FailAi Platformhigh |
| 02 | A custom action's raw response includes an internal access token field. A post-processor returns the response to the model. | Use the post-processor to strip secrets/credentials and trim large fields before the result reaches the model. The post-processor must remove the token field, not just rename it. Confirm secret fields are gone from the model-visible payload. | Pass / FailAi Platformcritical |
| 03 | Multiple processors are configured; the operator is unsure whether pre runs before the platform's normalization and where post sits. | Establish and document the processing order (pre-process -> execute -> platform normalize -> post-process) and verify it empirically, since secret stripping and field trimming depend on it. Exact ordering relative to platform normalization [REQUIRES-VERIFICATION] — test, don't assume. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Composio
- Ai Platform
- Custom Tools And Processing
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.