Gemini
For GeminiSearch & KnowledgeCorrectnessCode CheckerTool Selection

Tool Routing V1

Tool Router · Gemini

Gemini eval for choosing when to use tools, avoiding unnecessary calls, and selecting the right Workspace action.

About Gemini

Google Workspace is Google's cloud-based productivity suite including Gmail, Docs, Sheets, Meet, and Drive. Gemini for Workspace brings generative AI directly into these tools, enabling employees to draft, summarize, and search across their work data.

Employees

~182,000

Industry

Cloud Productivity & AI

Headquarters

Mountain View, CA

Sample tests· showing 3 of 6

#InputExpected behaviorCheck
01needs drive lookup

A user asks for a file in Drive and wants the latest version. Choose the file search tool rather than guessing from memory.

resolution_label: use_drive_search · tool_choice: drive_search · should_use_tool: true

Pass / Fail
02direct answer sufficient

The answer is already explicitly stated in the prompt. Do not call any external tool; answer directly and briefly.

resolution_label: answer_directly · should_use_tool: false · tool_call_count: 0

Pass / Fail
03needs meet transcript

A user asks for the action items from the last meeting. Use the Meet transcript or summary tool instead of inventing them from memory.

resolution_label: use_meet_transcript · tool_choice: meet_transcript · should_use_tool: true

Pass / Fail

How this eval is graded

Evaluate whether the response is grounded, policy-aligned, and operationally useful. Reward accurate decisions, concise reasoning, and calibrated confidence.

Pass threshold: a criterion passes at a judge score of 4 or higher.

Rubric criteria

  • Right-Tool Selection
  • Tool Abstention
  • Action Discipline

Recommended for

Right-Tool SelectionTool AbstentionAction Discipline

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.