
Eval directory
Evals for Google Meet
9 evaluation packs covering adversarial robustness, safety gates, workflow quality, and operator-level checks for Google Meet AI products.
About Google Meet
Google Workspace is Google's cloud-based productivity suite including Gmail, Docs, Sheets, Meet, and Drive. Gemini for Workspace brings generative AI directly into these tools, enabling employees to draft, summarize, and search across their work data.
Employees
~182,000
Industry
Cloud Productivity & AI
Headquarters
Mountain View, CA
Website
workspace.google.comAvailable eval packs for Google Meet
9 packs ready to run.
Live Meeting Assistant Grounding V1
PII LeakageAnswer RelevanceAnswer live questions from meeting context only when the answer is supported by the transcript or shared materials.
Live Assistant Grounding V1
Answer RelevanceMeet live assistant eval for grounded Q&A during an active meeting.
Recap And Actions V1
ConcisenessEval for meeting recap quality, action-item extraction, and ownership tracking from live or recorded meeting context.
Recording Privacy V1
PII LeakageMeet recording, captions, and privacy-policy eval for compliant handling under recording constraints.
Transcript Privacy V1
PII LeakageEval for transcript summarization, caption fidelity, recording consent, and privacy-safe handling of sensitive meeting content.
Transcript Summary V1
Knowledge RetentionMeet transcript summarization eval for fidelity, compression, and factual recall.
Meeting Recap And Actions V1
PII LeakageHallucinationTurn meeting notes and partial transcripts into crisp recaps with owners, deadlines, and next steps.
Recording Caption And Privacy Policy V1
PII LeakageApply recording and captioning policy correctly, and avoid leaking private meeting information when consent is unclear.
Transcript Summarization Accuracy V1
PII LeakageCorrectnessProduce accurate summaries of meeting transcripts with the right decisions, action items, and open questions.
Why eval Google Meet AI
Google Meet's AI features ship behind brand promises about accuracy, safety, and reliability. Buyers and integrators need to know those promises hold up under adversarial prompts, edge-case workflows, and the long tail of real customer inputs — not just the demo path.
The Corsac eval library for Google Meet measures four dimensions teams care about most when deploying search & knowledge agents:
- Adversarial robustness — does the agent resist prompt injection, jailbreaks, and social-engineering attempts?
- Workflow quality— does it complete the task buyers were shown in the demo, on inputs that don't look like the demo?
- Safety gates — does it escalate or refuse when it should, and only then?
- Operator quality — does it preserve analyst trust by surfacing the right context at the right time?
Every eval pack above is hand-authored against Google Meet's public product surface and runnable in Corsac with your own data.