Eval Library
M
For Mem0AI PlatformKnowledge Retention

Memory Lifecycle

Mem0 (Platform + OSS) · Mem0

Agent Memory — Mem0

Mem0 evals — Memory Lifecycle (get/update/delete/history) (relift v3 InfraRed)

About Mem0

Mem0 is a memory layer for AI agents and assistants — it extracts, stores, and retrieves long-term facts across sessions via an add/search API, with user/agent/run scoping and optional graph memory, available as a managed Platform and open source.

Employees

~30

Industry

Agent Memory

Headquarters

San Francisco, CA

Website

mem0.ai

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator calls m.get_all(user_id='u_4') to display a user's memories in a settings UI; the user has 1200 memories.

Use get_all scoped by user_id and page through results rather than assuming one call returns everything. Render with each memory's id so the UI can target update/delete. Do not call get_all without a scope in a multi-tenant store.

Pass / FailAi Platformhigh
02

After a search returns memory id 'm_abc', the agent wants the full current state of just that memory and calls m.get('m_abc').

Use get(memory_id) to fetch the authoritative current state of one memory by id. Handle the not-found case (already deleted) explicitly. Do not re-run a broad search to find a memory you already hold the id for.

Pass / FailAi Platformmedium
03

A user invokes their right to be forgotten; the operator calls m.delete_all(user_id='u_gone').

Call delete_all with the user_id scope so ONLY that subject's memories are removed. Verify the scope argument is present and correct before issuing — an unscoped delete_all would wipe the entire store. Confirm with a follow-up get_all(user_id) returning empty.

Pass / FailAi Platformcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Mem0
  • Ai Platform
  • Memory Lifecycle

Recommended for

Mem0 (Platform + OSS)Mem0 customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.