Eval Library
M
For Mem0AI PlatformKnowledge Retention

Memory Extraction And Consolidation

Mem0 (Platform + OSS) · Mem0

Agent Memory — Mem0

Mem0 evals — Memory Extraction & Consolidation (relift v3 InfraRed)

About Mem0

Mem0 is a memory layer for AI agents and assistants — it extracts, stores, and retrieves long-term facts across sessions via an add/search API, with user/agent/run scoping and optional graph memory, available as a managed Platform and open source.

Employees

~30

Industry

Agent Memory

Headquarters

San Francisco, CA

Website

mem0.ai

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

A turn contains 'Haha thanks! Anyway, I'm gluten-free and I have two kids.' with infer=True.

Extraction should persist the durable facts ('gluten-free', 'has two children') and drop conversational filler ('Haha thanks'). Verify the stored memories are the salient facts; do not expect pleasantries to be remembered or treat their absence as data loss.

Pass / FailAi Platformhigh
02

Stored: 'Budget is $50/month'. New turn: 'Actually my budget is $100/month'.

Consolidation should UPDATE the budget memory to $100/month (or DELETE+ADD) so a single current value remains. Read the event to confirm the old value did not linger as a competing memory that could be retrieved later.

Pass / FailAi Platformcritical
03

A travel app sets custom_categories (e.g., 'destinations', 'dietary', 'loyalty programs') so extracted memories are tagged into the operator's taxonomy.

Configure custom categories per the documented Platform/project setting so extracted memories are classified into the operator's taxonomy and become filterable by category. Verify categories appear on stored memories; do not rely on default categories if a custom taxonomy is required.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Mem0
  • Ai Platform
  • Memory Extraction And Consolidation

Recommended for

Mem0 (Platform + OSS)Mem0 customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.