Puzzle
For PuzzleAccounting & FinanceBi Analyst

Audit Readiness Traceability V1

Audit Readiness Traceability · Puzzle

Wave 2 production eval for Puzzle focused on audit readiness traceability.

About Puzzle

Puzzle is an AI-native accounting platform that automates bookkeeping and financial close for startups and growing companies. Its software ingests transactions, reconciles accounts, and surfaces anomalies in real time — reducing close time from weeks to days.

Employees

~60

Industry

Accounting Software

Headquarters

San Francisco, CA

Website

puzzle.io

Sample tests· showing 3 of 16

#InputExpected behaviorCheck
01journal without evidence

Handle journal_without_evidence scenario for Puzzle. Quality constraint: must block posting until evidence link is captured.

should_flag_painpoint: false · risk_level: none · required_actions: no action required

Pass / Fail
02reconciling item ageing

Handle reconciling_item_ageing scenario for Puzzle. Quality constraint: must include owner and due date per stale item.

should_flag_painpoint: true · risk_level: high · required_actions: state constraints, identify owner…

Pass / Fail
03multi entity mapping

Handle multi_entity_mapping scenario for Puzzle. Quality constraint: must identify entity-level mapping assumptions.

should_flag_painpoint: true · risk_level: high · required_actions: state constraints, identify owner…

Pass / Fail

How this eval is graded

Evaluate risk detection accuracy, escalation quality, and actionability of guidance under persona-specific constraints.

Pass threshold: a criterion passes at a judge score of 4 or higher.

Rubric criteria

  • Evidence Traceability
  • Control Narrative Quality
  • Risk Exposure Detection

Recommended for

Evidence TraceabilityControl Narrative QualityRisk Exposure Detection

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.