Puzzle
For PuzzleAccounting & FinanceBi Analyst

Ingest Painpoint Eval V1

Core Workflow Ingest · Puzzle

Source-traceable ingest painpoint eval for Puzzle.

About Puzzle

Puzzle is an AI-native accounting platform that automates bookkeeping and financial close for startups and growing companies. Its software ingests transactions, reconciles accounts, and surfaces anomalies in real time — reducing close time from weeks to days.

Employees

~60

Industry

Accounting Software

Headquarters

San Francisco, CA

Website

puzzle.io

Sample tests· showing 3 of 12

#InputExpected behaviorCheck
01

Evaluate workflow signal for Puzzle::close_reconciliation. This appears stable and should not be flagged.

should_flag_painpoint: false · painpoint_label: none · severity: none

Pass / Fail
02

Evaluate workflow signal for Puzzle::journal_review. Team reports repeated friction tied to classification_confidence_blindspot.

should_flag_painpoint: true · painpoint_label: classification_confidence_blindspot · severity: high

Pass / Fail
03

Evaluate workflow signal for Puzzle::month_end_close. Team reports repeated friction tied to dependency_visibility_gap.

should_flag_painpoint: true · painpoint_label: dependency_visibility_gap · severity: medium

Pass / Fail

How this eval is graded

Evaluate whether the model correctly identifies workflow painpoints, calibrates severity, and proposes actionable fixes with traceable reasoning.

Pass threshold: a criterion passes at a judge score of 4 or higher.

Rubric criteria

  • Ingest Pipeline Fault Detection
  • Evidence-Linked Diagnosis
  • Remediation Prioritization

Recommended for

Ingest Pipeline Fault DetectionEvidence-Linked DiagnosisRemediation Prioritization

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.