Your AI did the math. Would you stake money on it — without checking?
An AI agent will tell you "gross margin improved to 44.8%" or "this invoice totals $48,200" with total confidence. Often it's right. Often enough it isn't — and in an agent workflow something downstream acts on that number: it releases a payment, files a report, triggers a trade. Confidence is not correctness.
"Just have another model check it" doesn't fix this
Asking a second LLM to grade the first one fails for three reasons. It's non-deterministic — the same claim can pass one minute and fail the next. It fails in correlated ways — models tend to be confidently wrong about the same things. And most importantly, a model checking a model is not an independent, accountable attestor. "We asked GPT and it said the number was fine" is not something a counterparty, an auditor, or a finance team will accept.
What NumProof does
NumProof verifies a numeric or financial claim deterministically: exact rational arithmetic and symbolic math (and optional Lean 4 proofs), no model in the loop. You get one of three answers:
- VERIFY — the claim holds, exactly.
- REFUTE — it's false, with a concrete counterexample.
- ABSTAIN — it isn't exactly decidable, so NumProof says so instead of guessing.
Same input, same verdict, every time — with cell- and formula-level provenance when you hand it a spreadsheet.
curl -s $BASE/verify -H "Content-Type: application/json" \
-d '{"claim":"gross margin is 60% when gross profit is 600 and revenue is 1000"}'
# -> {"verdict":"VERIFY", ...}
The part that actually matters: a receipt you can re-check yourself
Every verdict can come with a signed Verification Receipt. A second party runs the open-source re-checker — numproof-verify — which recovers the signer and independently re-derives the verdict using commodity libraries (stdlib Fraction + sympy). No trust in NumProof required.
An agent can compute a number for itself. What it cannot do is issue an independent, signed attestation that a counterparty will accept. That asymmetry — not the arithmetic — is the product.
Where this fits
- Agent-to-agent payments: release the USDC only when the number verifies; withhold on REFUTE.
- Spreadsheet & report audits: footing, cross-foot, balance-sheet ties, margins — with provenance.
- Covenant & ratio checks: DSCR, Debt/EBITDA, current ratio, against a rule pack.
- Inside AI products: verify numbers before your app ships them to a user.
Try it
Run the live demo (no key) Re-check a receipt yourself
API · CLI · MCP server · x402 pay-per-call (USDC on Base). Building a finance or agent workflow? Request a pilot — or email support@numproof.com.