← all hypothesesCopilot Promise Ledger for SaaS Support Ops
graduated [TRIANGULATED] filter 11.0/15 spread ±3.0 signals: 2 independent
What is this?
When a SaaS support team deploys Intercom Fin, Decagon, Sierra, or Ada, the vendor's dashboard reports a rosy auto-resolution rate. What the head of support ops cannot independently see is whether the copilot's customer-facing commitments — 'we'll fix this by Friday', 'I've escalated this', 'refund processed' — actually land in Zendesk reality. The vendor sells the headline number; the ops lead absorbs the SLA breaches and CSAT damage when promises don't hold. Copilot Promise Ledger is a thin overlay on Zendesk/Intercom/Front: the ops lead flags copilot replies containing commitments (or imports the vendor's reply category), and the ledger waits for the resolution event (ticket close, CSAT score, SLA breach log) and grades each promise against reality. After 4–8 weeks, the lead has a miss-pattern catalog by commitment type — informing renewal negotiations and prompt-tightening asks. AE is uniquely suited: the 6-pattern autopsy taxonomy clusters recurring failure modes (e.g. temporal blindness on date promises), the sub-24h grading loop matches the 3–14 day ticket resolution window, and ground truth lives in Zendesk — no rubric subjectivity, no internal trace access.
Why did we consider it?
As AI copilots become table stakes in SaaS support, an independent reality-graded ledger of copilot promises is the natural buyer-side instrument — and AE's prediction-grading taxonomy is already the right shape to build it.
What breaks?
- Modern support agents use deterministic tool-calling rather than text-based promises, making the core extraction premise obsolete.
- Verifying real-world resolutions requires deep integrations with external systems (Stripe, Jira), violating the 'NOT multi-tenant SaaS' and solo developer constraints.
- The output is merely a vendor complaint log, lacking the actionable ROI needed to secure £100K-£300K ARR.
What did we learn?
Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). ⚠ 4 load-bearing contradiction(s) found. Real trust gap and clean AE-fit, but no observed buyer and episodic-vs-recurring tension unresolved — run Week 1 outbound before building.
Filter scores
Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.
| Axis | What it measures |
|---|
| data moat | Does this product accumulate proprietary data that compounds? |
| 10x model test | Does a better model make this more valuable, or redundant? |
| fast feedback loops | Can outputs be graded against reality in <30 days? |
| solo founder feasible | Can a solo operator build and run this without a team? |
| AI providers cant eat it | Do hyperscalers have structural reasons NOT to build this? |
Composite median: 11.0 / 15. Graduation threshold: 9.0. IQR across runs: 3.0.
Evidence
Signal A — Primary source
M365 Copilot is designed as a general purpose tool to help workers digest information by summarizing emails, meetings, or documents, create new ...
Signal D — Demand proxy
{"found":true,"summary":"Hacker News discussion surfaces community frustration that Copilot broke audit logs and Microsoft lacks transparency about it, directly evidencing demand for independent copilot accountability tooling in enterprise workflows.","sources":["https://news.ycombinator.com/item?id=44957454"],"reason":"HN thread 'Copilot broke audit logs, but Microsoft won't tell customers' shows real practitioner concern about copilot observability and audit-trail integrity — the exact trust gap the Promise Ledger targets. Remaining results (Reddit SaaS feedback, Facebook vibe-coding post) a…
Evaluation history
| When | Stage | Phase |
|---|
| 2026-05-13 08:34 | deep_council_verdict | graduated |
| 2026-05-13 08:30 | deep_claude_take | graduated |
| 2026-05-13 08:28 | deep_90day_plan | graduated |
| 2026-05-13 08:27 | deep_risk | graduated |
| 2026-05-13 08:26 | deep_distribution | graduated |
| 2026-05-13 08:24 | deep_pricing | graduated |
| 2026-05-13 08:23 | deep_moat | graduated |
| 2026-05-13 08:22 | deep_buyer_sim | graduated |
| 2026-05-13 08:21 | deep_icp | graduated |
| 2026-05-13 08:20 | deep_competitor | graduated |
| 2026-05-13 08:18 | deep_market_reality | graduated |
| 2026-05-13 08:13 | filter_score | scored |
| 2026-05-13 08:07 | filter_score | scored |
| 2026-05-13 07:55 | filter_score | scored |
| 2026-05-13 07:50 | filter_score | scored |
| 2026-05-13 07:42 | filter_score | scored |
| 2026-05-13 07:37 | filter_score | scored |
| 2026-05-13 07:24 | filter_score | scored |
| 2026-05-13 07:18 | filter_score | scored |
| 2026-05-13 07:12 | evidence_search | argument |
| 2026-05-13 01:30 | evidence_search | argument |
| 2026-05-12 23:36 | evidence_search | argument |
| 2026-05-12 21:48 | evidence_search | argument |
| 2026-05-12 20:06 | evidence_search | argument |
| 2026-05-12 18:06 | evidence_search | argument |
| 2026-05-12 16:18 | evidence_search | argument |
| 2026-05-12 14:24 | evidence_search | argument |
| 2026-05-12 12:42 | evidence_search | argument |
| 2026-05-12 10:48 | evidence_search | argument |
| 2026-05-12 09:06 | evidence_search | argument |
| 2026-05-12 07:18 | evidence_search | argument |
| 2026-05-12 05:36 | evidence_search | argument |
| 2026-05-11 19:36 | evidence_search | argument |
| 2026-05-11 18:06 | evidence_search | argument |
| 2026-05-11 16:36 | evidence_search | argument |
| 2026-05-11 15:06 | evidence_search | argument |
| 2026-05-11 13:36 | evidence_search | argument |
| 2026-05-11 06:36 | evidence_search | argument |
| 2026-05-11 05:24 | evidence_search | argument |
| 2026-05-11 04:42 | evidence_search | argument |
| 2026-05-11 04:18 | evidence_search | argument |
| 2026-05-11 03:48 | evidence_search | argument |
| 2026-05-11 03:18 | evidence_search | argument |
| 2026-05-11 02:48 | evidence_search | argument |
| 2026-05-11 01:00 | evidence_search | argument |
| 2026-05-11 00:12 | evidence_search | argument |
| 2026-05-11 00:06 | evidence_search | argument |
| 2026-05-11 00:00 | audience_simulation | argument |
| 2026-05-10 23:54 | red_team_kill | argument |
| 2026-05-10 23:48 | steelman | argument |
| 2026-05-10 23:44 | genesis | argument |