← all hypothesesAI-Aided Architecture Decision Ledger for Startup Engineering Leads
ranked [TRIANGULATED] filter 8.0/15 spread ±1.5 signals: 2 independent
What is this?
A lightweight ledger product where engineers at 10-50 engineer startups log AI-suggested technical decisions they adopt — architecture choices, library picks, debugging approaches, code-review fixes — in 30 seconds via web form or Slack. The CTO/Head of Engineering subscribes. 2-6 weeks later, AE cross-references the decision against git history, incident tracker, and PR revert logs, classifies each outcome via the 6-pattern autopsy taxonomy (e.g. did the AI suggestion exhibit Fatal Grounding Immunity? Did the engineer launder its confidence?), and produces a per-engineer and per-AI-tool calibration profile. The pain removed: CTOs sense their teams accept AI recommendations uncritically during high-load weeks but have no operationalised counterweight. The 67% self-awareness gap (RAND March 2026) is direct demand evidence at the consumer level; engineering teams are the AI-heaviest professional segment with the same dynamic plus a buyer (CTO) accountable for the downstream blowup. AE's adversarial debate plus autopsy taxonomy uniquely classify WHY a recommendation failed, not just THAT it failed.
Why did we consider it?
ADRs give AE a familiar wrapper, the autopsy taxonomy gives it a defensible moat, and the CTO buyer gives a solo UK operator a clean path to £100-300K recurring.
What breaks?
- Fatal UX paradox: relies on manual developer logging for micro-decisions, guaranteeing near-zero compliance.
- Abandons the AE's <24h fast feedback loop for a 2-6 week lagging indicator.
- Requires brittle, complex enterprise integrations (Git, Jira, PagerDuty) unsuited for a solo/weekend founder.
- Native AI dev platforms (e.g., CODITECT, Cursor) are already automating ADR generation and telemetry, rendering manual third-party ledgers obsolete.
What did we learn?
Still in evaluation (phase: ranked). No verdict yet.
Filter scores
Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.
| Axis | What it measures |
|---|
| data moat | Does this product accumulate proprietary data that compounds? |
| 10x model test | Does a better model make this more valuable, or redundant? |
| fast feedback loops | Can outputs be graded against reality in <30 days? |
| solo founder feasible | Can a solo operator build and run this without a team? |
| AI providers cant eat it | Do hyperscalers have structural reasons NOT to build this? |
Composite median: 8.0 / 15. Graduation threshold: 9.0. IQR across runs: 1.5.
Evidence
Signal A — Primary source
This position paper argues that AI-assisted software engineering requires explicit mechanisms for tracking the epistemic status and temporal...
Signal D — Demand proxy
{"found":true,"summary":"Multiple HN and Reddit discussions show engineering community concern about uncritical AI adoption, accountability gaps when AI suggestions fail, and skepticism about generative AI delivering on promises — all validating demand for a tool that tracks whether AI-suggested decisions actually worked.","sources":["https://news.ycombinator.com/item?id=46605587","https://news.ycombinator.com/item?id=42269227","https://www.reddit.com/r/vibecoding/comments/1kprxpl/read_a_software_engineering_blog_if_you_think/"],"reason":"[20] HN thread 'Generative AI isn't going all that well…
Evaluation history
| When | Stage | Phase |
|---|
| 2026-05-13 02:54 | filter_score | scored |
| 2026-05-13 02:48 | filter_score | scored |
| 2026-05-13 02:42 | filter_score | scored |
| 2026-05-13 02:37 | evidence_search | argument |
| 2026-05-13 00:18 | evidence_search | argument |
| 2026-05-12 22:24 | evidence_search | argument |
| 2026-05-12 20:42 | evidence_search | argument |
| 2026-05-12 18:48 | evidence_search | argument |
| 2026-05-12 16:54 | evidence_search | argument |
| 2026-05-12 15:06 | evidence_search | argument |
| 2026-05-12 13:18 | evidence_search | argument |
| 2026-05-12 11:24 | evidence_search | argument |
| 2026-05-12 09:42 | evidence_search | argument |
| 2026-05-12 08:00 | evidence_search | argument |
| 2026-05-12 06:12 | evidence_search | argument |
| 2026-05-11 20:24 | evidence_search | argument |
| 2026-05-11 18:42 | evidence_search | argument |
| 2026-05-11 17:12 | evidence_search | argument |
| 2026-05-11 15:42 | evidence_search | argument |
| 2026-05-11 14:18 | evidence_search | argument |
| 2026-05-11 12:48 | evidence_search | argument |
| 2026-05-11 12:18 | evidence_search | argument |
| 2026-05-11 11:48 | evidence_search | argument |
| 2026-05-11 11:24 | evidence_search | argument |
| 2026-05-11 11:12 | evidence_search | argument |
| 2026-05-11 11:00 | evidence_search | argument |
| 2026-05-11 10:48 | evidence_search | argument |
| 2026-05-11 09:54 | evidence_search | argument |
| 2026-05-11 09:48 | evidence_search | argument |
| 2026-05-11 09:42 | evidence_search | argument |
| 2026-05-11 09:36 | audience_simulation | argument |
| 2026-05-11 09:25 | red_team_kill | argument |
| 2026-05-11 09:18 | steelman | argument |
| 2026-05-11 09:15 | genesis | argument |