Architectural Bet Ledger for Engineering Orgs

ranked [TRIANGULATED] filter 9.0/15 spread ±1.5 signals: 2 independent

What is this?

A post-decision outcome ledger for Architecture Decision Records (ADRs) and technical RFCs at 30-150 person engineering orgs. When a team merges an ADR ('we'll stay on Postgres through 50M rows because write QPS will stay under 2k'), the product extracts the stated assumptions, invariants, and falsification triggers from the ADR text itself. 30-90 days later, the engineering lead one-clicks resolution against observable signals (migration shipped? incident from the named risk? falsification trigger fired?). AE grades each ADR's reasoning against outcome and applies the 6-pattern autopsy (cosmetic confidence, premise-conclusion severing, epistemological shielding) to the ADR text — which is rich, deliberate, and written for review. Buyer: VP Engineering / CTO who wants to know which of their org's architectural bets are holding and which assumption-classes systematically decay. No per-engineer surveillance, no PR-time friction, no sandbagging incentive — the unit is the collective decision, not the individual. Code never leaves; the ADR text is the artifact AE was built to autopsy.

Why did we consider it?

ADRs are the perfect substrate for AE's autopsy engine — deliberate text, collective decisions, fast-resolving, sold to a buyer who currently has zero visibility into architectural batting average.

What breaks?

Temporal Mismatch: Architectural bets take years to resolve, completely neutralizing AE's fast feedback loop advantage.
Volume Starvation: A 30-150 person org does not generate enough ADRs to provide statistically significant behavioral data for the VP Eng.
Attribution Decay: Architecture rots due to shifting business requirements (entropy), making binary grading of the original ADR text practically impossible.

What did we learn?

Still in evaluation (phase: ranked). No verdict yet.

Filter scores

Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.

Axis	What it measures
data moat	Does this product accumulate proprietary data that compounds?
10x model test	Does a better model make this more valuable, or redundant?
fast feedback loops	Can outputs be graded against reality in <30 days?
solo founder feasible	Can a solo operator build and run this without a team?
AI providers cant eat it	Do hyperscalers have structural reasons NOT to build this?

Composite median: 9.0 / 15. Graduation threshold: 9.0. IQR across runs: 1.5.

Evidence

Signal B — Competitor with documented gap

https://github.com/architecture-decision-record/architecture-decision-record

This OSS project provides ADR templates, formats, and examples for recording architectural decisions, but has no post-decision outcome tracking, no assumption/invariant extraction from ADR text, no falsification-trigger monitoring, and no reasoning-quality grading or autopsy patterns. It solves the 'record the decision' step but not the 'evaluate the bet' step that the hypothesis targets.

Signal D — Demand proxy

{"found":true,"summary":"Active Hacker News discussions show engineering practitioners debating accountability and consequences for architectural decisions, indicating real pain around the lack of systematic outcome tracking for engineering bets.","sources":["https://news.ycombinator.com/item?id=47861731","https://news.ycombinator.com/item?id=47862687"],"reason":"Result 21 discusses Uber's $8M ledger mistake and organizational accountability; result 23 directly debates whether engineers should face consequences for bad architectural decisions, with comments noting that engineering orgs ship fl…

Evaluation history

When	Stage	Phase
2026-05-14 08:54	evidence_search	ranked
2026-05-14 08:30	evidence_search	ranked
2026-05-14 08:06	evidence_search	ranked
2026-05-14 07:36	evidence_search	ranked
2026-05-14 07:01	evidence_search	ranked
2026-05-14 05:24	evidence_search	ranked
2026-05-14 05:00	evidence_search	ranked
2026-05-14 04:48	evidence_search	ranked
2026-05-14 04:42	evidence_search	ranked
2026-05-14 04:36	evidence_search	ranked
2026-05-14 04:24	evidence_search	ranked
2026-05-14 04:19	evidence_search	ranked
2026-05-14 04:12	evidence_search	ranked
2026-05-14 04:06	evidence_search	ranked
2026-05-14 03:54	evidence_search	ranked
2026-05-14 03:48	evidence_search	ranked
2026-05-14 03:42	evidence_search	ranked
2026-05-14 03:36	evidence_search	ranked
2026-05-14 03:30	evidence_search	ranked
2026-05-14 03:24	evidence_search	ranked
2026-05-14 03:18	evidence_search	ranked
2026-05-14 03:12	evidence_search	ranked
2026-05-14 03:06	filter_score	scored
2026-05-14 03:00	filter_score	scored
2026-05-14 02:54	filter_score	scored
2026-05-14 02:48	evidence_search	argument
2026-05-14 02:42	audience_simulation	argument
2026-05-14 02:36	red_team_kill	argument
2026-05-14 02:24	steelman	argument
2026-05-14 02:20	genesis	argument