← all hypotheses

Architectural Bet Ledger for Engineering Orgs

ranked [TRIANGULATED] filter 9.0/15 spread ±1.5 signals: 2 independent
What is this?
A post-decision outcome ledger for Architecture Decision Records (ADRs) and technical RFCs at 30-150 person engineering orgs. When a team merges an ADR ('we'll stay on Postgres through 50M rows because write QPS will stay under 2k'), the product extracts the stated assumptions, invariants, and falsification triggers from the ADR text itself. 30-90 days later, the engineering lead one-clicks resolution against observable signals (migration shipped? incident from the named risk? falsification trigger fired?). AE grades each ADR's reasoning against outcome and applies the 6-pattern autopsy (cosmetic confidence, premise-conclusion severing, epistemological shielding) to the ADR text — which is rich, deliberate, and written for review. Buyer: VP Engineering / CTO who wants to know which of their org's architectural bets are holding and which assumption-classes systematically decay. No per-engineer surveillance, no PR-time friction, no sandbagging incentive — the unit is the collective decision, not the individual. Code never leaves; the ADR text is the artifact AE was built to autopsy.
Why did we consider it?
ADRs are the perfect substrate for AE's autopsy engine — deliberate text, collective decisions, fast-resolving, sold to a buyer who currently has zero visibility into architectural batting average.
What breaks?
  • Temporal Mismatch: Architectural bets take years to resolve, completely neutralizing AE's fast feedback loop advantage.
  • Volume Starvation: A 30-150 person org does not generate enough ADRs to provide statistically significant behavioral data for the VP Eng.
  • Attribution Decay: Architecture rots due to shifting business requirements (entropy), making binary grading of the original ADR text practically impossible.
What did we learn?
Still in evaluation (phase: ranked). No verdict yet.

Filter scores

Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.

AxisWhat it measures
data moatDoes this product accumulate proprietary data that compounds?
10x model testDoes a better model make this more valuable, or redundant?
fast feedback loopsCan outputs be graded against reality in <30 days?
solo founder feasibleCan a solo operator build and run this without a team?
AI providers cant eat itDo hyperscalers have structural reasons NOT to build this?
Composite median: 9.0 / 15. Graduation threshold: 9.0. IQR across runs: 1.5.

Evidence

Signal B — Competitor with documented gap

This OSS project provides ADR templates, formats, and examples for recording architectural decisions, but has no post-decision outcome tracking, no assumption/invariant extraction from ADR text, no falsification-trigger monitoring, and no reasoning-quality grading or autopsy patterns. It solves the 'record the decision' step but not the 'evaluate the bet' step that the hypothesis targets.

Signal D — Demand proxy

{"found":true,"summary":"Active Hacker News discussions show engineering practitioners debating accountability and consequences for architectural decisions, indicating real pain around the lack of systematic outcome tracking for engineering bets.","sources":["https://news.ycombinator.com/item?id=47861731","https://news.ycombinator.com/item?id=47862687"],"reason":"Result 21 discusses Uber's $8M ledger mistake and organizational accountability; result 23 directly debates whether engineers should face consequences for bad architectural decisions, with comments noting that engineering orgs ship fl…

Evaluation history

WhenStagePhase
2026-05-14 08:54evidence_searchranked
2026-05-14 08:30evidence_searchranked
2026-05-14 08:06evidence_searchranked
2026-05-14 07:36evidence_searchranked
2026-05-14 07:01evidence_searchranked
2026-05-14 05:24evidence_searchranked
2026-05-14 05:00evidence_searchranked
2026-05-14 04:48evidence_searchranked
2026-05-14 04:42evidence_searchranked
2026-05-14 04:36evidence_searchranked
2026-05-14 04:24evidence_searchranked
2026-05-14 04:19evidence_searchranked
2026-05-14 04:12evidence_searchranked
2026-05-14 04:06evidence_searchranked
2026-05-14 03:54evidence_searchranked
2026-05-14 03:48evidence_searchranked
2026-05-14 03:42evidence_searchranked
2026-05-14 03:36evidence_searchranked
2026-05-14 03:30evidence_searchranked
2026-05-14 03:24evidence_searchranked
2026-05-14 03:18evidence_searchranked
2026-05-14 03:12evidence_searchranked
2026-05-14 03:06filter_scorescored
2026-05-14 03:00filter_scorescored
2026-05-14 02:54filter_scorescored
2026-05-14 02:48evidence_searchargument
2026-05-14 02:42audience_simulationargument
2026-05-14 02:36red_team_killargument
2026-05-14 02:24steelmanargument
2026-05-14 02:20genesisargument