← all hypotheses

Estimate Calibration Ledger for Startup CTOs (Jira/Linear-native)

ranked [TRIANGULATED] filter 9.5/15 spread ±2.5 signals: 2 independent
What is this?
A per-eng-manager calibration ledger for the CTO of a 10-30 engineer startup. AE reads sprint-commit tickets via Jira/Linear webhook — feature, estimate, rationale text already present in description — and at sprint close reads ship/slip ground truth from the same source. No new form. The six-pattern taxonomy is repositioned as a rhetorical-pattern detector, not an engineering predictor: AE tags which rationales exhibit Cosmetic Confidence, Premise-Conclusion Severing, Temporal Blindness, then accumulates outcomes per pattern per manager. After 4-6 sprints the CTO receives a quarterly ledger: 'Manager A's rationales with unnamed blockers shipped 28%, Manager B's named-dependency rationales shipped 71%.' The product is the longitudinal correlation, not the per-sprint critique. CTO uses it in 1:1s, capacity planning, and board roadmap defense. Per-sprint challenge sheet becomes a free byproduct; the ledger is what £200-400/mo buys.
Why did we consider it?
AE's graded-prediction backbone plus zero-friction Jira/Linear ingest produces a per-manager calibration ledger CTOs will pay £200-400/mo for because it compounds over sprints and defends roadmap decisions to boards.
What breaks?
  • Data Starvation: Startup Jira/Linear tickets lack the rich rationale text required for rhetorical pattern analysis; context lives in Slack/GitHub.
  • Hawthorne Effect: EMs will game the system by writing defensive, sanitized ticket descriptions once they know they are being algorithmically graded.
  • Go-to-Market Mismatch: Selling a £4,800/yr engineering surveillance tool requires high-touch sales and security reviews, incompatible with an introverted evening/weekend founder.
What did we learn?
Still in evaluation (phase: ranked). No verdict yet.

Filter scores

Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.

AxisWhat it measures
data moatDoes this product accumulate proprietary data that compounds?
10x model testDoes a better model make this more valuable, or redundant?
fast feedback loopsCan outputs be graded against reality in <30 days?
solo founder feasibleCan a solo operator build and run this without a team?
AI providers cant eat itDo hyperscalers have structural reasons NOT to build this?
Composite median: 9.5 / 15. Graduation threshold: 9.0. IQR across runs: 2.5.

Evidence

Signal B — Competitor with documented gap

LinearB turns Jira metrics into developer productivity and resource allocation insights but focuses on aggregate engineering metrics (cycle time, throughput). It does not perform rhetorical-pattern detection on estimate rationale text, does not build per-manager estimate-to-outcome calibration over sprints, and does not produce a longitudinal correlation ledger linking rationale patterns to ship/slip rates.

Signal D — Demand proxy

{"found":true,"summary":"Community discussions show active engagement with Jira/Linear estimation tooling among the target persona. A Reddit thread compares Linear vs Jira for small engineering teams, and an Atlassian Community post explicitly asks how to make Jira estimates more accurate with real-time cost tracking.","sources":["https://www.reddit.com/r/ProductManagement/comments/1neyq6j/been_using_linear_for_6_months_vs_jira_heres_my/","https://community.atlassian.com/forums/App-Central-articles/How-to-Estimate-in-Jira-Accurate-Predictions-and-Real-Time-Cost/ba-p/2797382"],"reason":"Two com…

Evaluation history

WhenStagePhase
2026-05-14 08:49evidence_searchranked
2026-05-14 08:24evidence_searchranked
2026-05-14 07:54evidence_searchranked
2026-05-14 07:24evidence_searchranked
2026-05-14 05:54evidence_searchranked
2026-05-14 05:18evidence_searchranked
2026-05-14 04:54evidence_searchranked
2026-05-14 01:54evidence_searchranked
2026-05-14 01:36evidence_searchranked
2026-05-14 01:12evidence_searchranked
2026-05-13 22:07evidence_searchranked
2026-05-13 21:06evidence_searchranked
2026-05-13 16:06evidence_searchranked
2026-05-13 15:54evidence_searchranked
2026-05-13 15:48evidence_searchranked
2026-05-13 15:42evidence_searchranked
2026-05-13 15:30evidence_searchranked
2026-05-13 15:24evidence_searchranked
2026-05-13 15:18evidence_searchranked
2026-05-13 15:12evidence_searchranked
2026-05-13 15:06evidence_searchranked
2026-05-13 15:01evidence_searchranked
2026-05-13 06:42filter_scorescored
2026-05-13 06:36filter_scorescored
2026-05-13 06:24filter_scorescored
2026-05-13 06:18evidence_searchargument
2026-05-13 06:12audience_simulationargument
2026-05-13 06:06red_team_killargument
2026-05-13 06:00steelmanargument
2026-05-13 05:55genesisargument