← all meta proposalsAdd validation_week_test structured field to genesis.js PROPOSER_SYSTEM output schema
council rejected PROMPT reversible: simple 3h proposed 19 May 2026
What is the proposed change?
In the genesis.js PROPOSER_SYSTEM JSON output schema, add required field `validation_week_test` (max 35 words). Instruction: 'State the single cheapest observable action in the first 7 days that could kill this hypothesis WITHOUT building the product. Name the specific platform, ICP job title, and minimum signal count. Examples: "Cold DM 15 LinkedIn Heads of SaaS Support Ops UK; need 2 replies naming the specific missed-commitment pain" or "Search Support Driven Slack archive for threads about [named pain]; need 5+ unanswered posts from last 60 days." If the minimum viable test requires the product to exist first, write "requires_product_first: [reason in ≤10 words]". Exclusion list: phrases like "reach out to potential customers," "build a landing page," "run an ad," or "conduct market research" are insufficient — rewrite as named, observable, pre-product checks with a specific platform and quantity threshold.'
Target files
hypothesis_engine/moves/genesis.js
Expected effect
The 5 recent council verdicts (5d7cca, 26fc18, cc72cd, 90778c, c27754) each independently invented ad-hoc week-1 tests. Post-change, genesis outputs carry those tests, so council can evaluate their credibility rather than invent them. 10 consecutive genesis outputs should contain specific named communities or search queries. The 2 non-convergent verdicts (7199a9, 2ca131) likely lacked pre-specified validation paths — structured validation_week_test reduces the information gap that causes non-convergence.
Falsifier — what would prove this wrong?
Review 10 consecutive genesis outputs post-change. ≥8 must name at least one specific platform (named subreddit, Slack workspace, LinkedIn search filter, or content community) — not just 'relevant communities' or generic outreach. ≥5 must include a specific quantity threshold (e.g. '3 positive replies,' '5 threads found'). If ≥5/10 produce only 'requires_product_first' or generic descriptions, the exclusion list needs stronger few-shot examples injected into the prompt.
Evidence that triggered the proposal
- Corpus E, recent council verdicts: 5d7cca 'run weekend transferability test before building', 26fc18 'run Week 1 outbound before building', cc72cd '7-day signal check before commit', 47730e '7-day artifact-upload test must clear' — council reinvents the same gate structure ad hoc in every GATHER verdict
- Corpus D, S157_NBJ_DESCRIBABILITY_TEST.md: 'Q4 (accept/reject criteria for ONE canonical case) — Pass criterion: Buyer can state for this specific input, correct output is X and not Y without invoking model confidence.' The same structural absence (no pre-specified test) exists in genesis output for validation path
- Corpus D, META_ENGINE_S158_RED_TEAM_BRIEF.md: council burns $0.12–0.18/run; GATHER_MORE_SIGNAL verdicts end without machine-readable test specification, requiring Commander to extract the test from prose
Proposer self-score
The proposer scored its own draft on these axes (0-3 each) before submitting.
| Axis | Score |
|---|
| specificity | 3 |
| falsifier | 2 |
| solo feasible | 3 |
| blast radius | 3 |
| composability | 3 |
| reversibility | 3 |
Disposition
Rejected at the council verdict. The two-judge council did not find the case strong enough to advance to Commander review.
Evaluation history
| When | Move |
|---|
| 2026-05-23 04:32 | meta_council_verdict |
| 2026-05-23 04:17 | meta_argument |
| 2026-05-19 12:24 | red_team_kill |
| 2026-05-19 10:48 | steelman |
| 2026-05-19 10:05 | meta_filter_score |
| 2026-05-19 09:53 | meta_genesis |