Add validation_week_test structured field to genesis.js PROPOSER_SYSTEM output schema

council rejected PROMPT reversible: simple 3h proposed 19 May 2026

What is the proposed change?

In the genesis.js PROPOSER_SYSTEM JSON output schema, add required field `validation_week_test` (max 35 words). Instruction: 'State the single cheapest observable action in the first 7 days that could kill this hypothesis WITHOUT building the product. Name the specific platform, ICP job title, and minimum signal count. Examples: "Cold DM 15 LinkedIn Heads of SaaS Support Ops UK; need 2 replies naming the specific missed-commitment pain" or "Search Support Driven Slack archive for threads about [named pain]; need 5+ unanswered posts from last 60 days." If the minimum viable test requires the product to exist first, write "requires_product_first: [reason in ≤10 words]". Exclusion list: phrases like "reach out to potential customers," "build a landing page," "run an ad," or "conduct market research" are insufficient — rewrite as named, observable, pre-product checks with a specific platform and quantity threshold.'

Target files

hypothesis_engine/moves/genesis.js

Expected effect

The 5 recent council verdicts (5d7cca, 26fc18, cc72cd, 90778c, c27754) each independently invented ad-hoc week-1 tests. Post-change, genesis outputs carry those tests, so council can evaluate their credibility rather than invent them. 10 consecutive genesis outputs should contain specific named communities or search queries. The 2 non-convergent verdicts (7199a9, 2ca131) likely lacked pre-specified validation paths — structured validation_week_test reduces the information gap that causes non-convergence.

Falsifier — what would prove this wrong?

Review 10 consecutive genesis outputs post-change. ≥8 must name at least one specific platform (named subreddit, Slack workspace, LinkedIn search filter, or content community) — not just 'relevant communities' or generic outreach. ≥5 must include a specific quantity threshold (e.g. '3 positive replies,' '5 threads found'). If ≥5/10 produce only 'requires_product_first' or generic descriptions, the exclusion list needs stronger few-shot examples injected into the prompt.

Evidence that triggered the proposal

Corpus E, recent council verdicts: 5d7cca 'run weekend transferability test before building', 26fc18 'run Week 1 outbound before building', cc72cd '7-day signal check before commit', 47730e '7-day artifact-upload test must clear' — council reinvents the same gate structure ad hoc in every GATHER verdict
Corpus D, S157_NBJ_DESCRIBABILITY_TEST.md: 'Q4 (accept/reject criteria for ONE canonical case) — Pass criterion: Buyer can state for this specific input, correct output is X and not Y without invoking model confidence.' The same structural absence (no pre-specified test) exists in genesis output for validation path
Corpus D, META_ENGINE_S158_RED_TEAM_BRIEF.md: council burns $0.12–0.18/run; GATHER_MORE_SIGNAL verdicts end without machine-readable test specification, requiring Commander to extract the test from prose

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

Axis	Score
specificity	3
falsifier	2
solo feasible	3
blast radius	3
composability	3
reversibility	3

Disposition

Rejected at the council verdict. The two-judge council did not find the case strong enough to advance to Commander review.

Evaluation history

When	Move
2026-05-23 04:32	meta_council_verdict
2026-05-23 04:17	meta_argument
2026-05-19 12:24	red_team_kill
2026-05-19 10:48	steelman
2026-05-19 10:05	meta_filter_score
2026-05-19 09:53	meta_genesis