← all meta proposals

Prompt genesis proposer to reason about exception classes inline

council rejected PROMPT reversible: simple 2h proposed 19 May 2026
What is the proposed change?
In the PROPOSER_SYSTEM prompt of genesis.js, after the existing mandatory-field instructions (structural_veto, resolution_event, manual_entry_path), add a reasoning block that does NOT introduce a new JSON schema field: 'Before writing your hypothesis description, explicitly reason: what classes of cases does this workflow explicitly exclude or route to human review? Name at least 2 distinct exception classes — not failure modes of the tool, but structural boundaries of the workflow (e.g., tier of customer, data-quality condition, type of event that falls outside scope). Embed this reasoning into the description field using phrases such as: The workflow excludes..., Human review is required when..., or This tool is not designed for cases where... A hypothesis description that names zero exception classes signals an idealized workflow rather than a deployable one. Do not add a new JSON field for exceptions — embed the reasoning in the description.' This is reasoning pressure, not a schema mandate, specifically to avoid the boilerplate-bait failure of a mandatory exceptions field (P2 was killed for this reason in S158).
Target files
hypothesis_engine/moves/genesis.js
Expected effect
After 20 genesis runs post-patch, at least 30% of hypothesis descriptions contain explicit exception-class language ('The workflow excludes...', 'Human review is required when...', or equivalent). Current S157 baseline: 1/43 graduated candidates (2.3%) explicitly named exception classes in their descriptions. The shadow describability gate (B.v2, targeting Q3) becomes more useful as exception-class language starts appearing in genesis output.
Falsifier — what would prove this wrong?
After 20 genesis runs, if fewer than 5% of descriptions contain exception-class language — statistically indistinguishable from the 2.3% S157 baseline — the reasoning instruction is being overridden by other PROPOSER_SYSTEM instructions or dropped under token pressure. Counter-falsifier: if 80%+ of descriptions contain the exact same boilerplate exception phrase across different hypotheses (indicating the LLM is pattern-matching to satisfy the instruction without genuine reasoning), P2's boilerplate-bait failure mode has been reproduced in prose form — roll back and move the check to an evaluator-side rubric instead.
Evidence that triggered the proposal
  • S157_NBJ_DESCRIBABILITY_TEST.md: 'Q3 (exception classes named) is the systematic engine blindspot. Out of 43 graduated, exactly ONE candidate (ec4507) explicitly names its exception queue in the description. The remaining 40 simply don't address it. The engine doesn't penalise for Q3 because it doesn't ask the question.'
  • S157_NBJ_DESCRIBABILITY_TEST.md: 'The Describability Test Q3: Buyer can name upfront the cases where the workflow should not fire or needs human review. [The workflow] will learn over time which cases are exceptions — FAILS.'
  • META_ENGINE_S158_RED_TEAM_SYNTHESIS.md: 'P2 dies. Structural reason: mandatory schema field becomes evaluator-gaming bait. Two fake-distinct exceptions satisfy P2's falsifier while being the same exception relabeled.' This proposal avoids schema mandate and instead requires reasoning embedded in description where downstream evaluators can assess quality.

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

AxisScore
specificity2
falsifier3
solo feasible3
blast radius3
composability3
reversibility3
Disposition
Rejected at the council verdict. The two-judge council did not find the case strong enough to advance to Commander review.

Evaluation history

WhenMove
2026-05-23 04:29meta_council_verdict
2026-05-23 04:14meta_argument
2026-05-19 14:12filter_score
2026-05-19 14:08filter_score
2026-05-19 14:00filter_score
2026-05-19 13:54evidence_search
2026-05-19 13:48evidence_search
2026-05-19 13:42evidence_search
2026-05-19 13:36evidence_search
2026-05-19 11:54audience_simulation
2026-05-19 10:18red_team_kill
2026-05-19 09:54meta_filter_score
2026-05-19 09:48steelman
2026-05-19 09:38meta_genesis