Replace v1 five-axis scoring in filter_score.js with ratified v2.3 ten-axis rubrics

filter rejected PROMPT reversible: medium 8h proposed 19 May 2026

What is the proposed change?

Replace the five v1 highSystem/lowSystem axis prompt pairs (data_moat, ten_x_model_test, fast_feedback_loops, solo_founder_feasible, ai_providers_cant_eat_it) with the ten v2.3 axis rubric blocks A1–A10 defined in brain/V2_FILTER_DESIGN_v2.3.md. Each run still scores 0–3 per axis; per-run total becomes max 30 instead of 15. Store per-axis scores in v2_a1..v2_a10 columns (already created by the s112 migration, currently unpopulated). Continue writing composite to composite_rank_score using the same formula (median_total − 0.5×IQR + 0.3×signal_count) but computed over v2 per-run totals. For the first 14 days, compute BOTH v1 and v2 composites: graduate on v1, log v2 as v2_composite_shadow. After 14 days, inspect v2 score distribution; if spread between Commander-KILLED and ROBUST S157 candidates is ≥15 points, flip graduation to v2. The two-consecutive-zero categorical kill rule must be updated to reference the new axis names (e.g. solo_founder_feasible → A7 distribution_reachability and A8 commander_non_engine_work_fit).

Target files

hypothesis_engine/moves/filter_score.js

Expected effect

Hypotheses like Commander-KILLED a38d31 (audit product, no warm-contact base) and c89a71 (ClaimGate, relational sales) should score A5=0 (scalable_revenue: pure audit service) and A7=0–1 (distribution_reachability: warm intros required), producing v2 composites below 40% even if v1 composite was favorable. ROBUST S157 candidates (ec4507, 3656a0) should maintain ≥55% v2 composite via high A6/A7/A9 scores. Graduation rate expected to drop 15–25% as v2 adds friction specifically for audit-shaped and relational-sales-required hypotheses.

Falsifier — what would prove this wrong?

Back-score all 4 Commander-KILLED hypotheses (a38d31, c89a71, 6bf9c5, c89a71-shape) under v2 composite. All 4 must score below 40% (≤12/30 pre-IQR-adjustment). Back-score the 4 S157 ROBUST 5/5 candidates (ec4507, 3656a0, 24a849, and one additional ROBUST). All 4 must score above 50% (≥15/30). If the separation between killed and robust groups is less than 15 composite points, the v2.3 axis prompt implementation mismatches the spec and must be revised before flipping graduation.

Evidence that triggered the proposal

Corpus D, V2_FILTER_DESIGN_v2.3.md: full ratified 10-axis spec with per-axis rubric text, anti-manipulation instructions, and evaluator guidance — went through 4 adversarial red-team rounds
Corpus D, META_ENGINE_S158_RED_TEAM_BRIEF.md: 'filter_score.js currently has 5 v1 axes (legacy, in production)…v2 doctrine is documented in brain/V2_FILTER_DESIGN_v2.3.md but the scoring code still uses v1'
Corpus E, Commander overrides: a38d31 KILL note 'audit product shape rejected, no warm-contact base in target ICP' and c89a71 KILL — both scored favorably under v1 composite but would fail v2 A5/A7/A8

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

Axis	Score
specificity	3
falsifier	2
solo feasible	2
blast radius	1
composability	2
reversibility	2

Disposition

Rejected by filter_score. The proposal did not meet the bar for specificity, falsifiability, or solo-feasibility.

Evaluation history

When	Move
2026-05-19 12:18	red_team_kill
2026-05-19 10:42	steelman
2026-05-19 09:56	meta_filter_score
2026-05-19 09:53	meta_genesis