Implement D.v2: non-convergence transcript sink in council_verdict.js

council rejected TOOL reversible: simple 4h proposed 19 May 2026

What is the proposed change?

council_verdict.js currently signals non-convergence via verdict_action='ESCALATED' and escalated:true in the output JSON. Add a post-verdict hook: when escalated===true, apply a redaction pass (strip OAuth tokens, Commander-confidential annotations, any string matching known secret patterns via a regex allowlist), then write the full per-round LLM transcripts to meta_engine/data/non_convergent/YYYY-MM-DD-{hypothesis_id}.jsonl. Each file: hypothesis_id, rounds_run, escalationReason (already emitted by council_verdict.js), each round's per-model verdict text + disagreement_type (FACTUAL|WEIGHTING|FRAMING). On deploy: inject 3 synthetic non-convergent transcript stubs via test harness and assert all 3 land in the sink. No automatic deletion; retention until Commander triggers taxonomy review via a new /commander/<token>/non_convergent route (minimal: just a file list with escalationReason per entry).

Target files

hypothesis_engine/moves/council_verdict.js meta_engine/data/non_convergent/

Expected effect

Current non-convergence rate: 22% (2 of 9 recent verdicts = 'council could not converge after 3 rounds'). At 9 verdicts/week, 30 days produces ~8–10 transcripts. The escalationReason field already distinguishes FACTUAL vs WEIGHTING vs FRAMING disagreements. The corpus enables the next meta_engine cycle to identify whether non-convergence clusters on specific disagreement_type values and propose targeted council prompt changes (e.g., if 80% of ESCALATED verdicts have disagreement_type=FACTUAL, the evidence_search move is under-delivering before council rather than models having legitimate value disagreements).

Falsifier — what would prove this wrong?

After 30 days: query hypotheses table for rows where verdict_action='ESCALATED'; count vs files in meta_engine/data/non_convergent/. Capture rate must be ≥95%. Invalidating observation: capture rate ≥95% but all transcripts show disagreement_type=FRAMING (value conflicts, not factual gaps) — this would indicate the council prompt redesign, not evidence_search, is the bottleneck, and the sink is working correctly but pointing at a different root cause.

Evidence that triggered the proposal

Corpus E: Council verdicts last 7d — 2 of 9 (22%) = 'council could not converge after 3 rounds — human decision required' (7199a9, 2ca131); no transcript corpus exists to study the split pattern
Corpus D: brain/META_ENGINE_S158_ROUND2_SYNTHESIS.md — D.v2 COMMIT_WITH_REVISION: capture-rate falsifier, redaction layer, synthetic deploy test, retention policy
Corpus D: brain/red_team_reviews/meta_engine_s158_round2_gemini-3.1-pro.md — 'Generates the empirical corpus required to safely build the split-reason taxonomy, overcoming the exact lack of evidence that killed the previous escalation harness proposal'

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

Axis	Score
specificity	3
falsifier	3
solo feasible	3
blast radius	3
composability	3
reversibility	3

Disposition

Rejected at the council verdict. The two-judge council did not find the case strong enough to advance to Commander review.

Evaluation history

When	Move
2026-05-23 04:35	meta_council_verdict
2026-05-23 04:21	meta_argument
2026-05-19 13:06	red_team_kill
2026-05-19 11:24	steelman
2026-05-19 10:08	meta_filter_score
2026-05-19 10:04	meta_genesis