← all meta proposals

Implement D.v2: non-convergence transcript sink in council_verdict.js

council rejected TOOL reversible: simple 4h proposed 19 May 2026
What is the proposed change?
council_verdict.js currently signals non-convergence via verdict_action='ESCALATED' and escalated:true in the output JSON. Add a post-verdict hook: when escalated===true, apply a redaction pass (strip OAuth tokens, Commander-confidential annotations, any string matching known secret patterns via a regex allowlist), then write the full per-round LLM transcripts to meta_engine/data/non_convergent/YYYY-MM-DD-{hypothesis_id}.jsonl. Each file: hypothesis_id, rounds_run, escalationReason (already emitted by council_verdict.js), each round's per-model verdict text + disagreement_type (FACTUAL|WEIGHTING|FRAMING). On deploy: inject 3 synthetic non-convergent transcript stubs via test harness and assert all 3 land in the sink. No automatic deletion; retention until Commander triggers taxonomy review via a new /commander/<token>/non_convergent route (minimal: just a file list with escalationReason per entry).
Target files
hypothesis_engine/moves/council_verdict.js meta_engine/data/non_convergent/
Expected effect
Current non-convergence rate: 22% (2 of 9 recent verdicts = 'council could not converge after 3 rounds'). At 9 verdicts/week, 30 days produces ~8–10 transcripts. The escalationReason field already distinguishes FACTUAL vs WEIGHTING vs FRAMING disagreements. The corpus enables the next meta_engine cycle to identify whether non-convergence clusters on specific disagreement_type values and propose targeted council prompt changes (e.g., if 80% of ESCALATED verdicts have disagreement_type=FACTUAL, the evidence_search move is under-delivering before council rather than models having legitimate value disagreements).
Falsifier — what would prove this wrong?
After 30 days: query hypotheses table for rows where verdict_action='ESCALATED'; count vs files in meta_engine/data/non_convergent/. Capture rate must be ≥95%. Invalidating observation: capture rate ≥95% but all transcripts show disagreement_type=FRAMING (value conflicts, not factual gaps) — this would indicate the council prompt redesign, not evidence_search, is the bottleneck, and the sink is working correctly but pointing at a different root cause.
Evidence that triggered the proposal
  • Corpus E: Council verdicts last 7d — 2 of 9 (22%) = 'council could not converge after 3 rounds — human decision required' (7199a9, 2ca131); no transcript corpus exists to study the split pattern
  • Corpus D: brain/META_ENGINE_S158_ROUND2_SYNTHESIS.md — D.v2 COMMIT_WITH_REVISION: capture-rate falsifier, redaction layer, synthetic deploy test, retention policy
  • Corpus D: brain/red_team_reviews/meta_engine_s158_round2_gemini-3.1-pro.md — 'Generates the empirical corpus required to safely build the split-reason taxonomy, overcoming the exact lack of evidence that killed the previous escalation harness proposal'

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

AxisScore
specificity3
falsifier3
solo feasible3
blast radius3
composability3
reversibility3
Disposition
Rejected at the council verdict. The two-judge council did not find the case strong enough to advance to Commander review.

Evaluation history

WhenMove
2026-05-23 04:35meta_council_verdict
2026-05-23 04:21meta_argument
2026-05-19 13:06red_team_kill
2026-05-19 11:24steelman
2026-05-19 10:08meta_filter_score
2026-05-19 10:04meta_genesis