Governance Verification Gate
Why this gate exists
REE_assembly governance cycles involve multiple moving parts: experiment manifests on multiple machines, per-machine runner heartbeats and status files, a central runner status file that may lag behind, pending_review.md generated from indexed manifests, and roadmap snapshots that summarise the current state.
These pieces can disagree. Documented failure modes:
- Stale pending_review.md: new manifests land after the review file is generated, leaving experiments invisible to governance.
- Roadmap/pending_review mismatch: the roadmap snapshot and pending_review.md report different pending counts, typically because one was written after and the other before a governance walk.
- Stale central runner_status.json: after the Phase-2 coordinator cutover, per-machine status files carry live writes while the central file lags. Downstream indexers that depend on the central file see stale counts.
- EXQ drained without manifest: an experiment drains from the queue and its outcome disappears from central indices, leaving its claim evidence status unknown.
- Failed experiment without autopsy: a FAIL result in pending_review.md is closed without a structured /failure-autopsy, making the failure evidence uninterpretable.
- Heartbeat scope bleed: the runner heartbeat git pull –rebase –autostash cycle can silently stash uncommitted governance edits, causing them to be lost or applied out of order.
This gate provides a deterministic, read-only check that surfaces these failure modes before a governance cycle is closed.
Scripts
scripts/verify_governance_cycle.py
Runs all checks, writes a JSON report, and prints a terminal summary.
# From REE_assembly root:
python scripts/verify_governance_cycle.py
# Custom stale threshold:
python scripts/verify_governance_cycle.py --stale-hours 6
# Custom output path:
python scripts/verify_governance_cycle.py --output /tmp/verify.json
# Help:
python scripts/verify_governance_cycle.py --help
Exit codes:
0– pass (no block findings)1– fail (one or more block findings)2– usage error
scripts/generate_governance_handoff.py
Reads the verification report and current state, writes a structured handoff.
# From REE_assembly root:
python scripts/generate_governance_handoff.py
Outputs:
evidence/handoffs/active_governance_handoff.jsonevidence/handoffs/active_governance_handoff.md
Run after verify_governance_cycle.py so the handoff reflects the latest verification status.
Checks and severity
Block vs Warn
A block finding means the governance cycle MUST NOT be marked complete until the finding is resolved. Block findings exit with code 1.
A warn finding is a signal worth investigating but does not block cycle closure. It may indicate genuine ambiguity (e.g. Phase-2 coordinator cutover means central status intentionally lags) or a soft inconsistency.
An info finding is purely informational.
Check A: PENDING_REVIEW_STALE_AGAINST_MANIFESTS (block)
Compares the Generated timestamp in pending_review.md against the newest manifest.json mtime in evidence/experiments/. If a newer manifest exists, pending_review.md does not reflect the latest experiment results.
Resolution: Run python scripts/generate_pending_review.py, then re-run the verifier.
Check B: ROADMAP_PENDING_REVIEW_MISMATCH (block)
Extracts the pending_review count from the most recent Status Snapshot in docs/roadmap.md and compares it to the Pending count in pending_review.md. A mismatch means the roadmap and review list describe different states.
Resolution: Re-generate pending_review.md, update the roadmap snapshot to match, or add an explicit note that the snapshot is intentionally stale (and why).
Check C: CENTRAL_RUNNER_STATUS_STALE / CENTRAL_RUNNER_STATUS_ABSENT (warn)
Checks whether evidence/experiments/runner_status.json is present and recent. The default stale threshold is 12 hours. This is a warn (not block) because, after the Phase-2 coordinator cutover, per-machine status files are authoritative and the central file may legitimately lag.
Resolution: Investigate whether the coordinator->central-index merge has wedged. If Phase-2 is intentionally active, document that in the roadmap snapshot and the finding is acceptable.
Check D: HEARTBEAT_STATUS_DIVERGENCE (warn)
Compares each machine’s runner heartbeat (state, current_exq) against its per-machine runner_status.json (idle, current). If the heartbeat says running but the status says idle, the runner may have stopped mid-experiment. Heartbeats older than 4 hours are skipped (machine may be offline).
Resolution: Check the runner process on the affected machine. If stopped, determine whether the experiment needs to be re-queued.
Check E: EXQ_DRAINED_WITHOUT_MANIFEST_OR_DISPOSITION (block)
Searches the most recent Status Snapshot for explicit mentions of an EXQ being drained or force_rerequed without a manifest. If a matching manifest is not found in evidence/experiments/, the experiment’s outcome is opaque.
Resolution: Check per-machine runner_status files and the coordinator panel.
- If the manifest is on a remote machine: wait for sync.
- If the run was pruned or never ran: add an explicit disposition note to the roadmap.
- If a re-run is needed: re-queue with a new letter suffix.
Check F: FAILED_EXPERIMENT_WITHOUT_DIAGNOSIS_LINK (warn)
For each run ID in the FAIL section of pending_review.md, searches for a corresponding failure_autopsy file in evidence/planning/. If none is found, the failure is undiagnosed.
This is a warn (not block) in the first version. If a governance cycle is explicitly being closed and the failure has no autopsy, it should become block.
Resolution: Run /failure-autopsy for the experiment, or add a non_contributory rationale with an explicit reason to its manifest.
Check G: HEARTBEAT_SCOPE_BLEED (warn)
If the working tree is dirty, checks whether any dirty file appears in both a telemetry path (runner_heartbeats/, runner_status/) and a protected governance path (docs/claims/claims.yaml, evidence/planning/). This catches cases where the runner heartbeat rebase cycle has touched governance state.
Resolution: Run git diff -- <path> to inspect the change. Revert any telemetry-driven change to protected files and re-apply the governance edit manually.
Governance cycle integration
The gate is designed to run at two points in the governance workflow:
- Before walking pending experiments: ensures pending_review.md is current.
- Before closing a cycle: ensures no block findings remain.
Wiring into governance.sh
To wire the gate into the end of the governance pipeline:
# Add to governance.sh after the final step:
echo "--- Step N: Governance verification gate ---"
if ! "$PYTHON" scripts/verify_governance_cycle.py; then
echo "ERROR: Governance verification gate FAILED. Resolve block findings before closing." >&2
exit 1
fi
Running manually (recommended for now)
Until the gate is wired into governance.sh, run it manually:
cd /path/to/REE_assembly
python scripts/verify_governance_cycle.py
python scripts/generate_governance_handoff.py
Review evidence/verification/governance_verification_latest.json and evidence/handoffs/active_governance_handoff.md.
Overriding a block finding
The correct way to override a block finding is to resolve the underlying inconsistency with an explicit documented disposition, not to weaken the check.
Examples:
- PENDING_REVIEW_STALE_AGAINST_MANIFESTS: Re-run generate_pending_review.py. If a manifest is deliberately excluded (e.g. a telemetry-only file), document why in the manifest or pending_review generation script.
- ROADMAP_PENDING_REVIEW_MISMATCH: Add a sentence to the roadmap snapshot explaining that the count difference reflects a governance cycle that is currently in progress.
- EXQ_DRAINED_WITHOUT_MANIFEST_OR_DISPOSITION: Add a line to the roadmap snapshot: “V3-EXQ-XXX: pruned from queue without run (reason: …)” or “awaiting sync from machine Y”. This explicit disposition allows the check to pass on the next run after the roadmap snapshot is updated.
Do not silence a block by editing the verifier unless the check is demonstrably producing false positives (e.g. mtime comparison is unreliable in a specific environment). If you must silence, document the reason with a comment and open a work item to revisit.
Relation to failure autopsy and inter-governance handoff
-
Failure autopsy (
/failure-autopsyskill,evidence/planning/failure_autopsy_*.{md,json}): the structured post-mortem for a specific FAIL or ERROR experiment. The verification gate checks that one exists for each pending FAIL; it does not replace the autopsy. -
Inter-governance workset (
evidence/planning/inter_governance_workset.v1.json, generated byscripts/generate_inter_governance_workset.py): the cross-cycle tracking of in-flight experiments and plan items. The handoff generator reads its summary for the workset section. -
Active governance handoff (
evidence/handoffs/active_governance_handoff.{json,md}): the machine-readable state packet generated by this system. It summarises verification status, pending items, in-flight experiments, and a single concrete next action. Intended to be read at the start of each governance session as a resumption primitive.
Report schema
{
"generated_at_utc": "ISO-8601 string",
"repo_root": "/absolute/path/to/REE_assembly",
"git_head": "40-char SHA or null",
"status": "pass | fail",
"findings": [
{
"severity": "block | warn | info",
"code": "CHECK_CODE_STRING",
"message": "Human-readable description",
"evidence_paths": ["list of relevant file paths"],
"suggested_next_action": "Concrete action to resolve"
}
],
"summary": {
"block_count": 0,
"warn_count": 0,
"info_count": 0
}
}