REE-v2 Implementation Spec
Claim Type: implementation_note Scope: V2 architecture, subsystem boundaries, SD-001/002/003 resolution, required metrics, failure gates Depends On: IMPL-008, IMPL-021, IMPL-022, IMPL-025, MECH-057, MECH-058, MECH-059, MECH-060, MECH-061, MECH-062, MECH-063 Status: candidate Claim ID: IMPL-023
Revision History
- Original (pre-2026-02-28): JEPA-like representation interface spec only. Written before V1 experimental program. No SD-001/002/003, no HippocampalModule, no CausalGridWorld.
- Revised 2026-03-01: Incorporates V1 experimental learning. Adds subsystem separation (SD-001), mutual constitution framing (SD-002), self-attribution substrate requirement (SD-003), CausalGridWorld environment spec, and implementation step map (Steps 2.0–2.5). Step 2.2 representation-interface content retained unchanged under its own section.
Purpose
Define a concrete, buildable target for REE-v2, incorporating everything learned from the V1 experimental program:
- Resolve substrate debts SD-001, SD-002, SD-003 in V2 architecture
- Implement persistent-causal-footprint environment (prerequisite for SD-003 experiments)
- Lock stable representation-interface contract for sensing adapters and E1/E2 latent prediction
- Expose interface hooks for V3 control completion (hippocampal, control-plane, E3)
V2 is the phase where subsystem separation becomes clean, the representation-interface becomes reliable, and self-attribution becomes experimentally testable — so that V3 can focus on control completion rather than fighting substrate debt.
Boundary note:
- JEPA is external to REE and is used here as an inspiration/comparison reference only.
- REE-v2 does not require JEPA to be part of the REE architecture graph.
V1 Learning Incorporated
V1 ran 14 experiments (EXQ-000 through EXQ-013). Results: 10 PASS, 4 FAIL.
PASSes confirm:
- MECH-059: Control plane precision separation is load-bearing
- MECH-056: Residue trajectory placement is load-bearing
- MECH-061: Commitment boundary token reclassification is load-bearing
- MECH-060/067: Write-locus contamination degrades attribution reliability
- MECH-063: Candidate count is load-bearing (n=32 meaningfully better than n=1)
- MECH-062: Residue routing weight is load-bearing (amnesic baseline degrades)
FAILs are substrate-limited, not architecture failures:
- MECH-058 (×2): E1/E2 timescale separation test conflated by SD-001; E2 was doing hippocampal work, making E2_FROZEN test “no trajectory proposal” rather than “no transition prediction”
- MECH-057 (×2): Control completion requires multi-step environment with genuine commitment pressure; V1 stateless hazard grid cannot surface this
Substrate debts registered:
- SD-001: E2.generate_candidates_cem() is hippocampal work inside E2. Prevents clean mechanistic isolation of E2 as transition model.
- SD-002: E1 and E2 are mutually constitutive, not orthogonal. E2 scaffolds E1’s associative learning; E1 primes E2’s predictions. MECH-058 framing incomplete.
- SD-003: E2 as self-attribution substrate requires clean isolatability as a pure transition model. V1 conflation (SD-001) and stateless environment prevent this.
V1 saturation: V1 has reached its qualification limit. Further V1 runs would not add mechanistic clarity. V2 build proceeds now.
V2 Subsystem Boundaries
Module map
| Module | Role | V1→V2 change |
|---|---|---|
LatentStack | Multi-timescale latent (γ/β/θ/δ depths), two-pass encode | Unchanged |
E1DeepPredictor | Slow LSTM world model; associative learning; long-horizon prediction; E1 prior for E2 | Unchanged; episode-boundary hidden-state semantics documented |
E2FastPredictor | Pure fast transition model: f(z_t, a_t) → z_{t+1} | CEM removed; forward_counterfactual() added (SD-003) |
HippocampalModule | New. Trajectory proposal by navigating residue-weighted affective terrain. CEM refinement loop. | Replaces E2.generate_candidates_cem() (SD-001) |
E3TrajectorySelector | Scores trajectories: J(ζ) = F(ζ) + λ·M(ζ) + ρ·Φ_R(ζ). Selects via precision-gated softmax. | Scoring equation flagged as working hypothesis; requires experimental validation |
ResidueField | Persistent ethical geometry. Accumulates harm. Weights only increase. | Unchanged; architectural note added: may become multi-module input (sensorium gate, hippocampal terrain, E1 conditioning) in V3+ |
REEAgent | Wires all modules; two-timescale training loop | Updated to delegate trajectory proposal to HippocampalModule |
CausalGridWorld | New. Grid environment with persistent agent causal footprint, contamination spread, background hazard drift | Replaces stateless V1 GridWorld (SD-003 prerequisite) |
E2: pure transition model (SD-001 resolution)
E2 is strictly f(z_t, a_t) → z_{t+1}. It does not search, plan, or refine trajectories.
Kept:
predict_next_state(z, a)— single-step transition with residual connectionpredict_observation(z)— observation from latentpredict_harm(z, a)— harm level in [0,1]rollout(z, action_sequence)— multi-step rollout given a fixed action sequencegenerate_candidates_random()— random-shooting baseline for HippocampalModule
Removed:
generate_candidates_cem()— this was hippocampal work; moved to HippocampalModule
Added:
forward_counterfactual(z, a_cf)— counterfactual query substrate for SD-003. Identical in implementation topredict_next_statebut named distinctly to flag its use in attribution experiments:z_actual = e2.predict_next_state(z, a_actual) z_cf = e2.forward_counterfactual(z, a_cf) causal_delta = z_actual - z_cf # agent's causal signature
HippocampalModule (SD-001 resolution)
HippocampalModule owns trajectory proposal. It uses E2 only as a forward model.
Architecture:
terrain_priorMLP:(z_beta, residue_at_z) → initial action distribution mean. Biases proposals toward low-residue regions of the affective landscape.- CEM refinement loop:
- Sample
num_candidatesaction sequences from current distribution - Roll each through
e2.rollout()(E2 as pure forward model) - Score each:
harm + 0.5 × residue - Refit distribution mean/std to elite fraction (lowest-scoring)
- Repeat for
num_cem_iterations
- Sample
- Returns final candidate set to E3 for evaluation
Key design property: E2 is passed in as a dependency and called only for rollouts. The search strategy, terrain weighting, and elite selection are entirely hippocampal concerns.
V3 note: The terrain_prior is a simple MLP. A more principled design would model the hippocampus as learning a map of the affective landscape via place-cell-like representations. This is out of scope for V2.
E3TrajectorySelector: working hypothesis
The scoring equation J(ζ) = F(ζ) + λ·M(ζ) + ρ·Φ_R(ζ) is a working hypothesis, not a settled canonical formulation. The weights (λ, ρ) are placeholder parameters pending calibration experiments. The entire scoring function is expected to be redesigned as the three-gate basal ganglia model and HippocampalModule architecture mature.
Known gaps in V2 E3 implementation:
- F(ζ): implemented as trajectory smoothness + final-state viability score. Actual reality constraint mechanism not yet specified.
- M(ζ): does not distinguish agent-caused from environment-caused harm. This requires SD-003 self-attribution experiments (Step 2.4).
- Φ_R(ζ): treated as scalar cost term. May become multi-module input in V3+.
ResidueField: architectural trajectory note
The ResidueField is currently used as a cost term in E3’s scoring function and as terrain input for HippocampalModule. At scale, the residue field (or residue manifold) may be better understood as an input to multiple modules rather than a penalty:
- As attentional prior for a sensorium gate (harm geometry shapes what the system notices)
- As hippocampal terrain (trajectory proposals navigate it)
- As E1 conditioning signal (associative learning weighted by harm history)
This is not a V2 implementation requirement. It is a design consideration to keep live as the architecture is developed toward V3+.
CausalGridWorld (SD-003 prerequisite)
The V1 GridWorld was stateless — no persistent agent causal footprint. This meant agent-caused and environment-caused transitions were indistinguishable, making self-attribution experiments impossible.
CausalGridWorld provides:
- Contamination grid: agent visits spread contamination with configurable rate. Contaminated cells can become additional hazards. Contamination persists across steps.
- Footprint grid: visit counts per cell, exposing agent’s causal history.
- Background drift: hazards random-walk independently of the agent on a configurable interval, providing environment-caused transitions.
- Transition type signal:
info['transition_type']on each step:"agent_caused_hazard"— contamination-triggered harm attributable to prior agent visits"env_caused_hazard"— background drift-triggered harm independent of agent action"resource"— reward from resource collection"none"— no notable transition
This makes agent-caused and environment-caused harm structurally distinct and experimentally disambiguatable via counterfactual E2 queries.
SD-002: E1/E2 Mutual Constitution (documentation, V2 scope)
E1 and E2 are not parallel systems with orthogonal outputs. They are co-constitutive:
-
E2 scaffolds E1: E2’s transition sequence is the temporal evidence stream from which E1 distills associations. The association “arm-up → throwing” only becomes reliably learnable because E2 provides the ordered kinematic chain.
-
E2→E1 autotrain (V3 work): E2’s forward predictions can generate synthetic temporal sequences that E1 learns from, without waiting for ground-truth observations. Not implemented in V2; flagged for V3.
-
E1 primes E2: E1’s associative prior conditions E2’s transition predictions. Partially implemented in V2: E1 generates a prior that REEAgent computes but does not yet wire into HippocampalModule. Full bidirectional loop is Step 2.2 work.
Implication for MECH-058 results: The E1_FROZEN and E2_FROZEN conditions in EXQ-002/006 disrupted both directions of the mutual-constitution loop, not just the targeted component. MECH-058 FAILs should be interpreted at system level, not as evidence that E1/E2 timescale separation is non-load-bearing. Retesting on V2 substrate (Step 2.1 parity check) with clean E2 separation may yield different results.
V2 Implementation Steps
These map to roadmap Steps 2.0–2.5 (see docs/roadmap.md).
Step 2.0 — V2 Redesign ← this document
Status: In progress. This spec document is the primary deliverable.
Exit criteria:
- GOVERNANCE_STATE.md SD-003 entry complete
- Roadmap updated with Steps 2.1–2.5 refined
- V2 entry criteria revised in roadmap
- This spec document complete and committed ← current task
Step 2.1 — E2 Separation (SD-001 resolution)
Status: Code complete. Formal close pending parity check.
What was done:
- E2.generate_candidates_cem() removed
- HippocampalModule created as independent class
- E2.forward_counterfactual() added
- REEAgent wired to HippocampalModule
- CausalGridWorld built
- Smoke tests pass
Remaining:
- Parity check: replicate V1 PASS results (MECH-059, 056, 061, 060/067) on V2 substrate
- SD-001 formally closed in GOVERNANCE_STATE.md after parity check passes
Step 2.2 — Representation Interface Contract
Status: Not started. Original spec content (below) defines scope.
Scope: sensor adapters → JEPA-like context/target latent interfaces; IMPL-022; stable output streams; calibration metrics. E1 prior wired into HippocampalModule. Full bidirectional E1↔E2 loop connected.
Step 2.3 — Persistent Causal Environment
Status: Environment built. Baseline validation not run.
Remaining:
- Baseline experiment confirming agent-caused vs env-caused signal is empirically distinguishable (not just structurally present) in CausalGridWorld
Step 2.4 — Self-Attribution Substrate
Status: Not started. Prerequisite: Steps 2.1 and 2.3 closed.
Scope: counterfactual E2 queries exercised experimentally; first genuine experiments isolating agent-caused vs environment-caused harm; SD-003 closed in GOVERNANCE_STATE.md.
Success criterion: at least one genuine PASS on a self-attribution claim using forward_counterfactual() against CausalGridWorld.
Step 2.5 — V2 Qualification
Status: Not started. Prerequisite: all prior steps.
Required Metrics (V2)
Core representation metrics (Step 2.2)
latent_prediction_error_meanlatent_prediction_error_p95latent_rollout_consistency_ratelatent_residual_coverage_rateprecision_input_completeness_ratelatent_uncertainty_calibration_error(if confidence-channel inputs present)
Mechanism-targeted metrics
e1_e2_timescale_separation_ratio(MECH-058) — expected to improve on V2 substrateuncertainty_coverage_rate(MECH-059)cross_channel_leakage_rate(MECH-060, provisional hook for V3)
SD-003 self-attribution metrics (Step 2.4)
agent_caused_harm_rate— harm events attributed to agent causal footprintenv_caused_harm_rate— harm events attributed to background environment drift-
counterfactual_delta_mean— meanz_actual - z_cf across transitions attribution_accuracy— fraction of transition_type labels correctly predicted by counterfactual delta
HippocampalModule metrics (Step 2.1 parity / Step 2.3+)
terrain_bias_score— degree to which CEM iterates toward lower-residue regionselite_harm_mean— mean harm of elite trajectories vs random baseline
Failure Gates (V2)
Hard-fail conditions (block V2 completion)
- Adapter contract drift or missing required signal declaration
- Timescale collapse (E1/E2 separation fails repeatedly on V2 substrate)
- Uncertainty stream unavailable or uncalibrated where required
- Residual/precision input completeness below threshold
- E2 counterfactual queries produce no measurable causal delta (Step 2.4)
- CausalGridWorld agent/env distinguishability not confirmed empirically (Step 2.3)
- SD-001 parity check fails: V1 PASS experiments do not replicate on V2 substrate
Readiness thresholds (tunable, but explicit)
latent_residual_coverage_rate >= 0.95precision_input_completeness_rate >= 0.95e1_e2_timescale_separation_ratio >= 1.5in qualification laneagent_caused_harm_ratevsenv_caused_harm_rateseparable under matched seedscounterfactual_delta_mean > 0in at least one self-attribution experiment- V1 PASS claims (MECH-059, 056, 061, 060/067, 062, 063) replicate on V2 substrate
- No unresolved adapter-contract failures across two consecutive governance cycles
Acceptance Checklist (Go/No-Go for V2 Completion)
- SD-001 closed: E2 parity check passes, HippocampalModule independently testable
- SD-002 documented: e2.md updated with mutual constitution framing
- SD-003 closed: counterfactual E2 querying demonstrated experimentally
- CausalGridWorld: agent/env distinguishability confirmed empirically
- Representation interface stable across qualification and stress lanes (Step 2.2)
- All required metrics emitted in contract-compliant experiment packs
- Core V2 mechanism claims have active evidence with documented conflict handling
- Governance output recommends moving implementation focus to V3 control completion
Step 2.2 Scope: Representation Interface Contract
Content below is from the original V2 spec. Scope: Step 2.2 only.
Terminology Mode (v2)
V2 canonical docs in REE_assembly use REE-first language throughout.
JEPA is used as inspiration only — for the self-supervised latent prediction principle and the conceptual distinction between context encoding (slow) and target prediction (fast). JEPA terminology does not appear in V2 contracts, schemas, or metric keys. There is no JEPA adapter layer in V2; E1, E2, and LatentStack are the native REE representation system.
Step 2.2 Scope: E1/E2 Representation Interface Contract
Note on JEPA (2026-03-01): JEPA is used as inspiration only. REE-v2 implements its own native E1/E2 representation system. No JEPA adapter signals, proxy bank, or EMA anchor path from JEPA are part of the V2 contract. The REE latent stack (LatentStack, E1, E2) is the canonical implementation, not an adapter layer over an external system.
In scope (Step 2.2)
- Native sensor ingress: normalise environment observations to LatentStack input format
- E1/E2 representation interface contract:
- stable latent output streams (
z_t,z_hat,pe_latent, optionaluncertainty_latent) - E1 prior wired into HippocampalModule (currently computed but not connected; Step 2.2 work)
- E2→E1 autotrain pathway design (V3 scope; interface stub in V2)
- stable latent output streams (
- Stable, contract-compliant experiment pack output
- Uncertainty/calibration channels with measurable calibration
Out of scope (Step 2.2)
- Final control-plane arbitration policy (V3)
- Full hippocampal replay/planning controller (V3)
- Full E3 commitment policy and accountability (V3)
V2 must still expose stub hooks for:
- pre-commit simulation error stream (
MECH-061) - post-commit realized error stream (
MECH-061) - commit-boundary token envelope (
HK-007) - trajectory candidate metadata (
HK-008,MECH-062)
These become V3 primary implementation scope.
Step 2.2 Implementation Objectives
OBJ-V2-001: stable latent representation outputs under repeated runs and seed variationOBJ-V2-002: explicit uncertainty/error channels with measurable calibrationOBJ-V2-003: strict experiment-pack contract compliance (manifest, metrics, summary)OBJ-V2-004: expose V3-ready hook stubs without embedding V3 policy decisionsOBJ-V2-005: E1 prior wired into HippocampalModule (bidirectional E1↔E2 loop connected)
Step 2.2 E1/E2 Interface Contract
Runtime ingress (per step):
obs_t: current environment observation tensorctx_window: ordered context frames/latent states[z_{t-k}, ..., z_{t-1}]action_t(optional): action taken at step t (for E2 conditioning)run_tags: mode and scenario labels for experiment pack
Runtime egress (per step):
z_t: current latent state[batch, latent_dim]— from LatentStack.get_affordance_latent()z_hat: E1 long-horizon predictions[batch, horizon, latent_dim]z_next_pred: E2 fast next-state prediction[batch, latent_dim]pe_latent: latent prediction error from committed trajectorymean: mean squared error over latent_dimp95: 95th percentile across batch
uncertainty_latent(optional):dispersion: std of z across ensemble or rollout samplescalibration_error: if calibration head present
All production evidence must be exportable into contract-compliant experiment pack keys.
Step 2.2 Configuration Contract
Required configurable parameters:
latent_dime1_horizon: E1 long-horizon prediction stepse2_rollout_horizon: E2 rollout horizon for trajectory candidatese2_num_candidates: number of CEM candidate trajectorieshippocampal_num_cem_iterations: CEM iterations in HippocampalModuleresidue_rbf_centers: number of RBF harm centers in ResidueFielduncertainty_estimator:none | dispersionlambda_ethical: E3 ethical cost weight (placeholder; requires calibration)rho_residue: E3 residue cost weight (placeholder; requires calibration)
Required reproducibility metadata in emitted manifests:
- scenario id/name and seed
- config hash
- environment:
env_id,env_version,grid_size,contamination_spread,env_drift_prob - source repo commit
Step 2.2 Qualification Profiles
- E1/E2 timescale ablation (MECH-058, retested on clean V2 substrate)
- E1_FROZEN vs E2_FROZEN vs FULL on CausalGridWorld
- Now interpretable cleanly: E2_FROZEN = no transition prediction; E1_FROZEN = no associative prior. SD-001 conflation no longer contaminates results.
- Uncertainty channels (MECH-059)
- Deterministic vs dispersion estimator
- Verify precision-input completeness
- Write-locus validation (MECH-060/067, parity check on V2)
- Confirm V1 PASS replicates on V2 substrate
- Pre-commit sim error vs post-commit realized error separation
- E1 prior integration (new in V2)
- With vs without E1 prior wired into HippocampalModule
- Verify terrain_prior biases trajectory proposals toward lower-residue regions
Step 2.2 Milestones
M2.2-0 E1 prior wired into HippocampalModule; bidirectional loop complete M2.2-1 Stable representation outputs: z_t, z_hat, pe_latent present across runs M2.2-2 Uncertainty calibration: dispersion estimator validates against held-out outcomes M2.2-3 V3 hook stubs: pre/post commit placeholders exported consistently M2.2-4 Step 2.2 stabilization: two clean experiment cycles, no contract churn
Cross-Version Hooks Required in V2
V2 must expose all v2_required hooks listed in docs/architecture/hook_registry.v1.json.
V2 should also emit stubs for v3_planned hooks where feasible:
- pre-commit simulation error stream placeholder
- post-commit realized error stream placeholder
- commitment-context trace IDs for attribution
- rollout-candidate metadata for hippocampal/controller attachment
Bridge coverage expectation in v2_required hook tier:
- commit-boundary token envelope export (
HK-007,MECH-061) - tri-loop gate arbitration trace export (
HK-008,MECH-062) - orthogonal control-axis telemetry export (
HK-009,MECH-063)
These are interface commitments, not full behavior commitments for the future layers.
Repository Strategy
REE_assembly: canonical governance, spec, evidence matrixree-v2: primary qualification lane for V2 experimentsree-v1-minimal: legacy baseline/reference harness for parity checks only; not the primary V2 qualification laneree-experiments-lab: stress/falsification/adversarial experiments
V2 qualification lane activation: once Step 2.1 parity check passes, ree-v2 becomes the required lane. ree-v1-minimal is retired from primary qualification status.
Open Questions
- What threshold values should gate V2 readiness without overfitting to current harnesses?
- How much of dual pre/post commit channeling should be stubbed vs implemented in late V2?
- What is the right uncertainty estimator for V2 given the native E1/E2 architecture? (The JEPA ensemble/head distinction is no longer relevant; dispersion over E2 rollouts is the natural candidate.)
Related Claims (IDs)
- IMPL-023
- IMPL-008
- IMPL-021
- IMPL-022
- IMPL-025
- MECH-057
- MECH-058
- MECH-059
- MECH-060
- MECH-061
- MECH-062
- MECH-063
- MECH-063