Trajectory Selection

Claim Type: implementation_note
Scope: Trajectory selection detail for E3
Depends On: ARC-003, ARC-004, ARC-013
Status: candidate
Claim ID: IMPL-016

Source: docs/processed/legacy_tree/architecture/trajectory_selection.md

Trajectory selection (E3)

E3 evaluates candidate sensorimotor futures and commits to one trajectory.

E3 is not a perceptual system and does not overwrite the shared sensory latent. Its role is to:

select a coherent sensorimotor future from an affordance space,
raise precision on the selected plan (commitment), and
convert hypothetical rollouts into learning-relevant prediction errors after action.

Canonical clarification (2026-02-09): explicit multi-step rollouts are generated by hippocampal systems. E2 supplies short-horizon predictions and affordances that seed hippocampal candidate generation. References below to “E2 rollouts” should be read as implementation approximations unless hippocampal generation is explicitly modeled. Terminology: rollout = explicit hippocampal multi-step sequence; forward prediction = E2 local transition kernel.

Inputs and representational domains

REE distinguishes two primary latents:

Shared sensory latent (z_S): reliability-weighted, fused multimodal evidence (completed perception).
Affordance/action latent (z_A): candidate sensorimotor futures generated by fast forward prediction.

E3 operates on (z_A), which is derived from (z_S) via E2. E3 does not operate directly on raw modality streams.

Candidate generation (Hippocampus seeded by E2 → (z_A))

Generate (N) candidate trajectories (\zeta_i) via hippocampal rollout, seeded by E2 affordances and E1 constraints, from the current shared sensory latent (z_S) for horizon (H).

Common options:

Model Predictive Control (MPC) hippocampal rollouts with random shooting.
Cross-Entropy Method (CEM) sampling for hippocampal rollout candidates.
Beam search for discrete action spaces (hippocampal rollout candidates).

Scoring

Legacy scoring formulation (preserved for traceability; current canonical framing does not require an explicit ethical cost term):

[ J(\zeta)=\mathcal{F}(\zeta)+\lambda\,M(\zeta)+\rho\,\Phi_R(\zeta) ]

Where (implementation-dependent, but conceptually stable in the legacy framing):

(\mathcal{F}(\zeta)): feasibility / coherence / constraint satisfaction (including physical viability and internal consistency across predictive depths).
(M(\zeta)): legacy mission-aligned objective terms (task progress, care objectives, etc.). Retained for evaluation-only; not required for E3 selection in the current canonical framing.
(\Phi_R(\zeta)): residue curvature term (persistent ethical consequence; path dependent).

Commitment as precision gating

⸻

Temporal collapse and the constructed present

E3 does not select an instantaneous action. It selects a trajectory through a temporally displaced latent space whose predictions span multiple future offsets (sensory, motor, procedural, affective).

Commitment performs a temporal collapse: • It phase-aligns temporally displaced predictions with motor execution timing • It locks a smeared predictive bundle into an action-relevant frame • It enforces cross-horizon coherence so the trajectory is experienced as a single unfolding “now”

The subjective present therefore corresponds to a phase-stable, committed trajectory, not to a time-slice of perception.

Without commitment, temporally displaced rollouts remain exploratory and do not generate a unitary present-moment experience.

Commitment is implemented as a precision increase at action/control depth ((\beta) and sometimes (\gamma)):

reduce action entropy for the selected plan
increase weighting of action-level prediction error (stabilises execution)

This corresponds to dopamine-like commitment without requiring a reward-only interpretation.

Epistemic consequence of commitment

E3 control plane (precision, gain, and mode tuning)

In addition to trajectory commitment, E3 operates a control plane that tunes how the rest of the system runs. This control plane exposes a registry of tunable parameters across modules and depths, including:

sensory precision and gain (attention-like modulation over (z_S)),
action and policy precision (commitment strength),
rollout temperature, horizon, and branching factor (hippocampal generator; seeded by E2),
learning-rate rigidity or plasticity in deep models (E1),
replay rate and pattern-completion bias (hippocampal braid), and
operating mode (task-engaged, Default Mode–like, or sleep/offline).

These parameters form a structured tuning surface (\theta_{\text{tune}}), updated by E3 based on context, urgency, residue curvature, and predicted risk or harm.

Neuromodulatory systems in biological cognition can be understood as implementing such a control plane. In REE, these channels are treated functionally: as precision, gain, stability, urgency, and availability controls, rather than as reward signals.

Before commitment, rollouts are hypotheses. After commitment, the selected trajectory is treated as intended.

This matters because only committed outcomes make errors diagnostic of the model. E3 therefore acts as an epistemic commitment gate:

uncommitted rollouts do not trigger belief update
committed trajectories produce learning-relevant prediction errors after action

Error routing after action

After executing the committed plan, prediction errors are routed by domain:

Perceptual error (sensory mismatch in (z_S)) → updates perceptual priors via error-weighted updates (no semantic overwrite).
Affordance / procedural error (predicted trajectory vs achieved) → updates E2 affordance model (what actions tend to do).
Deep model error (systematic mismatch across contexts) → updates E1 (slow, integrative world/self model).
Value mismatch / moral residue (harm/benefit deviation; constraint violations) → updates residue field (\phi(z)) / (\Phi_R).

Attention and precision coupling

Narrative continuity and safe switching

E3 maintains coherence of agency over time by favouring continuity between successive committed trajectories.

Candidate futures generated by E2 typically share partial prefixes corresponding to near-term sensorimotor structure. E3 enforces a safe switching constraint:

trajectory switches are preferentially allowed where candidate futures share a sufficient prefix with the current committed plan,
switches that break near-term coherence incur increasing cost unless overridden by urgency or constraint violation.

This ensures that plan revision occurs at points where futures remain compatible, producing a stable narrative line.

The subjective present (“now”) corresponds to the decision frontier where:

past path memory (hippocampal braid),
current sensory anchoring ((z_S)), and
multiple predicted futures ((z_A))

remain jointly consistent, allowing redirection without loss of self-coherence.

Attention is implemented as precision (gain) modulation. E3 interacts with attention by:

requesting increased precision on task-relevant sensory channels (to reduce ambiguity during execution), and/or
lowering precision in internal simulation modes where trajectories are exploratory rather than committed.

In REE, attention changes the weight of error signals; it does not inject new sensory content.

Ethical constraints as architectural consequences

Many practical rules (e.g., avoid harm, prefer benefit, disclose risks/benefits/alternatives to support capacity) can be modelled as constraints and scoring terms that emerge from the architecture:

harm/benefit prediction appears naturally as part of scoring candidate futures
commitment makes outcomes attributable (and therefore ethically and epistemically relevant)
residue makes ethical cost path dependent and persistent

These rules are therefore not treated as the deepest ethical primitives; they are operational constraints that arise from a cognition architecture that must act under uncertainty while preserving corrigibility and responsibility.

Shared sensory latent and timescale stack: latent_stack.md
Residue geometry (field vs path): residue_geometry.md
Path memory and replay (hippocampal braid): hippocampal_braid.md
Internal generative mode (Default Mode Network analogue): default_mode.md

Open Questions

None noted in preserved sources.

IMPL-016

References / Source Fragments

docs/processed/legacy_tree/architecture/trajectory_selection.md
docs/thoughts/2026-02-09_e2_hpc_interface.md