SD-023: Environmental Gradient Texture (World Extension)

Status: IMPLEMENTED 2026-04-09 Gap ID: SD-023-env-gradient-texture Depends on: SD-014


Problem

z_world can only encode features that exist in world_obs. CausalGridWorldV2 currently emits two proximity fields (hazard, resource), making z_world approximately a two-scalar encoding regardless of encoder expressiveness. This creates a ceiling on all world-model claims:

  • MECH-216 (E1 predictive wanting): E1 has no leading-indicator signal to predict. At approach positions, z_world already contains the resource proximity signal. E1 learns “resource_prox is high” not “pattern X predicts upcoming resource contact.” EXQ-263a likely produces artifactual salience for this reason, not because MECH-216 is wrong.

  • ARC-017 (typed stream separation) / MECH-096 (multimodal fusion): These claims require z_world to encode distinct features for different world content. With only two undifferentiated proximity channels the claims are structurally untestable.

  • E1 world model quality generally: E1’s LSTM should build a model of temporal dynamics. With two proximity fields as input there are almost no independent dynamics to model.

Root diagnosis note from 2026-04-09: “There are few signals in the world from which to develop models of what is going on too.” – This is the SD-023 problem.


Design

Principle: all objects emit, each type distinctively

Every placed object has its own gradient channel in world_obs. This is biologically grounded: in natural environments all objects have a detectable presence (olfactory, acoustic, visual texture). Making objects emit creates continuous world texture rather than sparse point sources.

Object types and gradient channels

Two new landmark types extend world_obs:

Channel block Source Dims Rationale
hazard_field_view hazard proximity (existing) 25  
resource_field_view resource proximity (existing) 25  
landmark_a_field_view Landmark A proximity 25 Navigation anchor, no harm/benefit
landmark_b_field_view Landmark B proximity 25 Predictive cue – biased near resources

world_obs_dim: 250 -> 300 (50 new dims, 2 new 5x5 field channels).

Landmark A (“pillar”): Navigation anchor, placed randomly. Strong short-range Gaussian gradient (sigma=1.5, scale=1.0). No harm, no benefit.

Landmark B (“trace”): Predictive resource cue, placed with bias near resource locations (prob=0.7 within radius 2 of a resource). Weaker medium-range gradient (sigma=2.5, scale=0.6). This creates the predictive co-occurrence structure for MECH-216: E1 can learn that high landmark_B field predicts upcoming resource contact because landmark B tends to be near resources.

Gradient field computation

def _compute_landmark_field(self, positions, sigma, scale):
    field = np.zeros((self.size, self.size), dtype=np.float32)
    two_sigma2 = 2.0 * sigma * sigma
    for lx, ly in positions:
        for x in range(self.size):
            for y in range(self.size):
                d2 = float((x - lx)**2 + (y - ly)**2)
                field[x, y] += scale * float(np.exp(-d2 / two_sigma2))
    return field

Fields are static within an episode (landmarks do not move). Recomputed at episode start. Agent’s 5x5 view window extracted identically to existing hazard/resource fields.

harm_obs stays unchanged

Landmark fields do NOT feed into harm_obs (they are not nociceptive signals). harm_obs_dim stays 51. Only world_obs grows (250 -> 300 when landmarks enabled).


Implementation

File: ree_core/environment/causal_grid_world.py

New __init__ params (all default to 0/disabled for backward compatibility):

n_landmarks_a: int = 0,
n_landmarks_b: int = 0,
landmark_a_sigma: float = 1.5,
landmark_a_scale: float = 1.0,
landmark_b_sigma: float = 2.5,
landmark_b_scale: float = 0.6,
landmark_b_resource_bias: float = 0.7,

New helper methods:

  • _place_random_landmarks(n, available) – random placement from interior cells
  • _place_biased_near_resources(n, bias_prob, radius, available) – biased near resources
  • _compute_landmark_field(positions, sigma, scale) – Gaussian gradient field

reset() additions:

self.landmark_a_positions = self._place_random_landmarks(n_landmarks_a, _interior)
self.landmark_b_positions = self._place_biased_near_resources(
    n_landmarks_b, landmark_b_resource_bias, radius=2, available=_interior)
self._landmark_a_field = self._compute_landmark_field(
    self.landmark_a_positions, landmark_a_sigma, landmark_a_scale)
self._landmark_b_field = self._compute_landmark_field(
    self.landmark_b_positions, landmark_b_sigma, landmark_b_scale)

world_obs_dim property: returns 300 (not 250) when landmarks enabled.

_get_observation_dict(): 5x5 field views extracted and appended to world_parts; obs_dict keys “landmark_a_field_view” [25] and “landmark_b_field_view” [25] added.

Config

Landmark params are CausalGridWorldV2 constructor params – NOT in REEConfig.from_dims(). Experiments that use landmarks must set world_obs_dim=300 explicitly when constructing REEConfig.

No encoder changes

SplitEncoder (z_world pathway) takes world_obs_dim as a parameter. Extending world_obs from 250 to 300 automatically exposes landmark gradient channels to z_world encoder. No structural changes to LatentStack, AffectiveHarmEncoder, or E3.

Backward compatibility

n_landmarks_a=0, n_landmarks_b=0 by default. world_obs_dim stays 250. All existing experiments unaffected. Landmarks are gradient-only – no new grid entity type.


Why This Enables MECH-216

Without SD-023:

  • Landmark B does not exist
  • The only world feature correlated with resource proximity is resource proximity itself
  • E1’s schema readout learns a redundant function – “resource_prox is high” != prediction

With SD-023:

  • Landmark B fields are elevated for ~2-3 cells around each resource, even outside resource proximity radius
  • E1’s LSTM can learn: “high landmark_B field in recent context predicts upcoming resource contact”
  • schema_salience should rise when landmark_B is nearby, BEFORE resource proximity rises
  • This is genuinely predictive wanting – anticipatory, not reactive

ML/AI Engineering Notes

No ML/AI engineering concerns identified. This is a pure environment extension – no new encoder, no new training target, no new learning component. The z_world encoder (SplitEncoder) automatically receives the new channels via world_obs_dim=300. No phased training required.


Validation

Experiment: EXQ-263b (MECH-216 retest with Landmark B as predictive cue).

Acceptance criteria:

  • LANDMARK_ENABLED shows salience_at_landmark_b > salience in LANDMARK_ABLATED
  • landmark_prox_correlation > resource_prox_correlation in LANDMARK_ENABLED condition

  • SD-023: environment.gradient_texture (this SD)
  • MECH-216: e1_predictive_wanting (primary beneficiary)
  • ARC-017: stream tags (untestable without world texture)
  • MECH-096: multimodal exteroceptive fusion (needs distinct world channels)
  • MECH-103: multimodal fusion (landmark channels act as additional modalities)

REE is developed by Daniel Golden (Latent Fields). Apache 2.0.