SD-020: z_harm_a Encodes Affective Surprise (Precision-Weighted PE)

Claim ID: SD-020
Subject: harm_stream.affective_surprise_pe
Status: IMPLEMENTED 2026-04-10
Registered: 2026-04-08
Depends on: SD-011, SD-019, ARC-016
Blocks: SD-021 (descending modulation requires PE-based z_harm_a)

Problem

Current compute_harm_accum_loss() trains z_harm_a on a raw EMA-accumulated harm scalar. This makes z_harm_a an integrator of absolute harm state, not a surprise detector.

Chen (2023, Front Neural Circuits) establishes that the anterior insula cortex (AIC) — the biological analog of z_harm_a — responds to “unsigned intensity prediction errors as a modality-unspecific aversive surprise signal,” not to raw stimulus magnitude.

If z_harm_a encodes raw accumulated harm, E3 urgency responds to sustained-but-expected threat (which the agent may already be handling) rather than unexpected escalation (which genuinely requires immediate behavioural change).

Solution

Replace the compute_harm_accum_loss() training target from raw accumulated_harm to a precision-weighted prediction error signal:

harm_PE = |accumulated_harm - _harm_obs_ema|  # unsigned PE
precision_norm = min(e3.current_precision / 500.0, 3.0)  # ARC-016 scaling
surprise_target = harm_PE * precision_norm    # precision-weighted surprise

The running EMA _harm_obs_ema tracks the agent’s expected harm. harm_PE is the unsigned deviation — how surprising the current threat level is. The precision weight (ARC-016 coupling) scales the surprise signal: when the agent is confident and precise, unexpected harm is more surprising than in a volatile state.

Config params (REEConfig)

Param	Type	Default	Purpose
`harm_surprise_pe_enabled`	bool	`False`	switch (False = legacy EMA target)
`harm_obs_ema_alpha`	float	`0.1`	EMA smoothing for expected-harm tracker

Agent attribute: `_harm_obs_ema`

Initialised to 0.0. Updated inside compute_harm_accum_loss() on each call. Tracks running expected harm magnitude.

Architecture Context

This is a loss function modification — no new modules, no new latent fields. The AffectiveHarmEncoder and z_harm_a are unchanged; only what they are trained to predict changes. The modification is gated by harm_surprise_pe_enabled=False (default), preserving backward compatibility.

harm_history_len > 0 (SD-011 second source) must be enabled for this method to be non-trivial (the aux head must exist). With harm_history_len=0, both legacy and PE-mode return zero (no aux head exists).

What This SD Enables

z_harm_a encodes affective surprise rather than absolute state
E3 urgency responds to unexpected threat escalation (correct biological profile)
SD-021 (descending modulation) can meaningfully attenuate the PE signal during commitment

Biological Grounding

Chen 2023 (Front Neural Circuits): AIC encodes unsigned intensity PEs as modality-nonspecific aversive surprise. Not raw magnitude. Seymour (2019, Neuron): pain as RL signal is precision-weighted. Active inference (Friston 2010): high-level affective representations encode precision-weighted PEs at the level of homeostatic set-points.

ARC-016 (dynamic precision) provides the biological grounding for the precision weighting: the precision of the harm-PE signal is E3’s confidence about the current state — in uncertain environments, the same harm magnitude is less surprising because the variance is already high.

SD-020 (harm_stream.affective_surprise_pe)
SD-011 (harm_stream.dual_nociceptive_streams)
SD-019 (harm_stream.affective_nonredundancy — prerequisite)
ARC-016 (cognitive_modes.control_plane_regimes — precision coupling)
MECH-220 (cingulate-insula hub coordination)
Q-036 (question: PE vs magnitude encoding in z_harm_a)