Waking-Phase Online V_s Invalidation Literature Pull
Created: 2026-04-22 Origin: V3-EXQ-475 result – SD-036 GABA decay + MECH-279 PAG freeze gate are both firing (5-6 freeze releases per seed across eval), but the agent re-commits ~12x per release and stays in freeze for 1000/1000 eval steps. The endogenous driver is the hippocampal proposer pulling trajectories from the original avoid-anchor; SD-036 decay can’t break the cycle without an upstream anchor invalidation signal. MECH-284 (V_s residual schema staleness accumulator) and MECH-269 anchor-reset criteria were registered to fill exactly this gap but are still v3_pending.
Prompt
/lit-pull Online (waking-phase) signals that invalidate, downweight, or replace a schema / cognitive map / hippocampal place-cell anchor when the world departs from what the schema predicts – specifically the question of what tells the brain that an existing anchor is no longer the right anchor. The architectural question is narrow: V_s (regional verisimilitude) currently updates only via offline replay (MECH-285 sleep priority); we need the waking-phase counterpart that lets V_s drop online when the agent is repeatedly surprised by mismatches between the anchored schema’s predictions and observed outcomes. This is distinct from (a) replay content (Pfeiffer & Foster 2013 already established that sequences start at current location and progress to goal – see targeted_review_connectome_mech_269/), (b) episode encoding for later replay, and (c) extinction learning of stimulus-outcome associations. The target is the invalidation signal itself – what computational or neurobiological signal triggers an online schema downweight.
Target claims to inform / extend:
- MECH-284 (V_s residual schema staleness accumulator – runtime reverse readout of V_s; currently registered v3_pending without operational definition of the staleness signal)
- MECH-269 (anchor selection by regional verisimilitude; the architecture has reset criteria sketched in
hippocampal_anchor_selection.md“Anchor reset criteria” but no biological grounding for the runtime signals) - MECH-272 (state-gated routing – waking is anchor-dominant; the lit-pull may refine this to “anchor-dominant until invalidation threshold crossed, then probe- dominant for a transient re-anchoring window”)
- Possibly new MECH for the online invalidation signal itself (parallel to MECH-285 for offline replay priority)
Specific neurobiological systems to cover
- Dopamine prediction error as invalidation signal (Schultz lineage)
- Schultz, Dayan & Montague 1997; Schultz 1998/2016 reviews
- Specifically: phasic dips on omission as candidate for “anchor invalid here”
- Recent: dopamine RPE in the dorsomedial striatum vs ventral striatum – model- based vs model-free distinction (Daw & Tobler 2014)
- Lateral habenula / negative reward prediction error
- Matsumoto & Hikosaka 2007 (LHb encodes negative RPE)
- Bromberg-Martin & Hikosaka 2011 (LHb in prediction failure / disappointment)
- Whether LHb signal could function as a broadcast invalidation signal vs being locally restricted to RPE pathways
- OFC as state representation / cognitive map and inferred-state updating
- Wilson et al 2014 (OFC encodes state in a partially observable task)
- Stalnaker, Cooch & Schoenbaum 2015 (OFC and Pavlovian outcome inference)
- Gardner, Schoenbaum & Gershman 2018 (OFC and inferred outcomes; latent state)
- The candidate: OFC representational change during waking as the substrate for online anchor invalidation, parallel to hippocampal replay during sleep
- Hippocampal pattern separation under high interference
- Yassa & Stark 2011 (DG/CA3 pattern separation)
- Reagh & Yassa 2014 (mnemonic similarity / lure discrimination as proxy for schema interference signal)
- Whether DG remap rate could function as an upstream signal that “the current anchor is being interfered with”
- Latent cause / latent state inference (computational frame)
- Gershman & Niv 2010, Gershman 2017 (latent state inference; structure learning)
- Specifically: Gershman, Blei & Niv 2010 (extinction as inference of new latent cause, not erasure of old) – direct architectural model for “invalidate the current schema by inferring a new latent cause”
- Niv 2019 review on learning task representations
- Extinction and reconsolidation literature, narrowly framed
- Bouton 2004 (extinction as new learning) – baseline
- Quirk & Mueller 2008 (vmPFC and extinction); Milad & Quirk 2002 (IL cortex)
- Reconsolidation literature (Nader et al 2000) only insofar as it speaks to online schema updating under reactivation – not the labile-window mechanics
- Locus coeruleus / norepinephrine as global novelty / surprise signal
- Aston-Jones & Cohen 2005 adaptive gain theory
- Sara & Bouret 2012 (LC and orienting/reset response)
- Whether LC phasic burst on prediction failure could function as an “invalidate current attentional/anchoring frame” broadcast signal – candidate biological substrate for MECH-284 staleness accumulator’s trigger event
Architectural questions the lit-pull should help answer
-
Local vs broadcast. Is the invalidation signal local to the affected schema (e.g. OFC representational change for the specific state) or a broadcast reset signal (e.g. LC norepinephrine burst, LHb negative-RPE)? The architectural reading needs both: a broadcast trigger (MECH-284 event) plus a local accumulator (MECH-284 state). The lit-pull should clarify which biology provides which role.
-
Single failure vs accumulation. Does a single prediction failure suffice to invalidate an anchor, or does the biology accumulate failures over a window before triggering reset? Clinical phenomenology (perseveration; the difficulty of changing one’s mind despite repeated counter-evidence) suggests accumulation; but acute orienting responses to single salient violations suggest single-event capability. The architectural reading currently expects accumulation (consistent with MECH-284 as an accumulator) but the lit-pull should test this.
-
Proportional vs threshold. Does invalidation produce a graded V_s downweight proportional to the failure magnitude, or does it cross a threshold and trigger discrete re-anchoring? Both have biological precedent; the architectural commitment matters because graded downweight is compatible with continuous mode arbitration while threshold re-anchoring requires a discrete “transition out of anchored state” event (which MECH-272 currently does not specify).
-
Coupling to replay priority (MECH-285). When a waking-phase invalidation fires, is the affected schema flagged for that night’s replay priority, or is the runtime invalidation independent of the offline reverse readout? The former predicts a sleep-pressure signal that scales with daytime invalidation load; the latter predicts independence. Clinical observation: high-distress / high-mismatch days produce both intrusive next-day cognition AND elevated REM% (or fragmented sleep), consistent with coupling – but the lit-pull should surface this directly.
-
Failure modes. Where in the biology can the invalidation signal fail?
- LC dysfunction: would predict perseveration / inability to shift away from a stale schema (clinical: ASD restricted-interest patterns? OCD?)
- LHb dysfunction: would predict failure to register negative outcomes (clinical: depression’s blunted RPE; or alternatively, hyperactive LHb in learned helplessness as over-invalidation of agency-anchors)
- OFC lesion: would predict inability to update inferred state (Bechara, Damasio gambling task; or anchor-locked behaviour as in catatonia subtype II)
- Failure in the accumulator integration step: would predict V3-EXQ-475’s phenotype directly – repeated brief releases from freeze (single events register) but no durable invalidation (accumulator never crosses threshold).
Output structure
Standard targeted_review_*/ format. Suggested directory: evidence/literature/targeted_review_waking_v_s_invalidation/
Per-paper records as usual. After the pull, write a short SYNTHESIS.md flagging:
- Which architectural question(s) each paper addresses
- The current best candidate biological substrate for the broadcast invalidation trigger (LC vs LHb vs OFC vs DA dip)
- The current best candidate substrate for the local accumulator (OFC representational change vs hippocampal DG remap vs latent-cause inference in vmPFC)
- Which of the five questions remain underdetermined and need additional pulls
- Whether the evidence supports registering a new MECH for the online invalidation event (separate from MECH-284 the accumulator), and if so, draft proposed claim text
Estimated scope: ~10-15 papers, single session.
Notes for the agent doing the pull
- The user is a consultant psychiatrist; clinical mappings (perseveration, OCD, depression’s blunted RPE, ASD restricted interests, learned helplessness as over-invalidation) are valuable framings.
- Pfeiffer & Foster 2013 and Dragoi & Tonegawa 2011/2013 are already pulled in
targeted_review_connectome_mech_269/– do not re-pull; cite where relevant. - The exemplar that motivated this pull is V3-EXQ-475 (re-run of EXQ-471 with SD-036 enabled): freeze releases occurred but re-commits dominated, indicating the agent has SD-036 decay and MECH-279 freeze exit but no online anchor invalidation signal driving the hippocampal proposer to remap.
- Connect to existing REE memory entries on regional verisimilitude (MECH-269), V_s bidirectional signal (MECH-283/284/285), and state-gated routing (MECH-272). The hypothesised online invalidation mechanism is a candidate fourth claim in the V_s cluster – a runtime trigger whose accumulator is MECH-284.
- Be alert to evidence that the invalidation signal is not a separable mechanism but is instead implicit in the existing replay machinery (e.g. waking SWR replays serving the same role as sleep SWR replays). If so, MECH-269 anchor-reset becomes a re-statement of MECH-285 in waking-state rather than a new mechanism – this would simplify the architecture.