Engram-Informed Predictor Track
Spec metadata:
- ID:
engram-informed-predictor-track - Status:
planning - Hard depends on:
predictive-memory-scorer - Registry:
docs/specs/INDEX.md
1) Problem
Signet already carries several Engram-like ideas across the scorer and SSM planning docs, but they are spread across multiple specs and not tracked as one execution lane. We need one planning contract that translates Engram patterns into concrete scorer experiments, selection criteria, and handoff into the SSM shadow track.
Without this lane, Engram-inspired work risks becoming piecemeal:
- hash path tweaks land without comparable benchmark conditions
- gating and convolution experiments drift from production constraints
- SSM planning references Engram patterns without a locked translation contract
2) Goals
- Establish a single execution lane for Engram-inspired scorer changes.
- Run reproducible ablations on the current cross-attention scorer before changing SSM shadow routing.
- Define a compatibility contract between selected ablation outcomes and
ssm-temporal-backbone. - Preserve current safety properties: fail-open behavior, deterministic fallback, bounded latency, and agent scoping.
- Keep retrieval substrate boundaries explicit: SQLite/FTS/vector/graph remain the source of recall truth.
3) Proposed capability set
A) Baseline locking and evaluation parity
Lock a reproducible baseline for current scorer behavior and evaluate all variants on identical data slices:
- synthetic canary suite from
packages/predictor/bench/ - real-session exports from predictor training data paths
- identical metric set: HR@K, MRR@K, DCG@K, latency p50/p95/p99
B) Hash-path translation from Engram
Apply and measure hash-path changes inspired by Engram:
- tokenizer normalization (NFKC + lowercase) before hashing
- prime bucket configuration to reduce systematic collisions
- optional multi-head hash embeddings for text-only candidate encoding
- collision-rate diagnostics alongside retrieval quality metrics
C) Gate-path translation from Engram
Test scorer variants that separate similarity and gating signals:
- explicit Engram-style alpha gate path
- separate content/value path
- optional depthwise causal Conv1d post-gating (kernel=4, SiLU)
- strict measurement of added latency and stability
D) Parameter allocation experiments
Test whether current scorer capacity is over-allocated to hash table memory by running budget reallocation sweeps (for example bucket count vs internal/value dims) while keeping inference constraints intact.
E) Handoff contract into SSM track
Codify which Engram-inspired deltas are accepted by ssm-temporal-backbone:
- which input encodings remain canonical
- which gating/conv patterns carry forward to SSM architecture tests
- which ideas are explicitly rejected (with reason) to avoid repeated loops
4) Non-goals
- No insertion of Engram modules into the underlying LLM backbone.
- No replacement of hybrid retrieval substrate.
- No schema-breaking changes to predictor comparison or training tables.
- No production cutover to SSM from this spec alone.
5) Integration contracts
Engram Track <-> Predictive Memory Scorer
- Keeps current sidecar RPC contract intact.
- Candidate feature vector shape remains backward compatible unless explicitly versioned.
- Any scorer variant must preserve fail-open behavior when sidecar is missing.
Engram Track <-> SSM Foundation Evaluation
- Shares benchmark harnesses and reporting format.
- Engram-inspired ablations become first-class rows in the foundation matrix.
- Foundation decision reports must cite this spec for translation rationale.
Engram Track <-> SSM Temporal Backbone
- Temporal shadow deployment consumes selected outputs from this track.
- No SSM routing default changes until Engram track recommendations are recorded and accepted.
- Deterministic fallback and latency budgets remain unchanged.
Engram Track <-> Desire Paths
- Path-scoring and traversal invariants stay authoritative.
- Constraint surfacing cannot be suppressed by any Engram-inspired scorer path.
6) Rollout phases
Phase 1: Baseline and instrumentation
- Freeze baseline scorer config and test slices.
- Add collision and latency diagnostics for hash/gate variants.
- Produce a reproducible baseline report.
Phase 2: Engram-inspired scorer ablations
- Run hash-path and gate-path experiments.
- Run parameter allocation sweeps under fixed latency budgets.
- Publish ablation matrix with reproducible configs.
Phase 3: SSM handoff contract
- Select accepted deltas.
- Update SSM planning contracts with accepted/rejected findings.
- Document follow-on implementation slices for shadow mode.
7) Validation and tests
- Deterministic hashing tests for normalization and bucket variants.
- Feature-dimension and RPC compatibility tests remain green.
- Latency guard tests verify no regression past agreed thresholds.
- Session-end comparison and drift logic still function with variants.
8) Success metrics
- One or more Engram-inspired variants improve quality slices on both synthetic and real-session evaluation sets.
- p95 scoring latency remains within configured budget envelopes.
- A signed decision report maps selected deltas into SSM temporal planning with explicit acceptance and rejection notes.
9) Open decisions
- Should multi-head hashing be retained as a permanent text-path default or used only in SSM experiments?
- Should Engram-style convolution live in the cross-attention scorer or remain SSM-only after translation?
- What is the minimum quality delta required to justify added implementation complexity in production scorer code?