SSM Graph Traversal Model
1) Problem
Graph traversal in Signet (DP-6) produces candidate paths through the
entity dependency graph. Currently, path ranking uses a static composite
score: strength * confidence * association_weight with post-fusion
dampening (DP-16: gravity, hub, resolution filters). This works for
single-hop and well-connected entities but degrades on multi-hop paths
where the quality of intermediate edges matters more than terminal scores.
The core issue is that static path scoring cannot learn from traversal outcomes. When the system discovers that a particular 3-hop path through entities A -> B -> C consistently produces relevant memories, the static scorer has no mechanism to remember that pattern. Each traversal starts from scratch.
SSM-GRAPH-INTERSECTION research identifies DyGMamba (two-level SSM for continuous-time dynamic graphs) and TF-TKG (temporal filtering for knowledge graph reasoning) as directly applicable architectures. The Engram paper further validates that hash-based O(1) lookup for static patterns combined with learned dynamic scoring outperforms purely learned approaches, suggesting that graph structure (static) and traversal dynamics (learned) should remain separate systems.
2) Goals
- Score traversal paths using an SSM that processes the sequence of edges (hops) as a temporal sequence, learning which edge patterns lead to relevant memory retrieval.
- Incorporate per-edge signals (confidence, co-occurrence, relationship type, age, traversal count) as features for the path SSM.
- Operate in shadow mode with deterministic fallback to static path scoring. No production path ranking changes without passing gates.
- Provide per-path explainability: why a path was promoted or demoted, which edges contributed most to the score.
- Preserve all hard invariants: bounded traversal, constraint surfacing, agent scoping, post-fusion dampening.
3) Proposed capability set
A) Path feature encoding
Each traversal path is a sequence of edges. Encode each edge as a feature vector:
| Feature | Description | Encoding |
|---|---|---|
| source_type | Entity type of source node | One-hot (8 types) |
| target_type | Entity type of target node | One-hot (8 types) |
| relationship_type | Dependency type | Categorical embedding |
| confidence | Edge confidence (0-1) | Direct |
| strength | Edge strength (0-1) | Direct |
| association_weight | NPMI co-occurrence | Direct |
| log_edge_age | Days since edge creation | log1p |
| traversal_count | Times this edge was traversed | log1p |
| hop_position | Position in path (0-indexed) | Normalized by max hops |
| is_community_boundary | Does edge cross Louvain communities? | Binary |
| source_density | Structural density of source entity | log1p |
| target_density | Structural density of target entity | log1p |
12-dim per edge. A 3-hop path produces a sequence of 3 feature vectors.
B) Path SSM architecture
Inspired by DyGMamba’s two-level design:
Level 1: Edge-level SSM
Processes the edge feature sequence along a single path:
Edge sequence: [e1, e2, ..., eH] (H = hop count, max 5)
|
v
Input projection: Linear(12, 32)
|
v
Selective SSM (1 layer, state_dim=16, conv_kernel=3)
|
v
Path embedding: last hidden state (32-dim)
Short sequences (max 5 hops from bounded traversal) make the SSM’s sequential scan cheap. The causal convolution (kernel=3) captures local edge-pair patterns (e.g., “uses -> depends_on” is a common productive pattern).
Level 2: Path comparison
Multiple candidate paths compete for ranking:
Candidate paths: [path1, path2, ..., pathN]
|
v
Path embeddings from Level 1: [p1, p2, ..., pN]
|
v
Query conditioning: dot(query_embedding_down, path_embedding)
|
v
Score head: Linear(32, 1)
|
v
ListNet loss over path rankings
Query conditioning uses the session context embedding (downprojected to 32-dim) to bias path scores toward the current session’s focus.
Parameter budget: ~50K parameters. Intentionally small because paths are short sequences (max 5 edges) and the model’s job is narrow: rank paths, not generate them.
C) Training signal: path feedback
DP-9 (path feedback propagation) provides the training signal:
- After a session, paths that led to memories receiving positive agent feedback get positive labels.
- Paths leading to irrelevant memories get negative labels.
- Paths that were not traversed get zero labels (unknown).
This is a sparse signal. Augment with:
- Heuristic labels from DP-6 benchmark: the 50-question LoCoMo set provides ground truth for which paths lead to correct answers.
- Synthetic path data: generate synthetic entity graphs with known productive paths, following SYNTHETIC-DATA-GENERATION patterns.
D) Shadow mode and fallback
Path SSM runs in shadow mode by default:
- Static scorer produces the production path ranking.
- Path SSM produces a shadow ranking on the same candidate paths.
- Both are logged for offline comparison.
- Dashboard shows agreement rate, per-pattern performance, and which paths the SSM would have promoted/demoted.
Deterministic fallback: if the SSM sidecar is unavailable, times out (>10ms), or returns NaN scores, the static scorer’s ranking is used with no degradation. Same fail-open pattern as the production scorer.
E) Explainability payload
Each scored path includes an explanation:
{
"path": ["entity:signet", "entity:cloudflare", "entity:wrangler"],
"score": 0.82,
"static_score": 0.65,
"explanation": {
"dominant_edge": 1,
"dominant_feature": "association_weight",
"community_crossing_penalty": 0.0,
"hop_count_factor": 0.95
}
}
The dominant_edge and dominant_feature are computed by gradient
attribution: which edge and which feature had the largest gradient
magnitude for the final score.
F) Constraint preservation
Hard invariants from INDEX.md:
- Bounded traversal: the SSM scores paths after traversal, it does not control traversal bounds. Max hops, max candidates, and timeout are unchanged.
- Constraint surfacing: paths containing constraint-bearing entities are never suppressed by SSM scoring. Constraints surface regardless of path score (invariant 5).
- Agent scoping: all path data is agent-scoped. No cross-agent path scoring.
- Post-fusion dampening: DP-16 gravity/hub/resolution filters apply after SSM scoring, same as they apply after static scoring.
4) Non-goals
- No unbounded traversal. Max hops and candidate limits are fixed.
- No removal of post-fusion dampening or constraint overrides.
- No modification of the entity dependency graph structure.
- No real-time graph topology learning (that’s ontology-evolution-core).
- No replacement of the temporal backbone (this spec extends it for graph-structured inputs, not replaces it).
5) Integration contracts
SSM Graph Traversal <-> Desire Paths
- Consumes traversal path output from DP-6 (entity-anchored search).
- Path features include DP-2 edge confidence and DP-5 community IDs.
- DP-9 path feedback is the primary training signal.
- DP-16 post-fusion dampening applies after SSM path scoring.
SSM Graph Traversal <-> Knowledge Architecture
- Entity types from KA taxonomy used for source_type/target_type features.
- Structural density from KA used for source/target density features.
- Entity aspects/attributes accessed for explainability but not as direct SSM inputs (keeps feature vector compact).
SSM Graph Traversal <-> SSM Temporal Backbone
- Path SSM is a separate model from the temporal backbone SSM.
- The temporal backbone handles per-memory relevance over time.
- The path SSM handles per-path quality over graph structure.
- Both run as shadow sidecars with independent fallbacks.
- Future: the two models could share a session context embedding, but they score different things (memories vs paths).
SSM Graph Traversal <-> Predictive Memory Scorer
- DP-10 (path scoring) evolves the scorer feature vector to include per-path signals: best_path_ssm_score, path_hop_count, min_edge_confidence, community_crossings.
- These become additional dimensions in the scorer’s feature payload, not a replacement for existing features.
6) Rollout phases
Phase 1: Shadow deployment
- Implement path feature encoder (12-dim per edge).
- Train path SSM on synthetic + LoCoMo benchmark data.
- Deploy as shadow sidecar alongside static path scorer.
- Log predictions, dashboard comparison panel.
- Validate on multi-hop benchmark slices.
Phase 2: DP-9 feedback integration
- Wire path feedback from real sessions into training pipeline.
- Implement explainability payload with gradient attribution.
- Extend shadow logging with per-path explanation.
- Run TSTR protocol: synthetic-trained vs real-trained vs combined.
Phase 3: Production blending
- Blend path SSM scores with static scores using configurable alpha.
- Start at alpha=0.1, increase based on shadow metrics.
- A/B evaluation: alternating sessions with and without path SSM.
- Minimum 50 sessions per arm before increasing alpha.
- Automatic rollback on any regression in multi-hop benchmark.
7) Validation and tests
- Multi-hop retrieval: >10% MRR improvement on 3+ hop paths compared to static path scoring, on the LoCoMo benchmark subset.
- Constraint preservation: zero constraint suppression incidents across all test sessions (must remain at zero).
- Bounded traversal: path SSM only scores paths within configured max hops (traversal bounds are enforced upstream, not by the SSM).
- Latency: path scoring adds <5ms p95 to traversal pipeline (paths are short, model is small).
- Fallback: deterministic fallback engages correctly when SSM sidecar is killed mid-scoring.
- Agent scoping: paths from agent A never influence scoring for agent B.
- Explainability: gradient attribution correctly identifies dominant edge in >80% of synthetic test cases with known productive edges.
8) Success metrics
- Multi-hop precision: fraction of 3+ hop paths leading to relevant memories increases by >15% compared to static scoring.
- Path diversity: SSM-scored retrieval surfaces memories from >20% more distinct entities than static scoring on the same queries.
- Agent feedback correlation: SSM path scores correlate with downstream agent feedback (Spearman rho > 0.3, p < 0.05).
- Convergence speed: path SSM produces useful rankings within 20 sessions of path feedback data.
9) Open decisions
- Whether the path SSM should share parameters with the temporal backbone SSM or be a completely independent model. Independent is simpler and avoids coupling failure modes. Shared saves compute but creates a single point of failure.
- Whether edge features should include entity embedding fragments (downprojected, like the current scorer’s encoding paths) or stay purely structural. Structural keeps the model small; embeddings add semantic signal but increase parameter count.
- How to handle paths with mixed confidence levels: should a single low-confidence edge in a 4-hop path tank the whole path score, or should the SSM learn its own confidence aggregation? The static scorer uses min-confidence, which is conservative. The SSM could learn something more nuanced.