Sprint Brief: Knowledge Architecture KA-3
Traversal Retrieval Path
What You’re Building
The knowledge graph built in KA-1 (schema) and KA-2 (structural assignment) is populated but passive — nothing reads from it at retrieval time. This sprint wires traversal-first retrieval into the session-start and recall paths so that the graph actively influences which memories get surfaced.
The core idea: when an entity is in scope (matched by project path, query terms, or checkpoint), walk its aspects, constraints, and one-hop dependencies to collect structurally relevant memory IDs. These join the candidate pool alongside the existing effective-score and embedding candidates.
The hard invariant: constraints always surface when their entity is in scope, regardless of score rank.
Required Reading
docs/specs/INDEX.md— Cross-Cutting Invariants (especially #5: constraints always surface)docs/specs/complete/knowledge-architecture-schema.md— section 7 (retrieval contract)docs/KNOWLEDGE-ARCHITECTURE.md— conceptual modeldocs/specs/SPRINT-BRIEF-KA1.md— schema and helpers this depends ondocs/specs/SPRINT-BRIEF-KA2.md— structural assignment (populates the tables this sprint reads from)
Prerequisites
KA-1 and KA-2 must be complete:
entity_aspects,entity_attributes,entity_dependencies,task_metatables exist and are indexed- KA-1 CRUD helpers in
knowledge-graph.tsare working - KA-2 structural assignment pipeline is populating rows
getConstraintsForEntity()returns correct results
Current Retrieval Architecture
Understanding the existing flow is critical. Here’s how retrieval works today — KA-3 adds a new candidate source without replacing anything.
Session-start (handleSessionStart in hooks.ts:753)
getAllScoredCandidates(project, limit)— queries memories ordered bycreated_at DESC, scores each witheffectiveScore(importance, createdAt, pinned)using 5%/day decay, filterseffScore > 0.2selectWithBudget(candidates, 2000)— picks top candidates within 2000 char budgetgetPredictedContextMemories(project, 10, 600, excludeIds)— extracts recurring terms from recent session summaries, FTS query for supplementary memories within 600 char budgetrecordSessionCandidates(sessionKey, candidates, injectedIds)— records all candidates + which were injected intosession_memories- Recovery checkpoint injection — loads latest checkpoint digest within 4h window, reserved separately from main budget
Hybrid recall (hybridRecall in memory-search.ts:114)
- BM25 keyword search (FTS5) — normalize scores to [0,1]
- Vector search (sqlite-vec) — normalize cosine distances to [0,1]
- Score merge —
alpha * vec + (1-alpha) * bm25blend - Rehearsal boost —
log(access_count + 1) * recencyFactor - Graph boost —
getGraphBoostIds()ingraph-search.ts, 1-hop throughrelationstable viamemory_entity_mentions - Optional reranker — cross-encoder re-rank of top-N
Current graph boost (graph-search.ts)
The existing getGraphBoostIds() does:
- Tokenize query, match entities by
canonical_name LIKE %token% - One-hop expansion through
relationstable (both directions) - Collect memory IDs via
memory_entity_mentions - Apply flat boost weight (default 0.15)
This uses the OLD graph structure (relations + memory_entity_mentions).
KA-3 adds a PARALLEL traversal path through the NEW KA tables
(entity_aspects, entity_attributes, entity_dependencies).
Deliverables
1. Traversal query builder
New file: packages/daemon/src/pipeline/graph-traversal.ts
This is the core of KA-3. A single function that takes focal entity IDs and returns a structurally coherent set of memory IDs plus constraint content.
export interface TraversalResult {
/** Memory IDs collected from entity_attributes.memory_id */
readonly memoryIds: Set<string>;
/** Constraint content that must always be surfaced */
readonly constraints: ReadonlyArray<{
readonly entityName: string;
readonly content: string;
readonly importance: number;
}>;
/** Entities traversed (for telemetry) */
readonly entityCount: number;
/** Whether traversal hit the timeout */
readonly timedOut: boolean;
}
export interface TraversalConfig {
/** Max aspects per entity, ordered by weight DESC (default 10) */
readonly maxAspectsPerEntity: number;
/** Max attributes per aspect (default 20) */
readonly maxAttributesPerAspect: number;
/** Max one-hop dependency expansions (default 30) */
readonly maxDependencyHops: number;
/** Minimum dependency strength to traverse (default 0.3) */
readonly minDependencyStrength: number;
/** Timeout in ms (default 500) */
readonly timeoutMs: number;
}
export function traverseKnowledgeGraph(
focalEntityIds: ReadonlyArray<string>,
db: ReadDb,
agentId: string,
config: TraversalConfig,
): TraversalResult;
Traversal algorithm (all synchronous, same pattern as
getGraphBoostIds):
- For each focal entity:
a. Pull all active constraints (
entity_attributeswherekind='constraint'andstatus='active') via JOIN throughentity_aspects. These go intoconstraintsoutput unconditionally. b. Pull top aspects byweight DESC, limited tomaxAspectsPerEntity. c. For each aspect, pull active attributes limited tomaxAttributesPerAspect. Collectmemory_idvalues (skip NULL — those are unclassified stubs from pass 1). - One-hop dependency expansion:
a. Query
entity_dependenciesfor focal entities wherestrength >= minDependencyStrength, limited tomaxDependencyHops. b. For each dependency target entity, repeat step 1 (constraints- top aspects + attributes). Do NOT recurse further — one hop only.
- Deduplicate memory IDs across all collected attributes.
- Check deadline at each major step (same
Date.now()pattern asgraph-search.ts).
Key design decisions:
- Constraints from dependency targets are also collected (if entity X depends on entity Y, Y’s constraints matter for X’s context)
memory_id = NULLrows are skipped (awaiting KA-2 classification)- The function is pure — takes a
ReadDb, no side effects - Timeout protection at each step, returns partial results on timeout
2. Focal entity resolution
New function in graph-traversal.ts:
export interface FocalEntityResult {
readonly entityIds: string[];
readonly source: 'project' | 'checkpoint' | 'query' | 'session_key';
}
export function resolveFocalEntities(
db: ReadDb,
agentId: string,
signals: {
project?: string;
sessionKey?: string;
checkpointEntityIds?: string[];
queryTokens?: string[];
},
): FocalEntityResult;
Resolution priority:
- Checkpoint entity IDs — if the recovery checkpoint includes structural snapshot fields (KA-5 future), use those directly
- Project path — match project path against entity names/
canonical names where
entity_type = 'project' - Session key lineage — look up the most recent checkpoint for this session key, extract entity mentions from its digest
- Query tokens — tokenize and match against entity
canonical_name(same asgetGraphBoostIdstokenizer)
For now, focus on project path matching (#2) and query token matching (#4). The checkpoint fields (#1) are KA-5 and session key lineage (#3) is a nice-to-have.
Project path matching:
SELECT id FROM entities
WHERE agent_id = ?
AND entity_type = 'project'
AND (canonical_name LIKE ? OR name LIKE ?)
ORDER BY mentions DESC
LIMIT 5
Normalize the project path: extract the last 1-2 directory segments
as search tokens. /home/nicholai/signet/signetai → search for
%signetai% and %signet%.
3. Wire traversal into session-start
Where: packages/daemon/src/hooks.ts, inside handleSessionStart
After getAllScoredCandidates() and before selectWithBudget():
- Resolve focal entities from
req.project - Call
traverseKnowledgeGraph()with focal entity IDs - Merge traversal memory IDs into the candidate pool:
- For each traversal memory ID not already in
allCandidates, fetch the memory row and add it with source'ka_traversal' - Traversal candidates get a synthetic effective score based on the attribute’s importance (not the decay-based score)
- For each traversal memory ID not already in
- Inject constraint content as a dedicated section in the output, AFTER the “Relevant Memories” section but BEFORE recovery context
Constraint injection format:
## Active Constraints
Constraints for entities in scope. These always apply.
- [EntityName] content of constraint
- [EntityName] another constraint
Budget: Constraints get their own reserved budget (default 1000
chars), carved out of maxInjectChars alongside the recovery context
reservation. Constraints are never truncated by the main budget —
they are appended after budget truncation, same pattern as recovery
context.
Key constraint: If there are no constraints and no traversal memories, this path should be a no-op with zero overhead beyond the focal entity resolution query.
4. Wire traversal into hybrid recall
Where: packages/daemon/src/memory-search.ts, inside
hybridRecall
After the existing graph boost block (line ~270) and before the reranker:
- Resolve focal entities from query tokens (use the same tokenizer
as
getGraphBoostIds) - Call
traverseKnowledgeGraph()with focal entity IDs - For each traversal memory ID:
- If already in
scored, apply a boost (same pattern as graph boost:(1 - tw) * score + twwheretwis configurable, default 0.2) - If NOT in
scored, add it with a base score derived from attribute importance
- If already in
- Re-sort after boost application
- Constraints from traversal are NOT injected in recall (recall is a search, not context assembly — constraints only apply at session-start)
Guard: Only run if cfg.pipelineV2.graph.enabled and the KA
tables exist. Use a try/catch with graceful fallback (same pattern
as existing graph boost).
5. Update candidate pool fusion
Where: packages/daemon/src/hooks.ts and
packages/daemon/src/session-memories.ts
The KA spec defines the new candidate pool as:
traversal pool ∪ effective top-50 ∪ embedding top-50
Currently it’s just effective top-N. After this deliverable:
- Extend
SessionMemoryCandidate.sourcetype:source: 'effective' | 'fts_only' | 'ka_traversal'; - Record traversal candidates in
session_memorieswithsource = 'ka_traversal' - Cap the merged pool at a configurable limit (default 100) before budget selection
This ensures the predictive scorer (KA-4) can see which candidates came from traversal vs effective score vs FTS.
6. Traversal configuration
Add to PipelineV2Config (in packages/core/src/types.ts):
readonly traversal?: {
readonly enabled: boolean; // default true
readonly maxAspectsPerEntity: number; // default 10
readonly maxAttributesPerAspect: number; // default 20
readonly maxDependencyHops: number; // default 30
readonly minDependencyStrength: number; // default 0.3
readonly timeoutMs: number; // default 500
readonly boostWeight: number; // default 0.2
readonly constraintBudgetChars: number; // default 1000
};
Wire defaults in packages/daemon/src/memory-config.ts with YAML
parsing, same pattern as structural config from KA-2.
Guard: traversal only runs when traversal.enabled && graph.enabled.
If KA tables don’t exist yet (migration hasn’t run), traversal
silently returns empty results.
7. Telemetry
Add traversal metrics to the session-start log entry (already logged at hooks.ts:966):
traversalEntities: number; // focal entities resolved
traversalMemories: number; // unique memory IDs from traversal
traversalConstraints: number; // constraints surfaced
traversalTimedOut: boolean; // whether traversal hit timeout
Also add to the /api/pipeline/status endpoint so the dashboard
can show traversal health.
Key Files
packages/daemon/src/pipeline/graph-traversal.ts— new, core traversal logicpackages/daemon/src/pipeline/graph-search.ts— existing graph boost (reference, not modified)packages/daemon/src/hooks.ts— wire traversal into session-startpackages/daemon/src/memory-search.ts— wire traversal into recallpackages/daemon/src/session-memories.ts— extend source typepackages/daemon/src/knowledge-graph.ts— KA-1 helpers (read, not modified)packages/core/src/types.ts— traversal config typespackages/daemon/src/memory-config.ts— traversal config defaults
What NOT to Build (KA-4+)
- Predictor structural features (KA-4)
- Checkpoint structural snapshots (KA-5)
- Dashboard visualization of graph traversal (KA-5)
- Multi-hop traversal beyond one-hop dependencies (future)
- API endpoints for browsing aspects/attributes (future)
- Automatic task execution from task_meta (out of scope)
Verification
bun run build— no type errorsbun test— existing tests passbun run typecheck— clean- With graph populated (KA-2 has run on some memories):
- Session-start with a known project path resolves focal entities
- Traversal collects memory IDs from entity_attributes
- Constraints appear in the “Active Constraints” section of inject
- Save a constraint fact (e.g., “never push directly to main for signetai”) — verify it appears in session-start inject when project path matches
- Save multiple facts about a project entity — verify traversal pulls them into session-start candidates
- Verify constraint budget is reserved separately (constraints survive main budget truncation)
- Verify traversal is a no-op when no KA data exists (empty tables)
- Verify traversal respects
traversal.enabled = false - Verify traversal timeout works (doesn’t block session-start)
- Verify recall graph boost includes traversal candidates
- Verify
session_memoriesrecords traversal candidates withsource = 'ka_traversal' - Check telemetry: traversal metrics appear in session-start log