Signet Memory Pipeline v2: Implementation Specification

Status: Draft for implementation

Audience: Core + Daemon maintainers

Scope note: This document is a delivery spec. It intentionally defines behavior, contracts, and validation criteria. It does not include implementation code.

1) Purpose

Signet needs a memory pipeline that is not only searchable, but selective, self-correcting, auditable, and safe under concurrency.

This spec hardens the prior plan into an implementation-ready design with:

explicit data contracts and lifecycle states
durable async processing
lock-safe transaction boundaries
provider and privacy controls
graph-aware retrieval
measurable quality, latency, reliability, and cost targets

2) Product Objectives

Primary goals

Increase recall relevance and consistency over current append-only memory behavior.
Prevent duplicate and contradictory memory growth.
Keep user data safe under provider outages and concurrent writes.
Preserve reversibility through complete memory history and recovery.
Keep local-first defaults, with optional remote provider support.
Enable bounded autonomous maintenance so agents can diagnose and repair common failure states without human intervention.

Non-goals (for this release)

Full multi-hop graph reasoning beyond one-hop expansion.
Autonomous memory pruning by LLM without recoverable audit trail.
External graph databases (Neo4j, etc.).
Mandatory online provider dependency.

3) Success Criteria (Release Gates)

The release is complete only if all are true:

No data loss when extraction/LLM fails (raw memory persists).
Pinned memories cannot be deleted by model decisions.
Duplicate insert race for identical content is blocked at DB level.
/remember remains responsive under provider outage via async fallback and retry queue.
Search quality improves against baseline on offline eval set.
History endpoint shows complete ADD/UPDATE/DELETE lineage.
Soft-deleted memories are recoverable during retention window.
Agents can explicitly modify a memory by ID with audit-safe semantics.
Agents can explicitly forget by ID or query without hard delete.
Autonomous maintenance loops resolve common degradation states within SLO and always leave an auditable trail.

4) Current Constraints and Assumptions

Memory pipeline is migrating from Python subprocesses into daemon TypeScript.
Existing schemas in the wild are mixed (legacy daemon schema, unified core schema, and migration edge cases).
Daemon currently opens multiple SQLite connections per request path; this is acceptable short term, but write contention increases under LLM-stage latency.
sqlite-vec remains the vector backend for this release.
Default model path is local Ollama unless user explicitly chooses remote provider.

4.1 Mem0 comparison findings (implemented behavior)

Review source:

references/mem0/mem0/memory/main.py
references/mem0/mem0-ts/src/oss/src/memory/index.ts
references/mem0/server/main.py

Observed parity requirements:

Mem0 supports both inferred and explicit mutation paths:
- inferred: ADD/UPDATE/DELETE/NONE in add pipeline
- explicit: update(id), delete(id), delete_all(filters), history(id)
Mem0 treats update/delete as first-class operations with history entries for each mutation.
Mem0 exposes API surface for modify/forget (PUT /memories/{id} and DELETE /memories/{id}), not only infer-on-add behavior.

Signet implication: inferred mutation is not enough; agent-directed modify/forget must be a formal API + policy surface.

4.2 OpenClaw integration baseline and gap

Current Signet state in-repo:

@signet/connector-openclaw is install-time wiring that patches OpenClaw config and installs command-hook files under ~/.agents/hooks/agent-memory/.
Installed hook handler supports command actions /remember, /recall, and /context via daemon hook endpoints.
@signetai/adapter-openclaw exists as a runtime plugin entry point and currently exposes lifecycle calls for session start/pre-compaction/ compaction complete plus manual remember/recall helpers.
Daemon hook surface already includes richer lifecycle endpoints, including user-prompt-submit and session-end.

Gap vs desired state:

Runtime plugin path is not yet the single canonical path for OpenClaw memory operations.
Legacy command-hook path can overlap with plugin behavior and must be guarded against duplicate recall/capture.
Explicit modify/forget tools are not yet available in the runtime OpenClaw integration path.

5) High-Level Architecture

5.1 Pipeline modes

remember supports three execution modes:

sync: do extraction and update-decision inline within latency budget.
async: persist raw memory immediately and enqueue processing job.
auto (default): attempt sync until timeout budget, then degrade to async without failing write.

5.2 Lifecycle stages

Ingest and normalize input.
Idempotency and exact dedup check.
Persist raw memory envelope.
Run extraction (facts + entities).
For each fact: retrieve candidates, decide ADD/UPDATE/DELETE/NONE, apply decision.
Persist embeddings and graph links.
Persist history and metrics.

5.3 Key principle

No long-running external call is allowed inside a write-locked transaction.

6) Data Model Specification

6.1 `memories` table

Required fields:

identity: id
content: content, normalized_content, content_hash (SHA-256)
classification: type, category
source: source_type, source_id, who
control: pinned, manual_override, is_deleted, deleted_at
quality/process: confidence, update_count, extraction_status
retrieval: importance, access_count, last_accessed
model provenance: embedding_model, extraction_model
audit stamps: created_at, updated_at, updated_by, version

Required indexes:

unique partial index on content_hash where not null
index on is_deleted, pinned, type, created_at
index on source_type, source_id

6.2 `memory_history` table

Purpose: immutable audit trail for all semantic decisions.

Fields:

identity: id, memory_id
event: event in {ADD, UPDATE, DELETE, RECOVER, MERGE, NONE}
payload: old_content, new_content, old_metadata, new_metadata
actor: actor_type, actor_id
model provenance: provider, model, prompt_version
reasoning metadata: decision_confidence, decision_reason
traceability: request_id, session_id, created_at

6.3 `memory_jobs` table (durable queue)

Purpose: retries, crash-safe processing, and backpressure control.

Fields:

identity: id, memory_id
type: job_type in {extract, decision, graph, reembed}
payload: payload_json
state: status in {pending, leased, retry_scheduled, done, dead}
retries: attempt_count, max_attempts, next_attempt_at
lease: lease_owner, lease_expires_at
diagnostics: last_error, last_error_code
timestamps: created_at, updated_at

6.4 Graph tables

entities:

id, name, canonical_name, entity_type, mentions, embedding, created_at, updated_at

relations:

id, source_entity_id, target_entity_id, relationship, mentions, confidence, created_at, updated_at

memory_entity_mentions (required link table):

id, memory_id, entity_id, mention_text, confidence, created_at

Without memory_entity_mentions, graph-augmented retrieval cannot map entity expansion back to memory rows reliably.

7) Migration and Rollback Strategy

7.1 Pre-migration safety

Create timestamped SQLite backup.
Verify backup integrity.
Run schema detection and emit migration plan report.

7.2 Migration behavior

Use additive migrations first (new columns/tables/indexes).
Backfill derived fields in batched jobs (content_hash, normalized_content, embedding_model, etc.).
Never block startup on full backfill; process in background queue.

7.3 Rollback

Rollback mechanism is DB restore from pre-migration backup.
Keep migration execution idempotent so re-run is safe.
Store migration audit record with start/end time and outcome.

8) Concurrency and Transaction Model

8.1 Transaction boundaries

Use short transactions only:

Tx A: ingest write (insert raw memory envelope + queue job)
Tx B: apply one semantic decision atomically
Tx C: finalize metadata/access/history updates

No LLM or embedding call may execute while holding write lock.

8.2 Race prevention

DB-level uniqueness on content_hash for exact duplicate collapse.
Compare-and-set update using memory version to prevent stale write.
Re-check candidate state immediately before applying UPDATE/DELETE.

8.3 Connection policy

Single shared DB accessor in daemon process.
Read-only handles for search paths where possible.
Standard pragmas on write connections:
- WAL mode
- busy timeout
- synchronous NORMAL
- memory temp store

9) Provider Abstraction and Policy

9.1 Supported providers

Local Ollama (default)
API providers (OpenAI/Anthropic)
Harness passthrough (Claude Code/OpenCode/OpenClaw)

9.2 Capability contract

Each provider must declare:

extraction support
decision support
optional reranking support
timeout and token constraints
structured output reliability level

9.3 Fallback order

Preferred provider
Secondary provider (optional)
Raw-save + async retry queue

If all provider calls fail, memory remains stored with extraction_status=unprocessed and is retried.

10) Privacy and Security Requirements

10.1 Default posture

Local-first processing is default.
Remote providers are opt-in.

10.2 Data handling controls

Before remote inference:

redact obvious secret patterns (tokens, API keys, private keys)
redact configured sensitive terms
preserve reversible placeholder map locally for audit

10.3 Governance controls

provider allowlist in config
explicit local_only=true enforcement mode
outbound inference logs must avoid storing raw redacted content

10.4 Safety invariants

pinned=1 cannot be deleted by model output.
soft-delete only; hard purge only via retention worker.
all mutating decisions require history event insert in same commit.

11) Extraction and Decision Contracts

11.1 Extraction output contract

Required structure:

facts[]: content + memory type + confidence
entities[]: source, relationship, target, confidence
warnings[]: optional parser/quality issues

Validation rules:

reject empty or trivial facts
enforce max fact length
enforce max output count per request
schema validation failure triggers raw-save fallback

11.2 Decision output contract

For each candidate memory, model returns one of:

ADD
UPDATE
DELETE
NONE

Additional fields required:

target temp id (for UPDATE/DELETE/NONE)
confidence
short reason

Invalid target ids or malformed decisions are discarded and recorded as pipeline warnings.

11.3 Contradiction handling

If same batch contains opposing claims for same subject:

block automatic destructive decision
store both with contradiction marker
emit review-needed history event

12) Search and Ranking Specification

12.1 Baseline retrieval

hybrid vector + BM25 search remains primary path
excluded from retrieval: is_deleted=1 unless explicit include flag

12.2 Score components

final_score = a*vector + b*bm25 + c*access + d*graph (+ optional rerank)

Where:

access is logarithmic frequency boost
graph is one-hop connectivity boost
reranker is opt-in and latency-bounded

12.3 Reranking policy

disabled by default
only applied to top-N candidates
bounded timeout; fallback to pre-rerank order on timeout/failure

13) Graph Memory Specification

13.1 Extraction behavior

Entity and relation extraction runs in same pipeline pass as fact extraction, but persisted independently.

13.2 Merge behavior

entity merge via semantic threshold + canonical name normalization
relation merge by (source, relationship, target) tuple
mention counts increment, not duplicate row insertion

13.3 Query-time usage

extract query entities
resolve nearest stored entities
expand one-hop neighbors
boost linked memories via memory_entity_mentions

If no entities resolve, search behaves exactly like baseline hybrid.

14) API Specification (Delta)

14.1 `POST /api/memory/remember`

New request fields:

mode: auto | sync | async
raw: boolean (bypass extraction)
idempotency_key: optional client key
pipeline_timeout_ms: optional override within safe bounds

Response includes:

memory_id
status: processed | queued | raw_saved
job_id when queued
warnings[]

14.2 `GET /api/memory/jobs/:id`

Returns job state, attempts, next retry, and last error summary.

14.3 `GET /api/memory/:id/history`

Returns ordered event log with model/provider provenance and reasons.

14.4 `POST /api/memory/:id/recover`

Recovers soft-deleted memory if still in retention window.

14.5 `PATCH /api/memory/:id`

Explicit modify endpoint for agent/operator corrections.

Request supports:

content (optional)
type, tags, importance, pinned (optional metadata updates)
reason (required; stored in history)
if_version (optional optimistic concurrency guard)

Behavior:

Fails on version mismatch when if_version is provided.
Updates embedding if content changes.
Writes UPDATE event to memory_history in same commit.

14.6 `DELETE /api/memory/:id`

Explicit forget endpoint (soft delete).

Request/query supports:

reason (required)
force (optional; required for pinned memory)

Behavior:

Default is soft-delete (is_deleted=1, deleted_at=now).
Rejects pinned delete unless force=true and policy permits caller.
Writes DELETE event to memory_history.

14.7 `POST /api/memory/forget`

Batch forget by query and filters (agent-safe forget flow).

Request supports:

query (semantic or keyword)
optional filters (type, tags, who, source_type, time range)
mode: preview | execute
limit
reason (required for execute)

Behavior:

preview returns candidate memory IDs and scores only.
execute applies soft-delete to selected IDs.
For large deletes above threshold, require explicit confirm token.

14.8 `POST /api/memory/modify`

Batch modify operation for structured edits.

Request supports list of patches:

{ id, content?, tags?, type?, importance?, reason, if_version? }

Behavior:

Atomic per item, not all-or-nothing across batch.
Per-item result includes success/failure + conflict reason.
Each successful item writes UPDATE history event.

14.9 Compatibility

Legacy aliases remain functional, mapped to the new pipeline behavior.

15) Configuration Specification

15.1 Pipeline config block

Config supports:

provider selection and model per provider
mode defaults (auto/sync/async)
timeout and retry budgets
dedup thresholds
reranker enablement
graph enablement
local_only privacy enforcement
mutation policy (allow/deny delete of pinned, max batch delete size)
forget safeguards (preview required, confirm threshold)

15.2 Safe defaults

provider: local Ollama
mode: auto
reranker: off
graph boost: low weight
remote redaction: on

16) Observability and Operations

16.1 Required metrics

Pipeline throughput and reliability:

remembers total, processed, queued, fallback raw saves
extraction success rate
decision parse failure rate
queue depth, queue age p95, dead-letter count
retry distribution by attempt

Latency:

remember end-to-end p50/p95/p99 by mode
extraction stage latency
decision stage latency
search latency p50/p95/p99

Quality indicators:

duplicate suppression rate
contradiction flag rate
update-vs-add ratio

Cost/usage:

tokens in/out per provider
estimated cost per 1k remembers

Storage/DB:

DB file size growth
sqlite busy/lock error rate

16.2 Logging requirements

Structured logs must include request id, memory id, job id, stage, provider, model, latency, and outcome.

16.3 Alert thresholds

Minimum alerts:

dead-letter rate > 1% over 15m
queue age p95 > 5m
remember p95 > SLO for 30m
sqlite busy errors above baseline threshold

16.4 Autonomous maintenance plane (self-healing)

The system must expose machine-actionable diagnostics and safe repair actions so agents can maintain memory health.

Required health signals (read-only):

queue health (depth, oldest_age, dead_rate, lease anomalies)
storage health (DB size growth, fragmentation, WAL growth)
index health (FTS/vec consistency and freshness)
model/provider health (availability, timeout rate, parse failure rate)
mutation safety health (wrong-target rate, rollback/recover success)

Required repair actions (mutating, policy-gated):

requeue dead/retryable jobs
release stale leases
reindex FTS and vector linkage consistency checks
reembed subset by model/version drift
run retention/purge jobs in safe order
execute targeted rollback/recover for recent destructive mutations

Control constraints:

all repair actions require reason + actor + correlation id
every repair action writes an audit event
per-action rate limits and cooldown windows
policy-level allowlist of autonomous actions
emergency kill switch to disable all autonomous mutations

17) Rollout Plan

Phase A: Infrastructure hardening

schema additions, queue table, history table, new indexes
transaction boundary refactor
DB connection access unification

Gate: no regression in existing remember/recall behavior.

Phase B: Shadow extraction

run extraction + decision in shadow mode
do not mutate memory semantics yet; only log proposed actions

Gate: acceptable parse reliability and decision quality on real traffic.

Phase C: Controlled writes

enable ADD/NONE decisions
keep UPDATE/DELETE behind feature flag

Gate: low duplicate growth, no data-loss incidents.

Phase D: Full semantic decisions

enable UPDATE/DELETE with safety invariants
enable recover endpoint and retention worker

Gate: pinned protection and recoverability validated.

Phase E: Graph and optional reranking

enable graph extraction + low-weight graph boost
reranker optional per config

Gate: retrieval quality improvement without latency SLO breach.

Phase F: Autonomous maintenance enablement

expose diagnostics endpoints and health score
ship policy-gated repair action endpoints
enable agent maintenance loop in observe-only mode first
graduate to execute mode after guardrail gates pass

Gate: self-healing actions improve system health metrics without increasing accidental mutation incidents.

Phase G: OpenClaw plugin-first runtime migration

define canonical runtime path as @signetai/adapter-openclaw
keep @signet/connector-openclaw as install/bootstrap only
add runtime operations for explicit modify/forget and full lifecycle parity with daemon hook surface
keep legacy command-hook files as compatibility fallback behind config gate
enforce single active runtime path per session (plugin or legacy, not both)

Gate: plugin path reaches functional parity and no duplicate capture/recall occurs when compatibility mode is enabled.

18) Validation and Test Strategy

18.1 Unit validation

normalization, trivial-content filters, parser validation
contradiction detector
decision application safety checks

18.2 Integration validation

remember -> queue -> process -> recall end-to-end
soft-delete and recovery behavior
schema migration on representative legacy DB snapshots

18.3 Concurrency validation

simultaneous identical remembers
simultaneous conflicting updates
worker lease contention and lease expiry recovery

18.4 Fault injection

provider timeout
malformed model output
temporary DB lock contention
process restart during leased job

18.5 Self-healing validation

inject queue stalls and verify autonomous requeue + lease recovery
inject index drift and verify autonomous consistency repair
inject provider outage and verify degrade/recover without data loss
inject accidental forget in canary and verify recover workflow
verify kill switch disables autonomous mutations immediately

18.6 OpenClaw integration validation

plugin lifecycle path validates session-start, user-prompt-submit, session-end, and compaction callbacks
plugin tools validate search/store/get/list/forget/modify contracts
compatibility mode validates legacy command hooks continue to function
mixed-mode tests verify duplicate recall/capture protection
daemon outage tests verify graceful degradation and recovery behavior

19) Success Metrics

19.1 Quality metrics

Recall@5 and Recall@10 on labeled recall set.
nDCG@10 for ranking quality.
Duplicate creation rate per 1,000 remembers.
Decision precision/recall for ADD/UPDATE/DELETE against human labels.
Contradiction handling precision (flagged contradictions that are genuinely conflicting).
Modify precision: percent of explicit edits applied to intended IDs.
Forget precision: percent of forget executions deleting only intended IDs.

Target deltas vs baseline:

Recall@10: +15% relative minimum
nDCG@10: +10% relative minimum
duplicate rate: -60% relative minimum
decision precision: >= 0.90 for UPDATE/DELETE in canary

19.2 Reliability metrics

End-to-end remember success rate >= 99.9% (raw-save counts as success).
Queue dead-letter rate <= 0.5% daily.
Pinned-delete incidents = 0.
Recovery success for soft-deleted memories >= 99% within retention window.
Accidental deletion incidents = 0 in canary and GA.

19.3 Latency metrics

For local default provider:

remember p95 (auto mode): <= 1.2s
remember p99 (auto mode): <= 2.5s
fallback raw-save p95 under provider outage: <= 200ms
recall/search p95: <= 400ms without reranker
explicit modify p95: <= 300ms (single ID)
explicit forget p95: <= 250ms (single ID)

19.4 Cost and efficiency metrics

Median tokens per remember decision path.
Estimated provider cost per 1,000 remembers.
Queue processing throughput (jobs/minute) at target concurrency.
Storage growth per 10,000 remembers.

19.5 Self-healing effectiveness metrics

Mean time to detect (MTTD) and mean time to recover (MTTR) for queue, provider, and index incidents.
Autonomous remediation success rate by action type.
Percent of incidents resolved without human intervention.
False-remediation rate (action taken but no health improvement).
Safety incident rate attributable to autonomous actions.

Targets:

autonomous resolution >= 80% of sev-3 memory incidents
MTTR improvement >= 50% vs non-autonomous baseline
false-remediation rate <= 5%
autonomous-action safety incidents = 0

19.6 OpenClaw integration metrics

Plugin adoption rate vs legacy command-hook path.
Duplicate recall/capture incident rate.
Plugin callback success rate by lifecycle event.
Tool success rate for explicit modify/forget operations.
OpenClaw-to-daemon roundtrip latency p95 for hook/tool calls.

Targets:

plugin adoption >= 90% before deprecating legacy path
duplicate recall/capture incidents = 0
callback and tool success rates >= 99%

20) Benchmarking Methodology

20.1 Benchmark datasets

Build three datasets:

Real anonymized remember/recall pairs from active usage.
Synthetic gold dataset with known facts, updates, and contradictions.
Adversarial dataset (ambiguous phrasing, negation, noisy formatting, trivial chatter, malformed JSON-like input).

Build one additional mutation dataset:

Modify/forget intent dataset with gold target IDs, including near-duplicate memories and pinned-memory edge cases.

Each dataset must include expected retrieval labels and expected update actions for scoring.

20.2 Offline quality benchmark

Method:

Run baseline pipeline and v2 pipeline on identical dataset snapshots.
Freeze embedding model and extraction model versions for run comparability.
Score Recall@k, nDCG, duplicate rate, decision precision/recall.
Score modify precision and forget precision on mutation dataset.

Output:

per-dataset report
aggregate weighted score
regression diff report by metric and error category

20.3 Online canary benchmark

Method:

Route small traffic slice to v2 with feature flags.
Keep baseline as control.
Compare latency, queue stability, and user-visible recall quality.
Compare modify/forget safety metrics (false delete, wrong-target edit).

Guardrails:

automatic rollback if latency or error thresholds breach for fixed window.

20.4 Load and stress benchmark

Scenarios:

sustained remember traffic
burst remember traffic
mixed remember+recall traffic
provider degraded/unavailable periods

Measurements:

throughput (RPS/jobs-per-minute)
queue lag
sqlite busy rates
p95/p99 latency
wrong-target modify rate
wrong-target forget rate

20.5 Resilience benchmark

Inject failures:

daemon restart during leased jobs
provider timeout spikes
malformed model responses
temporary file lock contention

Pass criteria:

no memory loss
eventual job completion or dead-letter with audit trail
recovery path functional
no unrecoverable delete from explicit forget endpoints

20.6 Cost benchmark

Run controlled 10k remember workload per provider mode and collect:

tokens consumed
estimated spend
median and p95 processing latency
quality score deltas vs local provider

20.7 Self-healing benchmark

Method:

Run deterministic failure scenarios (queue stall, lease leak, provider outage, index drift, rollback requirement).
Compare control (alerts only) vs autonomous maintenance mode.
Measure MTTD, MTTR, successful repairs, and safety outcomes.

Pass criteria:

autonomous mode meets MTTR and safety targets from section 19.5
no autonomous action bypasses policy controls
complete audit chain exists for every autonomous action

20.8 OpenClaw runtime benchmark

Method:

Run the same scripted conversations through:
- plugin-first runtime path (@signetai/adapter-openclaw)
- legacy command-hook compatibility path
Measure recall quality parity, capture quality parity, and operation latency.
Validate explicit modify/forget behavior and audit events.

Pass criteria:

plugin path is non-inferior on recall/capture quality
plugin path meets latency and reliability SLOs
no duplicate actions in compatibility or mixed-mode tests

21) Risks and Mitigations

Over-locking from naive transaction scope
- Mitigation: strict short transactions + compare-and-set writes.
Model hallucinated updates/deletes
- Mitigation: pinned hard-block, confidence thresholds, recoverability, history audit.
Queue backlog under provider degradation
- Mitigation: mode fallback, retry backoff, queue alerts, dead-letter.
Schema drift across installations
- Mitigation: robust schema detection + additive migration + backup.
Privacy leakage to remote providers
- Mitigation: local default, redaction, provider allowlist, local-only enforcement.

22) Failure Modes and Guardrails

Each guardrail below is required and has a release gate.

22.1 Unauthorized modify/forget

Required controls:
- caller identity must be resolved for every mutate request
- role-based policy for remember, modify, forget, recover
- actor identity must be written to history for every mutation
Release gate:
- policy tests prove unauthorized requests are denied and authorized requests succeed with correct audit actor

22.2 Mass forget blast radius

Required controls:
- mandatory preview mode for query forget
- max-delete threshold per request
- confirm token required above threshold
- optional dry-run-only policy mode for canary
Release gate:
- simulated broad forget cannot execute without preview + confirmation, and threshold policy blocks oversized requests

22.3 Wrong-target modify/forget

Required controls:
- strict ID-based execution after preview selection
- optimistic concurrency via if_version
- per-item result reporting for batch operations
Release gate:
- mutation benchmark meets wrong-target thresholds for edit/delete

22.4 Human edit override erosion

Required controls:
- manual_override lock with configurable TTL
- inferred LLM updates blocked during lock window unless forced by explicit operator intent
Release gate:
- tests verify inferred updates do not overwrite locked memories

22.5 Cross-tenant or cross-scope memory leakage

Required controls:
- mandatory filter scoping (user_id/agent_id/run_id) on all search/mutation paths
- reject unscoped destructive operations unless explicit admin policy
Release gate:
- tenant isolation tests show no cross-scope read/write/delete leakage

22.6 Prompt injection via stored memory content

Required controls:
- memory text treated as untrusted input in prompts
- prompt templates isolate instructions from content blocks
- strip or neutralize high-risk control tokens in model context
Release gate:
- adversarial prompt-injection test suite passes with no policy bypass

22.7 Retry duplication and non-idempotent mutations

Required controls:
- idempotency keys for explicit mutate endpoints
- compare-and-set updates by version
- dedup key for queued jobs to prevent duplicate processing
Release gate:
- retry/replay tests show no duplicate UPDATE/DELETE side effects

22.8 Graph drift after modify/forget

Required controls:
- modify/delete updates memory_entity_mentions and relation counts
- periodic graph consistency checker job
Release gate:
- consistency checks show no orphan links after mutation workloads

22.9 Tombstone/history retention gaps

Required controls:
- separate retention policies for active memory, tombstones, history, and jobs
- purge worker must run in safe order (links -> tombstones -> history)
Release gate:
- retention tests prove recoverability within SLA and clean purge after expiry

22.10 Low-quality eval labels

Required controls:
- human-labeled gold set for modify/forget with adjudication
- inter-rater agreement threshold before benchmark acceptance
Release gate:
- benchmark report includes label quality metrics and passes minimum agreement requirement

22.11 Operational recovery failure

Required controls:
- tested backup/restore runbook
- tested queue replay from crash state
- tested rollback trigger and procedure
Release gate:
- game-day drill passes with documented timings and no data loss

22.12 Abuse and anomaly spikes

Required controls:
- anomaly alerts for delete spikes and wrong-target mutation spikes
- per-caller rate limits for forget/modify
- emergency mutation freeze switch
Release gate:
- staged abuse simulation triggers alerts and freeze switch behaves as expected

22.13 Runaway or unsafe autonomous remediation

Required controls:
- bounded action budgets per hour/day and per incident
- mandatory health re-check after each repair step
- automatic halt after repeated ineffective repairs
- human escalation path with full state snapshot
Release gate:
- chaos test proves loop halts safely on non-improving conditions and escalates correctly

22.14 OpenClaw dual-path execution conflicts

Required controls:
- runtime arbitration key per session that selects plugin or legacy path
- idempotency keys on capture actions across both paths
- clear precedence rules in config and docs
Release gate:
- mixed-mode tests prove no duplicate remember/recall/capture actions

23) Implementation Deliverables

Updated schema and migration set with rollback runbook.
Durable memory job queue + worker loop semantics.
Extraction/decision pipeline integrated with remember endpoint.
Explicit modify/forget API + history and recovery endpoints.
Graph extraction storage and graph-boosted retrieval.
Metrics, dashboards, and alert rules.
Benchmark harness + baseline and canary report templates.
Agent maintenance plane (diagnostics + repair actions + policy engine).
Autonomous maintenance benchmark suite and runbooks.
Operator docs for config, rollout, failure handling, and escalation.
OpenClaw plugin-first runtime integration with legacy fallback mode.
OpenClaw migration and deprecation runbook for command-hook path.

24) Final Acceptance Checklist

All must pass:

schema migration dry run and live run tested
queue recovery tested across restart
pinned deletion prevention verified
soft-delete retention + recover verified
explicit modify endpoint concurrency guard tested
explicit forget preview/execute safeguards tested
offline benchmark meets quality targets
canary metrics meet latency/reliability targets
autonomous maintenance gates meet section 19.5 targets
kill switch and escalation workflows tested end-to-end
OpenClaw plugin path passes parity and no-duplication gates
rollout and rollback procedures documented and exercised

25) Immediate Next Steps

Approve this spec as implementation contract.
Create phase-level tickets (A through G) with explicit owners.
Capture baseline benchmark snapshot before first code change.
Start Phase A behind feature flags, then proceed by gates.
Define autonomous maintenance action policy and escalation matrix.
Finalize OpenClaw plugin-vs-legacy arbitration policy and migration timeline.

26) OpenClaw Plugin-First Integration Specification

26.1 Scope and intent

This section defines how OpenClaw wiring fits into the memory pipeline implementation so runtime behavior, safety, and observability remain consistent with the rest of this spec.

26.2 Component responsibilities

@signet/connector-openclaw (install/bootstrap):
- patch OpenClaw config entries
- install compatibility hook files
- no long-term ownership of runtime memory policy
@signetai/adapter-openclaw (runtime integration):
- lifecycle callbacks to daemon hook endpoints
- runtime tool surface for memory operations
- primary path for remember/recall/modify/forget in OpenClaw
Signet daemon:
- single source of truth for memory semantics, policy, and audit
- endpoint contract owner for hooks and memory APIs

26.3 Runtime path selection

One path must be active per OpenClaw session:

plugin (preferred)
legacy-hook (compatibility)

Selection rules:

If plugin capability is present and healthy, use plugin.
If plugin capability is absent, fall back to legacy-hook.
If both are configured, arbitration enforces one active path and logs the decision.

26.4 Required OpenClaw runtime capabilities

Lifecycle callbacks:
- session start
- user prompt submit
- session end
- pre-compaction
- compaction complete
Tool operations:
- memory_search, memory_store, memory_get, memory_list
- memory_modify, memory_forget
Error handling:
- graceful daemon timeout behavior
- explicit user-visible error summaries
- retry only when idempotency is guaranteed

26.5 Safety and consistency requirements

No duplicate capture/recall per event.
Explicit modify/forget must route through the same policy checks and history writes as non-OpenClaw callers.
Plugin and legacy compatibility paths must emit equivalent audit fields (actor, session, request, path).

26.6 Deprecation policy for legacy command-hook path

Keep compatibility path until plugin adoption and reliability targets are met (section 19.6).
Announce deprecation window with migration instructions.
Remove legacy path only after canary and GA windows complete without duplicate-action incidents.

27) Locked Implementation Decisions

These decisions are approved defaults for implementation unless explicitly superseded by a future revision.

27.1 OpenClaw legacy fallback sunset

Keep legacy command-hook fallback for at least 90 days after plugin-first GA.
Legacy removal is blocked until section 19.6 targets are met:
- plugin adoption >= 90%
- duplicate recall/capture incidents = 0
- callback and tool success rates >= 99%

27.2 Force-delete authority for pinned memories

Force-delete of pinned memories is operator-only by default.
Autonomous agents are denied force-delete unless a future policy explicitly enables it.
Force-delete requires reason, preview/confirmation flow, and audit event with actor identity.

27.3 Privacy default

New installs default to local_only = true for memory pipeline LLM processing.
Remote providers remain opt-in.
Existing installs preserve current behavior unless users opt into stricter privacy mode.

27.4 Retention defaults

Default retention windows:

Soft-deleted memories (tombstones): 30 days.
Memory history events: 180 days.
Completed jobs: 14 days.
Dead-letter jobs: 30 days.

These values are configurable and may be tightened by policy.

Privacy-sensitive install policy:

History retention must be explicitly chosen during setup (no implicit default).
Recommended choices are 90 or 180 days based on operator risk posture.

27.5 Autonomous maintenance bounds (GA)

Allowed unattended actions:

Requeue retryable jobs and release stale leases.
Run index/consistency checks and non-destructive repairs.
Run bounded re-embed jobs for model/version drift.

Human-approval-required actions:

Bulk forget operations.
Force-delete or pinned deletions.
Retention purge overrides and destructive rollback operations.

27.6 OpenClaw migration benchmark bar

Plugin-first migration passes only when all are true:

Plugin path is non-inferior to legacy path on recall/capture quality.
Plugin path is equal or better on reliability and latency SLOs.
Duplicate actions remain zero in mixed-mode and canary windows.

28) Competitive Gap Mapping (Supermemory Comparison)

This section maps current observed product gaps (vs references/supermemory/) to this implementation plan so prioritization stays explicit.

28.1 Current gap categories (as of this draft)

Memory lifecycle completeness:
- missing shipped explicit modify/forget/history/recover endpoints
- limited mutation safety tooling in GA path today
Ingestion breadth:
- limited native ingest surface compared to URL/file/media/connectors
Ecosystem integrations:
- thinner framework middleware and SDK ergonomics
MCP distribution posture:
- less standardized hosted OAuth/API-key style MCP path
Product analytics:
- less complete usage/error analytics for operator workflows

28.2 What v2 phases close directly

Phases A-D close most memory lifecycle and safety gaps:
- durable queue + retries
- explicit modify/forget + history + recover
- pinned safety invariants and mutation policy controls
Phase E closes graph-aware retrieval quality gaps.
Phase F closes self-healing and operational rigor gaps.
Phase G closes OpenClaw runtime parity and duplicate-path safety gaps.

28.3 What remains after v2 (out of scope for A-G)

Broad connector catalog and sync orchestration.
Framework-native middleware/tooling parity across major app stacks.
Hosted multi-tenant analytics and scoped auth experience.

29) Post-v2 Expansion Phases (H-K)

These phases begin only after section 24 acceptance criteria pass.

Phase H: Ingestion and Connector Foundation (local-first first)

Add first-class document ingest contracts for text/URL/file with asynchronous processing status (queued/extracting/chunking/embedding/indexing/done/failed).
Introduce connector runtime interface and provider contract: auth, incremental sync, replay, idempotency, provenance.
Ship initial connector set focused on highest leverage: GitHub docs, local filesystem/project docs, and Google Drive.
Gate: connector imports are idempotent, auditable, and do not violate local-first defaults.

Phase I: SDK and Agent Integration Surface

Expand @signet/sdk to typed memory lifecycle APIs aligned with v2 (remember/recall/modify/forget/history/recover/jobs).
Add framework adapters:
- Vercel AI SDK middleware/tool package
- OpenAI SDK helper package (TypeScript first, Python optional)
Standardize examples and reference templates for memory injection prompts and safe tool calling.
Gate: integration examples pass conformance tests and benchmark parity with direct daemon API usage.

Phase J: Auth Scope and Deployment Modes

Keep localhost/no-auth as default path.
Add optional authenticated mode for remote/team deployments: scoped API tokens, role policy, and operation-level authorization.
Add project/agent/user scoping guarantees for read/write/mutate operations.
Gate: tenant/scope isolation tests pass with zero cross-scope leakage.

Phase K: Analytics and Operator UX

Add memory pipeline analytics endpoints and dashboard cards: throughput, error taxonomy, queue health, latency, mutation safety.
Include exportable audit and incident timelines.
Add benchmark report publishing workflow for canary/GA decisions.
Gate: operators can diagnose top memory failures without ad-hoc log spelunking.

30) Prioritization Guardrails for H-K

Do not expand connector breadth before v2 mutation safety gates are green.
Favor depth over breadth: one production-grade connector beats five partial connectors.
Preserve local-first defaults in every new surface.
Any destructive operation must remain previewable, reversible, and fully audited.
New integration layers must reuse daemon contracts rather than invent parallel semantics.

31) Immediate Planning Actions (next cycle)

Convert Phases A-G into owner-assigned milestones with explicit acceptance tests per gate.
Freeze and capture baseline benchmark snapshot before A starts.
Draft Phase H RFC in parallel (contracts only, no implementation) so post-v2 execution can start without discovery lag.
Scope @signet/sdk v2 API surface in parallel with D/E so Phase I is not blocked on SDK design.
Define deployment mode policy matrix now (localhost default, authenticated optional) to de-risk Phase J.

32) Deep Implementation Plan (A-G)

This section converts phases into concrete implementation tracks, component ownership, and sequencing constraints.

32.1 Package ownership map

@signet/core owns:
- schema and migrations API surface
- memory domain types and validators
- extraction/decision contracts
- retrieval scoring and graph boost logic
@signet/daemon owns:
- queue worker runtime and job leasing
- API handlers and authorization policy checks
- observability endpoints and health diagnostics
@signet/sdk owns:
- typed client surface for all memory lifecycle APIs
- integration-safe wrappers and transport defaults
@signetai/adapter-openclaw owns runtime plugin behavior.
@signet/connector-openclaw owns install/bootstrap and fallback hooks.

32.2 Phase A implementation tracks (infrastructure hardening)

Track A1: Schema and migration runtime

Add migrations for:
- memories new columns (content_hash, normalized_content, is_deleted, deleted_at, pinned, importance, extraction_status, provenance fields)
- memory_history
- memory_jobs
- graph tables (entities, relations, memory_entity_mentions)
Add migration audit table:
- schema_migrations_audit(id, started_at, ended_at, outcome, details)
Add preflight command/API:
- detect schema state
- produce additive migration plan report
- verify backup existence + checksum

Track A2: DB access and transaction boundaries

Introduce single daemon DB accessor factory with:
- one write handle
- pooled read handles when beneficial
Implement explicit transaction wrappers:
- txIngestEnvelope()
- txApplyDecision()
- txFinalizeAccessAndHistory()
Enforce no provider call inside write transaction via lintable guard:
- central helper requiring pure DB closures

Track A3: Feature flags and kill switches

Add config keys:
- memory.pipelineV2.enabled
- memory.pipelineV2.shadowMode
- memory.pipelineV2.allowUpdateDelete
- memory.pipelineV2.graph.enabled
- memory.pipelineV2.autonomous.enabled
Add emergency freeze controls:
- memory.pipelineV2.mutationsFrozen
- memory.pipelineV2.autonomousFrozen

Track A4: Baseline compatibility

Keep legacy remember/recall behavior default-on until gates pass.
Add dual-read diagnostics mode:
- compare baseline vs v2 candidate retrieval silently
- log divergence metrics only

Exit criteria for Phase A (must all pass)

Additive migration succeeds on representative legacy DB snapshots.
Daemon restarts safely during migration/backfill.
No regression in existing /remember + /recall user path.

32.3 Phase B implementation tracks (shadow extraction)

Track B1: Contract-first extractors

Implement extractor interface:
- extractFactsAndEntities(input): ExtractionResult
Add validator with hard caps:
- max facts
- max entities/relations
- max fact length
- reject trivial or malformed outputs
Persist warnings even when extraction fails.

Track B2: Shadow decision engine

Implement candidate retrieval for each fact (top-K hybrid).
Implement decision contract parser for ADD/UPDATE/DELETE/NONE.
Persist proposed decisions to history with event=NONE + decision_reason, without mutating memories.

Track B3: Job worker + retry behavior

Worker loop with leasing:
- lease timeout
- stale lease reaper
- exponential backoff with jitter
Job dedup keys:
- one active extract job per memory/version
Dead-letter path with machine-readable error codes.

Exit criteria for Phase B

Parse reliability reaches threshold on real traffic sample.
Shadow decisions produce explainable audit records.
Queue remains stable under provider timeouts/outages.

32.4 Phase C implementation tracks (controlled writes: ADD/NONE)

Track C1: ADD write path

Enable ADD application with idempotency checks:
- exact hash duplicate collapse at DB layer
Write mandatory history event in same commit as memory mutation.
Update embedding linkage for newly added memory.

Track C2: Contradiction and duplicate suppression

Run contradiction detector before destructive recommendations.
When contradiction risk present:
- block destructive write
- emit review-needed marker

Track C3: Read-path confidence controls

Exclude low-confidence extracted facts from write path by threshold.
Keep raw envelope available regardless of quality outcome.

Exit criteria for Phase C

Duplicate growth is materially reduced from baseline.
No data-loss incidents under repeated retries/restarts.

32.5 Phase D implementation tracks (full semantic decisions)

Track D1: Explicit mutation APIs

Implement endpoints from section 14.5-14.8.
Require reason for all mutate operations.
Add if_version optimistic concurrency guards.

Track D2: Soft-delete and recoverability

Implement soft-delete semantics in all forget flows.
Implement recover endpoint with retention-window checks.
Retention worker enforces purge order: links -> tombstones -> history.

Track D3: Pinned and policy guardrails

Reject pinned delete unless force + policy-authorized actor.
Deny autonomous force delete by default (locked decision 27.2).
Record actor identity + request/session correlation for all mutations.

Exit criteria for Phase D

Wrong-target edit/delete metrics meet section 22 release gates.
Recover path passes SLA and audit-chain checks.

32.6 Phase E implementation tracks (graph + optional rerank)

Track E1: Graph extraction persistence

Persist entities/relations with merge policies.
Maintain memory_entity_mentions as mandatory link table.
On modify/delete, update mention counts and orphan cleanup safely.

Track E2: Query-time graph boost

Query entity extraction and nearest-entity resolution.
One-hop expansion and score boost integration: final = a*vector + b*bm25 + c*access + d*graph.
Graceful fallback to baseline hybrid if entity resolution fails.

Track E3: Optional reranker

Add reranker hook for top-N only.
Enforce strict timeout with fallback to pre-rerank ordering.
Instrument latency overhead separately.

Exit criteria for Phase E

Recall@10 and nDCG targets met without latency SLO breach.
Graph consistency checks show no orphan-link drift.

32.7 Phase F implementation tracks (autonomous maintenance)

Track F1: Diagnostics plane

Add read-only health endpoints for: queue, storage, index, provider, mutation safety.
Compute health score with sub-scores per domain.

Track F2: Repair action plane

Implement policy-gated actions: requeue, lease release, reindex checks, bounded reembed.
Require reason + actor + correlation id on every action.
Enforce per-action cooldown and hourly/day action budgets.

Track F3: Safety automation loop

Observe-only mode first (recommendations, no mutation).
Promote to execute mode only after canary gates.
Auto-halt on repeated non-improving remediations.

Exit criteria for Phase F

MTTR improves while maintaining zero autonomous safety incidents.
Kill switch proven effective immediately.

32.8 Phase G implementation tracks (OpenClaw plugin-first)

Track G1: Runtime path arbitration

Session-level arbitration key chooses plugin vs legacy path.
Emit audit field runtime_path for every memory operation.
Enforce one active path per session.

Track G2: Tool and lifecycle parity

Ensure plugin supports full lifecycle callbacks.
Ensure plugin exposes memory tool parity: search/store/get/list/modify/forget.
Route plugin mutations through same daemon policy engine as all callers.

Track G3: Compatibility de-risking

Keep legacy hooks gated and measurable.
Add duplicate-action guard keys across both paths.
Publish migration + deprecation runbook.

Exit criteria for Phase G

Plugin path non-inferior quality, equal/better reliability/latency.
Duplicate actions remain zero in mixed-mode canary.

32.9 Dependency-critical sequencing

A must complete before any B-F mutating behavior.
B shadow data should run long enough to calibrate thresholds before C.
D cannot ship before mutation policy and retention worker are complete.
E graph boost should be enabled only after D mutation correctness gates.
F execute mode cannot ship before D+E safety baselines are stable.
G deprecation timeline starts only after F operational maturity.

33) Deep Implementation Plan (H-K)

These phases focus on product surface parity while preserving local-first and mutation safety guarantees from v2.

33.1 Phase H: Ingestion and connector foundation

H1: Document ingest API and lifecycle

Introduce document-level model:
- documents table with status machine and provenance
- mapping from document -> derived memories
Proposed ingest endpoints:
- POST /api/documents (text/url payload)
- POST /api/documents/file (file upload)
- GET /api/documents/:id (status + diagnostics)
- GET /api/documents/:id/chunks (debug/inspection)
Status machine: queued -> extracting -> chunking -> embedding -> indexing -> done|failed.

H2: Connector runtime contract

Define connector interface:
- authorize()
- listResources()
- syncIncremental(cursor)
- syncFull()
- replay(resourceId, fromVersion)
Required connector metadata:
- provider, tenant scope, cursor, last sync, rate-limit state, error state, provenance tags.
Shared connector safeguards:
- idempotency keys by source document hash + provider version
- per-connector backoff and dead-letter queue
- replay-safe mutation behavior

H3: Initial connector set (depth-first)

GitHub docs connector (text docs only first pass).
Local filesystem docs connector (watch + snapshot modes).
Google Drive connector as first OAuth source.

H4: Connector operations

Connector-specific health diagnostics and forced resync API.
Manual sync + preview impact report before destructive sync actions.
Connector runbook for quota/permission/replay failures.

Exit criteria for H

At least one connector is production-grade with replay + idempotency.
Document ingest lifecycle is visible and debuggable end-to-end.

33.2 Phase I: SDK and framework integrations

I1: @signet/sdk v2 client

Add typed methods for: remember, recall, modify, forget, history, recover, jobs, documents.
Add strong request/response schemas and error classes.
Add transport policies: retries only for idempotent operations by default.

I2: Framework adapters

Vercel AI SDK adapter:
- context injection middleware
- memory tool bindings
- safety defaults for mutation operations
OpenAI SDK adapter (TS first):
- helper for tool definitions + execution wrappers
- conversation-scoped memory context helpers
Optional Python adapter follows after TS parity and adoption signal.

I3: Conformance + examples

Golden integration tests verify adapters match daemon API semantics.
Reference examples for chat agents and coding agents.
Include “safe defaults” templates: preview-first forget, version-guarded modify.

Exit criteria for I

Adapter flows show non-inferior recall quality vs direct API calls.
Mutation safety invariants preserved through all adapter layers.

33.3 Phase J: Auth scope and deployment modes

J1: Deployment mode matrix

local-default mode:
- localhost only
- no mandatory auth
team-auth mode (optional):
- token auth required
- scoped policy enforcement
hybrid mode:
- local commands remain frictionless
- remote endpoints require auth and scope

J2: Token and policy model

Token classes:
- personal full-scope token
- scoped token (project/agent-restricted)
- short-lived session token (optional)
v1 scope granularity includes project + agent + user dimensions.
Permission matrix by operation: remember/recall/modify/forget/recover/admin.
Mandatory scope filters in all query and mutation paths.

J3: Security and abuse controls

Rate limits for destructive operations.
Threshold confirmation for bulk forget and force delete.
Full audit provenance for caller identity and role.

Exit criteria for J

Isolation tests prove no cross-scope leakage.
Unauthorized mutation tests pass under all deployment modes.

33.4 Phase K: Analytics and operator UX

K1: Analytics data model

Usage counters by endpoint, actor, provider, connector.
Error taxonomy with stage-level codes.
Latency histograms for remember/recall/mutate/jobs.

K2: API endpoints and dashboard

Proposed endpoints:
- GET /api/analytics/usage
- GET /api/analytics/errors
- GET /api/analytics/logs
- GET /api/analytics/memory-safety
Dashboard cards: queue health, dead-letter trend, mutation safety, provider status, connector sync health.

K3: Operator incident workflow

One-click export of timeline for a request/session/memory id.
Include policy decisions and automated actions in timeline.
Add benchmark report publication to release decision checklist.

Exit criteria for K

Top incident classes diagnosable from dashboard + APIs alone.
Canary/GA decisions use standard published benchmark artifacts.

34) Detailed Milestone Graph and Suggested Sprinting

Assumes 2-week iterations and parallel work where safe.

34.1 Milestone groups

M1 (A1-A3): Schema + DB access + flags
M2 (B1-B3): Shadow extraction + queue stabilization
M3 (C1-C3): Controlled ADD writes + contradiction controls
M4 (D1-D3): Explicit mutation APIs + recovery + policy hardening
M5 (E1-E3): Graph persistence + graph boost + optional rerank
M6 (F1-F3): Diagnostics + repair actions + autonomous observe mode
M7 (G1-G3): OpenClaw plugin-first arbitration and parity
M8 (H1-H4): Document ingest + first production-grade connector
M9 (I1-I3): SDK v2 + framework adapters + conformance
M10 (J1-J3 + K1-K3): deployment auth modes + analytics UX

34.2 Parallelism guidance

M1 and benchmark harness prep can run together.
M2 shadow mode and API contract stubs for D can run together.
M5 graph extraction implementation can start during late M4, but feature-flagged off until D gates pass.
M8 RFC and interface contracts can start during M6/M7.
M9 SDK surface design should start before M8 completes.

34.3 Suggested de-risking order

Prioritize mutation correctness before connector breadth.
Prioritize policy correctness before authenticated deployment.
Prioritize observability before autonomous execute mode.

35) Engineering Findings and Recommendations

The strongest competitive leverage is not connector count first; it is trusted memory correctness under mutation and recovery.
Shipping explicit modify/forget/history/recover (D) is the highest product-perception unlock for parity.
Graph quality work (E) should stay conservative until mutation safety error rates are well-characterized.
OpenClaw plugin-first migration (G) is strategic for ecosystem coherence; treat duplicate-action prevention as a hard invariant.
Post-v2 connector work (H) should be depth-first with one excellent connector before expansion.
SDK/integration surfaces (I) need conformance tests so wrappers never drift from daemon policy semantics.
Optional auth modes (J) should not compromise local-default UX.
Analytics (K) should be considered a reliability feature, not just UX.

36) Operator Interview Outcomes (resolved)

The following implementation decisions were confirmed by operator input and now act as planning defaults.

Phase H OAuth connector priority:
- Google Drive first.
Phase J auth scope model for v1:
- include project + agent + user scoping in first authenticated mode.
Phase I adapter scope:
- defer Python adapter from wave 1; prioritize TypeScript adapter quality and adoption signal.
Privacy-focused retention policy:
- history retention must be config-required at setup.
Analytics redaction default:
- configurable at setup with safe presets (not one hard global mode).
Release-blocking metric priority:
- wrong-target mutation precision takes precedence over latency p95.

36.1 Plan implications from interview

Phase J complexity increases; allocate extra validation budget for scope-composition and isolation tests.
Setup UX must include retention + analytics redaction decisions.
Phase I timeline should assume TS adapter is productionized before Python starts.
Canary gates should halt rollout on mutation precision regressions even when latency remains within SLO.

Signet Memory Pipeline v2: Implementation Specification

1) Purpose

2) Product Objectives

Primary goals

Non-goals (for this release)

3) Success Criteria (Release Gates)

4) Current Constraints and Assumptions

4.1 Mem0 comparison findings (implemented behavior)

4.2 OpenClaw integration baseline and gap

5) High-Level Architecture

5.1 Pipeline modes

5.2 Lifecycle stages

5.3 Key principle

6) Data Model Specification

6.1 memories table

6.2 memory_history table

6.3 memory_jobs table (durable queue)

6.4 Graph tables

7) Migration and Rollback Strategy

7.1 Pre-migration safety

7.2 Migration behavior

7.3 Rollback

8) Concurrency and Transaction Model

8.1 Transaction boundaries

8.2 Race prevention

8.3 Connection policy

9) Provider Abstraction and Policy

9.1 Supported providers

9.2 Capability contract

9.3 Fallback order

10) Privacy and Security Requirements

10.1 Default posture

10.2 Data handling controls

10.3 Governance controls

10.4 Safety invariants

11) Extraction and Decision Contracts

11.1 Extraction output contract

11.2 Decision output contract

11.3 Contradiction handling

12) Search and Ranking Specification

12.1 Baseline retrieval

12.2 Score components

12.3 Reranking policy

13) Graph Memory Specification

13.1 Extraction behavior

13.2 Merge behavior

13.3 Query-time usage

14) API Specification (Delta)

14.1 POST /api/memory/remember

14.2 GET /api/memory/jobs/:id

14.3 GET /api/memory/:id/history

14.4 POST /api/memory/:id/recover

14.5 PATCH /api/memory/:id

14.6 DELETE /api/memory/:id

14.7 POST /api/memory/forget

14.8 POST /api/memory/modify

14.9 Compatibility

15) Configuration Specification

15.1 Pipeline config block

15.2 Safe defaults

16) Observability and Operations

16.1 Required metrics

16.2 Logging requirements

16.3 Alert thresholds

16.4 Autonomous maintenance plane (self-healing)

17) Rollout Plan

Phase A: Infrastructure hardening

Phase B: Shadow extraction

Phase C: Controlled writes

Phase D: Full semantic decisions

Phase E: Graph and optional reranking

Phase F: Autonomous maintenance enablement

Phase G: OpenClaw plugin-first runtime migration

18) Validation and Test Strategy

18.1 Unit validation

18.2 Integration validation

18.3 Concurrency validation

18.4 Fault injection

18.5 Self-healing validation

18.6 OpenClaw integration validation

6.1 `memories` table

6.2 `memory_history` table

6.3 `memory_jobs` table (durable queue)

14.1 `POST /api/memory/remember`

14.2 `GET /api/memory/jobs/:id`

14.3 `GET /api/memory/:id/history`

14.4 `POST /api/memory/:id/recover`

14.5 `PATCH /api/memory/:id`

14.6 `DELETE /api/memory/:id`

14.7 `POST /api/memory/forget`

14.8 `POST /api/memory/modify`