Connectors
Connectors let Signet pull content from external sources — local filesystems, documentation repos, cloud drives — and ingest that content into the memory store as searchable Documents. Each connector follows a consistent register-sync-health lifecycle managed through the Daemon‘s HTTP API.
The connector framework lives in packages/daemon/src/connectors/.
Type definitions are in packages/core/src/connector-types.ts.
Overview
When a connector is registered, the daemon persists a row in the
connectors SQLite table and assigns it a UUID. From that point on,
sync operations can be triggered on demand. Each sync walk produces
documents rows, which are picked up by the document ingest pipeline
for chunking, embedding, and indexing into memory.
Connectors do not run on a schedule by default. You trigger syncs explicitly via the API, or wire them into your own automation.
The document pipeline stages after ingest: queued → extracting →
chunking → embedding → indexing → done. Failures land at
failed with an error field on the document row.
Connector Status States
Every connector has a status field that tracks its current state.
The possible values (from CONNECTOR_STATUSES in
packages/core/src/connector-types.ts):
| Status | Description |
|---|---|
idle | Default state. The connector is registered and ready to sync. |
syncing | A sync operation is in progress. |
error | The last sync threw an unhandled exception. Check last_error. |
Status transitions:
idle → syncing → idle (successful sync)
idle → syncing → error (sync threw an exception)
error → syncing → idle (next sync succeeds)
The status field is updated via updateConnectorStatus() in
packages/daemon/src/connectors/registry.ts.
Connector Lifecycle
Register. POST a provider type and settings to create a new
connector. The daemon validates the provider name and stores a config
row. The connector starts in idle status.
Sync (incremental). POST to /:id/sync to process only resources
that have changed since the last successful sync. The stored
sync_cursor (a JSON blob with lastSyncAt) determines the cutoff.
This is the normal sync path — fast, low I/O.
Sync (full). POST to /:id/sync/full?confirm=true to reprocess
every matching resource regardless of the cursor. Requires the
?confirm=true query parameter as a guard against accidental
re-ingestion. Use this when you want to force re-embedding of existing
content.
Replay. Force-reprocess a single named resource without touching
the cursor or other documents. Useful for debugging a specific file.
(WIP — the replay() method exists on ConnectorRuntime but is not
yet exposed via an HTTP endpoint.)
Health. GET /:id/health returns current status, last sync
timestamp, any last error, and a live document count sourced from the
documents table.
Unregister. DELETE /:id removes the connector row. Pass
?cascade=true to also delete associated document rows.
API Endpoints
All write endpoints (POST, DELETE) require the admin permission.
GET endpoints are publicly accessible on the local daemon.
List connectors
GET /api/connectors
Response:
{
"connectors": [
{
"id": "b3a2...",
"provider": "filesystem",
"display_name": "My Docs",
"status": "idle",
"last_sync_at": "2026-02-20T14:00:00.000Z",
"last_error": null,
"cursor_json": "{\"lastSyncAt\":\"2026-02-20T14:00:00.000Z\"}",
"created_at": "2026-02-01T10:00:00.000Z",
"updated_at": "2026-02-20T14:00:01.000Z"
}
],
"count": 1
}
Register a connector
POST /api/connectors
Content-Type: application/json
Request body:
{
"provider": "filesystem",
"displayName": "My Docs",
"settings": {
"rootPath": "/home/user/docs",
"patterns": ["**/*.md", "**/*.txt"],
"maxFileSize": 1048576
}
}
provider must be one of: filesystem, github-docs, gdrive.
settings is passed through to the connector implementation as-is.
Response (201 Created):
{ "id": "b3a2c4d5-..." }
Get connector details
GET /api/connectors/:id
Returns the full ConnectorRow object.
Trigger incremental sync
POST /api/connectors/:id/sync
Returns immediately. The sync runs in the background.
Response:
{ "status": "syncing" }
If the connector is already syncing, returns 200 with the same body
rather than starting a duplicate run.
Trigger full resync
POST /api/connectors/:id/sync/full?confirm=true
The ?confirm=true parameter is required. Without it the endpoint
returns 400. The sync runs in the background; poll the health
endpoint to track completion.
Connector health
GET /api/connectors/:id/health
Response:
{
"id": "b3a2...",
"status": "idle",
"lastSyncAt": "2026-02-20T14:00:00.000Z",
"lastError": null,
"documentCount": 42
}
documentCount is a live count of documents whose source_url begins
with the connector’s rootPath. It reflects the current state of the
database, not the last sync result.
Delete a connector
DELETE /api/connectors/:id
DELETE /api/connectors/:id?cascade=true
Without cascade, only the connector row is removed. With
cascade=true, associated document rows are also deleted.
Response:
{ "deleted": true }
Filesystem Connector
The filesystem connector is the only built-in provider. It walks a local directory tree using glob patterns and ingests matching files as documents.
Configuration
| Field | Type | Default | Description |
|---|---|---|---|
rootPath | string | required | Absolute path to scan |
patterns | string[] | ["**/*.md", "**/*.txt"] | Glob patterns to match |
ignorePatterns | string[] | [".git", "node_modules", ".DS_Store"] | Paths to exclude |
maxFileSize | number | 1048576 (1 MB) | Files larger than this are skipped |
Example registration:
curl -s -X POST http://localhost:3850/api/connectors \
-H "Content-Type: application/json" \
-d '{
"provider": "filesystem",
"displayName": "Obsidian Vault",
"settings": {
"rootPath": "/home/user/obsidian-vault",
"patterns": ["**/*.md"],
"maxFileSize": 524288
}
}'
How it works
authorize checks that rootPath exists and is readable. If the
path is missing or permission is denied, registration still succeeds
but the first sync will fail with the error captured on the connector
row.
listResources runs the glob patterns against rootPath and returns
a flat list of matching files. Each resource has an id (relative
path), a name (basename), and an updatedAt (file mtime). The
listing is not paginated — all matching files are returned at once.
syncIncremental filters the discovered files to those whose mtime
is newer than cursor.lastSyncAt. Only changed files are processed.
This makes routine syncs fast even over large directory trees.
syncFull processes every matching file unconditionally, setting
forceUpdate: true. Existing document rows are reset to queued and
re-enqueued for the ingest pipeline.
replay reprocesses a single file identified by its relative path
(as returned by listResources). Useful for manually re-ingesting a
specific document without touching anything else.
For each file processed, the connector either inserts a new documents
row or updates the existing one (matched by source_url, which is the
absolute file path). After writing, it enqueues a document_ingest
job. The ingest pipeline handles chunking, embedding, and indexing
from that point on.
Files that cannot be read, or that exceed maxFileSize, produce a
SyncError entry in the sync result rather than halting the run.
Cursor-Based Incremental Sync
After each successful sync, the connector’s cursor_json column is
updated with a new SyncCursor:
interface SyncCursor {
lastSyncAt: string; // ISO timestamp
checkpoint?: string; // optional opaque continuation token
version?: number; // optional schema version
}
For the filesystem connector, only lastSyncAt is used. The
incremental sync compares each file’s mtime against this value and
skips anything older. A full sync ignores the cursor entirely but still
writes a fresh lastSyncAt when it completes.
The cursor is stored as JSON in the connectors table and updated
atomically alongside the connector’s last_sync_at timestamp in a
single write transaction (via updateCursor() in registry.ts).
On first sync (no cursor present), lastSyncAt defaults to the Unix
epoch (1970-01-01T00:00:00.000Z), which causes all files to be
treated as new.
Error Handling
Sync errors are non-fatal by design. If a single file fails to read,
the error is captured in the SyncResult.errors array and the sync
continues with remaining files. The connector’s status field stays
"syncing" until the entire run completes, then transitions to either
"idle" (success) or "error" (if the sync itself throws).
A per-resource SyncError has this shape:
interface SyncError {
resourceId: string; // relative path or resource identifier
message: string; // human-readable reason
retryable: boolean; // whether replaying would likely succeed
}
The last_error column on the connector row captures the message from
any unhandled exception that aborts a sync run. Per-resource errors
within a sync are surfaced in the sync result response but do not set
last_error.
Polling GET /api/connectors/:id/health after triggering a sync is
the intended way to check completion and surface errors.
Building Custom Connectors
To add a new provider, implement the ConnectorRuntime interface from
@signet/core. This section walks through the full process.
Step 1: Implement ConnectorRuntime
Create a new file in packages/daemon/src/connectors/:
import type {
ConnectorRuntime,
ConnectorConfig,
ConnectorResource,
SyncCursor,
SyncResult,
SyncError,
} from "@signet/core";
class MyConnector implements ConnectorRuntime {
readonly id: string;
readonly provider = "my-provider" as const;
private readonly settings: MySettings;
constructor(config: ConnectorConfig) {
this.id = config.id;
// Parse and validate config.settings
this.settings = parseMySettings(config.settings);
}
async authorize(): Promise<{ readonly ok: boolean; readonly error?: string }> {
// Validate credentials, connectivity, or access permissions.
// Return { ok: false, error: "reason" } if authorization fails.
// Registration still succeeds even if authorize() fails — the
// error surfaces on the first sync attempt.
return { ok: true };
}
async listResources(cursor?: string): Promise<{
readonly resources: readonly ConnectorResource[];
readonly nextCursor?: string;
}> {
// Return a list of resources available for syncing.
// Each resource needs: id (unique string), name, updatedAt (ISO).
// Use cursor/nextCursor for paginated listings if needed.
return { resources: [] };
}
async syncIncremental(cursor: SyncCursor): Promise<SyncResult> {
// Fetch and process only resources changed since cursor.lastSyncAt.
// For each resource:
// - Insert or update a documents row
// - Enqueue a document_ingest job
// - Track added/updated counts
// - Capture per-resource errors in the errors array
return {
documentsAdded: 0,
documentsUpdated: 0,
documentsRemoved: 0,
errors: [],
cursor: { lastSyncAt: new Date().toISOString() },
};
}
async syncFull(): Promise<SyncResult> {
// Fetch and process ALL resources regardless of cursor.
// Same logic as syncIncremental but with forceUpdate: true.
return {
documentsAdded: 0,
documentsUpdated: 0,
documentsRemoved: 0,
errors: [],
cursor: { lastSyncAt: new Date().toISOString() },
};
}
async replay(resourceId: string): Promise<SyncResult> {
// Reprocess a single resource by its id.
// Does not update the cursor.
return {
documentsAdded: 0,
documentsUpdated: 0,
documentsRemoved: 0,
errors: [],
cursor: { lastSyncAt: new Date().toISOString() },
};
}
}
Step 2: Add the provider to the type system
In packages/core/src/connector-types.ts, add your provider string to
the CONNECTOR_PROVIDERS tuple:
export const CONNECTOR_PROVIDERS = [
"filesystem",
"github-docs",
"gdrive",
"my-provider", // ← add here
] as const;
This ensures the API validates the provider name at registration time.
Step 3: Wire the factory into the daemon
In packages/daemon/src/daemon.ts, find where createFilesystemConnector
is called in the sync route handlers. Add a branch for your provider:
function createConnectorRuntime(
config: ConnectorConfig,
accessor: DbAccessor,
): ConnectorRuntime {
switch (config.provider) {
case "filesystem":
return createFilesystemConnector(config, accessor);
case "my-provider":
return createMyConnector(config, accessor);
default:
throw new Error(`Unknown provider: ${config.provider}`);
}
}
Step 4: Export a factory function
export function createMyConnector(
config: ConnectorConfig,
accessor: DbAccessor,
): ConnectorRuntime {
return new MyConnector(config);
}
Key implementation notes
-
Document rows: Each synced resource should map to a
documentstable row with a uniquesource_url. Use the existinginsertDocument()/updateDocument()helpers fromfilesystem.tsas a reference, or write your own usingDbAccessor. -
Ingest pipeline: After inserting or updating a document row, call
enqueueDocumentIngestJob(accessor, docId)to kick off chunking and embedding. -
Error resilience: Individual resource failures should be captured in the
SyncResult.errorsarray, not thrown. Only throw for unrecoverable errors that should abort the entire sync. -
Cursor management: The daemon handles cursor persistence — your connector just returns the new cursor in the
SyncResult. TheupdateCursor()function inregistry.tswrites it atomically. -
Status updates: The daemon’s sync route handlers call
updateConnectorStatus()before and after your sync methods. You don’t need to manage status yourself.
Health Lifecycle
Connector health is tracked through the status, last_sync_at, and
last_error fields on the connector row.
The daemon manages these fields around sync operations:
- Before sync: Status is set to
"syncing",last_erroris cleared - After successful sync: Status returns to
"idle", cursor is updated - After failed sync: Status is set to
"error",last_errorcaptures the exception message
The health endpoint (GET /api/connectors/:id/health) returns a
snapshot of these fields plus a live documentCount computed from the
documents table.
Health data is useful for monitoring dashboards or automation that triggers syncs and needs to know when they complete.
Registered Providers
The CONNECTOR_PROVIDERS tuple in packages/core/src/connector-types.ts
defines the valid provider names:
| Provider | Status | Description |
|---|---|---|
filesystem | Implemented | Local directory tree ingestion |
github-docs | Placeholder | GitHub repository documentation (not yet implemented) |
gdrive | Placeholder | Google Drive documents (not yet implemented) |
The register endpoint returns 400 if the provider field doesn’t
match a value in this tuple.