Post-Merge Canary Suite
Problem
PR-level tests validate individual packages but miss integration-level regressions: daemon fails to start after a migration, search quality degrades, remember/recall cycle breaks. These are caught only after release when users report issues. The full test suite is too slow for post-merge gating.
Goals
- Run a lightweight integration canary after every merge to main.
- Catch daemon startup, memory pipeline, and search regressions within minutes of merge.
- Complete in under 3 minutes (not a full test suite).
Non-goals
- Replacing PR-level unit and integration tests.
- Performance benchmarking (canary checks correctness, not speed).
- Blocking the merge itself (canary runs post-merge).
Proposed approach
Canary script (scripts/canary.ts): A self-contained script
that exercises the critical integration path:
- Build: Run
bun run buildto verify the build succeeds from a clean state. - Daemon startup: Start the daemon on a test port (3851) with a
temporary workspace directory. Verify
/healthreturns 200 within 10 seconds. - Migration integrity: Verify the daemon applied all migrations by checking the migration count via the API or direct SQLite query.
- Remember/recall cycle: POST a test memory to
/api/remember. Search for it via/memory/search. Verify the memory is returned with a relevance score above a threshold. - Cleanup: Stop the daemon, remove the temporary workspace.
CI workflow (.github/workflows/canary.yml): Triggered on push
to main. Runs the canary script. On failure, posts an alert (GitHub
issue labeled canary-failure, or Discord notification).
Failure handling: Canary failures do not revert the merge (too disruptive). Instead, they create a high-priority issue assigned to the last committer. The release train workflow checks for open canary failure issues and blocks release if any exist.
Phases
Phase 1 — Core canary checks
- Implement
scripts/canary.tswith steps 1-5. - Add canary workflow triggered on push to main.
- Add failure alerting via GitHub issue creation.
Phase 2 — Release train integration
- Block release workflow if open canary-failure issues exist.
- Add search quality baseline check (compare recall scores against a known-good threshold).
- Add canary run time tracking to detect performance regressions in the canary itself.
Validation criteria
- A merge that breaks daemon startup triggers a canary failure alert within 5 minutes.
- A successful merge with no regressions completes canary in under 3 minutes.
- The remember/recall cycle test catches search regressions (e.g., missing FTS index).
- Canary failure creates a GitHub issue with the failing step and commit SHA.
Open decisions
- Should the canary run on a self-hosted runner (for consistent performance) or GitHub-hosted runners?