diff options
| author | Craig Jennings <c@cjennings.net> | 2026-06-10 15:22:35 -0500 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-06-10 15:22:35 -0500 |
| commit | e0364b862332112b10eafe80cbba8ad079990095 (patch) | |
| tree | 140eeb4b8979e9dffed27bd947a40379f6364695 /docs/agent-knowledge-base-spec.org | |
| parent | c6bd31f1650330d911de35e120f707eae8ca2baa (diff) | |
| download | rulesets-e0364b862332112b10eafe80cbba8ad079990095.tar.gz rulesets-e0364b862332112b10eafe80cbba8ad079990095.zip | |
docs: finalize agent knowledge-base spec as ready with caveats
I ratified all seven decisions: the org-roam KB is the shared agent substrate, the write boundary is read-shared write-scoped (work never writes), nodes are per-fact, agent writes land freely in the KB only, and harness memory stays as the ephemeral capture layer. The spec moves to docs/agent-knowledge-base-spec.org in spec-create format, superseding the 2026-06-05 draft.
A work-root denylist classifier routes writes: personal projects write, work and unknown projects refuse and report the redacted fact. Implementation is broken into three phases and waits on confirming the denylist contents.
Diffstat (limited to 'docs/agent-knowledge-base-spec.org')
| -rw-r--r-- | docs/agent-knowledge-base-spec.org | 236 |
1 files changed, 236 insertions, 0 deletions
diff --git a/docs/agent-knowledge-base-spec.org b/docs/agent-knowledge-base-spec.org new file mode 100644 index 0000000..c59c33b --- /dev/null +++ b/docs/agent-knowledge-base-spec.org @@ -0,0 +1,236 @@ +#+TITLE: Agent Knowledge Base on Org-roam — Spec +#+AUTHOR: Craig Jennings & Claude +#+DATE: 2026-06-10 + +* Metadata +| Status | ready with caveats — Codex review incorporated, D7 ratified keep (Craig, 2026-06-10); caveat: confirm work-root denylist contents; implementation awaiting Craig's go | +| Owner | Craig Jennings | +| Reviewer | Craig Jennings; Codex (2026-06-10) | +| Related | [[file:../todo.org][todo.org — "Check that memories are sync'd across machines via git"]] | + +This spec supersedes the 2026-06-05 draft (formerly docs/design/2026-06-05-org-roam-knowledge-base-spec.org, removed; content in git history), folding in Craig's 2026-06-10 ratification answers and restructuring to the spec-create format. + +* Summary + +Agents adopt Craig's existing org-roam knowledge base (=~/sync/org/roam/=, ~490 org files, Syncthing-synced since 2023) as the shared, cross-project store for durable knowledge. Per-project harness memory stays as a fast capture layer; durable facts get promoted into the KB, which already syncs across machines. This replaces two abandoned designs (a dedicated git repo, a two-tier rules split) with a substrate that already exists. + +* Problem / Context + +Per-project agent memory lives at =~/.claude/projects/<encoded-cwd>/memory/=, a harness-owned path that is unmanaged and unsynced, so it doesn't survive a new-machine setup or dotfiles restore. Anything durable an agent learns is at-risk by default. + +Two fixes were built or designed and dropped. A dedicated =claude-memory.git= repo was built then reversed because it pooled work-confidential and personal memory into one store. A two-tier split (general lessons to rules, project memory to each project's =.ai/=) left public-remote code projects' memory at-risk by design. + +The simplification: cross-machine sync is already solved for anything living in =~/sync/org/roam/=. The task stops being "build a memory-sync system" and becomes "point agents at the knowledge base that already syncs." + +* Goals and Non-Goals + +** Goals +- Durable, cross-machine agent knowledge with no new sync mechanism. +- Agents query the KB before relying on remembered project facts, prior decisions, or reference material. +- Agent-written notes are distinguishable from Craig's, and index cleanly in his org-roam. +- The work/personal confidentiality boundary is explicit and enforced on the write side. + +** Non-Goals +- The rules layer (=claude-rules/=, =CLAUDE.md=) is untouched. The KB replaces the memory tier, not the rules tier. +- No Emacs/org-roam package integration; agents never touch the SQLite cache. +- No autonomy expansion. Free agent writes apply to the KB only — email, Linear comments, PRs, and every other public or external channel still require Craig's review and consent (D6). + +** Scope tiers +- v1: the pointer rule, the write schema, the boundary, one verified seed node. +- Out of scope: migrating historical harness memory wholesale into the KB; auto-promotion. +- vNext: promotion tooling (a wrap-up prompt or =/promote= pass), KB hygiene reports (orphan =:agent:= nodes). + +* Design + +The KB is a directory of plain org files with =#+title=, =#+filetags=, =:ID:= property drawers, and =[[id:UUID]]= links. That is the entire interface. + +For a *reader* (any agent): query with ripgrep over content and tags; follow a link by grepping for its target =:ID:=. The backlink graph and database are Craig's Emacs conveniences — the files are the agent's interface. Before relying on a remembered project fact, a prior decision, or reference material, search the KB first. + +Conflict-file exclusion is part of the command contract, not a prose reminder — the KB carries dozens of Syncthing =*.sync-conflict-*= files (63 at last count) whose contents are stale or duplicated. The canonical commands: + +#+begin_src sh +# content/tag search +rg --glob '*.org' --glob '!*sync-conflict*' '<query>' ~/sync/org/roam/ +# follow an [[id:UUID]] link to its node +rg --glob '*.org' --glob '!*sync-conflict*' ':ID:[[:space:]]+<uuid>' ~/sync/org/roam/ +#+end_src + +For a *writer* (personal-project agents only, per D5): create one node per fact, following roam conventions so the next =org-roam-db-sync= indexes it: + +#+begin_example +:PROPERTIES: +:ID: <generated uuid> +:END: +#+title: <concise title> +#+filetags: :agent:<scope>: + +<the fact, with [[id:...]] links to related nodes> +#+end_example + +Filename follows roam's timestamp-prefix convention (=YYYYMMDDHHMMSS-slug.org=). The =:agent:= filetag makes =rg '#\+filetags:.*:agent:'= a clean inventory of everything agents wrote, so Craig can review or prune at will. + +** Project classification and write routing (v1) + +D5's boundary needs an executable answer to "is this project allowed to write?" — inference from cwd names, remotes, or task content is too much discretion for a confidentiality boundary. The v1 source of truth is an explicit *work-root denylist* carried in =knowledge-base.md= (initially =~/projects/work=; contents confirmed with Craig before the rule ships). Classification: + +- *Work* — the project root is, or sits under, a denylisted work root. No KB write, ever. The agent records durable facts per that project's own conventions (work already keeps its knowledge in its project tree); v1 adds no new work-side store. +- *Personal* — the project root sits under a known project parent (=~/code/=, =~/projects/=, =~/.emacs.d=) and is not denylisted. KB writes allowed per D6. +- *Unknown* — anything else. No KB write. Refuse and report. + +The refusal message contract (work and unknown alike): state the classification, name the durable fact in a one-line redacted summary, and say where it was or wasn't written — so Craig can re-route it deliberately instead of losing it silently. + +** Harness memory: capture, then promote + +Harness memory keeps its current role but is redefined as an ephemeral working set: fast, automatic, per-project, relevance-recalled, and allowed to be at-risk because nothing durable depends on it surviving. Durable or cross-machine-valuable facts get promoted into the KB as a deliberate step (wrap-up, a task audit, or an explicit prompt) — the same capture-on-landing / promote-on-review cadence the pattern catalog uses. + +A new =claude-rules/knowledge-base.md= rule (auto-installs via the Makefile RULES glob, like =patterns.md= — no Makefile change expected) is the bridge: it carries the KB path, the query commands, the write schema, the classification denylist, and the D5/D6 boundary with its refusal contract. + +* Alternatives Considered + +** Dedicated private git repo (=claude-memory.git=) +- Good, because git gives history, review, and a deliberate sync step. +- Bad, because it pooled work-confidential and personal memory into one all-machines store — the reason it was built and then reversed (2026-05-23/24). +- Bad, because it added a new clone + symlink mechanism every machine must maintain. + +** Two-tier split (rules file + per-project =.ai/memory/=) +- Good, because general lessons would load natively into every session via the rules layer. +- Bad, because project memory in gitignored-=.ai/= projects stays at-risk by design — it solved sync only where =.ai/= was tracked. +- Neutral, because the promote-general-lessons instinct survives in this design as KB promotion. + +** Org-roam KB (chosen) +- Good, because sync already works (Syncthing, since 2023) — zero new infrastructure. +- Good, because the KB is already Craig's curated knowledge home; agent knowledge lands where he actually looks. +- Bad, because Syncthing has no review gate (accepted in D6) and no history — a bad write propagates immediately. +- Neutral, because agents read it as plain files; no org-roam tooling required or used. + +* Decisions + +** D1 — The KB is a queried substrate, accessed as files +- State: accepted (Craig, 2026-06-10) +- Context: an agent is a harness process, not an Emacs session; it cannot call org-roam's Elisp API or read its SQLite cache. +- Decision: We will treat =~/sync/org/roam/= as a directory of plain org files — ripgrep for search, grep-for-=:ID:= to follow links. +- Consequences: easier — works in every runtime, no Emacs dependency; harder — no backlink graph or db-backed queries for agents (acceptable: ~490 tagged, linked text files grep well). + +** D2 — Capture in harness memory, promote into the KB +- State: accepted (Craig, 2026-06-10) +- Context: harness memory is fast and auto-recalled but unsynced; the KB is durable but query-only. +- Decision: We will keep both with distinct roles — harness memory captures, the KB holds what's promoted. Promotion is deliberate (wrap-up, task audit, or explicit prompt), never automatic. +- Consequences: easier — the at-risk problem dissolves (what stays unsynced is by definition the regenerable hot set); harder — promotion is a discipline that has to actually happen, or value silts up in the capture layer. D7 (resolved: keep) confirms the capture layer stays, so this decision stands as written. + +** D3 — Surfacing via a pointer rule +- State: accepted (Craig, 2026-06-10) +- Context: agents need to know the KB exists, where it lives, and how to use it — in every project, on every machine. +- Decision: We will ship =claude-rules/knowledge-base.md= carrying path, query method, write schema, and boundary. It auto-installs via the existing Makefile RULES glob. +- Consequences: easier — one rule, machine-wide, same mechanism as =patterns.md=; harder — nothing material. + +** D4 — Write schema: roam-valid, =:agent:=-tagged, one node per fact +- State: accepted (Craig, 2026-06-10: "per fact") +- Context: agent writes must index cleanly in Craig's org-roam and stay distinguishable from his hand-authored notes. Granularity was open: per-fact nodes vs a per-project appended notes file. +- Decision: We will write one roam-valid node per fact (=:ID:= drawer, =#+title=, =#+filetags: :agent:<scope>:=, timestamp-prefixed filename), linking related nodes by =[[id:]]=. +- Consequences: easier — roam-native, linkable, =rg :agent:= inventories everything agents wrote; harder — more files (accepted; that's what roam is). + +** D5 — Write boundary: read-shared, write-scoped (option C) +- State: accepted (Craig, 2026-06-10: "Your recommendation C is the right one.") +- Context: the KB is personal and replicates to every Syncthing machine — Craig confirmed that includes a work machine. The leak risk is asymmetric and lives on the write side: a work agent writing confidential facts would pool them into the personal store. +- Decision: We will let any project read the shared KB; only personal projects write to it. Work agents write to work's own project tree, never the shared KB. Craig confirmed C handles the work-machine replication acceptably. +- Consequences: easier — reading value lands everywhere, confidential work data stays physically out of the KB; harder — work knowledge has no shared home (status quo, unchanged), and the inverse risk (personal facts surfacing in work artifacts) remains governed by the existing content-scope rules in =commits.md=. + +** D6 — Agent writes land freely in the KB, and only there +- State: accepted (Craig, 2026-06-10) +- Context: Syncthing has no git-style review gate; the alternative was a staging tag (=:agent:inbox:=) Craig promotes from. +- Decision: We will let agent writes land freely in the KB without a review gate. This autonomy is scoped to the KB alone — it is not permission to send email, comment on Linear tickets, or post to any public or external channel; those still require Craig's review and consent. +- Consequences: easier — no promotion queue to tend, knowledge lands immediately; harder — a bad write syncs everywhere before anyone reviews it (mitigated by the =:agent:= inventory and Craig's normal roam curation). + +** D7 — Harness memory stays as the capture layer +- State: accepted (Craig, 2026-06-10: "keep") +- Context: with the KB live, harness memory (=~/.claude/projects/<enc>/memory/= — the per-project store the harness auto-loads into context at session start) could either stay or retire. *Keep* preserves automatic relevance recall at zero query cost, at the price of two stores plus the promotion habit. *Retire* would mean one store and no promotion step, but recall stops being automatic and session start gets heavier. +- Decision: We will keep harness memory as the ephemeral capture layer. D2 stands as written, and Phase 3's promotion cadence is required, not optional — it's what keeps the capture layer from silting up. +- Consequences: easier — automatic recall keeps working, no harness behavior changes; harder — two stores and a promotion discipline (mitigated by Phase 3's mechanical wrap-up trigger). + +* Implementation phases + +Not started — Craig has explicitly held implementation pending his go-ahead. + +** Phase 1 — Pointer rule +Confirm the work-root denylist contents with Craig, then write =claude-rules/knowledge-base.md=: path, the canonical query commands (conflict-file exclusion included), the D4 schema, the classification + write-routing rules, the refusal contract, and the D5/D6 boundary. =make install= links it machine-wide via the existing RULES glob — no Makefile change. Tree stays working throughout (pure addition). + +** Phase 2 — Seed node + index verification +Craig supplies or approves the durable fact; the implementer writes exactly one node under =~/sync/org/roam/= per the schema (a genuine durable fact, not a test stub). Craig runs =org-roam-db-sync= and confirms it indexes and displays cleanly. Rollback if the schema fails: delete that one timestamped =:agent:= file. This validates the schema end-to-end before agents write at volume. + +** Phase 3 — Promotion cadence +Wire the promotion prompt into the wrap-up workflow (a "anything worth promoting to the KB?" check), and note the cadence in =knowledge-base.md=. Resolves D2's discipline risk with a mechanical trigger. + +* Acceptance criteria + +- [ ] =claude-rules/knowledge-base.md= exists with path, query method, write schema, and the ratified D5/D6 boundary. +- [ ] An agent in a personal project can find a relevant prior note by querying the KB. +- [ ] An agent-written node indexes cleanly in Craig's org-roam on the next =org-roam-db-sync= and is identifiable via the =:agent:= filetag. +- [ ] The capture/promote split and its trigger are documented. +- [ ] A work-project agent, asked to store a durable fact, writes it to work's own tree, not the KB. +- [ ] An unknown-classification project, asked to store a durable fact, refuses the KB write and reports the redacted fact per the refusal contract rather than guessing. +- [ ] The documented query commands find a known note and exclude =*.sync-conflict-*= files. + +* Readiness dimensions + +- Data model & ownership: KB nodes are user-curated (Craig) or agent-authored (=:agent:= tag); agents never edit Craig's hand-authored nodes, only link to them. Harness memory stays agent-owned and ephemeral. +- Errors, empty states & failure: a missing =~/sync/org/roam/= (machine without Syncthing) means the rule's query step finds nothing — agents proceed without the KB and say so rather than fabricate recall. No write occurs to a nonexistent path. +- Security & privacy: D5/D6 are the whole story — work never writes; KB writes never extend to public channels. No credentials live in the KB. +- Observability: =rg '#\+filetags:.*:agent:'= inventories all agent writes; roam's UI shows them tagged in Craig's normal browsing. +- Performance & scale: 484 files today; ripgrep over a few thousand org files is milliseconds. N/A as a concern. +- Reuse & lost opportunities: maximal — the entire design is reusing an existing synced, curated store instead of building one. +- Architecture fit & weak points: mirrors the patterns.md pointer-rule shape; the existing Makefile RULES glob installs the new rule with no Makefile change. Weak point is Syncthing conflict files (=*.sync-conflict-*=, 63 today) — excluded by the canonical query commands, not left to prose. +- Config surface: one path constant in =knowledge-base.md=. No knobs. +- Documentation plan: =knowledge-base.md= is the documentation; the spec records the why. +- Dev tooling: N/A because the interface is ripgrep and Write — no build, no tests beyond Phase 2's manual index check. +- Rollout, compatibility & rollback: pure addition; rollback is deleting the rule file and (optionally) the =:agent:=-tagged nodes, which the tag makes a one-command sweep. +- External APIs & deps: none — plain files. Verified: =~/sync/org/roam/= exists with ~490 org files plus 63 conflict files, Syncthing-synced (2026-06-05 ground-truth check; recount in the 2026-06-10 Codex review). + +* Risks, Rabbit Holes, and Drawbacks + +- Un-reviewed writes propagate instantly (D6 accepted this). Dodge: the =:agent:= inventory keeps cleanup cheap. +- Promotion discipline may not stick (D2). Dodge: Phase 3 makes it a mechanical wrap-up step rather than a memory burden. +- Syncthing conflict files could confuse queries. Dodge: exclusion is baked into the canonical commands. +- An incomplete work-root denylist would let a work project classify as personal. Dodge: Phase 1 starts by confirming the denylist with Craig, and the classification's safe default (unknown → refuse) covers anything outside the known parents. + +* Testing / Verification + +From the 2026-06-10 review, the verification surface for v1: + +- =make install= links =knowledge-base.md= into =~/.claude/rules/=. +- In a personal repo, the documented =rg= command finds a known note. +- In a work repo, a durable-storage request produces no write under =~/sync/org/roam/= and the refusal report names the fact. +- In an unknown project, the agent refuses or asks rather than guessing. +- One approved seed node indexes via =org-roam-db-sync= and appears in the =rg '#\+filetags:.*:agent:'= inventory. +- A =*.sync-conflict-*= file containing a unique token is excluded by the documented query. + +The first, second, and last checks are agent-runnable; the org-roam display check and the work/unknown behavioral checks are Craig's manual validation (tracked in todo.org). + +* Review dispositions + +Modified recommendations from the 2026-06-10 Codex review, with reasons. Everything else was accepted as written. + +- *Drop-in =[#B]= implementation tasks as standalone top-level TODOs* — modified: the phase tasks hang as children under the existing parent task ("Check that memories are sync'd across machines via git"), per the one-parent-owns-the-effort convention in the response workflow. Content carried over intact. +- *Update the fact count to the exact recursive number* — modified: the spec now says ~490 (Codex's own alternative), so routine KB growth doesn't churn the spec. +- *Define the exact work-side write destination* — modified within the review's own options: v1 adds no new work-side store. Work projects keep their existing project-tree conventions, and the KB rule's only work-side behavior is the refusal + report. + +* Review and iteration history + +** 2026-06-05 Fri @ 05:57:35 -0500 — Claude (rulesets session) — author +- What: initial one-page draft (five decisions, mechanics recommended), after Craig redirected the memory-sync task onto the existing org-roam KB. +- Why: the dedicated-repo and two-tier designs both failed on the work/personal boundary or left memory at-risk; the KB already syncs. +- Artifacts: original draft at docs/design/2026-06-05-org-roam-knowledge-base-spec.org (superseded by this file; content in git history). + +** 2026-06-10 Wed @ 14:29:20 -0500 — Craig Jennings (cj annotations) + Claude — author revision +- What: folded in Craig's ratification answers (D5 = option C; Syncthing does replicate to a work machine and C stands; per-fact node granularity; free KB-only writes with the explicit no-public-channels boundary) and rewrote into the spec-create format. D7 (harness memory's fate) held open with the fuller explanation Craig requested. +- Why: all but one decision ratified; the 2026-06-05 draft predated the spec-create template. +- Artifacts: this file; implementation explicitly deferred pending Craig's go-ahead. + +** 2026-06-10 Wed @ 14:35:40 -0500 — Codex — reviewer +- What changed or was recommended: reviewed implementation readiness and wrote a blocking review. The main blockers are unresolved D7 and the missing executable personal/work/unknown write-boundary classifier; medium notes cover concrete =rg= commands for conflict-file exclusion and seed-node approval/rollback mechanics. +- Why: implementation would otherwise force the agent to invent memory architecture and confidentiality-boundary behavior at write time. +- Artifacts: docs/agent-knowledge-base-spec-review.org (deleted on disposition completion per the response workflow; content summarized here and in Review dispositions). + +** 2026-06-10 Wed @ 14:39:41 -0500 — Claude Code (rulesets) — responder +- What: processed the Codex review with Craig's D7 ratification ("keep") as a pre-agreed input. Both blockers cleared: D7 accepted (harness memory stays the capture layer, Phase 3 mandatory) and a new "Project classification and write routing" design subsection (work-root denylist as source of truth, unknown → refuse, refusal message contract, no new work-side store). Mediums accepted: canonical =rg= commands with conflict-file exclusion baked in, Phase 2 approval/rollback mechanics, Makefile no-change note, ~490 fact count, Testing/Verification section. Three recommendations modified (see Review dispositions); none rejected. +- Why: converge to implementation-ready. Rubric: ready with caveats — the one caveat is confirming the work-root denylist contents with Craig before Phase 1 ships the rule. +- Artifacts: this file; implementation-task breakdown under the parent task in todo.org; review file deleted. |
