aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/design/2026-06-05-org-roam-knowledge-base-spec.org84
1 files changed, 84 insertions, 0 deletions
diff --git a/docs/design/2026-06-05-org-roam-knowledge-base-spec.org b/docs/design/2026-06-05-org-roam-knowledge-base-spec.org
new file mode 100644
index 0000000..5799fc1
--- /dev/null
+++ b/docs/design/2026-06-05-org-roam-knowledge-base-spec.org
@@ -0,0 +1,84 @@
+#+TITLE: Org-roam as the shared agent knowledge substrate — Spec
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-06-05
+
+One-page spec for adopting the existing org-roam knowledge base as the shared, cross-project knowledge store agents read from and write to. Drafted for Craig's review. The five decisions below each carry a recommended call, marked DECISION. DECISION 5 (the work/personal write boundary) is the one that needs Craig's ratification rather than a default; the others are mechanics.
+
+* Problem
+
+Per-project agent memory lives at =~/.claude/projects/<encoded-cwd>/memory/=, a harness-owned path that is unmanaged and unsynced, so it doesn't survive a new-machine setup. Two earlier approaches were explored and dropped: a dedicated =claude-memory.git= repo (built, then reversed because it pooled work-confidential and personal memory into one repo), and a two-tier split (general lessons to rules, project memory to each project's =.ai/=, which left public-remote code projects' memory at-risk by design).
+
+The simplification: =~/sync/org/roam/= already exists. 484 org files curated since 2023, synced across machines by Syncthing. Cross-machine sync is already solved for anything living there. So the task stops being "build a memory-sync system" and becomes "point agents at the knowledge base that already syncs."
+
+* Current state
+
+Three distinct layers, today:
+
+- *Rules* (=claude-rules/*.md= symlinked into =~/.claude/rules/=, plus per-project =CLAUDE.md=). Always-on instructions, loaded every session. Synced via the rulesets clone.
+- *Harness memory* (=~/.claude/projects/<enc>/memory/=). Small, per-project, auto-recalled by the harness into system reminders. Unsynced, at-risk.
+- *Org-roam KB* (=~/sync/org/roam/=). Large, human-curated, Syncthing-synced. Not currently agent-readable or agent-writable by convention.
+
+This spec leaves the rules layer untouched. It connects the KB to agents and redefines harness memory's role.
+
+* Design
+
+** DECISION 1 — The KB is a queried substrate, accessed as files, not via the org-roam package
+
+An agent is a harness process, not an Emacs session. It cannot call org-roam's Elisp API or read its SQLite cache. It treats =~/sync/org/roam/= as what it physically is: a directory of plain org files with =#+title=, =#+filetags=, an =:ID:= property drawer, and =[[id:UUID][desc]]= links. The agent searches with ripgrep over content and tags, and follows a link by grepping for its target =:ID:=. 484 tagged, linked text files is a strong agent substrate. The backlink graph and db are Craig's Emacs convenience; the files are the agent's interface.
+
+** DECISION 2 — Capture in harness memory, promote into the KB
+
+Harness memory and the KB are kept, with distinct roles that mirror the pattern-catalog's capture-on-landing / promote-on-review cadence:
+
+- *Harness memory = capture.* Fast, automatic, per-project, relevance-recalled. Treated as an ephemeral working set: regenerable, allowed to be at-risk, because nothing durable depends on it surviving.
+- *Org-roam KB = promote.* Durable or cross-machine-valuable facts get written into the KB, where Syncthing carries them to every machine and Craig curates them.
+
+This resolves the original at-risk problem without a new sync mechanism: the valuable knowledge lives in the synced KB; what stays in unsynced harness memory is by definition the low-value, regenerable hot set. Promotion is a deliberate step (a wrap-up or task-audit pass, or an explicit prompt), not automatic.
+
+** DECISION 3 — Surfacing: a pointer rule
+
+A new =claude-rules/knowledge-base.md= rule (auto-installs via the Makefile RULES glob, same as =patterns.md=) tells the agent: the KB lives at =~/sync/org/roam/=; query it with ripgrep before relying on a remembered project fact, a prior decision, or reference material; follow =[[id:]]= links by grepping the ID; and write durable facts back per the schema in DECISION 4, honoring the scope rule in DECISION 5. The rule is the bridge; it carries the path, the query method, the write schema, and the boundary.
+
+** DECISION 4 — Write schema, so the KB stays trustworthy
+
+Agent-written nodes follow org-roam conventions so Craig's Emacs indexes them on the next =org-roam-db-sync=, and carry a marker so agent notes stay distinguishable from Craig's hand-authored ones:
+
+#+begin_example
+:PROPERTIES:
+:ID: <generated uuid>
+:END:
+#+title: <concise title>
+#+filetags: :agent:<scope>:
+
+<the fact, with [[id:...]] links to related nodes>
+#+end_example
+
+Filename follows roam's timestamp-prefix convention (=YYYYMMDDHHMMSS-slug.org=). The =:agent:= filetag makes =rg '#\+filetags:.*:agent:'= a clean inventory of what agents wrote, so Craig can review or prune. Node granularity (one node per fact vs a per-project agent-notes node appended to) is an open question below.
+
+** DECISION 5 — The work/personal write boundary (needs Craig's ratification)
+
+This is the decision that sank the dedicated-repo design and it returns here. =~/sync/org/roam/= is Craig's personal KB on his personal machines. The risk is asymmetric and lives on the write side: if a work (DeepSat) agent writes into it, confidential work facts pool into a personal all-machines store. Reading is lower-risk and already governed by the content-scope rules in =commits.md= (personal facts must not surface in team artifacts).
+
+Three options:
+
+- *A — Work walled off.* The shared KB is personal-only. Work agents neither read nor write it; work keeps its knowledge in its own project tree. Zero leak risk, but work gains nothing from the KB.
+- *B — One KB, tag-scoped.* Everything in the KB, every node tagged =:work:= / =:personal:= / =:general:=. Agents recall only current-scope plus =:general:=, and a rule forbids cross-scope surfacing. Maximal sharing, but confidential work data sits one tag-mistake away from the wrong machine or artifact.
+- *C — Read-shared, write-scoped (recommended default).* Any project may read the shared KB. Personal projects write to it; work writes only to its own store, never the shared KB. Captures the reading value everywhere while keeping confidential work data physically out of the personal Syncthing KB. The inverse risk (personal facts in work output) stays governed by existing content-scope rules.
+
+Recommended: C, pending Craig's read on two facts only he has — how sensitive the work memory is, and whether Syncthing even replicates =~/sync/= to any work machine (if it doesn't, a work-machine agent can't read the KB regardless, which pushes toward A for work). If work confidentiality is strict, fall back to A.
+
+* Acceptance
+
+- =claude-rules/knowledge-base.md= exists: KB path, when-to-query, how-to-query (ripgrep + ID-follow), the write schema, and the ratified scope rule from DECISION 5.
+- The write schema is documented and produces nodes Craig's =org-roam-db-sync= indexes cleanly.
+- An agent in a personal project can find a relevant prior note by querying the KB, and append a new node that shows up in Craig's org-roam.
+- The harness-memory-as-capture / KB-as-promote split is documented, with the promotion trigger named.
+- The work/personal boundary is decided and encoded in the rule.
+
+* Open questions for Craig
+
+- *DECISION 5 itself* — A, B, or C. The central call.
+- *Syncthing topology* — does =~/sync/= replicate to any work machine? Bounds DECISION 5.
+- *Node granularity* — one node per fact (roam-native, linkable, more files) vs a per-project agent-notes node appended to (less sprawl, less linkable). Spec leans per-fact nodes tagged =:agent:=.
+- *Harness memory's fate* — keep the thin auto-recalled hot set (spec's lean, DECISION 2), or retire it entirely and have the agent query the KB at session start instead.
+- *Write review* — Syncthing has no git-style review gate. Do agent writes land freely, or get a lightweight review (e.g. written to an =:agent:inbox:= tag Craig promotes)? Relates to trust in DECISION 4.