diff options
| author | Craig Jennings <c@cjennings.net> | 2026-05-31 14:49:39 -0500 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-05-31 14:49:39 -0500 |
| commit | 93da00860bbc5a61663845d2785be20756983341 (patch) | |
| tree | 72b2f22d2024bafd71d960b5d0598f13a1c72587 /docs | |
| parent | 8d8a9b8ec79ec2252b098713283884aeae80038e (diff) | |
| download | dotemacs-93da00860bbc5a61663845d2785be20756983341.tar.gz dotemacs-93da00860bbc5a61663845d2785be20756983341.zip | |
docs(design): file org-roam shared-KB brainstorm
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/design/ai-kb-shared-roam-brainstorm.org | 420 |
1 files changed, 420 insertions, 0 deletions
diff --git a/docs/design/ai-kb-shared-roam-brainstorm.org b/docs/design/ai-kb-shared-roam-brainstorm.org new file mode 100644 index 00000000..e42e2b00 --- /dev/null +++ b/docs/design/ai-kb-shared-roam-brainstorm.org @@ -0,0 +1,420 @@ +#+TITLE: Brainstorm: org-roam as shared human + agent knowledge base +#+DATE: 2026-05-28 +#+SOURCE: rulesets discussion with Codex +#+TARGET: .emacs.d / ai-kb / org-roam agent tooling + +* Prompt + +If the rulesets agents had full access to org-roam, how would they best be able +to use it? Would it be a good idea? Consider not only project-related +documents, but also using org-roam as a shared knowledge base for Craig and the +agents. + +* Bottom line + +Yes, this is a good idea, but only if org-roam is treated as a curated shared +memory system rather than a dumping ground, a transcript archive, or a second +task tracker. + +The best shape is: + +- org-roam holds durable linked knowledge; +- =.ai/= holds operational/session workflow state; +- =todo.org= holds current work; +- =docs/= holds formal project documentation; +- agents access org-roam through a small structured tool layer; +- human and agent writes use the same safety boundary; +- curation is periodic and human-gated. + +This is close to the current =ai-kb= direction in =.emacs.d=: a private +git-backed org-roam store with a shared write protocol, summaries, provenance, +lint, index generation, and Emacs browsing. + +* What org-roam should be for agents + +** Durable cross-project memory + +Org-roam is strongest as the place for knowledge that should follow the agent +across projects: + +- Craig's preferences; +- recurring procedures; +- engineering lessons; +- architecture principles; +- known gotchas; +- durable decisions; +- reusable workflows; +- project relationships; +- tool conventions; +- "we tried X, rejected it because Y"; +- notes that later agents should discover by relationship, not just by keyword. + +Why this helps: + +- Agents stop rediscovering context. +- Preferences become durable and linkable. +- Lessons learned in one project can influence another project. +- The graph structure lets an agent discover consequences and related decisions + rather than reading a flat memory file top to bottom. + +** Project memory with links + +For project-related documents, org-roam should not replace the current project +surfaces. It should complement them. + +Suggested split: + +| Surface | Purpose | +|---------+---------| +| =.ai/session-context.org= | current session facts, live recovery, wrap-up archive | +| =todo.org= | current tasks and commitments | +| =docs/= | formal project docs, specs, architecture notes | +| org-roam / ai-kb | durable concepts, decisions, procedures, lessons, relationships | + +Example: + +- A session discovers "startup must pull rulesets before project repos." That + might be logged in =.ai/session-context.org= today. +- If it becomes a durable rule or reusable lesson, the agent writes an org-roam + node linking to related startup, sync, and failure-recovery nodes. +- A future project can point to that node by ID instead of duplicating the rule. + +** Shared human + agent knowledge base + +The best version is not "agent memory" separate from Craig's tools. It is a +shared knowledge base: + +- Craig can browse and edit with org-roam, backlinks, graph, node-find, and + normal Emacs affordances. +- Agents can query, show, link, remember, and curate through structured tools. +- Both sides see the same source of truth. +- Every write has provenance so later readers know whether a fact was + user-stated, observed, inferred, or externally sourced. + +This is the main value over a flat =MEMORY.md=. + +* How agents should use it + +** Query before acting + +Agents should query org-roam before: + +- choosing a convention; +- making an architectural recommendation; +- writing a new workflow; +- changing rulesets behavior; +- touching personal preferences; +- solving a problem that looks familiar; +- giving advice where Craig's past decisions may matter; +- contradicting an existing decision; +- starting a multi-step procedure that may already exist. + +They should not load the whole graph. The retrieval path should be: + +1. read the tiny adapter rule; +2. query the generated index / CLI; +3. inspect summaries; +4. open only the relevant nodes; +5. follow backlinks only when they look useful. + +Why this helps: + +- Keeps token usage low. +- Gives agents the right memory at the right time. +- Avoids start-of-session context floods. + +** Read backlinks as context + +Backlinks are where org-roam gives agents more than a search index. + +When an agent reads a node, it should be able to ask: + +- What decisions depend on this? +- What procedures reference this? +- Which projects are affected? +- Is this preference superseded? +- What gotchas are related? +- What unresolved contradictions exist? + +This supports reasoning by graph neighborhood rather than by raw text search. + +** Write only durable, general knowledge + +Agents should write unprompted only when the knowledge is: + +- durable: useful beyond this session; +- general enough: useful across projects or likely to recur; +- not re-derivable cheaply from code/git/docs; +- not a secret; +- not routine status. + +Good writes: + +- "Craig prefers no popup choice menus; present numbered options inline." +- "When moving an org subtree to roam, write and verify the target before + cutting the source." +- "Rulesets install artifacts are symlinks globally but copied per-project for + language bundles." +- "Use ID-first pointers to ai-kb nodes because titles and filenames can + change." + +Bad writes: + +- today's status; +- every session summary; +- task lists; +- raw chat transcripts; +- temporary debugging observations; +- guesses with no provenance; +- secrets or credentials. + +** Use contradiction handling, not silent overwrite + +If a new observation conflicts with an existing node, agents should not silently +replace the old node. + +Better flow: + +1. mark the new claim and old claim as contested, or create a contested note; +2. explain the contradiction; +3. ask Craig whether to update, scope as an exception, supersede, or reject. + +Example: + +Existing memory says "no popup menus." A new workflow proposes popup choice +menus. The agent should surface the conflict and ask whether this is an +exception or a rejected design. + +Why this helps: + +- Prevents model drift from rewriting preferences. +- Makes changing a durable preference explicit. +- Keeps the KB trustworthy. + +* Tool layer agents should get + +Agents should not use raw unrestricted file access as their primary interface. +They should get a compact API over the org-roam store. + +Suggested tools / CLI commands: + +- =ai-kb query <context>=: ranked search over index, titles, tags, summaries, + properties, and body. +- =ai-kb show <id-or-title>=: resolve ID first and print/open the node. +- =ai-kb backlinks <id>=: list nodes linking to a node, excluding generated + index and raw captures. +- =ai-kb remember=: write using the full protocol. +- =ai-kb lint=: structural and semantic validation. +- =ai-kb index=: regenerate the index. +- =ai-kb status=: fast state for dashboard/startup. +- =ai-kb doctor=: deeper health check. +- =ai-kb curate --dry-run=: report duplicates, orphans, contested nodes, stale + nodes, raw bloat. + +Why this helps: + +- Agents compose predictable operations. +- Humans can test behavior. +- Token usage drops because agents can request structured summaries instead of + reading many org files. +- Safety gates live in one place. + +* Required node shape + +Every shared KB node should have enough structure for retrieval, trust, and +maintenance. + +Suggested required properties: + +#+begin_src org +:PROPERTIES: +:ID: <uuid> +:PROJECTS: :general: :rulesets: +:CREATED: 2026-05-28 +:UPDATED: 2026-05-28 +:CREATED_BY: codex +:CONFIDENCE: user-stated +:VISIBILITY: personal +:SOURCE: chat 2026-05-28 +:STATUS: current +:SUMMARY: One sentence written for retrieval and index display. +:END: +#+title: Concise node title +#+filetags: :principle:preference: +#+end_src + +Important conventions: + +- =:ID:= is the durable identity. Titles and filenames may change. +- =:SUMMARY:= is required because query/index should not infer it. +- =:CREATED_BY:= and =:CONFIDENCE:= separate user-stated knowledge from model + inference. +- =:STATUS:= supports =current=, =contested=, =superseded=. +- =:VISIBILITY:= keeps privacy boundaries visible. +- relation labels in body links can express =SUPERSEDES=, =CONTRADICTS=, + =RELATES_TO=, =IMPLEMENTS=, =DERIVED_FROM=. + +* Human and agent writes need one safety boundary + +If both Craig and agents edit the KB, there should be exactly one write path. + +For agents: + +1. fetch / fast-forward if safe; +2. write; +3. regenerate index; +4. run full lint; +5. scan for secrets; +6. commit locally; +7. push later or via timer; +8. surface push failures. + +For human Emacs edits: + +- an =ai-kb= minor mode should run the same post-save sequence; +- save should not be blocked or made read-only; +- lint failure should leave the buffer editable, avoid committing, and surface + findings in a buffer/modeline/dashboard; +- a clean re-save commits. + +Why this helps: + +- Human edits cannot bypass the integrity model. +- Agent writes cannot introduce malformed nodes silently. +- The git history becomes the recovery layer. + +* Personal roam boundary + +There are two different ideas that should not be collapsed accidentally: + +1. =ai-kb=: shared human/agent operational knowledge. +2. Craig's personal org-roam: personal notes, journals, recipes, dailies, + knowledge graph. + +Recommended default: + +- keep =ai-kb= as a separate private org-roam repo; +- give agents rich access to =ai-kb=; +- give agents narrower, permissioned access to personal roam; +- bridge personal roam explicitly only when desired. + +Why: + +- Personal journals and private notes should not become agent scratch space. +- Agent writes can pollute a personal graph if not isolated. +- A separate repo makes sync, recovery, curation, and privacy easier. + +Still valuable personal-roam tools: + +- resolve topic to node; +- return node body plus backlinks; +- list nodes by tag; +- surface dailies for a date range; +- create notes via org-capture templates. + +But these should be structured affordances, not freeform agent mutation. + +* Curation workflow + +Any agent-written KB will rot unless curated. + +Add a periodic curation workflow that reports: + +- duplicate nodes; +- orphan nodes; +- stale nodes; +- contested nodes; +- superseded nodes still referenced; +- over-broad nodes to split; +- raw captures with no compiled node; +- raw files that are too large; +- external pointers that need repointing after a merge. + +Rules: + +- agents can propose merges/splits/deletions; +- Craig confirms destructive changes; +- merges must repoint inbound =[[id:]]= and external =ai-kb: Title (UUID)= + pointers; +- curation stamps =:LAST_CURATED:= or equivalent. + +Why this helps: + +- Keeps the graph useful. +- Prevents "AI memory" from turning into sediment. +- Makes trust a maintained property, not a one-time design claim. + +* What to avoid + +- Do not let every session summary become a roam node. +- Do not store secrets, credentials, tokens, or private keys. +- Do not treat org-roam as the task system. +- Do not load the whole graph at startup. +- Do not let agents rewrite/delete/merge nodes without human confirmation. +- Do not mix personal journals and agent KB by default. +- Do not rely on org-roam's SQLite database as the agent source of truth. + Files and IDs should be canonical; SQLite is a browsing cache. +- Do not let generated index files create semantic backlinks. +- Do not let raw external captures become primary query results. + +* Best architecture + +Use three layers: + +** Operational layer + +=.ai/=, workflows, session logs, =todo.org=. + +This layer answers: + +- What are we doing now? +- What happened this session? +- What workflow should run? +- What tasks are open? + +** Knowledge layer + +=ai-kb= as a private git-backed org-roam repo. + +This layer answers: + +- What should agents remember long-term? +- What principles, procedures, and preferences apply? +- What related decisions exist? +- What has been superseded or contested? + +** Adapter layer + +Thin rules/tools for Claude, Codex, local models, and Emacs. + +This layer answers: + +- How does this runtime query memory? +- How does this runtime write safely? +- How does this runtime respect the same contract? + +* Best ideas to carry forward + +1. Treat org-roam as shared long-term semantic memory, not transcript storage. +2. Keep =ai-kb= separate from personal roam by default. +3. Give agents structured tools: query, show, backlinks, remember, lint, status, + curate. +4. Require summaries and provenance on every node. +5. Use ID-first links and pointers. +6. Query before acting when durable preferences or prior decisions may matter. +7. Use backlinks for graph-neighborhood context discovery. +8. Make contradiction handling explicit. +9. Run human and agent writes through the same lint/index/commit path. +10. Make curation periodic and human-gated. + +* Possible next task + +Convert this brainstorm into a concrete design delta for the existing +=docs/design/ai-kb.org= and the open =Implement ai-kb= task: + +- add agent query triggers; +- specify personal-roam access boundaries; +- define the structured tool interface for personal roam vs =ai-kb=; +- add contradiction handling to the agent contract; +- add curation acceptance criteria; +- decide whether any subset of personal roam should be readable by default. |
