aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-05-24 03:08:48 -0500
committerCraig Jennings <c@cjennings.net>2026-05-24 03:08:48 -0500
commitc87631f0c200556d99b2ccbcd838cdf6877c7014 (patch)
tree8f1269e13c6392209f784c4d0cc3c179c8c5eb09 /docs
parent1faa6e7538d458a9e65c6e97fbf566363686e6c8 (diff)
downloaddotemacs-c87631f0c200556d99b2ccbcd838cdf6877c7014.tar.gz
dotemacs-c87631f0c200556d99b2ccbcd838cdf6877c7014.zip
docs(design): fold ai-kb reviews 3-4 into the spec
Reviews 3 (Codex, via Nexus/GraphRAG/Letta research) and 4 pushed on the write loop and the access layer rather than scope. I folded both in. The write path is now a real protocol: fetch and fast-forward before writing, org-lint the node, regenerate the index, commit locally always, and treat the push as best-effort and non-blocking so a failed push never errors or hangs the agent. That's the exact gpg-agent failure we hit earlier today. The index is regenerated from node properties by a script rather than hand-maintained, so it can't drift from the nodes. The access layer became an agent-neutral contract that lives in the repo, fronted by a minimal ai-kb CLI (doctor, query, remember, lint, curate, sync) with destructive operations human-only. That earns its place on Claude-only grounds: it's the clean home for the safe-write protocol and the lint and index steps. Cross-agent use is not a near-term goal, so Codex and Ollama adapters are deferred to vNext. The contract stays neutral in shape, so they're additive later. Added provenance fields, the T1/T2/T3 tier names, and the review dispositions. The spec is now Ready.
Diffstat (limited to 'docs')
-rw-r--r--docs/design/ai-kb.org266
1 files changed, 151 insertions, 115 deletions
diff --git a/docs/design/ai-kb.org b/docs/design/ai-kb.org
index a4b7790a..b16bcad1 100644
--- a/docs/design/ai-kb.org
+++ b/docs/design/ai-kb.org
@@ -5,107 +5,115 @@
* Status
-Ready with caveats. Two reviews incorporated (=ai-kb-review.org= human+Claude, =ai-kb-review2.org= Codex; both 2026-05-24). Scope decided (memory store v1, LLM-Wiki deferred — see below). Storage decided (dedicated private git repo at an XDG path; Syncthing dropped). Findings that blocked readiness (version control, switch-state safety, startup surface, project-awareness) now have decisions. Remaining open items are small and named in [[*Open decisions][Open decisions]].
+Ready. Four reviews incorporated (=ai-kb-review.org=, =-review2.org=, =-review3.org=, =-review4.org=; all 2026-05-24). The four original blockers (version control + recovery, switch-state safety, startup surface, project-awareness) and the two write-loop caveats from review 4 (push-failure contract, index regeneration) have decisions. Review 3's operational shape (a repo-resident agent-neutral contract, a minimal CLI, maintenance commands, multi-agent provenance) is adopted. Cross-agent is *not a near-term goal* (Craig, 2026-05-24): v1 ships the Claude adapter over the neutral contract, and other-agent adapters (Codex/Ollama, MCP) are deferred to [[*vNext][vNext]]. Remaining open items are small — see [[*Open decisions][Open decisions]].
-In scope: Step 1 (store + global rule + provisioning) and Step 2 (Emacs browsing layer). Step 3 (migrating =.ai/sessions= and workflows in) and the full LLM-Wiki layer are *deferred to their own specs* — see [[*vNext][vNext]].
+In scope: Step 1 (store + contract/CLI + global rule + provisioning) and Step 2 (Emacs browsing layer). Step 3 (migrating =.ai/sessions= and workflows in) and the full LLM-Wiki layer are *deferred to their own specs* — see [[*vNext][vNext]].
* Scope decision: memory store, not (yet) an LLM Wiki
-ai-kb v1 is a *global, durable, cross-project memory store* for Claude Code: hand- or agent-authored org-roam nodes holding lessons, principles, Craig's preferences, reusable procedures, and durable observations. It is the concrete first slice of the broader "org-roam as agent memory" vision in [[file:agentic-knowledgebase.org][agentic-knowledgebase.org]].
+ai-kb v1 is a *global, durable, cross-project memory store* for AI coding agents (Claude Code today; agent-neutral by contract): org-roam nodes holding lessons, principles, Craig's preferences, reusable procedures, and durable observations. It is the concrete first slice of the broader "org-roam as agent memory" vision in [[file:agentic-knowledgebase.org][agentic-knowledgebase.org]].
-It is *not* a Karpathy-style LLM Wiki in v1. That pattern — immutable =raw/= sources, compiled =wiki/= synthesis pages, =schema.org=, source hashes, and full ingest/query/lint pipelines — is a larger product whose value is *grounding compiled knowledge in re-checkable sources*. v1 adopts the one piece of that idea that pays off immediately — a =raw/= capture for *external* sources, so a node compiled from an article/doc/transcript stays re-checkable (see [[*Grounding external sources][Grounding external sources]]) — but not the full compiled-=wiki/= layer, =schema.org=, source hashes, or ingest pipeline. The LLM-Wiki layer is the documented evolution path (see [[*vNext][vNext]]), and v1's structure is chosen so it can grow that way without a rewrite.
+It is *not* a Karpathy-style LLM Wiki in v1. That pattern — immutable =raw/= sources, compiled =wiki/= synthesis pages, =schema.org=, source hashes, and full ingest/query/lint pipelines — is a larger product whose value is *grounding compiled knowledge in re-checkable sources*. v1 adopts the one piece that pays off immediately: a =raw/= capture for *external* sources (see [[*Grounding external sources][Grounding external sources]]). The rest of that machinery is the documented evolution path (see [[*vNext][vNext]]); v1's structure is chosen so it can grow that way without a rewrite.
* Problem
-Claude Code starts every session cold. Continuity today is per-project and flat: file memory at =~/.claude/projects/<encoded-cwd>/memory/=, plus =.ai/notes.org= and =.ai/sessions/=. There is no home for durable, *general* knowledge that should follow the agent into every repo — engineering lessons, Craig's cross-project preferences, reusable procedures (e.g. "move a local repo to git.cjennings.net with a mirror-to-GitHub hook") — and no link structure relating one piece of knowledge to another.
+AI coding agents start every session cold. Continuity today is per-project and flat. There is no home for durable, *general* knowledge that should follow the agent into every repo — engineering lessons, Craig's cross-project preferences, reusable procedures (e.g. "move a local repo to git.cjennings.net with a mirror-to-GitHub hook") — and no link structure relating one piece of knowledge to another. A flat shared lessons file would solve "knowledge that follows the agent" alone; org-roam is chosen over it for *link structure*, *first-class browsing* (node-find, backlink buffer, graph), and a substrate that grows toward the agentic-KB vision. The complexity is earned by those three.
-A single shared lessons file in the rules layer would solve "general knowledge that follows the agent" on its own. org-roam is chosen over that flat file because it buys *link structure* (backlinks/forward-links the agent can traverse and the curation pass can exploit), *first-class browsing* (node-find, backlink buffer, graph) for Craig, and a substrate that grows toward the agentic-KB vision. The added complexity is earned by those three; a flat file gives none of them.
+* Memory tiers
-* Concept: two layers
+Naming the tiers (after Nexus's vocabulary) so every agent routes consistently:
-** Layer 1 — the store
+- *T1 — session scratch:* the current chat, spawned-agent handoffs. Ephemeral.
+- *T2 — project memory:* per-project =~/.claude/projects/<encoded-cwd>/memory/=, =.ai/notes.org=, active project decisions. Minor or project-specific.
+- *T3 — ai-kb:* global, durable, cross-project. Significant and general: lessons, principles, preferences, reusable procedures, durable observations.
-A git repository of org files (location in [[*Storage, version control, and recovery][Storage]]). Each note is a valid org-roam node. The agent reads and writes these files directly with its normal file tools and never touches the SQLite database. The files are the source of truth.
+T2's =MEMORY.md= shrinks toward an index: for significant items it points at the T3 (ai-kb) node rather than holding the content. ai-kb is *not* a dump for T1/T2 breadcrumbs — the proactive-write bar (below) keeps it T3-only.
-** Layer 2 — the Emacs/org-roam integration
+* Concept: two layers
-So Craig can browse with backlinks and the graph. org-roam keys off a single global =org-roam-directory= + =org-roam-db-location= per session, so the second database cannot be live alongside the personal roam. The integration is a *switch*: a command rebinds those variables to the ai-kb repo + its own database file, runs =org-roam-db-sync=, and now node-find and the backlink buffer operate on ai-kb. A companion command switches back. The switch carries a guard contract (see [[*The Emacs switch: guard contract][guard contract]]) because those globals have live side effects.
+- *Store* — a git repository of org files (each a valid org-roam node). The agent reads/writes these directly and never touches the SQLite database; the files are the source of truth.
+- *Emacs/org-roam integration* — so Craig can browse with backlinks and the graph. org-roam keys off one global =org-roam-directory= + =org-roam-db-location= per session, so ai-kb cannot be live alongside the personal roam; the integration is a *switch* with a guard contract (see [[*The Emacs switch: guard contract][guard contract]]).
* Storage, version control, and recovery
-ai-kb is its *own git repository* — not in =~/sync/org= (Syncthing has proven too unreliable for backup/restore: no history, silent =.sync-conflict= files on concurrent writes) and *not* in =~/.emacs.d= (that repo is publicly mirrored to GitHub, and ai-kb holds personal/work-private knowledge — it would leak).
+ai-kb is its *own git repository* — not in =~/sync/org= (Syncthing has proven unreliable for backup/restore: no history, silent =.sync-conflict= files) and *not* in =~/.emacs.d= (publicly mirrored to GitHub; ai-kb holds personal/work-private knowledge that would leak).
-- *Location:* =~/.local/share/ai-kb= (XDG =$XDG_DATA_HOME/ai-kb=). Simpler alternative if preferred: =~/.ai-kb=. (Confirm in [[*Open decisions][Open decisions]].)
-- *Origin:* a bare repo on =git.cjennings.net= (=git@cjennings.net:ai-kb.git=), *private — no public GitHub mirror*, unlike the other repos. This is the recovery layer: full history, clone-to-restore on any machine.
-- *No Syncthing.* git is the sole sync and backup. Multi-machine concurrency surfaces as ordinary git merges (recoverable), not silent conflict files.
-- *Validate, then auto-commit on write.* The write path validates the node with =org-lint= (see [[*Node validity (org-lint)][Node validity]]) and only on a clean pass appends =git -C <ai-kb> add -A && git commit -m "<one-line>" && git push=, so every change is captured and pushed and malformed org never reaches the index. Low-risk (single-user, recoverable), and it keeps the store and its history in lockstep without a manual step.
-- *Store layout (v1):* compiled nodes live at the repo root; a =raw/= subdirectory holds captured external sources (see [[*Grounding external sources][Grounding external sources]]). =org-roam-directory= points at the repo root with =raw/= *excluded* from the scan (=org-roam-file-exclude-regexp= matching =/raw/=), so raw captures never become noisy roam nodes. The LLM-Wiki vNext would add a compiled =wiki/= layer + =schema.org=; v1 keeps compiled nodes flat at root.
-
-* Why a separate database
+- *Location:* =~/.local/share/ai-kb= (XDG =$XDG_DATA_HOME/ai-kb=). Alternative: =~/.ai-kb= (see [[*Open decisions][Open decisions]]).
+- *Origin:* a bare repo on =git.cjennings.net= (=git@cjennings.net:ai-kb.git=), *private — no public GitHub mirror*. This is the recovery layer: full history, clone-to-restore.
+- *No Syncthing.* git is the sole sync and backup; multi-machine concurrency surfaces as ordinary git merges, not silent conflict files.
+- *org-roam scope:* =org-roam-directory= points at the repo root; =raw/= is *excluded* from the scan (=org-roam-file-exclude-regexp= matching =/raw/=) so raw captures never become noisy roam nodes. The LLM-Wiki vNext would add a compiled =wiki/= layer; v1 keeps compiled nodes flat at root.
-org-roam supports one active =org-roam-directory= / =org-roam-db-location= at a time. ai-kb gets its own directory (the repo above) and its own database file (=~/.emacs.d/org-roam-ai.db= — a regenerable cache, fine to keep in emacs home). The personal roam (=~/sync/org/roam/= + =~/.emacs.d/org-roam.db=, recipes etc.) is never scanned or modified. Switching moves between them.
+* Write protocol and synchronization
-* The sync model
+The agent writes nodes from the shell, possibly from several machines or concurrent processes, so the write path is a defined protocol, not a bare =git push=. Encapsulated in the =ai-kb remember= operation (see [[*The agent contract and operations][operations]]):
-org-roam keeps the =.org= *files* (truth) and a SQLite *database* (a cache indexing every node and =[[id:...]]= link) that powers Emacs's backlink buffer, node-find, and graph. Editing inside Emacs updates the cache on save via =org-roam-db-autosync-mode=. Agent shell writes don't fire an Emacs save, so the cache goes stale until =org-roam-db-sync= re-scans.
+1. *Before write:* =git fetch=; if behind and clean, =git pull --ff-only=; if diverged or the tree is dirty with unrelated changes, *abort and surface* — don't auto-merge.
+2. *Validate:* =org-lint= the node; reject on *error*-level problems (not warnings — see [[*Node validity (org-lint)][Node validity]]).
+3. *Regenerate the index* from node properties (see [[*Startup surface and retrieval contract][Startup surface]]).
+4. *Commit locally — always.* The local commit is the durable record.
+5. *Push — best-effort, non-blocking, never fatal.* A failed push (offline, network blip, gpg-agent SSH key not loaded — a real, observed failure mode) is *logged and ignored*, never errors or hangs the agent. The next successful write or a session-end flush carries the backlog. Push is debounced (commit per write; push on a timer or at session end) — local commits are already durable, so a round-trip per memory buys nothing and produces a long tail of one-line commits.
+6. *On push rejection* (remote moved): do *not* blind-retry. Fetch, report the divergence, leave the local commit intact for resolution.
+7. *Same-machine concurrency:* =flock= around =remember= serializes concurrent agents (Claude + Codex + an Emacs save) so they don't race. A v1 file lock; not a daemon.
-The key consequence: *the agent does not need the database to check links* — links live in the files. Forward links are the =[[id:UUID]]= entries in a node's file; backlinks are every file containing =[[id:<thisID>]]=. The agent computes both by grepping, always current regardless of sync. *Craig's Emacs browsing* needs the cache current, so the switch-to-ai-kb command runs =org-roam-db-sync= on entry. The agent may also fire =emacsclient -e '(cj/ai-kb-db-sync)'= after a write for immediacy, but that is convenience, never a correctness requirement — agent correctness never depends on Emacs running.
+* Why a separate database
-* Memory routing (tiering)
+org-roam supports one active =org-roam-directory= / =org-roam-db-location= at a time. ai-kb gets its own directory (the repo above) and its own database (=~/.emacs.d/org-roam-ai.db= — a regenerable cache). The personal roam (=~/sync/org/roam/= + =~/.emacs.d/org-roam.db=, recipes etc.) is never scanned or modified.
-ai-kb shrinks the per-project memory files toward an *index*:
+* The sync model
-- *ai-kb* ← anything significant and *general*: engineering lessons and principles, Craig's cross-project preferences, reusable procedures, durable observations worth recall in any future session, in any repo. A "general but Emacs-flavored" lesson lives here tagged =:emacs:=, not forced into a project's memory.
-- *Per-project claude memory files* ← minor or project-specific facts and session breadcrumbs. For significant items, =MEMORY.md= points at the ai-kb node (by title/id) rather than holding the content.
+The =.org= files are truth; the SQLite db is a cache indexing nodes and =[[id:...]]= links that powers Emacs's backlink buffer, node-find, and graph. Editing in Emacs updates the cache on save (=org-roam-db-autosync-mode=); agent shell writes don't, so =org-roam-db-sync= re-scans. The key consequence: *the agent never needs the db to check links* — they live in the files and are grepped (always current). *Craig's Emacs browsing* needs the cache current, so the switch-to-ai-kb command syncs on entry; the agent may also fire =emacsclient -e '(cj/ai-kb-db-sync)'= for immediacy, but agent correctness never depends on Emacs running.
* Proactive-write rule
-The agent writes a node *unprompted* when something is **durable** (true beyond this session) *and* **general** (not tied to the current repo; project-specific knowledge goes to the per-project memory file). The bar, to keep out noise: it must be genuinely worth recalling or linking later — a principle, a reusable procedure, a preference, a non-obvious lesson — not routine status or anything re-derivable from code or git. New nodes link to related existing ones (grep candidates by title/tag first), and the agent updates the index node (see [[*Startup surface and retrieval contract][Startup surface]]).
+The agent writes a node *unprompted* when something is **durable** (true beyond this session) *and* **general** (T3, not tied to the current repo; project-specific knowledge goes to T2). The bar, to keep out noise: genuinely worth recalling or linking later — a principle, a reusable procedure, a preference, a non-obvious lesson — not routine status or anything re-derivable from code or git. New nodes link to related existing ones (grep candidates by title/tag first) and trigger an index regeneration.
-*Contradiction guard:* if a write would contradict an existing node that affects agent behavior or a stated preference, the agent does *not* silently overwrite. It marks both as =:STATUS: contested=, records the conflict, and asks Craig before changing the canonical node.
+*Contradiction guard:* if a write would contradict an existing node that affects agent behavior or a stated preference, the agent does *not* silently overwrite. It marks both =:STATUS: contested=, records the conflict, and asks Craig before changing the canonical node.
* Node format and conventions
#+begin_src org
:PROPERTIES:
-:ID: <uuid, generated with `uuidgen`>
-:PROJECTS: :general: ; or :deepsat: :emacs: ... (relevant project slugs)
-:CREATED: 2026-05-24
-:UPDATED: 2026-05-24
-:SOURCE: chat 2026-05-24 ; free-form: chat, a session file, a spec path, a URL
-:STATUS: current ; current | contested | superseded
+:ID: <uuid, generated with `uuidgen`>
+:PROJECTS: :general: ; or :deepsat: :emacs: ... (see slug rule below)
+:CREATED: 2026-05-24
+:UPDATED: 2026-05-24
+:CREATED_BY: claude-code ; claude-code | codex | ollama | human
+:CONFIDENCE: user-stated ; user-stated | observed | inferred | external
+:VISIBILITY: personal ; personal | work-private
+:SOURCE: chat 2026-05-24 ; free-form, or a raw/ path for external sources
+:STATUS: current ; current | contested | superseded
:END:
#+title: Concise node title
#+filetags: :principle:emacs:
-Body. Link related nodes with [[id:OTHER-UUID][Their title]].
+Body. Link related nodes with [[id:OTHER-UUID][Their title]], optionally prefixed
+with a relation label: SUPERSEDES, CONTRADICTS, RELATES_TO, IMPLEMENTS, DERIVED_FROM.
#+end_src
-- *Filename:* org-roam convention — =YYYYMMDDHHMMSS-slug.org= (or =slug.org= for stable, frequently-linked nodes).
-- *ID:* a real UUID (=uuidgen=) — org-roam won't index a node without a valid =:ID:=.
-- *Type tags* (=#+filetags:=): =:principle:=, =:preference:=, =:procedure:=, =:observation:=, =:reference:=.
-- *Project provenance:* =:PROJECTS:= property lists relevant project slugs; =:general:= marks truly cross-cutting nodes. Drives project-filtered startup surfacing.
-- *Provenance-lite:* =:CREATED:/:UPDATED:/:SOURCE:/:STATUS:=. (Source *hashes* and confidence levels are LLM-Wiki grounding machinery — deferred to vNext.)
+- *Filename:* org-roam convention — =YYYYMMDDHHMMSS-slug.org= (or =slug.org= for stable, frequently-linked nodes; prefer stable slugs for nodes that =MEMORY.md= will point at, so curation merges don't dangle the pointer).
+- *ID:* a real UUID (=uuidgen=) — org-roam won't index a node without one.
+- *Type tags* (=#+filetags:=): =:principle:= =:preference:= =:procedure:= =:observation:= =:reference:=.
+- *Project slugs* (=:PROJECTS:=): derived from the project directory basename (so =~/.emacs.d= → =:emacs:=, the DeepSat repo → =:deepsat:=), with =:general:= for cross-cutting nodes. The derivation rule lives in the contract so every agent produces the same slug; new slugs are recorded in the index's project list.
+- *Provenance:* =:CREATED_BY:= and =:CONFIDENCE:= let later curation and trust policy distinguish "Craig stated this" from "a model inferred it." =:CONFIDENCE:= here is *provenance* (how the claim was obtained), not a numeric grounding score — the latter is vNext. =:VISIBILITY:= is two-valued in v1 (the full =public|work-private|secret= taxonomy is vNext); secrets are never stored at all (see [[*Security and privacy][Security]]).
+- *Relation labels:* a small fixed vocabulary used in link context now; full typed-link catalog storage is vNext.
* Grounding external sources
-The one piece of the LLM-Wiki pattern adopted in v1, because its payoff is immediate: keep compiled knowledge *re-checkable against its source* wherever a source exists.
-
-- *Node authored from an external source* — a web article, a fetched doc, a transcript, an API result — captures the source under =raw/=: the fetched text/file, or for a URL a small =raw/<slug>.org= stub with the URL, retrieval date, and the relevant excerpt. The node's =:SOURCE:= points at that raw path. A later agent can then re-ground a suspicious node against the original instead of trusting its own prior summary — the failure mode where a wiki starts quoting itself as evidence.
-- *Node authored from the conversation or direct observation* — a lesson, a preference, an observation about a codebase — needs only the free-form =:SOURCE:= pointer (the chat, the session file, the repo). No raw capture: the source is not an external artifact, so there is nothing to preserve.
-- =raw/= is append-only in spirit (sources are not edited after capture) and is excluded from org-roam's scan, so it never clutters the graph.
+The one LLM-Wiki piece adopted in v1: keep compiled knowledge re-checkable where an external source exists.
-This is deliberately *selective*: a blanket =raw/= layer for every node would be overhead, since most agent memories have no external source. The full compiled-=wiki/= layer, source hashes, and confidence scoring — the rest of the grounding machinery — wait for vNext, when external ingestion is a real workflow rather than an occasional capture.
+- *Node authored from an external source* (web article, fetched doc, transcript, API result): capture under =raw/= and point =:SOURCE:= at that path. *By default store the URL, retrieval date, and the relevant excerpt* — store full external text only when it is user-owned, licensed for the use, or operationally necessary (this is a private KB, but copyright still applies). A later agent can re-ground a suspicious node against the source instead of trusting its own prior summary.
+- *Node authored from the conversation or direct observation*: only the free-form =:SOURCE:= pointer; no raw capture (the source is not an external artifact).
+- =raw/= is append-only in spirit and excluded from org-roam's scan.
* Startup surface and retrieval contract
-Passive grep-on-demand gets under-used — a memory not surfaced at startup behaves like no memory. But loading the whole KB into every session wastes context. The contract is two-tier (reconciling both reviews):
+Passive grep-on-demand gets under-used; loading the whole KB wastes context. Two tiers:
-- *L1 — always loaded:* the global rule =claude-rules/ai-kb.md= (tiny). It carries the path, the routing rule, the link-grep recipes, and the instruction: *when a task may involve durable preferences, known procedures, prior decisions, or cross-project knowledge, read the index first.*
-- *L2 — on demand:* =index.org= at the ai-kb root — a compact, generated navigation map (title, id, one-line, type, project, updated, status), optionally project-filtered. Read at session start only when L1's condition applies.
-- *Full nodes* are read only when the index points at them or Craig asks.
+- *L1 — always loaded:* the global rule adapter (=claude-rules/ai-kb.md= for Claude), tiny. It carries the path, routing rule, link-grep recipes, and: *when a task may involve durable knowledge, read the index first.* Include concrete example triggers so it actually fires — e.g. "before choosing a formatter/test/lint convention," "before a multi-step procedure you've likely done before (repo setup, release, deploy)," "when Craig states a preference," "when you hit a non-obvious gotcha worth keeping."
+- *L2 — on demand:* =index.org= at the repo root, read at session start only when L1's condition applies.
+- *Full nodes* read only when the index points at them or Craig asks.
-=index.org= shape (sections by type/project; a "Contested / needs review" section; a size budget — when it outgrows the budget, split into =index-procedures.org= etc. rather than bloating one file):
+=index.org= is *generated output*, never hand-maintained — that is what keeps it from drifting from the nodes. A regeneration script greps node properties (=#+title:=, =:ID:=, type tag, =:PROJECTS:=, =:UPDATED:=, =:STATUS:=) and rebuilds the file with a "generated, do not edit" marker. It runs in provisioning, in the curation pass, on demand, and as step 3 of every =remember=. =lint --index= checks: every listed id resolves, every =current= node is listed, contested/superseded sections are accurate, the size budget holds (split into =index-procedures.org= etc. when exceeded).
#+begin_src org
* Procedures
@@ -118,67 +126,83 @@ Passive grep-on-demand gets under-used — a memory not surfaced at startup beha
* Checking links (agent recipes)
-No database needed; grep the files.
+No database needed; grep the files (excluding =raw/=):
+- *Backlinks to a node* — =rg -l "id:<UUID>" ~/.local/share/ai-kb --glob '*.org' --glob '!raw/**'=.
- *Forward links from a node* — grep that node's file for =id:= links.
-- *Backlinks to a node* — =grep -rl "id:<UUID>" ~/.local/share/ai-kb/=.
- *Find a node to link to* — grep titles/tags.
* Node validity (org-lint)
-Because the agent writes nodes as raw org from the shell — bypassing Emacs's structural editing — a malformed drawer, a bad property line, or a broken timestamp can slip in. =org-roam-db-sync= would then choke on or silently mis-index that node, and it would render wrong in Emacs. This is a *syntactic validity* check, distinct from the link-grep and credential scans above (which check *content*); both run, on different things.
+The agent writes raw org from the shell, bypassing Emacs's structural editing, so malformed org (broken drawer, bad property, broken timestamp) can slip in and make =org-roam-db-sync= choke or mis-index. Distinct from the semantic link/credential checks; both run.
-- *On write (the corruption guard):* after writing or editing a node, validate it with =org-lint= via =emacs --batch=, reusing/extending the project's existing =scripts/lint-org.el=. A node that *fails org-lint is not committed* — malformed org never enters the store or the index. This is part of the write path, alongside the auto-commit.
-- *In curation:* an =org-lint= sweep over all nodes catches anything that drifted or was hand-edited badly in Emacs after the fact.
+- *On write:* =org-lint= via =emacs --batch=, gating on *error*-level (structural) problems only — a benign style warning must not reject a good node. A node that fails is not committed.
+- *In curation:* an =org-lint= sweep over all nodes catches drift or bad Emacs-side hand-edits.
-This is cheap (a sub-second batch call on a single small file) and is the safety net that makes "the agent writes raw org files" trustworthy.
+Cheap (sub-second batch on one small file); the safety net that makes "the agent writes raw org" trustworthy. Reuses/extends the project's =scripts/lint-org.el=.
-* The Emacs switch: guard contract
+* The agent contract and operations
-The switch is not clean variable-rebinding. =org-roam-db-autosync-mode= is on, and a global =org-after-todo-state-change-hook= (=cj/org-roam-copy-todo-to-today=) copies completed tasks into the *active* roam's daily. Naive rebinding means completing a task or capturing while switched writes into ai-kb, and a forgotten switch-back silently misroutes personal captures. So:
+The access layer is an *agent-neutral contract*, not a Claude-only prompt snippet. Cross-agent use is not a near-term goal (deferred to vNext), but making the contract repo-resident and neutral *in shape* costs nothing now, future-proofs that path, and — more importantly for v1 — the CLI earns its place on Claude-only grounds: it is the clean, atomic, testable home for the write protocol, index regeneration, and lint, far better than scattering them across prose rules.
-- *On entry* (=cj/org-roam-switch-to-ai-kb=): rebind =org-roam-directory= + =org-roam-db-location= to ai-kb; *rescope or disable* the completed-task→daily hook so personal-task completions never land in ai-kb (and vice-versa); run =org-roam-db-sync=; surface the active KB in the modeline/echo so a half-switched state is visible.
-- *On exit* (=cj/org-roam-switch-to-personal=): restore both variables to the personal values *exactly*, and restore the hook.
-- The commands state these guarantees; tests assert the completed-task hook does not fire into ai-kb while switched.
+- *Canonical contract:* lives *in the repo* (=~/.local/share/ai-kb/AGENT_CONTRACT.org=) — the source of truth for the node format, routing rule, write protocol, and operations. It travels with the store.
+- *Adapters* point at it: =claude-rules/ai-kb.md= (symlinked into =~/.claude/rules/= by rulesets =make install=) is the Claude adapter. Other agents get their own thin adapter when wanted (deferred — see [[*Open decisions][Open decisions]]).
+- *Operations* — a small =ai-kb= CLI (shell, calling =emacs --batch= for org-lint/index work) is the canonical surface, so humans and every agent share one contract:
+ - =ai-kb doctor= — repo present, remote reachable + private, branch state, org-roam db buildable, required tools installed, adapter linked, no obvious secrets.
+ - =ai-kb index= — regenerate =index.org= from node properties.
+ - =ai-kb query <context>= — read the index, return relevant node ids/summaries + raw paths.
+ - =ai-kb remember= — the write protocol above (fetch/ff, validate, regenerate index, commit, best-effort push, under =flock=).
+ - =ai-kb lint= — org-lint, duplicate ids, broken id-links, missing required properties, bad project slugs, stale index, credential scan.
+ - =ai-kb curate --dry-run= — report duplicates, orphans, contested/superseded nodes, raw captures with no compiled node, nodes untouched past a horizon.
+ - =ai-kb sync= — =org-roam-db-sync= against ai-kb (Emacs-side helper).
+- *Admin split:* destructive operations — merge nodes, delete a node or raw capture, rewrite backlinks, mark superseded — are *human-confirmed only*, never automatic.
+- *Capability levels* (named so adapters know their lane): =file-only= (read/grep/template-write), =cli= (call =ai-kb=), =mcp= and =semantic= are vNext. Claude v1 uses =cli= with the rule adapter; until the CLI exists, =file-only= following the contract template is the bootstrap path.
-* Curation
+* The Emacs switch: guard contract
-The proactive-write bar controls intake; nothing controls rot. Over months the KB accrues near-duplicates, superseded nodes, and orphans. A human-gated curation pass (a "task-review for memory"), periodic or node-count-triggered, surfaces four buckets — duplicates to merge, stale/superseded nodes, orphans (no back- or forward-links), over-broad nodes to split. Craig decides; the agent executes, repointing =[[id:]]= backlinks on merges (grep + rewrite). A =:LAST_CURATED:= stamp rotates the pass through least-recently-touched nodes. org-roam's backlinks/tags/graph make it a better curation substrate than a flat file. (The full workflow is a Step-1.5 follow-up; v1 ships the convention and the stamp.)
+=org-roam-db-autosync-mode= is on, and a global =org-after-todo-state-change-hook= (=cj/org-roam-copy-todo-to-today=) copies completed tasks into the *active* roam's daily. Naive rebinding means a task completion or capture while switched lands in the wrong roam.
-* Security and privacy
+- *On entry* (=cj/org-roam-switch-to-ai-kb=): rebind =org-roam-directory= + =org-roam-db-location=; *rescope or disable* the completed-task→daily hook; =org-roam-db-sync=; surface the active KB in the modeline/echo.
+- *On exit* (=cj/org-roam-switch-to-personal=): restore both variables *exactly* and restore the hook.
+- *Abnormal exit:* if Emacs is killed while switched, on-exit never runs. The config re-asserts personal-roam state at startup (or detects a stale switched state), so a crash can't leave the completed-task hook rescoped into ai-kb.
+- Tests assert the completed-task hook does not fire into ai-kb while switched.
-ai-kb lives in a *private* repo (cjennings.net only, no public mirror), which removes the main leak surface. v1 rule: *ai-kb is private but not a secret store* — no credentials, tokens, or keys in nodes; the curation/lint pass scans =raw/= (none in v1) and =wiki/= for common credential patterns before commit. The full source-classification taxonomy (=:VISIBILITY: public|personal|work-private|secret=) is deferred to vNext, when sharing/publishing or a public/private split is actually on the table.
+* Maintenance and curation
-* What Claude needs to leverage it
+The proactive-write bar controls intake; nothing controls rot, and the system creates memories unprompted, so a minimal maintenance loop is v1 (read-only commands; destructive execution human-gated):
-The load-bearing requirement: because ai-kb is used from *every* project, the agent spec lives in the *global rules layer* (=~/code/rulesets/claude-rules/ai-kb.md=), installed by the rulesets =make install= as a symlink into =~/.claude/rules/= and loaded into every session. A note only in this repo's =CLAUDE.md= would not reach the agent in another repo. Step 1 is not complete until that rule is written *and* =make install= has linked it.
+- =ai-kb doctor= / =ai-kb lint= — health and validity (above).
+- =ai-kb curate --dry-run= surfaces four buckets — duplicates to merge, stale/superseded nodes, orphans (no back- or forward-links), over-broad nodes to split. Craig decides; the agent executes *human-confirmed* merges/splits, repointing =[[id:]]= backlinks (grep + rewrite) and re-linting. A =:LAST_CURATED:= stamp rotates the pass through least-recently-touched nodes.
+- *Pointer integrity:* before deleting or merging a node, grep for inbound references (other nodes' =[[id:]]= and per-project =MEMORY.md= pointers) and repoint them; prefer stable =slug.org= names for pointer targets.
-ai-kb is *intentionally global* and crosses the per-project =.ai/= scope boundary by design — the agent's own knowledge base, not any single project's scope. This is the one sanctioned exception to =cross-project.md=.
+* Security and privacy
-* Provisioning
+ai-kb lives in a *private* repo (cjennings.net only, no public mirror), removing the main leak surface. v1 rule: *private but not a secret store* — no credentials/tokens/keys in nodes or =raw/=; =ai-kb lint= scans both for common credential patterns before commit and *fails* on a hit (secrets move to a secure reference, not ai-kb). =:VISIBILITY:= is two-valued (=personal= / =work-private=) in v1; the full =public|work-private|secret= taxonomy and a public/private split are vNext, for when sharing or publishing a subset is real.
-The pieces span three homes; name and order them. =make ai-kb-init= (wrapping =scripts/setup-ai-kb.sh=) is idempotent:
+* Provisioning
-1. Clone or init the ai-kb git repo at =~/.local/share/ai-kb= (bare origin =git@cjennings.net:ai-kb.git=).
-2. Seed =index.org= and a README/index node with a generated =:ID:=, =#+title:=, =#+filetags:= if absent.
-3. Best-effort initial sync: if an Emacs server is running, =emacsclient -e '(cj/ai-kb-db-sync)'= to build =org-roam-ai.db=; skip silently otherwise.
-4. Ensure the global rule is active: =cd ~/code/rulesets && make install= (symlinks =claude-rules/ai-kb.md= into =~/.claude/rules/=).
+The pieces span the rulesets repo, this repo, and the ai-kb repo. =make ai-kb-init= (wrapping =scripts/setup-ai-kb.sh=) is idempotent. *One-time server bootstrap* (distinct from the per-machine clone, and not doable by the local script): =sudo git init --bare /var/git/ai-kb.git && chown= on cjennings.net, plus the github-mirror hook left *off* for this repo.
-Fresh-machine order: (a) ai-kb repo cloned, (b) =make ai-kb-init= seeds + builds the db, (c) rulesets =make install= so the global rule is linked.
+Per-machine, ordered:
+1. Clone =git@cjennings.net:ai-kb.git= to =~/.local/share/ai-kb= (or =git init= + add remote on the very first machine).
+2. =make ai-kb-init=: seed =index.org= + a README/index node with a generated =:ID:=; install the =ai-kb= CLI; =ai-kb index=; best-effort =ai-kb sync= if an Emacs server is up.
+3. =cd ~/code/rulesets && make install= — symlinks the =claude-rules/ai-kb.md= adapter into =~/.claude/rules/=.
+4. =ai-kb doctor= to confirm the machine is wired correctly.
* Build plan
-** Step 1 — store + global rule + provisioning (immediate value)
+** Step 1 — store + contract/CLI + global rule + provisioning
-- The =ai-kb= git repo (bare on cjennings.net + clone at the XDG path) with seed =index.org=.
-- =~/code/rulesets/claude-rules/ai-kb.md= — the global L1 rule (path, node format incl. provenance + project tags, routing rule, proactive + contradiction rules, external-source raw-capture, link-grep recipes, "read the index first", validate-with-org-lint-then-auto-commit-on-write, no-secrets rule).
-- =scripts/setup-ai-kb.sh= + =make ai-kb-init=; confirm =make install= links the rule.
+- The =ai-kb= git repo (bare on cjennings.net + clone at the XDG path), seed =index.org=, =AGENT_CONTRACT.org=.
+- The minimal =ai-kb= CLI (=doctor/index/query/remember/lint/curate/sync=) implementing the write protocol, index regeneration, org-lint gating, credential scan, =flock=.
+- =claude-rules/ai-kb.md= adapter (points at the contract; routing + proactive + contradiction rules + concrete L1 triggers + "use =ai-kb remember=, never bypass =ai-kb lint="); =make install= links it.
+- =scripts/setup-ai-kb.sh= + =make ai-kb-init=; the one-time server bootstrap documented.
-After Step 1 the agent can write nodes, check links, and auto-commit immediately, before the Emacs layer exists.
+After Step 1 the agent can remember, query, lint, and curate-report immediately, before the Emacs layer exists.
** Step 2 — Emacs browsing layer
-In =org-roam-config.el=: ai-kb directory constant + =org-roam-ai.db=; =cj/org-roam-switch-to-ai-kb= / =cj/org-roam-switch-to-personal= with the guard contract above; =cj/ai-kb-db-sync= helper; keybindings under =C-c n= (e.g. =C-c n a= ai-kb / =C-c n A= back, avoiding the dense existing =l/f/p/r/t/i/w/I/d=); which-key labels; ERT tests + =/review-code=.
+In =org-roam-config.el=: ai-kb dir constant + =org-roam-ai.db=; =cj/org-roam-switch-to-ai-kb= / =…-to-personal= with the guard contract (incl. abnormal-exit re-assert); =cj/ai-kb-db-sync=; =C-c n= keybindings (e.g. =C-c n a= / =C-c n A=, avoiding the dense existing set); which-key labels; ERT tests + =/review-code=.
** Step 3 and the LLM-Wiki layer — deferred
@@ -186,59 +210,71 @@ Separate specs. See [[*vNext][vNext]].
* Test strategy
-- *Step 2 ERT* (=tests/test-<module>.el=, =make test-unit=): switch sets the ai-kb dir + db; switch-back restores personal values exactly; the completed-task hook does *not* fire into ai-kb while switched; the sync helper is callable.
-- *Provisioning* (bats/shell): =setup-ai-kb.sh= idempotent; seeds a node with a valid =:ID:=; initializes/validates git.
-- *Link recipes* (fixture KB): backlink-by-grep and forward-link-by-grep return correct sets.
-- *Node validity:* a well-formed node passes =org-lint=; a deliberately malformed node (broken drawer / bad property) fails, and the write path refuses to commit it.
+- *CLI / write path:* a node write with the remote unreachable still commits locally and does *not* error the agent (push deferred); =flock= serializes concurrent =remember=; =org-lint= error-level rejects a malformed node, a style warning does not.
+- *Index:* regeneration from a fixture KB produces the expected entries; a node added out-of-band appears only after regeneration (proves no drift); =lint --index= flags a missing/stale entry.
+- *Link recipes* (fixture KB): backlink-by-grep (excluding =raw/=) and forward-link-by-grep return correct sets.
+- *Step 2 ERT:* switch sets the ai-kb dir+db; switch-back restores personal exactly; the completed-task hook does not fire into ai-kb while switched; startup re-asserts personal state.
+- *Provisioning* (bats): =setup-ai-kb.sh= idempotent; seeds a node with a valid =:ID:=; =doctor= passes on a freshly-provisioned repo.
* Scaling path (planned, not built)
- v1: =rg= over org files + a generated =index.org=.
-- v1.5: a scripted =ai-kb-search= over title/tags/properties/body.
-- vNext: a local BM25/vector tool (e.g. =qmd=) over the nodes, preserving links; no embeddings in v1.
+- v1.5: =ai-kb query= grows richer ranking over title/tags/properties/body.
+- vNext: a local BM25/vector tool (=qmd= or similar) over the nodes, preserving links; no embeddings in v1.
* Review dispositions
-Everything not listed here was accepted as written and woven into the body above. Listed: the modified and rejected recommendations, with reasons.
-
-- *Review 2 core reframe → MODIFIED (scope).* v1 is the org-roam memory store, not a full Karpathy LLM Wiki. Per Review 2's own off-ramp; matches Craig's stated intent (durable memory, not raw-source compilation). The LLM-Wiki layer is the documented vNext.
-- *Review 2 #1 (raw/wiki/schema separation) → PARTIALLY ADOPTED.* v1 adds a =raw/= capture for *external* sources only (see Grounding external sources), because that is where re-checkability pays off immediately. The compiled =wiki/= layer and =schema.org= stay vNext — most agent memories have no external source, so a blanket raw/wiki split would be overhead.
-- *Review 2 #2 (full ingest/query/lint operations) → MODIFIED.* Query = index + grep; semantic lint folds into the curation pass; =org-lint= syntactic validation is now an explicit write-time guard (see Node validity); the heavy ingest pipeline (source registration/compilation) → vNext.
-- *Review 2 #3 (full provenance: SOURCES + hashes + confidence) → MODIFIED to provenance-lite.* Adopted =:CREATED:/:UPDATED:/:SOURCE:/:STATUS:=; dropped source hashes + confidence (they serve raw-source grounding, deferred).
-- *Review 2 #8 (exclude =raw/= from org-roam's scan) → ADOPTED.* Now that v1 has a =raw/=, =org-roam-file-exclude-regexp= keeps raw captures out of the graph so they don't become noisy nodes.
-- *Review 2 #10 (full =:VISIBILITY:= taxonomy + credential lint) → MODIFIED to a v1 no-secrets rule + lint scan.* Private-repo location handles the main concern; the four-level taxonomy → vNext when publishing/sharing is real.
-- *Review 1 #5 (curation workflow) → ACCEPTED, partially deferred.* v1 ships the convention + =:LAST_CURATED:= stamp; the full human-gated workflow is a Step-1.5 follow-up.
-- *Storage location → Option 1 (emacs home) REJECTED* (public GitHub mirror would leak personal/work knowledge); *Option 3-XDG ACCEPTED* (dedicated private repo at =~/.local/share/ai-kb=); Syncthing dropped per Craig.
+Everything not listed was accepted as written and woven in. Listed: modified, rejected, or owner-deferred recommendations, with reasons.
+
+- *Review 2 core reframe → MODIFIED (scope).* v1 is the memory store, not a full LLM Wiki; per Review 2's own off-ramp and Craig's stated intent.
+- *Review 2 #1 (raw/wiki/schema) → PARTIALLY ADOPTED.* =raw/= for external sources only; compiled =wiki/= + =schema.org= stay vNext (most memories have no external source).
+- *Review 2 #2 (ingest/query/lint) → MODIFIED.* query = index + =rg= (now =ai-kb query=); lint = =ai-kb lint= + curation; org-lint is the write-time validity gate; the heavy ingest pipeline → vNext.
+- *Review 2 #3 (provenance + hashes + numeric confidence) → MODIFIED to provenance-lite,* reconciled with Review 3: adopted =:CREATED_BY:/:CONFIDENCE:(provenance)/:VISIBILITY:/:SOURCE:/:STATUS:=; dropped source *hashes* and numeric confidence scoring (raw-corpus grounding machinery → vNext).
+- *Review 2 #8 (exclude =raw/= from scan) → ADOPTED.*
+- *Review 2 #10 (full visibility taxonomy + lint) → MODIFIED:* v1 = private repo + no-secrets rule + credential lint; four-level taxonomy → vNext.
+- *Review 3 #1 (agent-neutral contract + CLI) → ADOPTED (contract + CLI); cross-agent ADAPTERS deferred to vNext (Craig, 2026-05-24).* The contract lives in the repo and a minimal CLI is the operation surface — justified on Claude-only correctness (atomic safe writes), with neutrality as cheap future-proofing. Codex/Ollama adapters + MCP wait until cross-agent is actually adopted.
+- *Review 3 capability levels =mcp=/=semantic= → DEFERRED.* vNext.
+- *Review 4 #1 (push-failure contract) → ADOPTED,* and strengthened to debounced best-effort push (commit always; push never blocks/fails the agent) — directly informed by the gpg-agent SSH failure observed this session.
+- *Review 4 #2 (index regeneration) → ADOPTED:* generated by =ai-kb index=, never hand-maintained.
+- *Storage location → Option 1 (emacs home) REJECTED* (public mirror leaks); *XDG dedicated private repo ADOPTED;* Syncthing dropped.
+- *Curation full workflow → kept v1-minimal:* read-only =curate --dry-run= ships v1; the interactive merge/split flow is human-gated and its cadence is an Open decision.
* Agreed decisions
- Building from the rulesets session is sanctioned cross-project work (Craig, 2026-05-24).
-- ai-kb is intentionally global and the one sanctioned exception to =cross-project.md=.
+- ai-kb is intentionally global; the one sanctioned exception to =cross-project.md=.
- Scope: memory store v1; LLM Wiki deferred.
-- Storage: dedicated private git repo, XDG path, no Syncthing, auto-commit-on-write.
+- Storage: dedicated private git repo, XDG path, no Syncthing.
+- Write path: commit always, push best-effort/non-blocking/debounced; safe-fetch before; =flock=.
+- Operations are an agent-neutral contract fronted by a minimal =ai-kb= CLI; destructive ops human-only.
+- Cross-agent is not a near-term goal (Craig, 2026-05-24): v1 ships the Claude adapter; other-agent adapters + MCP are deferred to vNext. The contract stays neutral in shape so they are additive later.
* Open decisions
-- [ ] Store path: =~/.local/share/ai-kb= (XDG, recommended) vs =~/.ai-kb= (dotdir). Taste call; everything else is identical.
-- [ ] Curation cadence/trigger (calendar vs node-count) and where the Step-1.5 workflow lives (rulesets =.ai/workflows/=).
+- [ ] *Store path:* =~/.local/share/ai-kb= (XDG, recommended) vs =~/.ai-kb=.
+- [ ] *CLI language:* shell wrapper calling =emacs --batch= (recommended) vs Elisp-first.
+- [ ] *Push debounce:* timer interval vs flush-at-session-end only.
+- [ ] *Curation cadence/trigger* (calendar vs node-count) and where the workflow lives (rulesets =.ai/workflows/=).
* vNext
-Each idea below is valuable but out of v1 scope. The v1 bar is *token-efficient and fully recoverable*; an idea earns v1 inclusion only if it improves recall or grounding *now*, not at hypothetical scale. The reason for deferring or declining each is stated so a future reader (or reviewer) need not re-litigate it.
+Each is valuable but out of v1 scope; the v1 bar is token-efficient, safe, and fully recoverable. Reasons given so a future reader need not re-litigate.
-- *Step 3 — migrate =.ai/sessions/= and =.ai/workflows/= into ai-kb* (sessions as dated log nodes, workflows as procedure nodes, linkable); its own spec. *Why not v1:* moving existing, working systems is a migration with its own tradeoffs. Build ai-kb, live with it, then decide whether the move earns its disruption.
-- *Compiled =wiki/= layer + =schema.org=* (synthesis pages held distinct from their sources). *Why not v1:* v1 already captures external sources under =raw/= and authors compiled nodes at the root. A formal source-vs-synthesis split only pays off once external ingestion is frequent enough that re-compiling synthesis across many sources is routine — until then it is structure without a workload.
-- *Source hashes + =:CONFIDENCE:= scoring.* *Why not v1:* hash-based drift detection only has value against a substantial =raw/= corpus to check against. With occasional captures, the =:SOURCE:= path + =:STATUS:= already let a future agent re-ground by hand. It adds bookkeeping with no current payoff.
-- *Formal ingest / query / lint operations.* *Why not v1:* "query" is already index-first + =rg=; "lint" already folds into the curation pass (broken links, orphans, duplicates, credential scan). Only the heavy "ingest pipeline" — register a source, compile across many pages, update index and log atomically — is genuinely new, and that is the external-corpus workflow that triggers the =wiki/= layer above. Premature without that workload.
-- *Semantic / embedding retrieval and =qmd=-style local search.* *Why not v1:* find-by-meaning is a real recall gain, but only above roughly hundreds of nodes. Below that, =rg= + the generated index builds faster, has no index-staleness, and adds no dependency. See [[*Scaling path (planned, not built)][Scaling path]] — adopt when the index stops fitting comfortably, not before. No embeddings in v1.
-- *Append-only =log.org= (chronological operation log).* *Why not v1:* a navigation/debugging aid, not a capability gain; git history already records every write via auto-commit. Cheap to add later if the git log proves too coarse.
-- *Source-classification taxonomy* (=:VISIBILITY: public|personal|work-private|secret=) and a public/private split. *Why not v1:* the dedicated *private* repo already removes the main leak surface, and the v1 no-secrets rule + credential lint cover the floor. The four-level taxonomy earns its place only when sharing or publishing a subset is actually on the table.
-- *Full agentic-knowledgebase vision* — project-hub nodes; person/decision/thread/meeting/problem/runbook node types; the =cj/agent-*= command set. *Why not v1:* a much larger product (see [[file:agentic-knowledgebase.org][agentic-knowledgebase.org]]); ai-kb is its first concrete slice and proves the substrate first.
-- *Live dual-roam browsing* — personal roam + ai-kb visible at once, no switch. *Why not v1:* org-roam supports one active database per session, so the switch is the only option today. Revisit if org-roam gains multi-db support, or via a second Emacs instance.
+- *Step 3 — migrate =.ai/sessions/= + workflows into ai-kb.* Its own spec. *Why not v1:* moving working systems is a migration with its own tradeoffs; prove ai-kb first.
+- *Other-agent adapters (Codex, Ollama) + MCP server.* *Why not v1:* cross-agent is not a near-term goal (Craig, 2026-05-24). The contract is already repo-resident and neutral in shape, so adapters are purely additive when a second agent is actually adopted. Ollama specifically will need a host wrapper (run =ai-kb query= before the turn; =remember --confirm= human-gated) since a model runtime won't curate on its own.
+- *Compiled =wiki/= layer + =schema.org=, source hashes, numeric confidence.* *Why not v1:* pay off only with a substantial external-source corpus to compile and drift-check; v1 captures sources selectively under =raw/= already.
+- *Formal ingest pipeline.* *Why not v1:* the external-corpus workflow that triggers the wiki layer; premature without it.
+- *Semantic / embedding retrieval, =qmd=.* *Why not v1:* find-by-meaning pays off above ~hundreds of nodes; =rg= + index is faster and dependency-free below that.
+- *Event-sourced JSONL catalog + SQLite projection; typed-link graph traversal + content-addressed spans.* *Why not v1:* Nexus-scale infrastructure; org-roam's db + grep cover v1, and clean links/properties now let this be built later without rewriting nodes.
+- *Plan library / operator-DAG execution* (AgenticScholar, Plan*RAG). *Why not v1:* the near-term lesson is only "don't hide retrieval procedure in prose" — met by the =ai-kb query= contract; multi-hop planning waits.
+- *=log.org= op-log.* *Why not v1:* git history already records every write; add later if the git log is too coarse.
+- *Full =:VISIBILITY:= taxonomy + public/private split.* *Why not v1:* the private repo + no-secrets rule cover the floor.
+- *Full agentic-knowledgebase vision* (project hubs; person/decision/thread/meeting/problem/runbook types; =cj/agent-*= commands). *Why not v1:* a much larger product; ai-kb is its first slice.
+- *Live dual-roam browsing* (no switch). *Why not v1:* org-roam supports one active db per session today.
* Relationship to existing mechanisms
-- *Per-project claude memory* — stays the session-recall layer; shrinks to an index pointing into ai-kb for significant items.
-- *.ai/notes.org and .ai/sessions/* — unchanged in v1 (migration is the deferred Step 3).
+- *Per-project claude memory (T2)* — stays the session-recall layer; shrinks to an index pointing into ai-kb (T3) for significant items.
+- *.ai/notes.org and .ai/sessions/* — unchanged in v1 (migration is deferred Step 3).
- *Personal org-roam (recipes, etc.)* — never touched; reached by switching.
- *agentic-knowledgebase.org* — the broader vision; ai-kb is its first concrete slice.