docs(design): MCP-into-gptel + gh-as-gptel-tool specs + MCP phases

Two new design docs in docs/design/ covering the next two GPTel work items, plus matching task scaffolding in todo.org. mcp-el-gptel-integration.org wires mcp.el into the config so GPTel gets access to the nine MCP servers Claude Code already uses (linear, notion, figma, slack-deepsat, drawio, google-calendar, google-docs-personal, google-docs-work, google-keep). The design covers async startup, the write-confirmation policy, a server-enablement defcustom, a doctor with live-auth-check, the audit buffer, and the mcp.el compatibility layer. The spec is at revision 3 after two code-review passes flagged a critical confirmation gap (gptel-confirm-tool-calls nil at ai-config.el:386 silently ignored per-tool :confirm slots) and several incorrect mcp.el API assumptions. Both are addressed. gptel-gh-tool.org wraps the gh CLI as a hybrid surface: 14 typed read wrappers plus one general write tool gated by :confirm t. Host/repo resolution is command-aware: --repo HOST/OWNER/REPO for repo commands, --hostname only for api and auth status. The runner enforces an irreversible-command blocklist, a 64KB in-flight output cap, and a debug-record plus last-error-buffer story. The spec is at revision 2 after a code-review pass corrected gh flag assumptions and reframed the safety story around per-tool confirm. todo.org gains a link to the MCP spec under the parent task plus nine TODO sub-tasks (one per implementation phase), and a new gh-tool TODO with the same spec-link shape.
author: Craig Jennings <c@cjennings.net> 2026-05-17 01:16:03 -0500
committer: Craig Jennings <c@cjennings.net> 2026-05-17 01:16:03 -0500
commit: 2b61aa82afa39a0ff1b165fa9ff09d55d21bfabf (patch)
tree: 0ff4916e842caff081bfa2a70d8753dc6fa21044 /docs
parent: 1fef9aed248aaf7586950475aa22849e413f1c04 (diff)
download: dotemacs-2b61aa82afa39a0ff1b165fa9ff09d55d21bfabf.tar.gz
dotemacs-2b61aa82afa39a0ff1b165fa9ff09d55d21bfabf.zip
2 files changed, 2495 insertions, 0 deletions
diff --git a/docs/design/gptel-gh-tool.org b/docs/design/gptel-gh-tool.org
new file mode 100644
index 00000000..a9ba22bb
--- /dev/null
+++ b/docs/design/gptel-gh-tool.org
@@ -0,0 +1,1061 @@
+#+TITLE: Design: Wrap the gh CLI as a GPTel tool
+#+AUTHOR: Craig Jennings
+#+DATE: 2026-05-16
+#+OPTIONS: toc:nil num:nil
+
+* Status
+
+Draft (revision 2).  Pre-implementation; no code shipped yet.  =gh=
+v2.92.0 installed and authenticated against both =github.com= (account
+=cjennings=) and =deepsat.ghe.com= (account =craig-jennings=).
+
+Revision 2 incorporates a code-review pass that caught several
+incorrect =gh= CLI assumptions and a critical safety gap: with
+=gptel-confirm-tool-calls= set to =nil= in =modules/ai-config.el:386=,
+the "irreversible-only blocklist" V1 from revision 1 would let the
+agent run unconfirmed writes (=pr merge=, =run cancel=, =api -f
+body=...=, =release create=, etc.).  Revision 2 keeps Craig's intent
+of a general-capability tool but applies =:confirm t= to every
+classified write, drops the "scan command string for =--hostname=
+and =-H=" approach in favor of argv-list builders, and expands the
+wrapper set so most agent workflows don't need to fall back to the
+general tool.
+
+* Problem
+
+GPTel agents can read git history locally (=git_log=, =git_diff=,
+=git_status=) but have no access to the GitHub side of the workflow:
+PRs, issues, reviews, CI runs, releases, gists, repo metadata.  The
+=gh= CLI does all of this and is already authenticated against both
+hosts the user works with.  Wrapping it as a GPTel tool gives the
+agent the same GitHub surface the user has.
+
+The wrapper has to handle four complications the local git tools
+don't:
+
+1. *Two authenticated hosts* -- personal (=github.com=) and work
+   GHE (=deepsat.ghe.com=) -- with different policies and different
+   blast radii for destructive operations.
+2. *A vast subcommand surface* (~30 top-level subcommands, hundreds
+   of leaves) that doesn't fit cleanly into one typed schema per
+   leaf.
+3. *Inconsistent flag semantics across subcommands.*  =--hostname=
+   exists only on =gh api= and =gh auth status=; the rest use
+   =--repo [HOST/]OWNER/REPO=.  =-H= means =--head= on =gh pr list=
+   but =--header= on =gh api=.  Naive string-scanning to detect
+   user intent is wrong.
+4. *Global GPTel setting* (=gptel-confirm-tool-calls nil=) means
+   the agent can call any registered tool without user confirmation
+   unless that specific tool is registered with =:confirm t=.  V1
+   safety must come from per-tool flags, not from a session-level
+   gate.
+
+* Goals
+
+1. The agent can invoke the non-interactive subset of =gh= against
+   either host, gated by a safety policy.
+2. High-frequency reads have typed wrappers with sensible defaults
+   (host/repo resolution, JSON-by-default with minimal fields,
+   output cap, timeout).  Wrappers auto-execute.
+3. The active host is resolved from a single context object that
+   considers (in order): explicit =hostname=/=repo= args, branch
+   upstream remote, =origin= remote, =cj/gh-default-host=.  Every
+   tool response prefixes the resolved host/repo so cross-host
+   mistakes are visible.
+4. Writes via the general tool execute only after user
+   confirmation (=:confirm t= on the gptel tool registration).
+   Unknown classifications fail closed (=:confirm t=).
+   Irreversible commands hard-block.
+5. Interactive commands (=auth login=, =codespace ssh=, =--web=,
+   =browse=, =pr checkout=, =repo clone=, =config set=,
+   =alias set=, =extension exec=) are hard-blocked in V1.
+6. File exfiltration paths (=api --input=, =api -F key=@file=,
+   =release upload=, =gist create=) are hard-blocked in V1.
+7. Output is capped *during capture*, not after, so a runaway
+   =--paginate= or =--log= can't fill memory or block Emacs.
+   Long-output commands have higher timeouts but the same byte
+   cap.
+8. Every call produces a structured debug record (host, repo, cwd,
+   sanitized argv, classification, policy decision, exit code,
+   duration, bytes captured, truncation flag, error kind) inspectable
+   via =cj/gh-tool-last-error=.
+9. =cj/gh-doctor= diagnoses missing prerequisites (=gh=
+   executable, version floor, auth per host, cwd repo detection,
+   environment overrides) before they fail at runtime.
+10. V2 adds true per-host policy (currently uniform across both
+    hosts), profile-based tool subsets, and bridge helpers between
+    local git tools and the gh wrappers.
+
+* Non-Goals
+
+- *Interactive subcommands.*  =gh auth login=, =gh codespace ssh=,
+  =gh pr checkout= (modifies working tree), =gh repo clone=
+  (writes outside controlled paths) are not in scope.  The user
+  runs those in a real terminal.
+- *Replicating gh extensions.*  Core gh only.  Extensions can be
+  invoked from a regular terminal if needed.
+- *Streaming long-running output.*  =gh run watch=, =gh
+  --paginate= for unbounded result sets, =gh repo clone= for large
+  repos.  V1 blocks =--paginate= and =watch= verbs; future
+  versions may add streaming support.
+- *Per-host write policies* (work-read-only / personal-full-write).
+  V1 applies the same write-confirmation policy to both hosts.
+  V2 makes the policy host-keyed.
+- *Conversation-context injection* and *local/remote bridge
+  helpers* (current branch → PR, changed files → PR comments).
+  Future enhancements; tracked as a follow-up.
+- *Artifact download* (=gh run download=).  Writes files; needs
+  its own confirmation flow.  Deferred to V2.
+
+* Verified gh CLI Contracts
+
+These were checked against =gh --version 2.92.0= help output before
+the spec was revised.  Implementation can rely on them.
+
+** Host / repo selection per subcommand
+
+- =gh api= and =gh auth status=: use =--hostname HOST= (long form
+  only; no short form).
+- =gh pr {view,list,diff,checks}=, =gh issue {view,list}=,
+  =gh run {view,list}=, =gh release *=, =gh search *=: use
+  =--repo [HOST/]OWNER/REPO= (short =-R=).
+- =gh repo view= and similar root-level commands: infer from cwd
+  remote; no =--hostname= or =--repo= flag.
+- Environment variables: =GH_HOST= and =GH_REPO= work for commands
+  that lack explicit flags.
+
+** Flags with overloaded short forms
+
+- =gh pr list -H FOO= → =--head FOO= (branch).
+- =gh search prs -H FOO= → =--head FOO= (branch).
+- =gh api -H KEY:VAL= → =--header KEY:VAL=.
+
+Substring scanning =-H= as "hostname" is wrong on every command
+that uses it.
+
+** =gh api= method classification
+
+The help text explicitly says: "adding request parameters will
+automatically switch the method to POST".  This means:
+
+- =gh api repos/x/y/issues= → GET (read).
+- =gh api repos/x/y/issues -f title=Foo= → POST (write).
+- =gh api repos/x/y/issues -F body=@file= → POST (write +
+  exfiltration).
+- =gh api --method GET repos/x/y/issues -f q=x= → GET (read,
+  explicit override).
+
+Classifier must inspect for =-f=, =-F=, =--field=, =--raw-field=,
+=--input=, and =--method= before defaulting to GET.
+
+** Wrappers without =--json= support
+
+- =gh pr diff= produces patch output; no =--json= flag.  Wrapper
+  is text-only.  Structured file metadata available via
+  =gh pr view --json files= (new wrapper =gh_pr_files=).
+
+** Environment variables for non-interactive use
+
+All confirmed via =gh help environment=:
+
+- =GH_PROMPT_DISABLED= -- disable interactive prompting.
+- =GH_PAGER= / =PAGER= -- set to =cat= so no pager fires.
+- =GH_NO_UPDATE_NOTIFIER= -- silence the "new version" banner.
+- =GH_NO_EXTENSION_UPDATE_NOTIFIER= -- same for extensions.
+- =GH_SPINNER_DISABLED= -- silence the progress spinner.
+- =NO_COLOR= -- disable ANSI color (cleaner LLM context).
+
+V1 subprocess env always includes all of the above.
+
+** Authentication
+
+#+begin_example
+✓ Logged in to github.com account cjennings (keyring)
+✓ Logged in to deepsat.ghe.com account craig-jennings (keyring)
+#+end_example
+
+Both stored in the system keyring.  =gh= reads the keyring per
+call; the wrapper does not handle tokens.  Expired auth surfaces
+as exit code 4 with a clear stderr message; recovery is interactive
+(=gh auth login --hostname HOST= in a terminal).
+
+* Current State
+
+** =modules/ai-config.el=
+
+- =gptel-confirm-tool-calls nil= at line 386.  Per-tool
+  =:confirm t= is the only confirmation mechanism in V1.
+- =cj/gptel-load-local-tools= (lines 71-96) loads tools from
+  =gptel-tools/=.  Add new gh feature names to
+  =cj/gptel-local-tool-features=.
+
+** =gptel-tools/=
+
+Ten tools today.  =git_log.el= is the closest analogue: validates
+input, runs subprocess with =process-file=, caps output, signals
+clear errors.  The gh wrappers follow the same shape with shared
+helpers in =gh-common.el=.
+
+* Design
+
+** File layout
+
+Two files plus per-wrapper registrations.  Wrappers are
+deliberately small (argv builders + JSON-field defaults) so they
+all live in one file:
+
+- =gptel-tools/gh-common.el= -- shared helpers: host/repo context,
+  argv runner, subprocess env, classifier, blocklist, redaction,
+  debug record, last-error buffer.  No tool registrations.
+- =gptel-tools/gh.el= -- the general tool + every wrapper
+  registration.  Pulls in =gh-common= via =require=.
+
+This matches Craig's preference for low loader surface (per the
+MCP revision discussion).  If =gh.el= grows past ~600 lines we
+split per-resource later (=gh-pr.el=, =gh-issue.el=, etc.).
+
+** Host / repo resolution
+
+A single helper returns a structured context object used by every
+tool entry point:
+
+#+begin_src emacs-lisp
+(defun cj/gh--resolve-context (cwd hostname repo)
+  "Return a context plist for a gh invocation.
+Resolution order:
+1. Explicit REPO argument (`[HOST/]OWNER/REPO').  If it contains
+   `HOST/', split off the host.
+2. Explicit HOSTNAME argument.
+3. CWD's branch upstream remote URL (`git rev-parse --abbrev-ref
+   @{u}' then `git remote get-url REMOTE').
+4. CWD's `origin' remote URL.
+5. `cj/gh-default-host'.
+
+Returns:
+  (:host HOST :owner OWNER :repo REPO :source SYM)
+where SOURCE is one of: explicit-repo, explicit-host, upstream,
+origin, default."
+  ...)
+#+end_src
+
+URL parsing handles all three forms produced by git:
+
+- =git@HOST:OWNER/REPO.git=
+- =ssh://git@HOST/OWNER/REPO.git=
+- =https://HOST/OWNER/REPO.git=
+
+Returns =nil host= if the URL doesn't match a known host; the
+caller uses =cj/gh-default-host= in that case.
+
+#+begin_src emacs-lisp
+(defcustom cj/gh-default-host "github.com"
+  "Host used when no other resolution path yields one."
+  :type 'string
+  :group 'cj)
+
+(defcustom cj/gh-known-hosts '("github.com" "deepsat.ghe.com")
+  "Hosts the gh CLI is authenticated against."
+  :type '(repeat string)
+  :group 'cj)
+#+end_src
+
+Context appears in every tool response as a one-line prefix:
+
+#+begin_example
+[gh github.com/cjennings/dotemacs read ok 12.4KB] ...
+[gh deepsat.ghe.com/org/repo write confirmed 0.8KB] ...
+[gh github.com api unknown blocked] ...
+#+end_example
+
+This makes cross-host accidents visible.
+
+** Argv-list builders, never command strings
+
+Every tool builds an explicit argv list and hands it to a shared
+runner.  Wrappers expose typed parameters that map directly into
+argv positions:
+
+#+begin_src emacs-lisp
+(defun cj/gh-pr-view--build-argv (context number include-comments fields)
+  "Return an argv list for `gh pr view NUMBER ...'."
+  (let ((repo-arg (format "%s/%s/%s"
+                          (plist-get context :host)
+                          (plist-get context :owner)
+                          (plist-get context :repo))))
+    (append (list "pr" "view" (number-to-string number)
+                  "--repo" repo-arg)
+            (when include-comments '("--comments"))
+            (when fields (list "--json" fields)))))
+#+end_src
+
+The general tool takes an =args= parameter typed as array-of-string
+(not a shell string) so the agent constructs argv directly:
+
+#+begin_src emacs-lisp
+;; agent passes: ["pr" "view" "171" "--repo" "deepsat.ghe.com/org/repo"]
+;; runner runs: gh pr view 171 --repo deepsat.ghe.com/org/repo
+#+end_src
+
+If the agent accidentally includes leading =gh= in args[0], the
+runner strips it and logs the normalization.
+
+** Subprocess contract
+
+One runner for everything:
+
+#+begin_src emacs-lisp
+(defun cj/gh--run (argv &key timeout context)
+  "Run gh with ARGV (a list of strings).
+Returns a plist:
+  (:exit-code N :stdout STR :stderr STR :truncated BOOL :duration-ms N
+   :argv ARGV :context CONTEXT)
+Enforces TIMEOUT (default `cj/gh--default-timeout').
+Captures output in-flight, killing the process at
+`cj/gh--max-bytes' to prevent runaway memory use."
+  ...)
+#+end_src
+
+Contract:
+
+- *Env vars set every call:* =GH_PROMPT_DISABLED=1=, =GH_PAGER=cat=,
+  =PAGER=cat=, =NO_COLOR=1=, =GH_NO_UPDATE_NOTIFIER=1=,
+  =GH_NO_EXTENSION_UPDATE_NOTIFIER=1=, =GH_SPINNER_DISABLED=1=.
+- *Timeout:* default 20 s for reads, 60 s for diffs/logs/search.
+  Per-tool override allowed but capped at 120 s.  Timeout kills
+  the process and returns =:error-kind 'timeout=.
+- *Output cap during capture:* a process filter accumulates bytes
+  up to =cj/gh--max-bytes= (64 KB).  At the cap, the filter sets
+  =truncated=, ignores further output, and sends SIGTERM after a
+  short grace.  Truncation marker appended to returned stdout.
+- *TRAMP rejection:* if =(file-remote-p default-directory)= is
+  non-nil, return =:error-kind 'remote-cwd= without invoking gh.
+- *Executable check:* if =(executable-find cj/gh--executable)= is
+  nil, return =:error-kind 'no-executable= with install
+  instructions.
+- *Version check:* the first call per session verifies
+  =gh --version= meets =cj/gh--min-version= (default "2.50.0");
+  cached for the session.
+
+** Read / write / destructive classifier
+
+Used to apply =:confirm t= and the irreversible blocklist:
+
+#+begin_src emacs-lisp
+(defconst cj/gh--read-verbs
+  '("view" "list" "status" "search" "diff" "checks" "describe"
+    "show" "logs" "auth-status"))
+
+(defconst cj/gh--write-verbs
+  '("create" "edit" "merge" "close" "reopen" "comment" "review"
+    "upload" "set" "add" "remove" "rerun" "cancel" "delete"
+    "fork" "archive" "unarchive" "lock" "unlock" "pin" "unpin"
+    "ready" "draft" "rename" "transfer" "approve" "label" "assign"))
+
+(defconst cj/gh--blocked-verbs
+  '(;; interactive / opens UI
+    "login" "logout" "checkout" "clone" "ssh" "code" "edit-prompt"
+    ;; modifies user config
+    "alias" "config" "extension"
+    ;; opens browser
+    "browse"))
+
+(defconst cj/gh--blocked-flags
+  '("--web"          ; many commands
+    "--paginate"     ; can produce unbounded output
+    "--input"        ; gh api file upload
+    "--editor"))     ; opens editor
+
+(defconst cj/gh--irreversible-patterns
+  '("\\`repo delete\\b"
+    "\\`release delete\\b"
+    "\\`secret delete\\b"
+    "\\`ssh-key delete\\b"
+    "\\`gpg-key delete\\b"
+    "\\`org delete\\b"
+    "\\`project delete\\b"
+    "\\`variable delete\\b"
+    "\\`ruleset delete\\b"
+    "\\`label delete\\b.*--yes"))
+#+end_src
+
+Classifier:
+
+#+begin_src emacs-lisp
+(defun cj/gh--classify (argv)
+  "Return one of: read, write, destructive, blocked, unknown."
+  (let* ((stripped (cj/gh--strip-flags argv))
+         (resource (car stripped))
+         (verb (cadr stripped)))
+    (cond
+     ;; Hard blocks first.
+     ((cj/gh--has-blocked-verb-p argv) 'blocked)
+     ((cj/gh--has-blocked-flag-p argv) 'blocked)
+     ((cj/gh--has-file-arg-p argv) 'blocked)  ; -F key=@file
+     ((cj/gh--matches-irreversible-p argv) 'destructive)
+     ;; gh api is special.
+     ((string= resource "api") (cj/gh--classify-api argv))
+     ;; Verb match.
+     ((member verb cj/gh--read-verbs) 'read)
+     ((member verb cj/gh--write-verbs) 'write)
+     (t 'unknown))))
+
+(defun cj/gh--classify-api (argv)
+  "Classify a `gh api ...' invocation.
+Reads: explicit method GET/HEAD with no -f/-F/--input.
+Writes: any -f/-F/--field/--raw-field/--input, OR explicit
+non-GET/HEAD method.
+Default (no method, no field): read (matches gh's GET default)."
+  (let* ((explicit-method (or (cadr (member "-X" argv))
+                              (cadr (member "--method" argv))))
+         (has-field (cl-some
+                     (lambda (f) (cl-some (lambda (a) (string-prefix-p f a))
+                                          argv))
+                     '("-f" "--raw-field" "-F" "--field")))
+         (has-input (member "--input" argv)))
+    (cond
+     (has-input 'blocked)                ; file exfiltration
+     ((and explicit-method
+           (not (member (upcase explicit-method) '("GET" "HEAD"))))
+      'write)
+     (has-field 'write)                   ; -f/-F auto-promotes to POST
+     ((and explicit-method
+           (member (upcase explicit-method) '("GET" "HEAD")))
+      'read)
+     (t 'read))))
+#+end_src
+
+Classifier-driven policy table:
+
+| Classification | Policy |
+|----------------+--------|
+| =read= | auto-execute |
+| =write= | =:confirm t= on registration; agent's call shows in confirm prompt |
+| =destructive= | hard-block; return =:error-kind 'irreversible-blocked= |
+| =blocked= | hard-block; return =:error-kind 'policy-blocked= with reason |
+| =unknown= | =:confirm t= (fail closed) |
+
+** Safety policy
+
+V1 uniform policy applied to both hosts:
+
+#+begin_src emacs-lisp
+(defcustom cj/gh-policy
+  '((read . auto)
+    (write . confirm)
+    (destructive . block)
+    (blocked . block)
+    (unknown . confirm))
+  "Per-classification policy.  V1 applies uniformly to both hosts."
+  :type '(alist :key-type symbol :value-type symbol)
+  :group 'cj)
+#+end_src
+
+Since GPTel's =:confirm t= flag is per-tool-registration (not
+per-call), the general tool is registered with =:confirm t=
+always.  The wrappers are registered per their classification:
+
+| Tool | Registered with |
+|------+-----------------|
+| Read wrappers (=gh_pr_view=, etc.) | =:confirm nil= |
+| =gh= general tool | =:confirm t= |
+
+The general tool then applies the policy table at invocation
+time: reads execute without further prompting (already past
+GPTel's confirm because :confirm t fires once); writes show
+detail before invocation; destructive/blocked never reach gh.
+
+** General tool: =gh=
+
+Single tool registered with =:confirm t= covering everything the
+wrappers don't.  Schema:
+
+| Arg | Type | Required | Purpose |
+|-----+------+----------+---------|
+| =args= | array of string | yes | argv list, e.g. =["pr", "view", "171", "--repo", "deepsat.ghe.com/org/repo"]= |
+| =hostname= | string | no | Override host for commands that accept =--hostname= (=api=, =auth status=); ignored otherwise |
+| =repo= | string | no | =[HOST/]OWNER/REPO= for repo-scoped commands; if HOST is present in repo, hostname arg is overridden |
+| =cwd= | string | no | Working directory; defaults to current buffer; must be under =$HOME=, must not be TRAMP |
+| =timeout= | integer | no | Seconds before kill; default 20, max 120 |
+
+Description (registered) explicitly says:
+
+#+begin_example
+Use this only when no task-specific gh_* tool fits.  Prefer
+gh_pr_view, gh_pr_list, gh_pr_checks, gh_issue_view, gh_issue_list,
+gh_run_view, gh_run_list, gh_run_logs_failed, gh_repo_view,
+gh_search_prs, gh_search_issues, gh_api_get.
+
+Writes (create/edit/merge/etc.) require user confirmation.
+Destructive (repo/release/secret delete) are hard-blocked.
+Interactive commands (auth login, codespace ssh, --web, browse)
+are hard-blocked.  File uploads (api --input, -F @file, release
+upload, gist create) are hard-blocked.
+#+end_example
+
+** Wrapper inventory
+
+Twelve wrappers grouped by resource.  Each has typed args, JSON
+field defaults, output truncation, timeout override, and a
+description that names its scope.
+
+| Tool | gh command | Defaults | Timeout |
+|------+------------+----------+---------|
+| =gh_repo_view= | =repo view --json= | JSON fields: =name,nameWithOwner,description,defaultBranchRef,url,visibility= | 20s |
+| =gh_pr_view= | =pr view N --json= | JSON fields: =number,title,state,author,createdAt,url,body= (body truncated) | 20s |
+| =gh_pr_list= | =pr list --json= | JSON fields: =number,title,state,author,createdAt,headRefName=; default =--limit 30= (capped at 100) | 20s |
+| =gh_pr_diff= | =pr diff N --color never= | text only; capped at 64 KB | 60s |
+| =gh_pr_checks= | =pr checks N --json= | JSON fields: =name,status,conclusion,startedAt,completedAt,link= | 20s |
+| =gh_pr_files= | =pr view N --json files= | JSON fields: =files= (path, additions, deletions, mode) | 20s |
+| =gh_pr_current= | =pr view --json= (no number — auto-detect) | Same as =gh_pr_view= | 20s |
+| =gh_issue_view= | =issue view N --json= | JSON fields: =number,title,state,author,createdAt,url,body= (body truncated) | 20s |
+| =gh_issue_list= | =issue list --json= | JSON fields: =number,title,state,author,createdAt,labels=; default =--limit 30= | 20s |
+| =gh_run_view= | =run view RUN-ID --json= | JSON fields: =databaseId,name,status,conclusion,startedAt,headBranch,event,url= | 20s |
+| =gh_run_list= | =run list --json= | JSON fields: =databaseId,name,status,conclusion,startedAt,headBranch=; default =--limit 20= | 20s |
+| =gh_run_logs_failed= | =run view RUN-ID --log-failed= | text only; capped at 64 KB | 60s |
+| =gh_search_prs= | =search prs --json= | JSON fields: =number,title,state,author,repository,url=; default =--limit 30= (capped at 100) | 30s |
+| =gh_search_issues= | =search issues --json= | Same as PR search | 30s |
+| =gh_api_get= | =api ENDPOINT --method GET= | text/JSON pass-through; rejects fields/input args | 30s |
+
+Common args for every wrapper (unless noted):
+
+| Arg | Type | Required | Purpose |
+|-----+------+----------+---------|
+| =repo= | string | no | =[HOST/]OWNER/REPO=; resolved from context otherwise |
+| =hostname= | string | no | Override host for context resolution |
+| =limit= | integer | no | =--limit= for list/search wrappers; clamped to per-wrapper max |
+
+The =gh_api_get= wrapper *explicitly* rejects =-f=, =-F=,
+=--field=, =--raw-field=, =--input=, =--method=, and any method
+override.  It only accepts =ENDPOINT= and optional =-H= headers
+that don't carry secrets (the runner redacts =Authorization:= and
+similar regardless).  Writes via API go through the general tool
+with confirmation.
+
+** JSON field defaults
+
+Per-wrapper defaults are minimal -- enough for the agent to decide
+whether to drill in, not so much that one call fills the context.
+
+For =gh_pr_view= specifically:
+
+- *Default* (no =fields= override): the small list above (number,
+  title, state, author, createdAt, url, body-truncated-to-2KB).
+- *Override*: agent passes =fields= as a comma-separated string;
+  wrapper validates against a per-resource allowlist (so the agent
+  can't request =reviews,comments,files= in one call to bypass the
+  cap).
+- *Include flags*: =include-body t/nil=, =include-comments t/nil=,
+  =include-reviews t/nil= as boolean args.  Each adds the
+  corresponding JSON field; agent opts in only when needed.
+
+** Output truncation
+
+Process filter pattern, not post-hoc cap:
+
+#+begin_src emacs-lisp
+(defun cj/gh--make-filter (state-var)
+  "Return a process filter that accumulates into STATE-VAR's :stdout,
+stops collecting after `cj/gh--max-bytes', sets :truncated, and
+sends SIGTERM to the process."
+  (lambda (proc output)
+    (let* ((state (symbol-value state-var))
+           (current (plist-get state :stdout))
+           (current-len (length current))
+           (remaining (- cj/gh--max-bytes current-len)))
+      (cond
+       ((<= remaining 0) nil)             ; already at cap
+       ((<= (length output) remaining)
+        (plist-put state :stdout (concat current output)))
+       (t
+        (plist-put state :stdout
+                   (concat current (substring output 0 remaining)))
+        (plist-put state :truncated t)
+        (ignore-errors (delete-process proc)))))))
+#+end_src
+
+Truncation marker appended before return:
+
+#+begin_example
+[truncated at 64KB; use --limit, narrower fields, or a specific
+wrapper to reduce output]
+#+end_example
+
+** Error classification + debug record
+
+Every call returns (and =cj/gh-tool-last-error= caches) a debug
+record:
+
+#+begin_src emacs-lisp
+(defun cj/gh--debug-record (argv context exit-code stdout stderr
+                            duration-ms truncated)
+  (list :host (plist-get context :host)
+        :repo (cj/gh--repo-arg context)
+        :cwd default-directory
+        :argv (cj/gh--redact-argv argv)
+        :classification (cj/gh--classify argv)
+        :policy (cj/gh--policy-decision argv)
+        :exit-code exit-code
+        :duration-ms duration-ms
+        :bytes-captured (length stdout)
+        :truncated truncated
+        :error-kind (cj/gh--error-kind exit-code stderr)))
+#+end_src
+
+=:error-kind= mapping:
+
+| Condition | =:error-kind= | Returned message |
+|-----------+---------------+------------------|
+| Exit 4 + "authentication required" | =auth= | "Run =gh auth login --hostname HOST= in a terminal." |
+| Process killed by timeout timer | =timeout= | "Command exceeded N seconds; narrow the query or use a more specific wrapper." |
+| Policy block | =policy-blocked= | "Blocked by V1 policy: REASON." |
+| Irreversible match | =irreversible-blocked= | "Hard-blocked irreversible command: COMMAND." |
+| Truncated output | =truncated= | "Output truncated at 64KB; reduce scope." |
+| TRAMP cwd | =remote-cwd= | "Cannot run gh from remote directory: CWD." |
+| Missing executable | =no-executable= | "gh not found at CJ/GH--EXECUTABLE; install via 'pacman -S github-cli' (or equivalent)." |
+| Other non-zero exit | =gh-exit= | (raw stderr, redacted) |
+
+Each error includes a sanitized reproduce line:
+
+#+begin_example
+Reproduce: GH_PROMPT_DISABLED=1 GH_PAGER=cat gh pr view 171 --repo HOST/OWNER/REPO
+#+end_example
+
+(Secrets, body text, file paths redacted via =cj/gh--redact-argv=.)
+
+** Audit log (V1, opt-out)
+
+Every call appends one line to
+=~/.emacs.d/data/gh-tool-log/YYYY-MM-DD.log=:
+
+#+begin_example
+2026-05-16T14:23:45-0500 host=github.com repo=cjennings/dotemacs class=read policy=auto exit=0 duration=128ms bytes=4321
+2026-05-16T14:24:02-0500 host=deepsat.ghe.com repo=org/repo class=write policy=confirm exit=0 duration=412ms bytes=88
+#+end_example
+
+Metadata only, not output bodies.  Defcustom
+=cj/gh-tool-audit-log-enabled= (default =t=).  Daily rotation
+implicit (one file per day).  Cleanup manual.
+
+** Secrets redaction
+
+=cj/gh--redact-argv= masks:
+
+- Anything after =--token=, =--secret=, =--password= flags.
+- Authorization headers (=-H "Authorization: ..."=).
+- =--figma-api-key=KEY= (in case general tool spawns figma-mcp
+  somehow).
+- Bearer tokens in URLs (=?token=...=).
+- Values for =-f=/=-F= keys named like =body=, =text=,
+  =description= (private content; metadata still logged).
+
+Applied to:
+- All stderr returned to the agent.
+- All audit-log lines.
+- All debug records.
+- The reproduce line on error.
+
+* Commands & UX
+
+** =cj/gh-doctor=
+
+Diagnostic command.  No side effects.  Checks:
+
+- =gh= executable found at =cj/gh--executable=.
+- =gh --version= meets =cj/gh--min-version=.
+- =gh auth status= for each host in =cj/gh-known-hosts=.
+- Current buffer's cwd repo detection: resolves to which host/repo?
+- Environment overrides effective (=GH_PROMPT_DISABLED= etc. would
+  be set by the runner).
+- Active account per host.
+- Warnings if any block-list-relevant env is set externally
+  (e.g. user already has =GH_PAGER= set to something that pages).
+
+Output: a buffer with PASS / FAIL / WARN per check + recovery
+actions for failures.
+
+** =cj/gh-tool-last-error=
+
+Opens a buffer showing the last call's debug record:
+
+#+begin_example
+Host:           deepsat.ghe.com
+Repo:           org/repo
+Source:         upstream
+CWD:            ~/projects/work/foo
+Argv:           ("pr" "view" "171" "--repo" "deepsat.ghe.com/org/repo")
+Classification: read
+Policy:         auto
+Exit code:      0
+Duration:       412 ms
+Bytes captured: 4321
+Truncated:      no
+Error kind:     none
+
+Reproduce:
+  GH_PROMPT_DISABLED=1 GH_PAGER=cat NO_COLOR=1 \
+    gh pr view 171 --repo deepsat.ghe.com/org/repo
+#+end_example
+
+** Tool response header
+
+Every tool result begins with a one-line header so cross-host /
+policy decisions are visible:
+
+#+begin_example
+[gh github.com/cjennings/dotemacs read ok 4.3KB]
+{ ... }
+[gh deepsat.ghe.com/org/repo write confirmed 0.2KB]
+{ ... }
+[gh github.com/cjennings/dotemacs api blocked policy-blocked]
+Error: Hard-blocked file-upload path: --input file.
+Reproduce: gh api repos/cjennings/dotemacs/contents/foo --input file
+#+end_example
+
+* Implementation Plan
+
+Eight phases.  Each ends with green ERT tests + manual smoke
+before the next.
+
+** Phase 1 -- Common helpers + context resolver
+
+=gh-common.el=: =cj/gh--executable=, =cj/gh--available-p=,
+=cj/gh--version= (cached), =cj/gh--validate-cwd= (HOME + non-TRAMP),
+=cj/gh--parse-remote-url=, =cj/gh--resolve-context=,
+=cj/gh--redact-argv=.  No subprocess execution yet.
+
+Tests cover all helpers against fixture remote URLs and synthetic
+git directories.  No real gh calls.
+
+** Phase 2 -- Runner with subprocess env + timeout + in-flight cap
+
+=cj/gh--run=, the process filter, the timer kill path, env-var
+setup.  Tests stub =make-process= to simulate output / exit / hang
+/ truncation paths.
+
+Acceptance: with stub configured to produce 100 KB of output,
+returned stdout is exactly 64 KB plus the truncation marker, and
+the process gets SIGTERM.
+
+** Phase 3 -- Classifier + blocklist + policy
+
+=cj/gh--classify=, =cj/gh--classify-api=, blocklist constants,
+=cj/gh--policy-decision=.  Tests for every blocklist pattern
+(verbs + flags + file-arg paths), API edge cases (=-f= promotes
+to POST, =--method GET -f x=y= stays GET, etc.), and the
+unknown-fails-closed contract.
+
+** Phase 4 -- Read wrappers (5 first)
+
+=gh_repo_view=, =gh_pr_view=, =gh_pr_list=, =gh_pr_diff=,
+=gh_issue_view=.  Each is a thin schema + argv builder + delegate
+to =cj/gh--run=.  Tests verify argv shape for typical args.
+
+Manual smoke against both hosts.  First real gh calls.
+
+** Phase 5 -- Remaining wrappers + JSON defaults
+
+Eight more wrappers (=gh_pr_checks=, =gh_pr_files=,
+=gh_pr_current=, =gh_issue_list=, =gh_run_view=, =gh_run_list=,
+=gh_run_logs_failed=, =gh_search_prs=, =gh_search_issues=,
+=gh_api_get=).  JSON-field defaults per wrapper.  Tests for the
+=gh_api_get= flag-rejection contract.
+
+** Phase 6 -- General tool
+
+=gh.el= general tool registration with =:confirm t=, blocklist
+enforcement, policy decision applied before invocation.  Tests
+verify confirmation gate (stub gptel's confirm flow), blocked
+commands never reach =cj/gh--run=, destructive commands return
+=:irreversible-blocked= without prompting.
+
+** Phase 7 -- UX: doctor, last-error, response header
+
+=cj/gh-doctor=, =cj/gh-tool-last-error=, response header
+formatting.  Audit log writer.  Defcustom for log enable.
+
+** Phase 8 -- Loader wiring + integration
+
+Add the 16 feature names to =cj/gptel-local-tool-features= (one
+per wrapper + the general tool + the helpers feature).  Verify
+they land in =gptel-tools=.
+
+* Test Plan
+
+Target: 55-70 ERT tests across four files.  No real subprocesses,
+no real network, no real =~/.claude.json= (gh tools don't use it,
+but the no-real-process rule applies uniformly).
+
+** =tests/test-gh-common.el= -- pure helpers (~25 tests)
+
+- =cj/gh--parse-remote-url=: ssh-scp, ssh-url, https, with/without
+  =.git=, with/without trailing slash, unknown host returns =:host
+  nil=, =github.com.evil.example= does NOT match =github.com=.
+- =cj/gh--resolve-context=: explicit repo wins; explicit hostname
+  wins over remote; upstream beats origin; origin beats default;
+  default fires when no git.
+- =cj/gh--validate-cwd=: HOME-rooted ok; outside HOME errors;
+  TRAMP errors; non-directory errors.
+- =cj/gh--redact-argv=: =--token=, =-H "Authorization: ..."=,
+  =--figma-api-key=, =-f body=...= → body redacted; sentinel
+  =REDACTED_TEST_SECRET= never appears in any output of any
+  helper.
+- =cj/gh--available-p=: nil when =executable-find= fails; t
+  otherwise.
+- =cj/gh--version=: caches per session; floor check rejects
+  too-old.
+
+** =tests/test-gh-runner.el= -- runner contract (~15 tests)
+
+Stub =make-process=:
+- Normal exit 0 with short output: returned verbatim, no
+  truncation flag.
+- Long output (100 KB stub): truncated at 64 KB exactly,
+  truncation marker present, =:truncated t=.
+- Process hangs past timeout: timer fires, SIGTERM sent, returns
+  =:error-kind 'timeout=.
+- Exit 4 + auth stderr: returns =:error-kind 'auth= with recovery
+  message.
+- TRAMP cwd: never invokes =make-process=, returns =:error-kind
+  'remote-cwd=.
+- Missing executable: returns =:error-kind 'no-executable=.
+- Environment: =process-environment= includes all six required
+  vars before =make-process= call.
+- Argv with leading "gh" stripped + logged.
+
+** =tests/test-gh-classifier.el= -- policy logic (~20 tests)
+
+- Every read verb classifies as read.
+- Every write verb classifies as write.
+- Every destructive pattern matches.
+- Every blocked verb (=login=, =browse=, =clone=, etc.) classifies
+  as blocked.
+- Every blocked flag (=--web=, =--paginate=, =--input=) classifies
+  as blocked.
+- File-upload (=-F key=@/path=) classifies as blocked.
+- =gh api= GET (no fields): read.
+- =gh api= GET (=-f x=y=): write (auto-POST per gh's rules).
+- =gh api --method GET -f q=x=: read (explicit override).
+- =gh api -X DELETE=: destructive.
+- =gh api -X PATCH=: write.
+- =gh api --input file=: blocked.
+- Unknown verb (=gh frobnicate=): unknown → confirm.
+- =-H= as branch (=gh pr list -H feature=) doesn't trigger host
+  treatment.
+- =-H= as header (=gh api -H Accept:json=) doesn't trigger host
+  treatment.
+- Policy decision: read → auto; write → confirm; destructive →
+  block; blocked → block; unknown → confirm.
+
+** =tests/test-gh-wrappers.el= -- per-wrapper builders + schemas (~15 tests)
+
+- Every wrapper's schema is valid (correct =:name=, =:type=,
+  =:description=, =:args= shape).
+- =gh_pr_view= argv with number 171 and repo "host/o/r" produces
+  =("pr" "view" "171" "--repo" "host/o/r" "--json" "DEFAULTS")=.
+- =gh_pr_diff= rejects =format='json=.
+- =gh_api_get= rejects =-f=, =-F=, =--field=, =--raw-field=,
+  =--input=, =--method= other than GET.
+- =gh_pr_current= invokes without a number arg (uses cwd).
+- =limit= clamped to per-wrapper max.
+- =fields= validated against per-resource allowlist.
+
+** Manual smoke (every phase)
+
+| Phase | Smoke |
+|-------+-------|
+| 4 | =gh_pr_view N= against both hosts |
+| 4 | =gh_pr_list= in =~/.emacs.d= → uses github.com/cjennings/dotemacs |
+| 5 | =gh_pr_checks= shows CI status without full logs |
+| 5 | =gh_run_logs_failed= cap kicks in on a long-failed run |
+| 5 | =gh_api_get= rejects a =-f= arg with clear error |
+| 6 | =gh= general tool: agent asked to merge a PR triggers GPTel confirm prompt |
+| 6 | Agent asked to =gh repo delete= gets irreversible-blocked |
+| 6 | Agent asked to =gh --web=... gets policy-blocked |
+| 7 | =cj/gh-doctor= correctly identifies an unauthenticated host |
+| 7 | =cj/gh-tool-last-error= shows debug record after a failing call |
+
+** Opt-in integration suite
+
+A small set of real-gh tests (in =tests/test-gh-integration.el=
+marked =:tag :integration=, default skipped):
+
+- =gh auth status --hostname github.com= ok.
+- =gh auth status --hostname deepsat.ghe.com= ok.
+- =gh repo view cjennings/dotemacs --json name=
+  returns parseable JSON.
+- =gh pr list --repo cjennings/dotemacs --limit 1= returns ≤ 1
+  PR.
+
+Run manually via =make test-name TEST=gh-integration=.
+
+* Acceptance Criteria
+
+1. *Argv contract.* No tool produces a command string for execution;
+   every call goes through the argv-list runner.
+2. *No silent writes.* Every classified write either prompts
+   GPTel's confirm or hard-blocks.  Verified by an end-to-end
+   test where the agent attempts =pr merge= and the test fails if
+   =make-process= is invoked before confirm.
+3. *In-flight cap.* A stubbed process emitting 1 MB returns
+   exactly 64 KB; the runner never holds more than 65 KB in
+   memory.
+4. *Host visibility.* Every successful tool response begins with
+   =[gh HOST/REPO ...]=.  Verified by a test that greps the
+   response text.
+5. *Doctor coverage.* =cj/gh-doctor= correctly identifies (a) no
+   gh executable, (b) too-old gh, (c) unauthenticated host, (d)
+   non-git cwd, (e) git cwd whose remote points to an unknown
+   host.
+6. *No secret leakage.* Test fixtures containing
+   =REDACTED_TEST_SECRET= in every secret-bearing slot
+   (=--token=, =-H Authorization=, =-f body=, etc.) produce zero
+   matches when grepping audit log, debug record, and any
+   user-facing message.
+
+* Risks
+
+** R1 -- gh CLI evolves and verbs drift
+
+New =gh= versions may add subcommands the blocklist doesn't
+cover.  Or rename verbs.
+
+*Mitigation:* the blocklist works on verbs (not full subcommand
+paths) so most additions are caught.  Doctor includes a
+=gh --version= floor.  Periodic review when gh bumps a major
+version.
+
+** R2 -- The "all reads auto-execute" default may still be too broad
+
+Some reads expose private content (issue bodies, PR descriptions
+from private repos).  An agent surfacing a confidential issue
+body into a saved conversation has data-leak implications.
+
+*Mitigation:* response header makes the host/repo visible in
+every result, so the saved conversation makes the privacy
+boundary auditable.  Wrappers truncate body/comments by default;
+agent must explicitly opt-in to include them.  Documented in
+commentary.
+
+** R3 -- The general tool's =:confirm t= prompt may become click-fatigue
+
+If the agent uses the general tool heavily during a write-heavy
+workflow (PR creation, label management), confirming every call
+becomes tedious.
+
+*Mitigation:* the expanded wrapper set covers most reads, so the
+general tool fires mainly for writes -- where confirmation is
+exactly the right behavior.  If usage shows confirm fatigue,
+V2's per-host policy can add =:auto= for explicit
+write-confirmed contexts.
+
+** R4 -- =--paginate= block conflicts with legitimate large queries
+
+Blocking =--paginate= globally means the agent can't get
+historical CI runs (which may need pagination).
+
+*Mitigation:* =gh_run_list= and =gh_search_*= accept a clamped
+=--limit= which usually substitutes.  If a use case needs more,
+the agent can request multiple non-paginated pages explicitly.
+
+** R5 -- Token expiry surfaces as cryptic exit 4
+
+When a host's keyring entry expires, every call returns exit 4
+with "authentication required" on stderr.  The agent sees the
+error but may not realize the fix is interactive.
+
+*Mitigation:* the runner's =:error-kind 'auth= mapping prepends
+the recovery message before returning to the agent.  =cj/gh-doctor=
+proactively checks auth status.
+
+** R6 -- TRAMP cwd silently runs gh remotely
+
+Without the explicit TRAMP rejection, =process-file= would try to
+spawn =gh= on the remote host (where it may not exist, or may
+authenticate against the wrong keyring).
+
+*Mitigation:* runner checks =(file-remote-p default-directory)=
+first and returns =:error-kind 'remote-cwd= with a clear message.
+
+** R7 -- =gh search= behaves differently on GHE
+
+GHE may not support every advanced search operator
+=github.com= does.  Search wrappers may return inconsistent
+results across hosts.
+
+*Mitigation:* documented in =gh_search_*= wrapper descriptions.
+Result header makes the host visible so the agent can adjust.
+
+** R8 -- Audit log grows unbounded
+
+One file per day, but no automatic cleanup.
+
+*Mitigation:* metadata-only entries are tiny (~150 bytes); a
+year of heavy use is a few MB.  Manual cleanup acceptable.
+Defcustom to disable for users who don't want it.
+
+* Open Questions
+
+** Q1 -- Should the general tool's confirmation prompt include the classification?
+
+When GPTel asks "Run gh tool? (y/n)" the prompt shows the argv but
+not the classification.  Showing "WRITE: gh pr merge 171" gives the
+user more context.  Need to investigate gptel's confirm-prompt
+extensibility.
+
+** Q2 -- Should =gh_pr_diff= cap differently from text wrappers?
+
+A PR diff can legitimately be 100KB+ for a large refactor.  The
+64KB cap is the same as everywhere else.  If diffs need a higher
+cap (256KB?), that's per-wrapper config.
+
+** Q3 -- Should wrappers expose =include-body=, =include-comments=, etc., as separate args, or as a comma-separated list?
+
+The spec proposes separate boolean args (=:include-body t=,
+=:include-comments t=).  Alternative: one =:include= comma-list
+arg.  Separate args are more discoverable; comma-list is more
+compact.  Decide during Phase 4.
+
+** Q4 -- Should =cj/gh-tool-audit-log= grow into a query interface?
+
+V1 writes one line per call.  Future: a command to query the
+log (=cj/gh-audit-search REGEX=) for surfacing "what did the agent
+do to this PR last week?".
+
+* V2 Roadmap
+
+Items intentionally deferred:
+
+- *Per-host policy.*  =cj/gh-host-policy= alist keyed by hostname
+  (mirror of the MCP spec's structure) so work GHE can be
+  read-only while personal allows writes-with-confirm.
+- *Conversation context injection.*  After a PR view, the wrapper
+  inserts a "GitHub context: HOST/REPO PR #N at URL" line into the
+  GPTel buffer so saved conversations stay traceable without
+  bundling full output.
+- *Local/remote bridge helpers.*  current-branch → PR-number,
+  changed-files → matching PR file comments, etc.
+- *Artifact download.*  =gh_run_artifacts=, =gh_release_download=
+  with explicit confirm and write to a controlled directory.
+- *Async general tool.*  =make-process= + sentinel for the cases
+  where 60s timeout isn't enough (rare, but real for some
+  =--paginate= scenarios).
+- *Audit log query interface.*  =cj/gh-audit-search=,
+  =cj/gh-audit-by-host=.
+- *Profile-based tool subsets.*  e.g. read-only profile
+  vs. write-capable profile per buffer.
+
+* References
+
+- [[file:../../gptel-tools/git_log.el][gptel-tools/git_log.el]] -- pattern reference for new tool files.
+- [[file:../../modules/ai-config.el][modules/ai-config.el]] -- =gptel-confirm-tool-calls nil= at
+  line 386; loader at lines 71-96.
+- [[file:gptel-agentic-tool-ideas.org][gptel-agentic-tool-ideas.org]] -- broader agentic-tool design;
+  =gh= sits alongside the MCP integration as the
+  collaboration tier.
+- [[file:mcp-el-gptel-integration.org][mcp-el-gptel-integration.org]] -- sibling design; same
+  confirm-on-write pattern for safety.
+- [[https://cli.github.com/manual/][gh CLI manual]] -- subcommand reference.
+- =gh --version 2.92.0= help output -- verified flag semantics
+  per subcommand.
+- =gh help environment= -- verified env-var names for non-interactive
+  mode.
diff --git a/docs/design/mcp-el-gptel-integration.org b/docs/design/mcp-el-gptel-integration.org
new file mode 100644
index 00000000..6bac77c5
--- /dev/null
+++ b/docs/design/mcp-el-gptel-integration.org
@@ -0,0 +1,1434 @@
+#+TITLE: Design: Wire mcp.el into GPTel for MCP server access
+#+AUTHOR: Craig Jennings
+#+DATE: 2026-05-16
+#+OPTIONS: toc:nil num:nil
+
+* Status
+
+Draft (revision 3).  Pre-implementation; no code shipped yet.  The
+mcp.el package is cloned at =~/code/mcp.el/= (fork of
+[[https://github.com/lizqwerscott/mcp.el][lizqwerscott/mcp.el]]) but not wired into the config.
+
+Revision 3 tightens seven contracts the revision-2 review flagged:
+
+1. *GPTel confirmation* -- =gptel-confirm-tool-calls= is =nil= at
+   =ai-config.el:386=, which short-circuits every per-tool
+   =:confirm= slot.  The integration flips it to =auto= as a hard
+   precondition.
+2. *Async timeout mechanics* -- replaced =with-timeout= (which
+   only supervises dynamic extent) with an explicit timer/callback
+   race for async tool calls.
+3. *Startup completion semantics* -- the hub's completion callback
+   is opportunistic, not authoritative; the stall timer + polling
+   =mcp-server-connections= is the source of truth.
+4. *Server identity at registration* -- walk
+   =mcp-server-connections= directly instead of parsing
+   =:category mcp-SERVER= out of =mcp-hub-get-all-tool=.
+5. *Server enablement* -- =cj/mcp-enabled-servers= defcustom lets
+   users disable a server without writing code.  Profiles still
+   deferred.
+6. *Keymap pinned* -- =C-; a C= (Connect) is the MCP subprefix.
+   =M= (=gptel-menu=) and =m= (=cj/gptel-change-model=) stay
+   where they are.
+7. *mcp.el private-API isolation* -- a compat layer wraps every
+   =mcp--*= call so version drift surfaces in one place.
+
+Plus several smaller changes: every MCP tool registers async,
+description normalization adds a server-name prefix and a write
+risk note, =cj/mcp-start-on-entry-points= defcustom scopes
+startup triggers (default: full chat only), TRAMP processes
+local-only, doctor gains live-auth-check, =cj/mcp-wait-until-ready=
+command added, audit buffer surfaces failed servers prominently.
+
+* Problem
+
+GPTel exposes ten local tools today (=read_buffer=, =read_text_file=,
+=write_text_file=, =update_text_file=, =list_directory_files=,
+=move_to_trash=, =git_status=, =git_log=, =git_diff=, =web_fetch= --
+see =gptel-tools/=).  Claude Code, by contrast, has access to nine
+external MCP servers (linear, notion, figma, slack-deepsat,
+google-calendar, google-docs-personal, google-docs-work, drawio,
+google-keep), each exposing 10-70 additional tools.
+
+The asymmetry means agentic work done in GPTel can't touch the same
+external systems Claude Code can.  Wiring [[https://github.com/lizqwerscott/mcp.el][mcp.el]] into the config
+closes the gap: GPTel gains access to every MCP server Claude Code
+uses, modulo three claude.ai-hosted servers whose OAuth is bound to
+the Claude.ai session (see Non-Goals).
+
+* Goals
+
+1. GPTel sees every tool from the enabled subset of nine reusable
+   MCP servers in =gptel-menu=, grouped by server via the tool's
+   =:category= field.
+2. Servers spawn *asynchronously*.  Opening GPTel never blocks on
+   MCP startup; tools arrive incrementally and =gptel-tools= updates
+   as each server reports its inventory.  =cj/toggle-gptel= must
+   return without waiting for any MCP subprocess.
+3. Write/destructive MCP tools are gated by a confirmation prompt
+   the user actually sees.  Two preconditions:
+   =gptel-confirm-tool-calls= is set to =auto= (so the per-tool
+   =:confirm= slot is honored), and write/destructive tools are
+   registered with =:confirm t=.  Read-only tools execute without
+   confirmation.
+4. Secrets stay in =~/.claude.json= (single source of truth, shared
+   with Claude Code).  The Emacs config reads env vars from there
+   at server-spawn time, with an mtime-aware cache.  Secrets are
+   never echoed to status, errors, hub buffers, or tests.
+5. A per-server status alist tracks each server's lifecycle (idle /
+   starting / ready / failed / stopped) and is inspectable via
+   =cj/mcp-status= and a =cj/mcp-list-tools= audit buffer.
+6. Server-management commands live under a =C-; a C= (Connect)
+   subprefix so existing GPTel keys (=C-; a M=, =m=) aren't
+   disturbed.
+7. A failed server (network down, OAuth token expired, npx package
+   404) is surfaced clearly via the OAuth-recovery pattern matcher
+   and does not block GPTel itself.  Successful servers' tools are
+   available immediately; failed servers' tools are absent (not
+   stale).
+8. The config can swap between MELPA mcp.el and the local
+   =~/code/mcp.el/= checkout with a one-line uncomment, gated by a
+   capability check that asserts required API functions exist.
+9. A first-run =cj/mcp-doctor= command diagnoses missing
+   prerequisites (=npx=, =uvx=, =~/.claude.json=, per-server
+   commands, known local endpoints) and optionally runs a
+   live-auth probe before they fail at runtime.
+
+* Non-Goals
+
+- The three claude.ai-hosted MCP servers (Gmail / Drive / Calendar
+  served from =*.googleapis.com/mcp/v1=).  Their OAuth is issued
+  by the Claude.ai session and is not transferable to GPTel.
+- *MCP resources and prompts.*  v1 registers tools only.
+  Resource browsing and prompt invocation are tracked as
+  follow-ups; the local checkout has the API surface ready.
+- *Per-conversation tool profiles.*  v1 ships
+  =cj/mcp-enabled-servers= for whole-server enable/disable;
+  profiles (different tool subsets per chat) wait for v1.5 once
+  usage shows whether they're needed.
+- *Auth-source migration.*  Deferred until the OAuth re-auth flow
+  for expiring tokens is understood.  Tracked in Open Questions
+  §Q3.
+- *Automated OAuth re-auth when tokens expire.*  Out of scope; the
+  user re-authenticates via Claude Code, and the next GPTel
+  invocation picks up the refreshed values from =~/.claude.json=.
+- *Modifying mcp.el itself in this repo.*  Upstream patches and
+  tests live in =~/code/mcp.el/= and ship via PRs to lizqwerscott's
+  master.
+
+* Verified API Contracts
+
+These were checked against the actual source before each revision.
+Behavior summarized here so implementation can rely on it.
+
+** GPTel confirmation gating (=gptel.el:2244=)
+
+The confirmation check is:
+
+#+begin_src emacs-lisp
+(if (and gptel-confirm-tool-calls
+         (or (eq gptel-confirm-tool-calls t)
+             (gptel-tool-confirm tool-spec)))
+    ;; ask user
+    ...)
+#+end_src
+
+When =gptel-confirm-tool-calls= is =nil=, the =(and ...)=
+short-circuits and the tool's =:confirm= slot is ignored.
+
+The defcustom default is ='auto=, which "seeks confirmation only
+when the corresponding tool spec has a non-nil :confirm slot"
+(=gptel.el:1601-1603=).  =ai-config.el:386= currently sets it to
+=nil=.
+
+*Implementation consequence:* =ai-mcp.el= must =setq
+gptel-confirm-tool-calls 'auto= as part of its setup (and
+=ai-config.el= drops the explicit =nil= setting).  Without this,
+write-gated tools register =:confirm t= and gptel ignores it.
+
+** mcp-hub callback ownership (=~/code/mcp.el/mcp-hub.el:53-90=)
+
+=mcp-hub--start-server= unconditionally appends its own six
+callbacks (=:initial-callback=, =:tools-callback=,
+=:prompts-callback=, =:resources-callback=,
+=:resources-templates-callback=, =:error-callback=) to whatever
+the caller passes.  Per-server custom callbacks in the alist
+result in duplicate keyword arguments to =mcp-connect-server= --
+behavior implementation-defined.
+
+*Implementation consequence:* the integration does not slip custom
+callbacks through =mcp-hub-servers=.  It uses
+=mcp-hub-start-all-server='s top-level completion callback as an
+opportunistic signal, walks =mcp-server-connections= directly for
+authoritative state, and uses a stall timer as the deadline.
+
+** mcp-hub-start-all-server completion semantics
+
+The hub's completion callback (=mcp-hub-start-all-server='s
+=CALLBACK= argument) fires when its internal counter reaches the
+total server count.  The counter increments:
+
+- On immediate Elisp errors from =mcp-hub--start-server=.
+- When the =:inited-callback= passed to =mcp-hub--start-server=
+  fires, which happens inside the hub's =:tools-callback=.
+
+Async error paths flow through =:error-callback= -- which the hub
+also installs but does *not* obviously chain into the inited
+callback.  Servers without tools may not pass through the tools
+callback in the same way.
+
+*Implementation consequence:* the callback is treated as an
+opportunistic readiness signal, *not* as "all initialized or
+failed".  The authoritative state comes from polling
+=mcp-server-connections= (each entry has =mcp--status= of
+=connected= / =error= / =starting=) and from the stall timer
+deadline.
+
+** gptel-make-tool registration semantics (=gptel.el:1729-1820=)
+
+=gptel-make-tool= registers the tool into =gptel--known-tools=
+keyed by category + name.  It does *not* add the tool to
+=gptel-tools= (the per-buffer active list).  The existing local
+tools (=gptel-tools/git_log.el:97=) explicitly do:
+
+#+begin_src emacs-lisp
+(gptel-make-tool ...)
+(add-to-list 'gptel-tools (gptel-get-tool '("category" "name")))
+#+end_src
+
+*Implementation consequence:* the registration pipeline does
+both calls per tool, tracks tool names per server in a hash, and
+deregisters cleanly on restart/stop without disturbing local
+tools.
+
+** MELPA vs local checkout
+
+The local checkout (=~/code/mcp.el/=, tip =f10768e=) has HTTP
+transport (=mcp-http-process-connection=) and recent UX
+improvements (resource reading, imenu support, detail mode).
+MELPA parity not yet verified.
+
+*Implementation consequence:* =cj/mcp--assert-capabilities=
+checks for required functions at load time and signals a clear
+=user-error= if missing.  Use-package block defaults to MELPA;
+the local-checkout =:load-path= line stays commented until the
+capability check tells us MELPA is missing something.
+
+* GPTel Confirmation Contract
+
+The single most consequential precondition for the safety story:
+
+** Current state
+
+=modules/ai-config.el:386= sets =gptel-confirm-tool-calls= to
+=nil=.  This was a deliberate "allow tool access by default"
+choice when only the ten local tools existed -- all of which are
+either read-only (git_log, git_status, list_directory_files, etc.)
+or already wrap their own confirm prompts (web_fetch uses
+=:confirm t= but is ignored under the current setting; this was
+acceptable because the only "real" tool there is on a user-typed
+URL).
+
+** Required state
+
+The MCP integration cannot ship without flipping this to =auto=.
+Specifically, =modules/ai-mcp.el= must:
+
+1. =setq gptel-confirm-tool-calls 'auto= during its load.
+2. Audit the existing local tools and add =:confirm t= to any
+   that should be gated.  =web_fetch= is the obvious candidate;
+   =write_text_file=, =update_text_file=, =move_to_trash= may
+   also warrant it depending on Craig's preference.
+
+The existing =ai-config.el:386= line is removed.  A comment
+points readers at =ai-mcp.el= for the new value.
+
+** Verification test
+
+A test in =tests/test-ai-mcp-confirm-contract.el= asserts:
+
+- After =ai-mcp= loads, =gptel-confirm-tool-calls= is ='auto=.
+- A write-classified MCP tool registered with =:confirm t= takes
+  the confirmation branch in =gptel-send='s tool-dispatch code
+  (verified by stubbing gptel's confirm-prompt and checking it
+  fires).
+- A read-classified MCP tool registered with =:confirm nil= does
+  not take the confirmation branch.
+- Local =git_log= (=:confirm nil=) still runs without prompting.
+
+* Current State
+
+** =modules/ai-config.el=
+
+- =use-package gptel= block at lines 363-414, defer-loaded on the
+  =gptel= / =gptel-send= / =gptel-menu= commands.
+- =gptel-confirm-tool-calls nil= at line 386.  Removed by this
+  integration; see § GPTel Confirmation Contract.
+- =cj/gptel-load-local-tools= (lines 71-96) loads the ten local
+  tools from =gptel-tools/=.
+- =cj/toggle-gptel= (lines 418-441) is the primary entry point
+  (=C-; a t=).  Other entry points: =cj/gptel-quick-ask=
+  (=C-; a q=), =gptel-magit-commit-generate= (=g= in magit),
+  =cj/gptel-rewrite-with-directive= (=C-; a r=), =gptel-send=
+  (=C-RET= in gptel buffer).
+- =cj/ai-keymap= (lines 510-528) currently uses keys A B M d . f
+  b l m p q r R c s t x.  =C= (uppercase) is free and becomes the
+  MCP subprefix.
+
+** =gptel-tools/=
+
+Ten =.el= files.  =git_log.el= is the closest analogue;
+=web_fetch.el= demonstrates the =:confirm t= pattern.
+
+** =~/.claude.json=
+
+Mode 0600, ~75 KB.  Top-level =mcpServers= key holds the nine
+servers we want.  Env-var names per server (values redacted):
+
+| Server               | Env vars                                              |
+|----------------------+-------------------------------------------------------|
+| google-calendar      | =GOOGLE_OAUTH_CREDENTIALS=                            |
+| google-docs-personal | =GOOGLE_CLIENT_ID=, =GOOGLE_CLIENT_SECRET=, =GOOGLE_MCP_PROFILE= |
+| google-docs-work     | Same three vars (different values)                    |
+| google-keep          | =GOOGLE_EMAIL=, =GOOGLE_MASTER_TOKEN=                 |
+
+* Design
+
+** Module split
+
+Implementation lives in =modules/ai-mcp.el=.  =modules/ai-config.el=
+gains only autoload declarations and the =C-; a C= subprefix
+wiring.
+
+** Code organization outline (=ai-mcp.el=)
+
+The file is organized in seven sections so it stays readable as
+features land:
+
+1. *Constants and defcustoms* -- =cj/mcp-server-specs=,
+   =cj/mcp-claude-config=, =cj/mcp-enabled-servers=,
+   =cj/mcp-start-on-entry-points=, =cj/mcp-startup-timeout=,
+   =cj/mcp-tool-timeout=, =cj/mcp-tool-confirm-overrides=,
+   audit-log defcustoms.
+2. *Public commands* -- =cj/mcp-ensure-started=, =cj/mcp-hub=,
+   =cj/mcp-status=, =cj/mcp-list-tools=,
+   =cj/mcp-restart-failed=, =cj/mcp-restart-server=,
+   =cj/mcp-stop-all=, =cj/mcp-doctor=,
+   =cj/mcp-wait-until-ready=.
+3. *Pure helpers* -- Claude config reader, =cj/mcp--build-server-alist=,
+   =cj/mcp--redact=, =cj/mcp--confirm-p=,
+   =cj/mcp--normalize-description=.
+4. *mcp.el compatibility layer* -- 3-5 wrappers around private
+   API (=mcp--status=, =mcp--tools=, etc.).  Single source of
+   version-drift risk.
+5. *Registration pipeline* -- =cj/mcp--register-tool=,
+   =cj/mcp--register-server-tools=,
+   =cj/mcp--deregister-server-tools=,
+   =cj/mcp--registered-tools= hash.
+6. *Async state machine* -- =cj/mcp--state=,
+   =cj/mcp--server-status=, =cj/mcp--on-all-started=,
+   =cj/mcp--stall-timer=, =cj/mcp--poll-status=.
+7. *UI* -- audit-buffer mode, doctor buffer, recovery-pattern
+   matcher, response prefixes.
+
+This explicit outline doubles as the file's table of contents in
+its commentary block.
+
+** Server inventory: data first
+
+The nine servers are described as a defconst of plists, with no
+secrets baked in:
+
+#+begin_src emacs-lisp
+(defconst cj/mcp-server-specs
+  '((:name "linear"
+     :transport http
+     :url "https://mcp.linear.app/mcp"
+     :auth in-protocol
+     :risk write-capable)
+    (:name "notion"
+     :transport http
+     :url "https://mcp.notion.com/mcp"
+     :auth in-protocol
+     :risk write-capable)
+    (:name "figma"
+     :transport stdio
+     :command "npx"
+     :args ("-y" "figma-developer-mcp" "--stdio")
+     :secret-args ("--figma-api-key" :figma-api-key)
+     :auth args-token
+     :risk arg-leak)
+    (:name "slack-deepsat"
+     :transport sse
+     :url "http://127.0.0.1:13080/sse"
+     :auth local
+     :risk write-capable)
+    (:name "drawio"
+     :transport stdio
+     :command "npx"
+     :args ("-y" "@drawio/mcp")
+     :auth none
+     :risk none)
+    (:name "google-calendar"
+     :transport stdio
+     :command "npx"
+     :args ("-y" "@cocal/google-calendar-mcp")
+     :env (:GOOGLE_OAUTH_CREDENTIALS t)
+     :auth oauth
+     :risk write-capable)
+    (:name "google-docs-personal"
+     :transport stdio
+     :command "npx"
+     :args ("-y" "@a-bonus/google-docs-mcp")
+     :env (:GOOGLE_CLIENT_ID t :GOOGLE_CLIENT_SECRET t :GOOGLE_MCP_PROFILE t)
+     :auth oauth
+     :risk write-capable)
+    (:name "google-docs-work"
+     :transport stdio
+     :command "npx"
+     :args ("-y" "@a-bonus/google-docs-mcp")
+     :env (:GOOGLE_CLIENT_ID t :GOOGLE_CLIENT_SECRET t :GOOGLE_MCP_PROFILE t)
+     :auth oauth
+     :risk write-capable)
+    (:name "google-keep"
+     :transport stdio
+     :command "uvx"
+     :args ("--from" "keep-mcp" "python" "-m" "server.cli")
+     :env (:GOOGLE_EMAIL t :GOOGLE_MASTER_TOKEN t)
+     :auth token
+     :risk write-capable)))
+#+end_src
+
+The same data drives the doctor check list, status labels, and
+recovery messages -- a single source of truth keeps them from
+drifting.
+
+** Server enablement
+
+#+begin_src emacs-lisp
+(defcustom cj/mcp-enabled-servers
+  (mapcar (lambda (s) (plist-get s :name)) cj/mcp-server-specs)
+  "List of MCP server names to start.
+Defaults to every server in `cj/mcp-server-specs'.  Set to a
+shorter list to disable specific servers without editing the
+spec.  Changes take effect on next `cj/mcp-restart-failed' or
+Emacs restart."
+  :type '(repeat string)
+  :group 'cj)
+#+end_src
+
+=cj/mcp--build-server-alist= filters by this list before
+returning.  A user who wants only =linear= and =drawio= sets:
+
+#+begin_src emacs-lisp
+(setq cj/mcp-enabled-servers '("linear" "drawio"))
+#+end_src
+
+This is the answer to "100+ tools is overwhelming" without
+needing per-conversation profiles.
+
+** Entry-point policy
+
+Not every GPTel entry point should trigger MCP startup.  Quick
+ask, rewrite, and magit commit-message generation are
+lightweight; spinning up nine subprocesses for a 50-word commit
+message is surprising overhead.
+
+#+begin_src emacs-lisp
+(defcustom cj/mcp-start-on-entry-points
+  '(toggle-gptel)
+  "GPTel entry points that trigger MCP startup.
+Symbols correspond to commands: `toggle-gptel', `gptel-send',
+`gptel-quick-ask', `gptel-rewrite-with-directive',
+`gptel-magit-generate-message'.  Default: only full chat
+(`toggle-gptel')."
+  :type '(repeat symbol)
+  :group 'cj)
+#+end_src
+
+Each entry-point command checks membership before calling
+=cj/mcp-ensure-started=:
+
+#+begin_src emacs-lisp
+(defun cj/toggle-gptel ()
+  ...
+  (when (memq 'toggle-gptel cj/mcp-start-on-entry-points)
+    (cj/mcp-ensure-started))
+  ...)
+#+end_src
+
+** Claude config reader (mtime-cached, structured returns)
+
+#+begin_src emacs-lisp
+(defcustom cj/mcp-claude-config
+  (expand-file-name "~/.claude.json")
+  "Path to the Claude Code config that holds MCP server env vars."
+  :type 'file
+  :group 'cj)
+
+(defvar cj/mcp--config-cache nil
+  "Cons of (MTIME . PARSED) for `cj/mcp-claude-config'.")
+
+(defun cj/mcp--read-claude-config ()
+  "Return a structured result describing the Claude config state.
+Result shape:
+  (:ok t :data PLIST)
+  (:ok nil :reason missing-file)
+  (:ok nil :reason unreadable)
+  (:ok nil :reason malformed-json :message STR)
+Cached by mtime; subsequent calls reparse only on change."
+  ...)
+#+end_src
+
+** mcp.el compatibility layer
+
+All private-API access lives in 3-5 helpers documented with the
+upstream commit they target.  This is the only file that touches
+=mcp--*= names; everything else calls these wrappers.
+
+#+begin_src emacs-lisp
+;; ai-mcp-compat -- isolates private mcp.el API.
+;; Verified against upstream commit f10768e (2026-05-16).
+
+(defun cj/mcp--server-status (connection)
+  "Return CONNECTION's lifecycle status: connected, error, starting."
+  (mcp--status connection))
+
+(defun cj/mcp--server-tools (connection)
+  "Return CONNECTION's discovered tool list (plists)."
+  (mcp--tools connection))
+
+(defun cj/mcp--server-name (connection)
+  "Return CONNECTION's logical server name."
+  (jsonrpc-name connection))
+
+(defun cj/mcp--assert-capabilities ()
+  "Signal `user-error' if any required mcp.el function is missing."
+  (dolist (fn '(mcp-connect-server mcp-make-text-tool
+                mcp-hub mcp-hub-start-all-server
+                mcp-hub-get-all-tool mcp-server-connections))
+    (unless (fboundp fn)
+      (user-error "mcp.el too old; missing %s.  Upgrade or switch \
+to local checkout in `ai-mcp.el' use-package block" fn))))
+#+end_src
+
+If mcp.el renames a slot or changes a return shape, only these
+helpers break.  Tests cover each helper against stub objects.
+
+** Startup model: async + state machine + polling
+
+Three state structures capture lifecycle:
+
+#+begin_src emacs-lisp
+(defvar cj/mcp--state 'idle
+  "Overall MCP integration state: idle, starting, partial, ready, failed.")
+
+(defvar cj/mcp--server-status nil
+  "Alist mapping server name to status plist:
+  (:state STATE :tool-count N :tools (NAME ...) :last-error STR
+   :started-at TIME :ready-at TIME)")
+
+(defvar cj/mcp--stall-timer nil
+  "Timer guarding against servers that never call back.")
+#+end_src
+
+=cj/mcp-ensure-started= is the only entry point consumers call:
+
+#+begin_src emacs-lisp
+(defun cj/mcp-ensure-started ()
+  "Schedule MCP startup if it hasn't run yet this session.
+Returns immediately.  Servers spawn asynchronously."
+  (when (eq cj/mcp--state 'idle)
+    (setq cj/mcp--state 'starting)
+    (cj/mcp--assert-capabilities)
+    (cj/mcp--build-status-from-specs)
+    (setq mcp-hub-servers (cj/mcp--build-server-alist))
+    (message "MCP: starting %d server(s) in background..."
+             (length mcp-hub-servers))
+    ;; The hub callback is opportunistic.  We poll status on each
+    ;; tick and the stall timer is the authoritative deadline.
+    (mcp-hub-start-all-server #'cj/mcp--on-hub-callback nil nil)
+    (cj/mcp--start-stall-timer)))
+#+end_src
+
+State transitions and authority:
+
+- *Hub completion callback (opportunistic).*  Triggers an
+  immediate poll + registration pass for whatever's ready.  Does
+  not signal completion by itself.
+- *Stall timer (authoritative deadline).*  After
+  =cj/mcp-startup-timeout= (default 30 s), marks every server
+  still in =starting= state as =failed= with reason =timeout=,
+  registers tools from servers that did become ready, transitions
+  =cj/mcp--state= to its final value (=ready=, =partial=, or
+  =failed=).
+- *Polling (authoritative state).*  =cj/mcp--poll-status= walks
+  =mcp-server-connections= and maps each entry's =mcp--status= to
+  =cj/mcp--server-status=.  Called from the hub callback and from
+  the stall timer.  Servers that transitioned through
+  =:error-callback= (which the hub doesn't chain into the inited
+  callback) show up here.
+
+Properties:
+
+- =cj/mcp-ensure-started= returns in <100 ms regardless of
+  subprocess state.
+- =mcp-hub-start-all-server= itself is async (third =SYNCP= arg
+  is =nil=).
+- Servers transition independently; tools land in =gptel-tools=
+  as each server reports inventory.
+- A server that never connects, never errors, and never reports
+  tools is caught by the stall timer.
+
+** Tool registration pipeline
+
+Walks =mcp-server-connections= directly, per server, after each
+status poll.  This gives clean per-server bookkeeping without
+parsing the =mcp-SERVER= category prefix:
+
+#+begin_src emacs-lisp
+(defun cj/mcp--register-server-tools (server-name)
+  "Register every tool from the connected server SERVER-NAME.
+Idempotent: re-registration replaces the function pointer
+without duplicating menu entries."
+  (let ((connection (gethash server-name mcp-server-connections)))
+    (when (and connection
+               (eq (cj/mcp--server-status connection) 'connected))
+      ;; First, deregister any existing tools for this server.
+      (cj/mcp--deregister-server-tools server-name)
+      (dolist (raw-tool (cj/mcp--server-tools connection))
+        (cj/mcp--register-tool server-name raw-tool))
+      (cj/mcp--update-server-status server-name :state 'ready))))
+
+(defun cj/mcp--register-tool (server-name raw-tool)
+  "Register one tool from SERVER-NAME.
+RAW-TOOL is the plist from `mcp--tools' (untransformed)."
+  (let* ((remote-name (plist-get raw-tool :name))
+         (gptel-name (format "mcp__%s__%s" server-name remote-name))
+         (description (cj/mcp--normalize-description
+                       server-name raw-tool))
+         (confirm-p (cj/mcp--confirm-p gptel-name remote-name))
+         ;; mcp-make-text-tool builds the closure; we async by default.
+         (mcp-plist (mcp-make-text-tool server-name remote-name t))
+         ;; Rewrite name + description + confirm after mcp.el builds the closure.
+         (gptel-plist (cj/mcp--rewrite-plist
+                       mcp-plist
+                       :name gptel-name
+                       :description description
+                       :confirm confirm-p
+                       :async t
+                       :category (format "mcp-%s" server-name))))
+    (apply #'gptel-make-tool gptel-plist)
+    (add-to-list 'gptel-tools
+                 (gptel-get-tool
+                  (list (format "mcp-%s" server-name) gptel-name)))
+    (push gptel-name
+          (gethash server-name cj/mcp--registered-tools))))
+#+end_src
+
+Key properties:
+
+- *Async by default.*  All MCP tools register with =:async t=.
+  This avoids any sync MCP tool call blocking Emacs during
+  =gptel-send='s tool dispatch.  Per-call timeout uses the
+  timer-race pattern (next subsection).
+- *Closure preserves remote name.*  =mcp-make-text-tool= built
+  the function before we rewrote =:name=, so the closure calls
+  =mcp-call-tool SERVER REMOTE-NAME=, not the prefixed
+  =mcp__SERVER__TOOL=.
+- *Idempotent.*  Each registration deregisters first, so
+  callbacks firing multiple times or restarts don't accumulate
+  duplicate entries.
+- *Description normalization.*  See next subsection.
+
+** Per-call timeout: explicit timer/callback race
+
+=with-timeout= only supervises the dynamic extent of the form it
+wraps.  For async tools where the function returns immediately
+and the callback fires later, it does nothing.  The correct
+pattern is an explicit timer:
+
+#+begin_src emacs-lisp
+(defun cj/mcp--wrap-async-with-timeout (server-name remote-name
+                                        mcp-callback)
+  "Return a gptel-compatible async wrapper for an MCP tool.
+GPTel calls the returned function with (CALLBACK . ARGS).
+The wrapper races MCP's response against `cj/mcp-tool-timeout';
+whichever fires first wins, and late callbacks are ignored."
+  (lambda (gptel-callback &rest args)
+    (let* ((done nil)
+           (timer (run-at-time cj/mcp-tool-timeout nil
+                               (lambda ()
+                                 (unless done
+                                   (setq done t)
+                                   (funcall gptel-callback
+                                            (format "MCP tool %s/%s \
+timed out after %ds"
+                                                    server-name
+                                                    remote-name
+                                                    cj/mcp-tool-timeout)))))))
+      (mcp-async-call-tool
+       (gethash server-name mcp-server-connections)
+       remote-name
+       (cj/mcp--args-to-plist args)
+       (lambda (result)
+         (cancel-timer timer)
+         (unless done
+           (setq done t)
+           (funcall gptel-callback
+                    (mcp--parse-tool-call-result result))))
+       (lambda (code message)
+         (cancel-timer timer)
+         (unless done
+           (setq done t)
+           (funcall gptel-callback
+                    (format "MCP error %s: %s" code
+                            (cj/mcp--redact message)))))))))
+#+end_src
+
+Both branches set =done= before invoking gptel's callback so a
+late response (e.g., MCP responds after the timer fired but
+before its sentinel cancels) doesn't deliver twice.  Timer is
+canceled on the success and error paths.
+
+The closure that =mcp-make-text-tool= built gets replaced with
+this wrapper during registration (the =:function= slot of the
+rewritten plist).
+
+** Confirmation / safety policy
+
+All enabled tools are registered (per the goal of full
+visibility), but write/destructive tools get =:confirm t=.
+Classification is name-based:
+
+#+begin_src emacs-lisp
+(defconst cj/mcp--write-name-patterns
+  '("\\`create\\b" "\\`update\\b" "\\`delete\\b" "\\`remove\\b"
+    "\\`send\\b" "\\`post\\b" "\\`add\\b" "\\`move\\b"
+    "\\`invite\\b" "\\`share\\b" "\\`upload\\b" "\\`set\\b"
+    "\\`patch\\b" "\\`import\\b" "\\`sync\\b" "\\`merge\\b"
+    "\\`close\\b" "\\`reopen\\b" "\\`archive\\b" "\\`unarchive\\b"
+    "\\`approve\\b" "\\`reject\\b" "\\`label\\b" "\\`assign\\b"
+    "\\`reply\\b" "\\`comment\\b" "\\`trash\\b" "\\`restore\\b"
+    "\\`pin\\b" "\\`unpin\\b" "\\`copy\\b" "\\`rename\\b"))
+
+(defconst cj/mcp--read-name-patterns
+  '("\\`get\\b" "\\`list\\b" "\\`read\\b" "\\`search\\b"
+    "\\`find\\b" "\\`fetch\\b" "\\`view\\b" "\\`query\\b"
+    "\\`describe\\b" "\\`show\\b" "\\`check\\b"))
+
+(defcustom cj/mcp-tool-confirm-overrides nil
+  "Per-tool confirmation overrides.
+Alist mapping fully qualified MCP tool name (e.g.,
+\"mcp__linear__create_issue\") to t or nil.  Wins over the
+pattern-based classifier."
+  :type '(alist :key-type string :value-type boolean)
+  :group 'cj)
+
+(defun cj/mcp--confirm-p (gptel-name remote-name)
+  "Return non-nil if the tool should register with `:confirm t'."
+  (let ((override (assoc gptel-name cj/mcp-tool-confirm-overrides)))
+    (cond
+     (override (cdr override))
+     ((seq-some (lambda (p) (string-match-p p remote-name))
+                cj/mcp--write-name-patterns) t)
+     ((seq-some (lambda (p) (string-match-p p remote-name))
+                cj/mcp--read-name-patterns) nil)
+     (t t))))  ; unknown → confirm
+#+end_src
+
+This is the second half of the safety story; the first half
+(=gptel-confirm-tool-calls 'auto=) is enforced by ai-mcp.el at
+load time.
+
+** Description normalization
+
+mcp-side descriptions are written for an arbitrary client and
+vary in quality.  The registration pipeline prefixes a stable
+"server / write-risk" header so the agent and the user have
+consistent context:
+
+#+begin_src emacs-lisp
+(defun cj/mcp--normalize-description (server-name raw-tool)
+  "Return a normalized description string for RAW-TOOL.
+Prefix: [SERVER] for reads, [SERVER WRITE] for writes,
+[SERVER ?] for unknown classification.  Then the upstream
+description, unchanged."
+  (let* ((remote-name (plist-get raw-tool :name))
+         (upstream (or (plist-get raw-tool :description)
+                       "(no description provided by server)"))
+         (cls (cond
+               ((seq-some (lambda (p) (string-match-p p remote-name))
+                          cj/mcp--write-name-patterns) "WRITE")
+               ((seq-some (lambda (p) (string-match-p p remote-name))
+                          cj/mcp--read-name-patterns) "")
+               (t "?"))))
+    (format "[%s%s] %s" server-name
+            (if (string-empty-p cls) "" (concat " " cls))
+            upstream)))
+#+end_src
+
+Tools in =gptel-menu= show as:
+
+#+begin_example
+[linear WRITE] Create a new Linear issue in a team.
+[linear] List issues in a Linear team.
+[google-keep ?] Frobnicate a note (unknown classification).
+#+end_example
+
+** Tool deregistration
+
+Three triggers remove tools from =gptel-tools=:
+
+- =cj/mcp-stop-all= -- removes every MCP-registered tool, clears
+  =cj/mcp--registered-tools=, calls =mcp-stop-server= per server.
+  Local tools untouched.
+- =cj/mcp-restart-server= -- removes tools for the named server
+  before re-registering.
+- =cj/mcp-restart-failed= -- deregister + re-register each
+  =failed= server.
+
+#+begin_src emacs-lisp
+(defun cj/mcp--deregister-server-tools (server-name)
+  "Remove every GPTel tool this integration registered for SERVER-NAME.
+Local tools (those not in `cj/mcp--registered-tools') are
+preserved."
+  (let ((mcp-tools (gethash server-name cj/mcp--registered-tools)))
+    (dolist (tool-name mcp-tools)
+      (setq gptel-tools
+            (cl-remove-if (lambda (tool)
+                            (string= (gptel-tool-name tool) tool-name))
+                          gptel-tools)))
+    (remhash server-name cj/mcp--registered-tools)))
+#+end_src
+
+** Secrets redaction
+
+=cj/mcp--redact= masks every secret-bearing field before any
+string surfaces in messages, errors, status buffers, hub
+displays, or test fixtures.  Used by status formatting, audit
+buffer, failure surfacing, and error wrappers.  Tests assert
+sentinel =REDACTED_TEST_SECRET= never appears in any user-facing
+output.
+
+** Process cleanup
+
+=kill-emacs-hook= gets one entry: =cj/mcp-stop-all=.  Stdio
+process sentinels record abnormal exits into the per-server
+status.
+
+*Local-only constraint.*  MCP server processes are always spawned
+under the local Emacs's =default-directory='s root.  TRAMP /
+remote buffers do not relocate the spawn; MCP processes are
+local-only.  This is enforced implicitly (mcp-hub-start-all-server
+uses =make-process= which is local) and documented in the
+commentary.
+
+** Privacy: external tool output in saved conversations
+
+MCP tool results land in the GPTel buffer.  GPTel's autosave (when
+enabled) persists those results to
+=~/.emacs.d/ai-conversations/=.  Concrete implications:
+
+- A Slack channel excerpt the agent fetched is now on disk.
+- A Google Docs snippet the agent quoted is now on disk.
+- A Linear issue body the agent read is now on disk.
+
+This is normal external-tool behavior, but users may not realize
+it.  Two mitigations:
+
+- =ai-mcp.el='s commentary explicitly documents the autosave
+  privacy implication.
+- The audit buffer (=cj/mcp-list-tools=) includes a header note:
+  "Tool results land in =gptel-tools= responses; saved
+  conversations persist them.  Use =cj/gptel-autosave-toggle= per
+  buffer to opt out."
+
+A future enhancement (V1.5+) could mark MCP-sourced tool output
+with a visible delimiter in the GPTel buffer so the user sees
+"this came from an external server" during the chat.  Not v1.
+
+** Per-server auth matrix
+
+| =:auth= value | Servers | How it works | Recovery if failed |
+|---------------+---------+--------------+--------------------|
+| =in-protocol= | linear, notion | mcp.el HTTP transport handles OAuth handshake | Open URL surfaced in =cj/mcp-status= |
+| =local= | slack-deepsat | Local SSE; no auth but proxy must run | Start the local proxy |
+| =none= | drawio | No auth | n/a |
+| =args-token= | figma | API key in process args | Update Claude config, restart |
+| =oauth= | google-calendar, google-docs-* | OAuth token in env; refresh out-of-band | Re-auth via Claude Code, restart |
+| =token= | google-keep | Long-lived token in env | Regenerate, update Claude config, restart |
+
+Recovery surfaces via pattern matching on errors:
+
+#+begin_src emacs-lisp
+(defconst cj/mcp--recovery-patterns
+  '(("\\(401\\|unauthorized\\|token expired\\)"
+     . "Token expired -- re-auth via Claude Code, then C-; a C r SERVER")
+    ("\\(connection refused\\|ECONNREFUSED\\)"
+     . "Local endpoint unreachable -- check the upstream service is running")
+    ("\\(ENOENT\\|command not found\\)"
+     . "Missing dependency -- run `cj/mcp-doctor' to diagnose")))
+#+end_src
+
+** Timeouts
+
+| Timeout | Variable | Default | Behavior |
+|---------+----------+---------+----------|
+| Startup / tool discovery | =cj/mcp-startup-timeout= | 30 s | Server marked =failed/timeout=; integration continues with ready servers |
+| Per-call tool execution | =cj/mcp-tool-timeout= | 60 s | Tool call returns timeout string to agent via the timer-race wrapper; server stays connected |
+
+Both via defcustoms.  Per-tool override possible via
+=cj/mcp-tool-timeout-overrides= alist.
+
+* Status UX
+
+** Echo-area summary: =cj/mcp-status=
+
+Single-line summary keyed off =cj/mcp--state=:
+
+| State | Echo |
+|-------+------|
+| idle | =MCP: not started.  (C-; a t triggers it.)= |
+| starting | =MCP: starting (3/9 ready)...= |
+| partial | =MCP: 7/9 ready (failed: google-calendar, slack-deepsat).= |
+| ready | =MCP: 9/9 ready, 87 tools registered.= |
+| failed | =MCP: all 9 servers failed.  C-; a C d to diagnose.= |
+
+** Wait until ready: =cj/mcp-wait-until-ready=
+
+For the case where the agent asks for a Calendar tool immediately
+after =cj/toggle-gptel=:
+
+#+begin_src emacs-lisp
+(defun cj/mcp-wait-until-ready (&optional timeout)
+  "Block until MCP startup completes or TIMEOUT seconds pass.
+TIMEOUT defaults to `cj/mcp-startup-timeout'.  Returns the final
+state symbol (`ready', `partial', `failed', `starting')."
+  (interactive)
+  ...)
+#+end_src
+
+Bound to =C-; a C w=.  Reports progress every second via
+=message= so the user sees countdown.
+
+** Audit buffer: =cj/mcp-list-tools=
+
+Tabulated-list buffer.  Failed servers appear at the top with a
+red face so they're visually obvious.  Each row:
+
+#+begin_example
+Server               State    Tools  Confirm  Description
+-------------------- -------- -----  -------  ------------------------------
+google-calendar      FAILED   0      -        Token expired; see status
+slack-deepsat        FAILED   0      -        Local proxy unreachable
+-------------------- -------- -----  -------  ------------------------------
+linear/list_issues   ready    -      no       List issues in a Linear team
+linear/create_issue  ready    -      YES      Create a new Linear issue
+linear/...           ready    44 total
+...
+#+end_example
+
+Sort: failed servers first, then by server name, then by tool
+name.  Keys: =g= refresh, =RET= jump to tool's category in
+gptel-menu, =r= restart server under point, =c= toggle confirm
+override for tool under point.
+
+* Commands & Keymap
+
+=C-; a C= becomes the MCP (Connect) subprefix.  Existing keys are
+preserved: =M= keeps =gptel-menu=, =m= keeps
+=cj/gptel-change-model=.
+
+| Key | Command | Purpose |
+|-----+---------+---------|
+| =C-; a C h= | =cj/mcp-hub= | Open server-management buffer |
+| =C-; a C s= | =cj/mcp-status= | Echo state summary |
+| =C-; a C l= | =cj/mcp-list-tools= | Open audit buffer |
+| =C-; a C r= | =cj/mcp-restart-failed= | Restart failed servers |
+| =C-; a C R= | =cj/mcp-restart-server= | Restart a named server |
+| =C-; a C S= | =cj/mcp-stop-all= | Stop everything |
+| =C-; a C d= | =cj/mcp-doctor= | Diagnose prerequisites |
+| =C-; a C w= | =cj/mcp-wait-until-ready= | Block until ready |
+
+which-key labels mirror the table.
+
+** cj/mcp-doctor
+
+Diagnostic command.  Two modes:
+
+- *Static* (default) -- no side effects, no network: capability
+  check, =npx=/=uvx= on PATH, Claude config parseability,
+  per-server env-var presence, local endpoint reachability.
+- *Live* (=C-u C-; a C d=) -- opt-in: invokes a single safe read
+  per auth class to verify OAuth tokens haven't silently expired.
+  For example, =gh_search= against linear is one tool call; same
+  for notion, google-calendar, google-docs, google-keep.  Static
+  checks first; live probe only fires if static passes.
+
+Output buffer keys:
+
+- =c= copy diagnostic summary to kill ring (for pasting into
+  bug reports / notes).
+- =r= rerun all failed checks.
+- =q= quit.
+
+Each check row formats as: =PASS / FAIL / WARN  CHECK  RECOVERY=.
+
+* Implementation Plan
+
+Eight phases (was seven in rev 2; added Phase 1.5 for the
+confirmation-contract setup).  Each ends with green ERT tests +
+a manual smoke test before the next.
+
+** Phase 1 -- Module + pure helpers
+
+Create =modules/ai-mcp.el=.  Implement: =cj/mcp-server-specs=,
+=cj/mcp-enabled-servers=, =cj/mcp-start-on-entry-points=,
+=cj/mcp--read-claude-config=, =cj/mcp--get-env=,
+=cj/mcp--build-server-alist= (pure transformer; filters by
+=cj/mcp-enabled-servers=), =cj/mcp--redact=,
+=cj/mcp--confirm-p=, =cj/mcp--normalize-description=.
+
+Tests cover all of the above against fixtures.
+
+** Phase 1.5 -- Confirmation contract
+
+Flip =gptel-confirm-tool-calls= to ='auto= in =ai-mcp.el='s
+setup.  Remove the =(setq gptel-confirm-tool-calls nil)= line
+from =ai-config.el=.  Audit existing local tools and add
+=:confirm t= to any that should be gated (=web_fetch=
+guaranteed; =write_text_file=, =update_text_file=,
+=move_to_trash= per Craig's decision).
+
+Verification test in =tests/test-ai-mcp-confirm-contract.el=
+asserts the setting, the local-tool gating behavior, and that
+=git_log= (=:confirm nil=) still runs without prompting.
+
+** Phase 2 -- Compat layer + tool registration with fake inventory
+
+Add =ai-mcp-compat= helpers.  Build the registration pipeline
+against a stubbed =mcp-server-connections=.  Verify:
+- Tool name rewriting (=remote-name= stays in closure;
+  =gptel-name= is =mcp__SERVER__TOOL=).
+- =gptel-make-tool= + explicit =add-to-list 'gptel-tools=.
+- =:confirm= application per policy + overrides.
+- Description normalization adds expected prefix.
+- =cj/mcp--registered-tools= bookkeeping.
+- Deregister removes from =gptel-tools= without disturbing local
+  tools (test pre-populates =gptel-tools= with a local tool and
+  asserts it survives).
+- Re-register after deregister doesn't duplicate.
+- All tools register with =:async t=.
+
+** Phase 3 -- Async state machine + timeout wrapper
+
+Implement =cj/mcp-ensure-started=, =cj/mcp--on-hub-callback=,
+=cj/mcp--state= transitions, stall timer, polling.  Implement
+=cj/mcp--wrap-async-with-timeout=.  Stub
+=mcp-hub-start-all-server= with synthetic delayed callbacks and
+synthetic async errors.
+
+Verify:
+- =cj/mcp-ensure-started= returns in <100 ms regardless of stubs.
+- Hub callback triggers status poll + registration.
+- Stall timer fires for stuck servers.
+- Async error path (=:error-callback= without inited callback)
+  reaches =cj/mcp--server-status= via polling.
+- =cj/mcp--wrap-async-with-timeout=: timer-first ignores late MCP
+  response; MCP-first cancels timer; both branches deliver
+  exactly once.
+
+** Phase 4 -- First real connection (no auth)
+
+Wire one =drawio= or =slack-deepsat= server.  Verify the stubbed
+Phase 3 behavior matches real subprocesses.
+
+** Phase 5 -- Status UX + commands + doctor (static)
+
+Implement =cj/mcp-status=, =cj/mcp-list-tools= (with failed
+servers at top, red face), =cj/mcp-doctor= (static mode only),
+=cj/mcp-wait-until-ready=, restart commands, keymap entries,
+which-key labels.
+
+Investigate =gptel-menu= refresh behavior: if the transient is
+already open when a new tool registers, does it pick it up on
+next invocation?  Document and add an acceptance test:
+"register new tool while gptel-menu is open; close and reopen;
+new tool appears."
+
+** Phase 6 -- HTTP servers
+
+Add =linear= and =notion=.  Test in-protocol OAuth handshake.
+Add live-auth-check mode to doctor.
+
+** Phase 7 -- Env-dependent stdio servers
+
+Add =figma=, =google-calendar=, =google-docs-personal=,
+=google-docs-work=, =google-keep=.
+
+** Phase 8 -- Privacy + audit polish
+
+Add audit buffer's privacy header, autosave commentary, audit-log
+hygiene (=cj/mcp-tool-audit-log-enabled= defcustom).  Update
+=ai-mcp.el= commentary with the code-organization outline as a
+TOC.
+
+* Test Plan
+
+** Confirmation contract (Phase 1.5)
+
+- =gptel-confirm-tool-calls= is ='auto= after =ai-mcp= loads.
+- Write-classified MCP tool with =:confirm t= triggers confirm
+  prompt (stub gptel's prompt and assert it fires).
+- Read-classified MCP tool with =:confirm nil= does not trigger
+  prompt.
+- Pre-existing =git_log= (=:confirm nil=) does not trigger
+  prompt.
+- =web_fetch= (newly gated with =:confirm t=) triggers prompt.
+
+** Pure helpers (Phase 1-2)
+
+- =cj/mcp--read-claude-config=: good fixture, missing, unreadable,
+  malformed JSON, missing =:mcpServers=, missing server, empty
+  env, non-string env values.  Cache invalidation on mtime
+  change.
+- =cj/mcp--build-server-alist=: each transport, each auth class,
+  env merge, args splicing for figma, no mutation of
+  =cj/mcp-server-specs=, filter by =cj/mcp-enabled-servers=.
+- =cj/mcp--redact=: bearer tokens, OAuth credentials,
+  TOKEN/KEY/SECRET/CREDENTIALS-suffixed env vars, URL query
+  tokens, figma args slot.  Sentinel never leaks.
+- =cj/mcp--confirm-p=: read patterns, write patterns, unknown →
+  t, override map wins.
+- =cj/mcp--normalize-description=: prefix shape per
+  classification.
+
+** Registration pipeline (Phase 2)
+
+- Single tool registers into all three structures.
+- Two tools with same =:name= from different servers don't
+  collide.
+- Re-registration replaces function pointer without duplicating.
+- Deregister removes from =gptel-tools= without touching a
+  pre-populated local tool.
+- All tools register with =:async t=.
+- Confirm overrides win over patterns.
+
+** Compat layer (Phase 2)
+
+- Each =cj/mcp--*-server-*= helper returns expected value against
+  stub object.
+- =cj/mcp--assert-capabilities= signals when a required function
+  is missing.
+
+** State machine + timeout (Phase 3)
+
+- =cj/mcp-ensure-started= from idle returns in <100 ms with
+  delayed-callback stub.
+- Second call from =starting= is no-op.
+- Hub callback triggers status poll.
+- Stall timer marks slow servers =failed/timeout=.
+- Async error path: server emits error callback only (no inited
+  callback); polling catches it and marks =failed/error= within
+  stall window.
+- =cj/mcp--wrap-async-with-timeout=:
+  - MCP responds first, before timer: gptel callback fires once
+    with result; timer canceled.
+  - Timer fires first: gptel callback fires once with timeout
+    message; late MCP response is ignored (done flag).
+  - Error callback: cancels timer, fires once with redacted
+    error.
+
+** Async / freeze (Phase 3)
+
+- Stub =mcp-hub-start-all-server= to delay callbacks 5 s.
+  =cj/toggle-gptel= returns within 250 ms; buffer is
+  interactive.
+- Stub a server that never responds.  After
+  =cj/mcp-startup-timeout=, server is =failed=,
+  =cj/mcp--state= is =partial= or =failed=.
+
+** Entry-point policy (Phase 5)
+
+- With =cj/mcp-start-on-entry-points '(toggle-gptel)=, calling
+  =cj/toggle-gptel= triggers startup; calling
+  =cj/gptel-quick-ask= does not.
+- Adding =gptel-quick-ask= to the list makes quick-ask trigger.
+
+** Local-tool preservation (Phase 2)
+
+- =cj/mcp-stop-all= removes only MCP-registered tools from
+  =gptel-tools=; local tools like =git_log= remain.
+- =cj/mcp-restart-server= removes only that server's tools;
+  other MCP servers' tools and local tools both remain.
+
+** Process / cleanup (Phase 4+)
+
+- =kill-emacs-hook= calls =cj/mcp-stop-all=; subprocesses exit.
+- =cj/mcp-stop-all= clears =gptel-tools= MCP entries and
+  =cj/mcp--registered-tools=.
+- Restart doesn't leak process buffers or duplicate process
+  objects.
+- Process sentinel records abnormal exits into status.
+
+** Partial availability (Phase 4+)
+
+- 8 of 9 servers ready, 1 failed: ready tools available;
+  failed-server tools absent.
+- Restart-failed only retries the failed one.
+
+** Saved-conversation behavior (Phase 7+)
+
+- After a successful MCP tool call, GPTel autosave (when on)
+  persists the tool result.  Test asserts the saved file
+  contains the result text and the audit buffer's privacy
+  header is updated.
+- =cj/gptel-autosave-toggle= off → result not saved.
+
+** Keymap (Phase 5)
+
+- =C-; a C h/s/l/r/R/S/d/w= all bound after =ai-mcp= loads.
+- Existing =C-; a M=, =C-; a m= still bound to
+  =gptel-menu=, =cj/gptel-change-model=.
+- which-key labels present for every new binding.
+- No duplicate labels.
+
+** =gptel-menu= refresh (Phase 5)
+
+- Register new tool while a previously-opened gptel-menu is
+  closed; reopen; new tool appears.  Document whether mid-open
+  refresh works.
+
+** No-real-process rule
+
+All tests in =tests/test-ai-mcp*= use stubs for =process-file=,
+=make-process=, =mcp-hub-start-all-server=,
+=mcp-server-connections=, =mcp--tools=, =mcp--status=.  No real
+=npx=, no network, no real =~/.claude.json=.
+
+** Manual test matrix
+
+| Scenario | Expected |
+|----------+----------|
+| No =~/.claude.json= | Doctor warns; env-free servers still start |
+| Malformed Claude config | Status shows =malformed-json=; integration =failed= cleanly |
+| Network offline | HTTP servers fail; stdio servers start; status =partial= |
+| =npx= not on PATH | Doctor flags it; stdio servers fail with clear message |
+| One stdio server exits immediately | Sentinel records failure; others continue |
+| slack-deepsat endpoint down | Server =failed=; recovery message points at local proxy |
+| Google token expired | Server starts; tool calls fail; live-auth check (=C-u doctor=) surfaces it |
+| All servers available | =MCP: 9/9 ready, ~N tools registered= |
+| =cj/mcp-restart-failed= after fix | Only retried servers transition |
+| =cj/mcp-stop-all= then call a tool | Tool absent from =gptel-tools= |
+| Disable a server via defcustom | Doctor and status reflect the absence |
+| TRAMP buffer open + =cj/toggle-gptel= | MCP starts locally; no remote spawn |
+
+* Acceptance Criteria
+
+1. *No freeze.*  =cj/toggle-gptel= returns in <250 ms with mcp.el
+   wired and nine real servers starting in background.
+2. *Incremental registration.*  As each server reports tools,
+   =gptel-tools= updates; in-flight =gptel-send= calls see
+   newly-added tools on the next request.
+3. *No MCP failure breaks ordinary GPTel chat.*  With every MCP
+   server failing, =cj/toggle-gptel= still opens a usable chat
+   buffer; non-tool prompts work normally; local tools (git_log
+   etc.) still callable.
+4. *Confirm gate works.*  After =ai-mcp= loads, a write-classified
+   MCP tool actually prompts before invocation.  Verified by a
+   test that fails if =mcp-async-call-tool= is invoked before
+   gptel's confirm-prompt stub fires.
+5. *Local-tool preservation.*  =cj/mcp-stop-all= and per-server
+   restart remove only MCP-owned tools.
+6. *Partial availability.*  With one failed server, status is
+   =partial=, ready servers' tools work, failed server's tools
+   absent.
+7. *Idempotent restart.*  Calling =cj/mcp-restart-failed= twice
+   with no intervening change produces identical state.
+8. *No secret leakage.*  Grep every user-facing output for
+   sentinel fixture secrets; zero matches.
+9. *Doctor coverage (static).*  Identifies each diagnosable
+   failure in the manual test matrix.
+10. *Server enablement.*  Setting =cj/mcp-enabled-servers= to a
+    subset starts only those servers; doctor reports the disabled
+    ones as expected-absent.
+
+* Risks
+
+** R1 -- mcp.el API drift behind compat layer
+
+Even with the compat layer, an upstream rename could break us if
+the capability check misses it.
+
+*Mitigation:* compat helpers document the upstream commit and
+file location.  Tests cover each helper against stub objects;
+when mcp.el bumps, run those tests first.
+
+** R2 -- Cold-start latency for nine subprocesses
+
+Nine =npx -y= invocations cold-start over several seconds.  Time
+to =ready= state may be 10-30 seconds on a cold machine.
+
+*Mitigation:* async model means user doesn't wait.  Tools arrive
+incrementally; status indicator shows progress.
+=cj/mcp-wait-until-ready= for the rare case where the agent needs
+a specific tool immediately.
+
+** R3 -- OAuth token expiry surfaces silently
+
+A Google server starts cleanly but every tool call fails with
+auth errors.
+
+*Mitigation:* the OAuth recovery pattern matcher inspects every
+tool-call error.  Live-auth-check mode in doctor proactively
+calls one safe read per auth class.
+
+** R4 -- Tool count balloons gptel-menu
+
+Up to 100+ tools.  Even with category grouping, the transient
+menu is large.
+
+*Mitigation:* =cj/mcp-enabled-servers= lets users disable
+servers they don't need.  Audit buffer is the alternate browser.
+Per-conversation profiles deferred to v1.5.
+
+** R5 -- =~/.claude.json= schema change
+
+Parser breaks if Anthropic restructures the file.
+
+*Mitigation:* =cj/mcp--read-claude-config= returns structured
+errors that surface in status and doctor.  Integration degrades
+to "env-free servers work, env-dependent servers fail" rather
+than crashing.
+
+** R6 -- Process argument leakage (figma)
+
+figma's API key is in process args -- visible via =ps=,
+=/proc/PID/cmdline=, =list-system-processes=.
+
+*Mitigation:* accepted risk (the figma package only supports
+args-token).  =cj/mcp--redact= ensures the key never appears in
+Emacs-side output.  Commentary flags this.
+
+** R7 -- Confirmation fatigue from unknown-classification tools
+
+Default for unknown is =:confirm t=.  A server with many tools
+matching neither read nor write pattern produces many prompts.
+
+*Mitigation:* audit buffer surfaces unknown classifications with
+their confirm state.  =cj/mcp-tool-confirm-overrides= alist lets
+the user pin specific tools to =nil= once vetted.  A "review
+unknowns" doctor pass could enumerate them on demand (v1.5).
+
+** R8 -- Subprocess accumulation across sessions
+
+If =kill-emacs-hook= is bypassed (kill -9, crash), subprocesses
+persist.
+
+*Mitigation:* =cj/mcp-doctor= can detect orphaned mcp processes
+via =list-system-processes= (v1.5 enhancement).
+
+* Open Questions
+
+** Q1 -- Should =gptel-menu= refresh after mid-call tool registration?
+
+Investigation during Phase 5.  If gptel-menu's transient caches
+=gptel-tools= at open time, mid-call additions won't appear
+until close+reopen.  Document the behavior; if it's a real
+pain, file a gptel upstream issue.
+
+** Q2 -- Should write-confirmation overrides be per-host?
+
+A v1.5 question: when per-conversation profiles land, the
+override alist could also be scoped (e.g., "in /work/ folder,
+auto-confirm google-docs-work writes").  Out of v1 scope.
+
+** Q3 -- Auth-source migration path for OAuth tokens
+
+Three candidate paths (Elisp OAuth client / Claude Code refresh
+script / timer-based refresh) all have meaningful complexity.
+Until one is viable, =~/.claude.json= stays the source.
+
+** Q4 -- Live-auth check cadence
+
+Doctor's live-auth mode is opt-in.  Should there be a periodic
+auto-check (every N hours via timer) that catches expiry between
+explicit doctor runs?  Adds complexity; defer until usage shows
+need.
+
+* Considered Alternatives
+
+** =gptel-mcp.el= (declined; cited as prior art)
+
+[[https://github.com/lizqwerscott/gptel-mcp.el][lizqwerscott/gptel-mcp.el]] is a 96-line wrapper from the same
+author as mcp.el that bridges mcp.el's tool inventory into gptel.
+It exposes five functions:
+
+- =gptel-mcp-register-tool= -- walks =mcp-hub-get-all-tool= and
+  calls =gptel-make-tool= on each plist.
+- =gptel-mcp-activate-all-tool= -- pushes tools into
+  =gptel-tools=.
+- =gptel-mcp-deactivate-all-tool= -- removes them by category +
+  name lookup.
+- =gptel-mcp-start-all-server-and-register= -- chains
+  =mcp-hub-start-all-server= with the register callback.
+- =gptel-mcp-dispatch= -- a transient menu with three keys (=s=
+  start, =A= activate, =C= deactivate).
+
+*Why this matters for the spec.*  The package independently
+validates the integration shape this spec converged on:
+=mcp-hub-start-all-server= → walk connections → =gptel-make-tool=
+→ =add-to-list gptel-tools= is the canonical path.
+
+*Why we are not adopting it.*  The package solves the trivial
+"wire tools through" problem but skips every concern this spec
+exists to address:
+
+| Concern | Spec | gptel-mcp.el |
+|---------+------+--------------|
+| GPTel confirmation contract (=gptel-confirm-tool-calls 'auto=) | Required precondition | Not addressed |
+| Tool-name collisions | Rewrites to =mcp__SERVER__TOOL= | Silent overwrite |
+| Confirm-on-write policy | Per-tool =:confirm t= for writes | All tools register with default |
+| Async startup contract | =cj/mcp-ensure-started= returns in <100 ms | Synchronous-feel start |
+| Async per-call timeout | Explicit timer/callback race | None |
+| State machine | =cj/mcp--state= + per-server status | None |
+| Server identity strategy | Walks =mcp-server-connections= directly | Uses =mcp-hub-get-all-tool= + parses =mcp-SERVER= |
+| Secrets handling | Reads env from =~/.claude.json= with mtime cache | None |
+| Deregistration tracking | =cj/mcp--registered-tools= hash, preserves local tools | Removes by category, no local-tool guarantee |
+| Server enablement | =cj/mcp-enabled-servers= defcustom | None |
+| Entry-point scoping | =cj/mcp-start-on-entry-points= defcustom | Manual via dispatch menu |
+| Status UX | =cj/mcp-status=, =cj/mcp-list-tools= audit buffer | Just dispatch menu |
+| OAuth recovery | Pattern matcher with per-auth-class recovery | None |
+| Secrets redaction | =cj/mcp--redact= applied everywhere | None |
+| mcp.el compat layer | Isolated wrappers around private API | Direct =mcp--*= access scattered |
+| Tests | 10 acceptance criteria + manual matrix + no-real-process | None |
+| Doctor / live-auth check | Static + live-probe diagnostic | None |
+
+Adopting it would force shipping safely or wrapping with
+everything specified above (two layers, no code reduction).
+
+*What we are taking from it.*  Confidence the API path is right.
+The transient-dispatch UX was considered for the keymap (rev 2),
+but the keymap is now pinned to discrete commands under =C-; a
+C= so existing GPTel keys aren't disturbed (rev 3).
+
+* References
+
+- [[file:../../modules/ai-config.el][modules/ai-config.el]] -- =gptel-confirm-tool-calls nil= at
+  line 386 (removed by this integration); loader at lines 71-96.
+- [[file:gptel-tools-shortlist.org][gptel-tools-shortlist.org]] -- local-tools shortlist; MCP servers
+  slot in as the "external" tier.
+- [[file:gptel-agentic-tool-ideas.org][gptel-agentic-tool-ideas.org]] -- broader agentic-tool design.
+- [[file:gptel-gh-tool.org][gptel-gh-tool.org]] -- sibling design; same confirm-on-write
+  pattern.
+- [[https://github.com/lizqwerscott/mcp.el][lizqwerscott/mcp.el]] -- upstream.
+- [[https://github.com/lizqwerscott/gptel-mcp.el][lizqwerscott/gptel-mcp.el]] -- considered and declined; see §
+  Considered Alternatives.
+- =~/code/mcp.el/mcp-hub.el:53-90,131-160= -- verified callback
+  ownership and start-all helper.
+- =elpa/gptel-0.9.8.5/gptel.el:1595-1607,2244-2245= -- verified
+  =gptel-confirm-tool-calls= semantics and tool-confirm gate.
+- =elpa/gptel-0.9.8.5/gptel.el:1729-1820= -- verified
+  =gptel-make-tool= registration.
+- =~/.claude.json= -- Claude Code config.
author	Craig Jennings <c@cjennings.net>	2026-05-17 01:16:03 -0500
committer	Craig Jennings <c@cjennings.net>	2026-05-17 01:16:03 -0500
commit	2b61aa82afa39a0ff1b165fa9ff09d55d21bfabf (patch)
tree	0ff4916e842caff081bfa2a70d8753dc6fa21044 /docs
parent	1fef9aed248aaf7586950475aa22849e413f1c04 (diff)
download	dotemacs-2b61aa82afa39a0ff1b165fa9ff09d55d21bfabf.tar.gz dotemacs-2b61aa82afa39a0ff1b165fa9ff09d55d21bfabf.zip