diff options
Diffstat (limited to 'docs/specs/google-keep-emacs-integration-spec.org')
| -rw-r--r-- | docs/specs/google-keep-emacs-integration-spec.org | 220 |
1 files changed, 220 insertions, 0 deletions
diff --git a/docs/specs/google-keep-emacs-integration-spec.org b/docs/specs/google-keep-emacs-integration-spec.org new file mode 100644 index 000000000..376522ab4 --- /dev/null +++ b/docs/specs/google-keep-emacs-integration-spec.org @@ -0,0 +1,220 @@ +#+TITLE: Google Keep <-> Emacs integration — Spec +#+AUTHOR: Craig Jennings & Claude +#+DATE: 2026-06-24 +#+TODO: TODO | DONE SUPERSEDED CANCELLED + +* Metadata +| Status | v1 implemented (Phases 1-3); live setup pending; v2 next | +|----------+------------------------------------------------------------| +| Owner | Craig | +|----------+------------------------------------------------------------| +| Reviewer | Codex (spec-review) | +|----------+------------------------------------------------------------| +| Related | [[file:../../todo.org][todo.org: google-keep in-editor integration]] | + +* Problem / Context + +Craig keeps quick notes in Google Keep but works almost entirely in Emacs. Today, reading or acting on a Keep note means leaving Emacs for the phone or the web app — a context switch for content that wants to live next to his org files. He wants Keep notes native to Emacs: browsable, searchable, greppable, and editable, with a path to publish the result as a standalone package. + +Two hard constraints shape every choice: + +1. *No official API.* Google Keep has no public API. Every live client (gkeepapi and the tools built on it) reverse-engineers the private mobile endpoint. That layer is fragile: it breaks when Google changes auth, and it needs a Google master token, not a password. +2. *The existing MCP is agent-only.* Craig already has a google-keep MCP with full read/write (create/update/find/labels/archive/list-items). But MCP tools are invoked by the *agent* (Claude), not from elisp — there is no elisp MCP client. So the in-editor integration cannot reuse the MCP as its data path; it needs its own. + +Researched 2026-06-24: there is no in-editor Emacs Google Keep package on MELPA or GitHub — only KeepToOrg, a one-shot Takeout-HTML-to-org importer (unmaintained). So this is a build, not an adopt. + +The closest prior art in this config is =calendar-sync.el=: an existing engine that fetches external data and writes generated org files under =data/=. This spec follows its conventions (generated file under =data/=, atomic temp-then-rename writes, async =make-process= + sentinel, =auth-source= via the house helper) rather than reinventing them. + +* Goals and Non-Goals +** Goals +- Keep notes visible and usable inside Emacs without leaving the editor. +- An org-native representation, so notes are searchable/greppable and reuse org machinery. +- A structure that starts as glue in =.emacs.d= and can be extracted to a publishable package (the VAMP / pearl module-to-package pattern). +- Read-write (create/edit notes from Emacs) as the immediate v2 increment — v1 ships read-only first, but the read path is built to carry write, so v2 is additive. +** Non-Goals +- Full bidirectional offline sync, conflict resolution, or real-time updates. +- Faithful round-tripping of every Keep feature (list checkboxes, collaborators, drawings, images). +- Reusing the MCP from elisp (infeasible — agent-only). +** Scope tiers +- v1: read-only. Fetch Keep notes through the gkeepapi bridge and render them as an org page (each note an org header). A manual refresh command. Auth via auth-source. Graceful degradation when the bridge or credentials are missing. The read path establishes a round-trip-ready data model and a stable per-note identity, because v2 builds on them. +- v2 (immediate follow-on, not deferred): read-write — create a note from a region or capture, and edit a note back to Keep. Reuses v1's bridge, auth, data model, and note-identity; adds the inverse direction and a staleness check. +- Out of scope: list/checkbox fidelity, collaborators, drawings/images, and any background or real-time sync. +- Later: list/checkbox rendering, and extracting the core to a standalone package. + +* Design + +** For the user + +A command (=cj/keep-refresh=) pulls the current Keep notes and writes them into one generated org file (=data/keep.org= under =user-emacs-directory=, a constant in =user-constants.el= alongside =gcal-file= — a generated file, not a hand-authored one in =~/org/=). Each note becomes a top-level org heading: the title (or a derived title) as the heading text, the note body as the entry, and Keep metadata as properties — labels as org tags, plus =:KEEP_ID:=, =:PINNED:=, =:COLOR:=, =:ARCHIVED:=, =:UPDATED:= in a drawer. Pinned notes sort first; each header carries an org link back to the source note (=https://keep.google.com/#NOTE/<id>=). The file is plain org, so it is searchable with the agenda, greppable, and linkable. Opening it is just visiting the file; a keybinding and a dashboard entry make it one keystroke. v1 is read-only: editing the org file does not push back to Keep (a header line says so), so there is no accidental-mutation risk while the integration is young. The =:KEEP_ID:= and =:UPDATED:= on each header are what v2 later uses to target an update and detect a stale local copy. + +** For the implementer + +Three layers, cleanly separable so the core can later be a package: + +1. *Data bridge (Python).* A small script using gkeepapi: authenticate with a stored master token, fetch notes, emit JSON on stdout per the schema below. =id= is the gkeepapi server id (immutable across title/content edits — the safe v2 targeting key, never derived from content), and =updated= is the freshness anchor v2's staleness guard reads. This is the one place the unofficial API lives, isolated so a break is contained and swappable; the same bridge gains a write subcommand in v2. On failure it exits non-zero and prints a single reason token on stderr (=no-gkeepapi=, =no-token=, =auth-failed=, =network=) so the elisp sentinel can surface which piece broke. +2. *Org renderer (elisp core).* Runs the bridge with =make-process= + a sentinel (async, so a slow or hung auth never blocks the interactive thread — the calendar-sync pattern), parses its JSON, and writes the org page via a temp file + atomic rename (never a partial =keep.org= under the user's eyes). Records =:KEEP_ID:=/=:UPDATED:= per note, labels as tags, pinned-first, archived notes tagged =:archived:=. =cj/keep-refresh= is the entry point. The master token is read with =cj/auth-source-secret-value= (system-lib.el, the house helper calendar-sync/slack/transcription use) against a documented =:host= entry. The JSON-to-org transform is the round-trip contract v2 inverts. This core takes its file path, auth host, and keymap injected from the glue layer — it never reaches into =user-constants= or binds keys itself, so it lifts out cleanly as a package. +3. *Glue (=modules/google-keep-config.el=).* Supplies the core its =data/keep.org= path, the =:host=, a Keep keybinding prefix, the dashboard entry, and the =(require 'google-keep-config)= in =init.el=. A load-time =cj/executable-find-or-warn= for =python3= warns early when the interpreter is absent (the gkeepapi-missing and token-missing cases surface at refresh time via the bridge's stderr reason token). + +** Bridge JSON schema (the load-bearing seam) + +Both the v1 transform and the v2 inverse are written against this, so it is pinned here. On success the bridge prints a JSON array of note objects: + +#+begin_src js +[ { "id": "<gkeepapi server id, string>", + "title": "<string, may be empty>", + "text": "<string; newlines are real \n inside the JSON string>", + "labels": ["<label>", ...], // [] when none + "pinned": true|false, + "archived": true|false, + "color": "<keep color name, e.g. \"WHITE\">", + "updated": "<ISO8601 UTC, e.g. 2026-06-25T04:12:00Z>" } ] +#+end_src + +- =updated= is ISO8601 UTC — sortable, parseable by =parse-iso8601-time-string=, and readable in the drawer. v2's staleness compare depends on this exact, comparable form. +- An empty Keep returns =[]= (a valid empty page, not an error). +- =labels= is always an array (empty, never null). =text= newlines are real =\n= inside the JSON string (standard JSON string encoding). +- Failure is not expressed in JSON: the bridge exits non-zero with one stderr reason token (above). The sentinel maps the token to a =display-warning=. + +* Alternatives Considered + +** A — Takeout import (one-shot HTML -> org) +- Good, because no auth, fully offline, dead simple, and zero ongoing breakage risk. +- Bad, because it is not live — Craig must manually export a Takeout archive, so notes are stale the moment they are imported, and it cannot write, so it is useless for the v2 write path. +- Neutral, because it is the right tool for an archival snapshot, the wrong one for daily use. Deferred — not built in v1; a possible later no-auth archive importer if ever wanted. + +** B (chosen for the data path) — gkeepapi via a Python subprocess bridge +- Good, because it is the only path that gives live notes and (in v2) write-back from inside Emacs, with the full note model. +- Bad, because gkeepapi reverse-engineers a private API: it breaks on Google auth changes, needs Python plus a stored master token, and the bridge is glue Craig owns and maintains. +- Neutral, because the fragility is isolated to one script; when it breaks, the renderer degrades to a warning. + +** C — Reuse the google-keep MCP from elisp +- Good, because the MCP already has full read/write and is maintained outside the config. +- Bad, because MCP tools are invoked by the agent, not elisp — there is no elisp MCP client, so this is infeasible for an in-editor feature. +- Neutral, because the MCP stays the right tool for agent-driven Keep access; it just can't power an in-editor integration. + +** D — A local HTTP server wrapping gkeepapi +- Good, because elisp would talk clean HTTP instead of spawning a subprocess each refresh. +- Bad, because a long-running personal server is more infrastructure than a single-user note view warrants. +- Neutral, because it is a heavier variant of B; revisit only if subprocess latency ever bites. + +* Decisions [5/5] + +** DONE Presentation shape: org page of headers +- Decision: We will render notes as one *org page*, each note a top-level header (title + body + metadata properties, labels as tags). Confirmed by Craig 2026-06-25. +- Consequences: easier — reuses org search/agenda/grep/links, nothing bespoke to render, and read-only is safe. Harder — a large Keep collection makes a long file (mitigated by pinned-first sort and org folding). + +** DONE Direction: read-only v1, read-write v2 (immediate) +- Decision: We will ship v1 read-only (fetch + render + refresh), then move directly into v2 read-write (create/edit back to Keep). v2 is the immediate next increment, not a deferred someday. The read path must establish a round-trip-ready data model and a stable per-note identity up front, so v2 is additive rather than a rewrite. Confirmed by Craig 2026-06-25. +- Consequences: easier — v1 ships the visible value fast on a path write reuses wholesale, and a parse bug can't corrupt real Keep data while the model is being proven. Harder — the read path carries design weight it wouldn't if write were truly far off (note-identity and the freshness field have to be right in v1), and the write work — note targeting, staleness/overwrite handling — still has to be built right after. + +** DONE Data path: gkeepapi subprocess bridge (Takeout deferred) +- Decision: We will use a small Python gkeepapi bridge that emits JSON as the sole data path; it powers read in v1 and gains a write subcommand in v2. A failure degrades to a clear warning. The Takeout-import path is deferred (read-only and stale, so it can't serve v2). The MCP is not in the data path. Confirmed by Craig 2026-06-25. +- Consequences: easier — one live path that serves both read and write, fragility isolated to one swappable script. Harder — a Python + gkeepapi dependency and a maintained bridge; the unofficial API can break and needs a master token, with no second read path as a safety net (the warning is the degradation). + +** DONE Auth: master token in authinfo.gpg via auth-source +- Decision: We will store the master token in =authinfo.gpg= and read it via =auth-source= (with =cj/auth-source-secret-value=, the house helper), and document the one-time master-token retrieval. Confirmed by Craig 2026-06-25. +- Consequences: easier — consistent with existing credential handling, no plaintext secret, the daemon's auth-source cache applies. Harder — the one-time token retrieval is a manual setup step, and a revoked/expired token surfaces as an auth failure the renderer must report cleanly. + +** DONE Structure: google-keep-config.el glue + extractable core +- Decision: We will build the integration as =modules/google-keep-config.el= (the =.emacs.d= glue: paths, keys, dashboard, auth wiring) plus a self-contained core (the bridge runner + org renderer) written so the core can later move to a standalone =keep.el=-style package, mirroring the VAMP / pearl migration. The core takes paths/keys/host injected from the glue — it never reaches into =user-constants= or binds keys itself. Confirmed by Craig 2026-06-25. +- Consequences: easier — usable immediately in =.emacs.d=, with a clean seam for later extraction. Harder — the discipline of keeping the core free of =.emacs.d=-specific assumptions from the start. + +* Review findings [8/8] +** DONE Generated-file path + atomic write (round 1, non-blocking) +- Finding: the draft put the file at =~/org/keep.org= (hand-authored org area) with no atomic write, diverging from calendar-sync (generated org under =data/=, temp-then-rename). Risk: a partial write leaves a truncated file the user is viewing, and a generated file in =~/org/= invites edits that get overwritten. +- Resolution: path is =data/keep.org= as a =user-constants.el= constant; the renderer writes via temp file + atomic rename. + +** DONE Async subprocess via make-process + sentinel (round 1, non-blocking) +- Finding: the draft said "run as a subprocess" without committing to async; gkeepapi can hang on auth churn, and a synchronous =call-process= would freeze Emacs. +- Resolution: specified =make-process= + sentinel (the calendar-sync pattern); the warning-on-failure degradation lives in the sentinel. + +** DONE Name the auth-source house helper (round 1, non-blocking) +- Finding: the draft read the token via "auth-source" generically; the house helper is =cj/auth-source-secret-value= (system-lib.el). +- Resolution: spec now calls it out with a documented =:host= entry. + +** DONE Name the missing-tool helper (round 1, non-blocking) +- Finding: the draft promised a display-warning for a missing tool but didn't cite =cj/executable-find-or-warn=, and didn't separate the gkeepapi-missing vs token-missing failure modes. +- Resolution: glue uses =cj/executable-find-or-warn= for =python3=; the bridge emits distinct stderr reason tokens the sentinel surfaces. + +** DONE Pin the bridge JSON schema (round 1, BLOCKING — cleared) +- Finding: Phase 1 listed JSON fields but left the =updated= format, label/newline serialization, empty-result, and error envelope unstated — the round-trip contract both v1 and v2's staleness guard depend on. +- Resolution: added the "Bridge JSON schema" subsection — ISO8601-UTC =updated=, always-array =labels=, standard JSON newline encoding, =[]= for empty, and exit-nonzero + stderr-token for failure. + +** DONE v2 staleness guard re-fetches before write (round 1, non-blocking) +- Finding: v2 must not trust the possibly-stale =:UPDATED:= in the generated file; a phone edit between refresh and edit-back would be clobbered. +- Resolution: Phase 4 states the write path re-fetches the target note's current =updated= from Keep and compares before overwriting, prompting on mismatch. + +** DONE Note KEEP_ID is the immutable server id (round 1, non-blocking) +- Finding: "stable" id asserted but not anchored. +- Resolution: spec states =:KEEP_ID:= is the gkeepapi server id, immutable across edits, never derived from content. + +** DONE Small enhancements: source link + archived handling (round 1, non-blocking, optional) +- Finding: a back-link to the Keep web note and archived-note handling were free given the data already carried. +- Resolution: each header links to =keep.google.com/#NOTE/<id>=; archived notes are tagged =:archived:= (the JSON carries =archived=). + +* Implementation phases + +** Phase 1 — Data bridge +A Python script (gkeepapi) that authenticates with the stored master token and prints the JSON array per the Bridge JSON schema on stdout (=id= = immutable server id, =updated= = ISO8601 UTC). On failure: exit non-zero with one stderr reason token. Standalone and testable from the shell with a fixture; no Emacs yet. Tree stays working (new files only). + +** Phase 2 — Org renderer + refresh +The elisp core + =modules/google-keep-config.el=: run the bridge with =make-process= + a sentinel, parse the JSON, and write =data/keep.org= via temp file + atomic rename (heading + body + property drawer per note, recording =:KEEP_ID:= and =:UPDATED:=, labels as tags, archived tagged =:archived:=, pinned-first, a source back-link per header). =cj/keep-refresh= regenerates it; auth via =cj/auth-source-secret-value=. A header line marks the file a read-only view. The sentinel maps a bridge failure (stderr token) to a =display-warning= naming the missing piece — never errors at load. + +** Phase 3 — Access UX + un-orphan +Keybindings (a Keep prefix), a dashboard entry, a load-time =cj/executable-find-or-warn= for =python3=, and the =(require 'google-keep-config)= in =init.el=. Optional: a dedicated read-only major mode for the buffer. + +** Phase 4 (v2) — read-write +The immediate next increment after v1 lands. A write subcommand on the bridge (gkeepapi create/update), and an elisp path that targets a note by =:KEEP_ID:= and, before overwriting, re-fetches that note's current =updated= from Keep and compares it against the drawer's =:UPDATED:= (a staleness guard — abort/prompt on mismatch so a phone edit since the last refresh isn't clobbered). Entry points to create a note from a region/capture and edit one back. Specced in detail once v1's read model is proven on real notes; listed here so Phases 1-3 don't paint into a read-only corner. + +(Later, not specced here, logged to todo.org: list/checkbox rendering; extract the core to a package.) + +* Acceptance criteria +- [ ] =cj/keep-refresh= fetches the current Keep notes and writes =data/keep.org= via temp + atomic rename, one header per note with title/body/labels/metadata and a source back-link. +- [ ] Pinned notes sort to the top; labels render as org tags; archived notes are tagged =:archived:=; Keep id/color/pinned/archived/updated land in a property drawer. +- [ ] Each note header carries an immutable =:KEEP_ID:= (gkeepapi server id) and an ISO8601 =:UPDATED:= (the v2 targeting + staleness anchors). +- [ ] The bridge runs async (=make-process= + sentinel); a slow/hung auth does not block Emacs. +- [ ] The master token is read via =cj/auth-source-secret-value= from =authinfo.gpg=; no secret is hardcoded. +- [ ] A missing python3 / gkeepapi / token (distinct bridge stderr tokens) produces a clear =display-warning=, not a load error or a crash; an empty Keep yields an empty page, not an error. +- [ ] =make validate-modules= + launch smoke clean with =google-keep-config= required. + +* Readiness dimensions +- Data model & ownership: Keep is the source of truth; =data/keep.org= is a generated read-only view (v1). Each note maps to one org header keyed by the immutable =:KEEP_ID:=; the bridge JSON (pinned schema) is the contract between Python and elisp, and the JSON-to-org transform is the round-trip contract v2 inverts. +- Errors, empty states & failure: auth failure, a broken gkeepapi, missing Python/token, or zero notes each degrade to a warning (named by the bridge's stderr token) or an empty page, never a crash. The unofficial API breaking is expected, not exceptional. +- Security & privacy: the master token lives in =authinfo.gpg= (gpg-encrypted), read via =cj/auth-source-secret-value=; note content lands in a local generated org file under =data/=. No secret in the repo. The token grants broad Google access — documented as a risk. +- Observability: the warning path names which piece is missing (python3 / gkeepapi / token / auth) from the bridge's stderr token. The generated page's header shows the last refresh time. +- Performance & scale: one async subprocess per manual refresh over N notes (tens to low hundreds); trivial. No background polling in v1. +- Reuse & lost opportunities: follows calendar-sync conventions (generated org under =data/=, atomic temp+rename, =make-process= + sentinel) and reuses =cj/auth-source-secret-value=, =cj/executable-find-or-warn=, org rendering, and the dashboard. gkeepapi supplies the API client, so no endpoint code is written here. +- Architecture fit & weak points: three layers (Python bridge / elisp core / glue) with the fragile API isolated in layer 1 and the core kept free of =.emacs.d= specifics for extraction. Weak point: gkeepapi maintenance and Google auth churn — mitigated by isolation, graceful degradation, and a swappable bridge. +- Config surface: the =data/keep.org= path constant, the auth-source =:host= entry, and a keybinding prefix. No tuning knobs in v1. +- Documentation plan: a setup note (one-time master-token retrieval, =pip install gkeepapi=, the authinfo entry) and the module commentary. No user-migration doc (personal config). +- Dev tooling: the bridge is shell-testable with a JSON fixture; the core gets ERT over the JSON-to-org transform (against the pinned schema); =make validate-modules= + launch smoke for the module. +- Rollout, compatibility & rollback: additive — a new module + a require. Rollback = drop the require and delete =data/keep.org=. No existing behavior changes. +- External APIs & deps: gkeepapi (PyPI) and the unofficial Google Keep mobile endpoint it wraps — the single load-bearing external dependency and the central risk. Python 3 on PATH. A Google master token. + +* Risks, Rabbit Holes, and Drawbacks +- The central risk is gkeepapi breaking when Google changes auth or the private endpoint. It has a history of auth churn. With Takeout deferred there is no second read path, so the mitigations are: isolate gkeepapi in the bridge, degrade to a clear warning (named by stderr token), keep the bridge swappable, and never block Emacs load on it. +- Credential risk: the master token grants broad account access. Keep it in =authinfo.gpg=, never the repo; document revocation. +- v2 staleness: because write follows immediately, the edit-back path must re-fetch the note's current =updated= from Keep before writing, not trust the generated file's =:UPDATED:= (which is only as fresh as the last refresh). Carrying =:UPDATED:= correctly in v1 is what makes that guard possible. +- Scope creep: the pull toward full bidirectional sync, list fidelity, and real-time. v1 is a read-only org view and v2 is targeted read-write; richness and sync stay later, gated behind those landing. + +* Review and iteration history +** 2026-06-24 Wed @ 22:40:00 -0400 — Claude — author +- What: initial draft. +- Why: Craig asked to spec the google-keep in-editor integration before building. It spans a fragile external API, an auth-source credential, a Python/elisp bridge, and a module-to-package trajectory, with real trade-offs on shape, direction, and data path — worth pinning before code. +- Artifacts: docs/specs/google-keep-emacs-integration-spec.org; the todo.org google-keep task. +** 2026-06-25 Thu @ 00:40:00 -0400 — Claude — author +- What: resolved all five decisions with Craig (one by one) and folded them in. Shape = org page (popup dropped, a separate later discussion). Direction = read-only v1 then read-write v2 immediately, so the read path now carries a round-trip-ready model + stable note-identity. Data path = gkeepapi sole bridge, Takeout deferred. Auth = authinfo.gpg via auth-source. Structure = extractable core + glue. Added Phase 4 (v2) and the v2-staleness risk. +- Why: decisions resolved, so the spec could go to spec-review. +- Artifacts: this spec. +** 2026-06-25 Thu @ 00:55:00 -0400 — spec-review (round 1) + author response +- What: independent review against the readiness gate and Phase 4 dimensions, reading calendar-sync.el as the analog. Eight findings (one blocking: the bridge JSON schema; seven non-blocking: data/ path + atomic write, async make-process, the two house helpers, v2 re-fetch, KEEP_ID server id, source-link + archived). Craig accepted all; folded in here. Added the Bridge JSON schema subsection, the calendar-sync-convention reuse, and the Review findings section. +- Why: clears the blocker and the readiness gaps; rubric moves to Ready. +- Artifacts: this spec (Review findings [8/8], Bridge JSON schema). +** 2026-06-25 Thu @ 01:05:00 -0400 — spec-review (round 2) — reviewer +- What: fresh readiness pass on the revised spec. Verified the round-1 resolutions are coherent (data/keep.org consistent throughout, the JSON schema is the single referenced contract, the async/sentinel + stderr-token story lines up across Design/Phase 2/acceptance). Confirmed cj/auth-source-secret-value (host &optional user) exists and is buildable as written, and that the data/keep.org constant matches user-constants.el's gcal-file form. No new findings. +- Why: confirms the spec is implementation-ready after two rounds. +- Artifacts: this spec. Verdict: Ready. +** 2026-06-25 Thu @ 01:30:00 -0400 — Claude — implementer +- What: built v1 (Phases 1-3) in a no-approvals speedrun. Phase 1: the gkeepapi bridge (scripts/google-keep/keep-bridge.py) emitting the pinned JSON schema, degrading with a stderr reason token; 12 Python tests. Phase 2: the elisp pure core (parse/tag/heading/render, extractable) + cj/keep-refresh (async make-process + sentinel, atomic temp-then-rename to keep-file, stderr-token to display-warning); 15 ERT tests. Phase 3: keep-file constant in user-constants.el, C-c k prefix (refresh/open), executable warning, required in init.el. validate-modules clean; no-token path degrades without error. +- Why: spec was Ready and Craig green-lit a no-approvals build of v1. +- Artifacts: scripts/google-keep/keep-bridge.py + test; modules/google-keep-config.el + tests/test-google-keep-config.el; user-constants.el; init.el. Remaining: the one-time gkeepapi + master-token setup (a VERIFY under todo.org Manual testing), then v2. |
