aboutsummaryrefslogtreecommitdiff
path: root/docs/specs/google-keep-emacs-integration-spec.org
blob: bd12779eb8644ecabe3f58686393dc0714108e5f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
#+TITLE: Google Keep <-> Emacs integration — Spec
#+AUTHOR: Craig Jennings & Claude
#+DATE: 2026-06-24
#+TODO: TODO | DONE SUPERSEDED CANCELLED

* Metadata
| Status   | Ready (round 1 review folded in)                           |
|----------+------------------------------------------------------------|
| Owner    | Craig                                                      |
|----------+------------------------------------------------------------|
| Reviewer | Codex (spec-review)                                        |
|----------+------------------------------------------------------------|
| Related  | [[file:../../todo.org][todo.org: google-keep in-editor integration]]               |

* Problem / Context

Craig keeps quick notes in Google Keep but works almost entirely in Emacs. Today, reading or acting on a Keep note means leaving Emacs for the phone or the web app — a context switch for content that wants to live next to his org files. He wants Keep notes native to Emacs: browsable, searchable, greppable, and editable, with a path to publish the result as a standalone package.

Two hard constraints shape every choice:

1. *No official API.* Google Keep has no public API. Every live client (gkeepapi and the tools built on it) reverse-engineers the private mobile endpoint. That layer is fragile: it breaks when Google changes auth, and it needs a Google master token, not a password.
2. *The existing MCP is agent-only.* Craig already has a google-keep MCP with full read/write (create/update/find/labels/archive/list-items). But MCP tools are invoked by the *agent* (Claude), not from elisp — there is no elisp MCP client. So the in-editor integration cannot reuse the MCP as its data path; it needs its own.

Researched 2026-06-24: there is no in-editor Emacs Google Keep package on MELPA or GitHub — only KeepToOrg, a one-shot Takeout-HTML-to-org importer (unmaintained). So this is a build, not an adopt.

The closest prior art in this config is =calendar-sync.el=: an existing engine that fetches external data and writes generated org files under =data/=. This spec follows its conventions (generated file under =data/=, atomic temp-then-rename writes, async =make-process= + sentinel, =auth-source= via the house helper) rather than reinventing them.

* Goals and Non-Goals
** Goals
- Keep notes visible and usable inside Emacs without leaving the editor.
- An org-native representation, so notes are searchable/greppable and reuse org machinery.
- A structure that starts as glue in =.emacs.d= and can be extracted to a publishable package (the VAMP / pearl module-to-package pattern).
- Read-write (create/edit notes from Emacs) as the immediate v2 increment — v1 ships read-only first, but the read path is built to carry write, so v2 is additive.
** Non-Goals
- Full bidirectional offline sync, conflict resolution, or real-time updates.
- Faithful round-tripping of every Keep feature (list checkboxes, collaborators, drawings, images).
- Reusing the MCP from elisp (infeasible — agent-only).
** Scope tiers
- v1: read-only. Fetch Keep notes through the gkeepapi bridge and render them as an org page (each note an org header). A manual refresh command. Auth via auth-source. Graceful degradation when the bridge or credentials are missing. The read path establishes a round-trip-ready data model and a stable per-note identity, because v2 builds on them.
- v2 (immediate follow-on, not deferred): read-write — create a note from a region or capture, and edit a note back to Keep. Reuses v1's bridge, auth, data model, and note-identity; adds the inverse direction and a staleness check.
- Out of scope: list/checkbox fidelity, collaborators, drawings/images, and any background or real-time sync.
- Later: list/checkbox rendering, and extracting the core to a standalone package.

* Design

** For the user

A command (=cj/keep-refresh=) pulls the current Keep notes and writes them into one generated org file (=data/keep.org= under =user-emacs-directory=, a constant in =user-constants.el= alongside =gcal-file= — a generated file, not a hand-authored one in =~/org/=). Each note becomes a top-level org heading: the title (or a derived title) as the heading text, the note body as the entry, and Keep metadata as properties — labels as org tags, plus =:KEEP_ID:=, =:PINNED:=, =:COLOR:=, =:ARCHIVED:=, =:UPDATED:= in a drawer. Pinned notes sort first; each header carries an org link back to the source note (=https://keep.google.com/#NOTE/<id>=). The file is plain org, so it is searchable with the agenda, greppable, and linkable. Opening it is just visiting the file; a keybinding and a dashboard entry make it one keystroke. v1 is read-only: editing the org file does not push back to Keep (a header line says so), so there is no accidental-mutation risk while the integration is young. The =:KEEP_ID:= and =:UPDATED:= on each header are what v2 later uses to target an update and detect a stale local copy.

** For the implementer

Three layers, cleanly separable so the core can later be a package:

1. *Data bridge (Python).* A small script using gkeepapi: authenticate with a stored master token, fetch notes, emit JSON on stdout per the schema below. =id= is the gkeepapi server id (immutable across title/content edits — the safe v2 targeting key, never derived from content), and =updated= is the freshness anchor v2's staleness guard reads. This is the one place the unofficial API lives, isolated so a break is contained and swappable; the same bridge gains a write subcommand in v2. On failure it exits non-zero and prints a single reason token on stderr (=no-gkeepapi=, =no-token=, =auth-failed=, =network=) so the elisp sentinel can surface which piece broke.
2. *Org renderer (elisp core).* Runs the bridge with =make-process= + a sentinel (async, so a slow or hung auth never blocks the interactive thread — the calendar-sync pattern), parses its JSON, and writes the org page via a temp file + atomic rename (never a partial =keep.org= under the user's eyes). Records =:KEEP_ID:=/=:UPDATED:= per note, labels as tags, pinned-first, archived notes tagged =:archived:=. =cj/keep-refresh= is the entry point. The master token is read with =cj/auth-source-secret-value= (system-lib.el, the house helper calendar-sync/slack/transcription use) against a documented =:host= entry. The JSON-to-org transform is the round-trip contract v2 inverts. This core takes its file path, auth host, and keymap injected from the glue layer — it never reaches into =user-constants= or binds keys itself, so it lifts out cleanly as a package.
3. *Glue (=modules/google-keep-config.el=).* Supplies the core its =data/keep.org= path, the =:host=, a Keep keybinding prefix, the dashboard entry, and the =(require 'google-keep-config)= in =init.el=. A load-time =cj/executable-find-or-warn= for =python3= warns early when the interpreter is absent (the gkeepapi-missing and token-missing cases surface at refresh time via the bridge's stderr reason token).

** Bridge JSON schema (the load-bearing seam)

Both the v1 transform and the v2 inverse are written against this, so it is pinned here. On success the bridge prints a JSON array of note objects:

#+begin_src js
[ { "id":       "<gkeepapi server id, string>",
    "title":    "<string, may be empty>",
    "text":     "<string; newlines are real \n inside the JSON string>",
    "labels":   ["<label>", ...],        // [] when none
    "pinned":   true|false,
    "archived": true|false,
    "color":    "<keep color name, e.g. \"WHITE\">",
    "updated":  "<ISO8601 UTC, e.g. 2026-06-25T04:12:00Z>" } ]
#+end_src

- =updated= is ISO8601 UTC — sortable, parseable by =parse-iso8601-time-string=, and readable in the drawer. v2's staleness compare depends on this exact, comparable form.
- An empty Keep returns =[]= (a valid empty page, not an error).
- =labels= is always an array (empty, never null). =text= newlines are real =\n= inside the JSON string (standard JSON string encoding).
- Failure is not expressed in JSON: the bridge exits non-zero with one stderr reason token (above). The sentinel maps the token to a =display-warning=.

* Alternatives Considered

** A — Takeout import (one-shot HTML -> org)
- Good, because no auth, fully offline, dead simple, and zero ongoing breakage risk.
- Bad, because it is not live — Craig must manually export a Takeout archive, so notes are stale the moment they are imported, and it cannot write, so it is useless for the v2 write path.
- Neutral, because it is the right tool for an archival snapshot, the wrong one for daily use. Deferred — not built in v1; a possible later no-auth archive importer if ever wanted.

** B (chosen for the data path) — gkeepapi via a Python subprocess bridge
- Good, because it is the only path that gives live notes and (in v2) write-back from inside Emacs, with the full note model.
- Bad, because gkeepapi reverse-engineers a private API: it breaks on Google auth changes, needs Python plus a stored master token, and the bridge is glue Craig owns and maintains.
- Neutral, because the fragility is isolated to one script; when it breaks, the renderer degrades to a warning.

** C — Reuse the google-keep MCP from elisp
- Good, because the MCP already has full read/write and is maintained outside the config.
- Bad, because MCP tools are invoked by the agent, not elisp — there is no elisp MCP client, so this is infeasible for an in-editor feature.
- Neutral, because the MCP stays the right tool for agent-driven Keep access; it just can't power an in-editor integration.

** D — A local HTTP server wrapping gkeepapi
- Good, because elisp would talk clean HTTP instead of spawning a subprocess each refresh.
- Bad, because a long-running personal server is more infrastructure than a single-user note view warrants.
- Neutral, because it is a heavier variant of B; revisit only if subprocess latency ever bites.

* Decisions [5/5]

** DONE Presentation shape: org page of headers
- Decision: We will render notes as one *org page*, each note a top-level header (title + body + metadata properties, labels as tags). Confirmed by Craig 2026-06-25.
- Consequences: easier — reuses org search/agenda/grep/links, nothing bespoke to render, and read-only is safe. Harder — a large Keep collection makes a long file (mitigated by pinned-first sort and org folding).

** DONE Direction: read-only v1, read-write v2 (immediate)
- Decision: We will ship v1 read-only (fetch + render + refresh), then move directly into v2 read-write (create/edit back to Keep). v2 is the immediate next increment, not a deferred someday. The read path must establish a round-trip-ready data model and a stable per-note identity up front, so v2 is additive rather than a rewrite. Confirmed by Craig 2026-06-25.
- Consequences: easier — v1 ships the visible value fast on a path write reuses wholesale, and a parse bug can't corrupt real Keep data while the model is being proven. Harder — the read path carries design weight it wouldn't if write were truly far off (note-identity and the freshness field have to be right in v1), and the write work — note targeting, staleness/overwrite handling — still has to be built right after.

** DONE Data path: gkeepapi subprocess bridge (Takeout deferred)
- Decision: We will use a small Python gkeepapi bridge that emits JSON as the sole data path; it powers read in v1 and gains a write subcommand in v2. A failure degrades to a clear warning. The Takeout-import path is deferred (read-only and stale, so it can't serve v2). The MCP is not in the data path. Confirmed by Craig 2026-06-25.
- Consequences: easier — one live path that serves both read and write, fragility isolated to one swappable script. Harder — a Python + gkeepapi dependency and a maintained bridge; the unofficial API can break and needs a master token, with no second read path as a safety net (the warning is the degradation).

** DONE Auth: master token in authinfo.gpg via auth-source
- Decision: We will store the master token in =authinfo.gpg= and read it via =auth-source= (with =cj/auth-source-secret-value=, the house helper), and document the one-time master-token retrieval. Confirmed by Craig 2026-06-25.
- Consequences: easier — consistent with existing credential handling, no plaintext secret, the daemon's auth-source cache applies. Harder — the one-time token retrieval is a manual setup step, and a revoked/expired token surfaces as an auth failure the renderer must report cleanly.

** DONE Structure: google-keep-config.el glue + extractable core
- Decision: We will build the integration as =modules/google-keep-config.el= (the =.emacs.d= glue: paths, keys, dashboard, auth wiring) plus a self-contained core (the bridge runner + org renderer) written so the core can later move to a standalone =keep.el=-style package, mirroring the VAMP / pearl migration. The core takes paths/keys/host injected from the glue — it never reaches into =user-constants= or binds keys itself. Confirmed by Craig 2026-06-25.
- Consequences: easier — usable immediately in =.emacs.d=, with a clean seam for later extraction. Harder — the discipline of keeping the core free of =.emacs.d=-specific assumptions from the start.

* Review findings [8/8]
** DONE Generated-file path + atomic write (round 1, non-blocking)
- Finding: the draft put the file at =~/org/keep.org= (hand-authored org area) with no atomic write, diverging from calendar-sync (generated org under =data/=, temp-then-rename). Risk: a partial write leaves a truncated file the user is viewing, and a generated file in =~/org/= invites edits that get overwritten.
- Resolution: path is =data/keep.org= as a =user-constants.el= constant; the renderer writes via temp file + atomic rename.

** DONE Async subprocess via make-process + sentinel (round 1, non-blocking)
- Finding: the draft said "run as a subprocess" without committing to async; gkeepapi can hang on auth churn, and a synchronous =call-process= would freeze Emacs.
- Resolution: specified =make-process= + sentinel (the calendar-sync pattern); the warning-on-failure degradation lives in the sentinel.

** DONE Name the auth-source house helper (round 1, non-blocking)
- Finding: the draft read the token via "auth-source" generically; the house helper is =cj/auth-source-secret-value= (system-lib.el).
- Resolution: spec now calls it out with a documented =:host= entry.

** DONE Name the missing-tool helper (round 1, non-blocking)
- Finding: the draft promised a display-warning for a missing tool but didn't cite =cj/executable-find-or-warn=, and didn't separate the gkeepapi-missing vs token-missing failure modes.
- Resolution: glue uses =cj/executable-find-or-warn= for =python3=; the bridge emits distinct stderr reason tokens the sentinel surfaces.

** DONE Pin the bridge JSON schema (round 1, BLOCKING — cleared)
- Finding: Phase 1 listed JSON fields but left the =updated= format, label/newline serialization, empty-result, and error envelope unstated — the round-trip contract both v1 and v2's staleness guard depend on.
- Resolution: added the "Bridge JSON schema" subsection — ISO8601-UTC =updated=, always-array =labels=, standard JSON newline encoding, =[]= for empty, and exit-nonzero + stderr-token for failure.

** DONE v2 staleness guard re-fetches before write (round 1, non-blocking)
- Finding: v2 must not trust the possibly-stale =:UPDATED:= in the generated file; a phone edit between refresh and edit-back would be clobbered.
- Resolution: Phase 4 states the write path re-fetches the target note's current =updated= from Keep and compares before overwriting, prompting on mismatch.

** DONE Note KEEP_ID is the immutable server id (round 1, non-blocking)
- Finding: "stable" id asserted but not anchored.
- Resolution: spec states =:KEEP_ID:= is the gkeepapi server id, immutable across edits, never derived from content.

** DONE Small enhancements: source link + archived handling (round 1, non-blocking, optional)
- Finding: a back-link to the Keep web note and archived-note handling were free given the data already carried.
- Resolution: each header links to =keep.google.com/#NOTE/<id>=; archived notes are tagged =:archived:= (the JSON carries =archived=).

* Implementation phases

** Phase 1 — Data bridge
A Python script (gkeepapi) that authenticates with the stored master token and prints the JSON array per the Bridge JSON schema on stdout (=id= = immutable server id, =updated= = ISO8601 UTC). On failure: exit non-zero with one stderr reason token. Standalone and testable from the shell with a fixture; no Emacs yet. Tree stays working (new files only).

** Phase 2 — Org renderer + refresh
The elisp core + =modules/google-keep-config.el=: run the bridge with =make-process= + a sentinel, parse the JSON, and write =data/keep.org= via temp file + atomic rename (heading + body + property drawer per note, recording =:KEEP_ID:= and =:UPDATED:=, labels as tags, archived tagged =:archived:=, pinned-first, a source back-link per header). =cj/keep-refresh= regenerates it; auth via =cj/auth-source-secret-value=. A header line marks the file a read-only view. The sentinel maps a bridge failure (stderr token) to a =display-warning= naming the missing piece — never errors at load.

** Phase 3 — Access UX + un-orphan
Keybindings (a Keep prefix), a dashboard entry, a load-time =cj/executable-find-or-warn= for =python3=, and the =(require 'google-keep-config)= in =init.el=. Optional: a dedicated read-only major mode for the buffer.

** Phase 4 (v2) — read-write
The immediate next increment after v1 lands. A write subcommand on the bridge (gkeepapi create/update), and an elisp path that targets a note by =:KEEP_ID:= and, before overwriting, re-fetches that note's current =updated= from Keep and compares it against the drawer's =:UPDATED:= (a staleness guard — abort/prompt on mismatch so a phone edit since the last refresh isn't clobbered). Entry points to create a note from a region/capture and edit one back. Specced in detail once v1's read model is proven on real notes; listed here so Phases 1-3 don't paint into a read-only corner.

(Later, not specced here, logged to todo.org: list/checkbox rendering; extract the core to a package.)

* Acceptance criteria
- [ ] =cj/keep-refresh= fetches the current Keep notes and writes =data/keep.org= via temp + atomic rename, one header per note with title/body/labels/metadata and a source back-link.
- [ ] Pinned notes sort to the top; labels render as org tags; archived notes are tagged =:archived:=; Keep id/color/pinned/archived/updated land in a property drawer.
- [ ] Each note header carries an immutable =:KEEP_ID:= (gkeepapi server id) and an ISO8601 =:UPDATED:= (the v2 targeting + staleness anchors).
- [ ] The bridge runs async (=make-process= + sentinel); a slow/hung auth does not block Emacs.
- [ ] The master token is read via =cj/auth-source-secret-value= from =authinfo.gpg=; no secret is hardcoded.
- [ ] A missing python3 / gkeepapi / token (distinct bridge stderr tokens) produces a clear =display-warning=, not a load error or a crash; an empty Keep yields an empty page, not an error.
- [ ] =make validate-modules= + launch smoke clean with =google-keep-config= required.

* Readiness dimensions
- Data model & ownership: Keep is the source of truth; =data/keep.org= is a generated read-only view (v1). Each note maps to one org header keyed by the immutable =:KEEP_ID:=; the bridge JSON (pinned schema) is the contract between Python and elisp, and the JSON-to-org transform is the round-trip contract v2 inverts.
- Errors, empty states & failure: auth failure, a broken gkeepapi, missing Python/token, or zero notes each degrade to a warning (named by the bridge's stderr token) or an empty page, never a crash. The unofficial API breaking is expected, not exceptional.
- Security & privacy: the master token lives in =authinfo.gpg= (gpg-encrypted), read via =cj/auth-source-secret-value=; note content lands in a local generated org file under =data/=. No secret in the repo. The token grants broad Google access — documented as a risk.
- Observability: the warning path names which piece is missing (python3 / gkeepapi / token / auth) from the bridge's stderr token. The generated page's header shows the last refresh time.
- Performance & scale: one async subprocess per manual refresh over N notes (tens to low hundreds); trivial. No background polling in v1.
- Reuse & lost opportunities: follows calendar-sync conventions (generated org under =data/=, atomic temp+rename, =make-process= + sentinel) and reuses =cj/auth-source-secret-value=, =cj/executable-find-or-warn=, org rendering, and the dashboard. gkeepapi supplies the API client, so no endpoint code is written here.
- Architecture fit & weak points: three layers (Python bridge / elisp core / glue) with the fragile API isolated in layer 1 and the core kept free of =.emacs.d= specifics for extraction. Weak point: gkeepapi maintenance and Google auth churn — mitigated by isolation, graceful degradation, and a swappable bridge.
- Config surface: the =data/keep.org= path constant, the auth-source =:host= entry, and a keybinding prefix. No tuning knobs in v1.
- Documentation plan: a setup note (one-time master-token retrieval, =pip install gkeepapi=, the authinfo entry) and the module commentary. No user-migration doc (personal config).
- Dev tooling: the bridge is shell-testable with a JSON fixture; the core gets ERT over the JSON-to-org transform (against the pinned schema); =make validate-modules= + launch smoke for the module.
- Rollout, compatibility & rollback: additive — a new module + a require. Rollback = drop the require and delete =data/keep.org=. No existing behavior changes.
- External APIs & deps: gkeepapi (PyPI) and the unofficial Google Keep mobile endpoint it wraps — the single load-bearing external dependency and the central risk. Python 3 on PATH. A Google master token.

* Risks, Rabbit Holes, and Drawbacks
- The central risk is gkeepapi breaking when Google changes auth or the private endpoint. It has a history of auth churn. With Takeout deferred there is no second read path, so the mitigations are: isolate gkeepapi in the bridge, degrade to a clear warning (named by stderr token), keep the bridge swappable, and never block Emacs load on it.
- Credential risk: the master token grants broad account access. Keep it in =authinfo.gpg=, never the repo; document revocation.
- v2 staleness: because write follows immediately, the edit-back path must re-fetch the note's current =updated= from Keep before writing, not trust the generated file's =:UPDATED:= (which is only as fresh as the last refresh). Carrying =:UPDATED:= correctly in v1 is what makes that guard possible.
- Scope creep: the pull toward full bidirectional sync, list fidelity, and real-time. v1 is a read-only org view and v2 is targeted read-write; richness and sync stay later, gated behind those landing.

* Review and iteration history
** 2026-06-24 Wed @ 22:40:00 -0400 — Claude — author
- What: initial draft.
- Why: Craig asked to spec the google-keep in-editor integration before building. It spans a fragile external API, an auth-source credential, a Python/elisp bridge, and a module-to-package trajectory, with real trade-offs on shape, direction, and data path — worth pinning before code.
- Artifacts: docs/specs/google-keep-emacs-integration-spec.org; the todo.org google-keep task.
** 2026-06-25 Thu @ 00:40:00 -0400 — Claude — author
- What: resolved all five decisions with Craig (one by one) and folded them in. Shape = org page (popup dropped, a separate later discussion). Direction = read-only v1 then read-write v2 immediately, so the read path now carries a round-trip-ready model + stable note-identity. Data path = gkeepapi sole bridge, Takeout deferred. Auth = authinfo.gpg via auth-source. Structure = extractable core + glue. Added Phase 4 (v2) and the v2-staleness risk.
- Why: decisions resolved, so the spec could go to spec-review.
- Artifacts: this spec.
** 2026-06-25 Thu @ 00:55:00 -0400 — spec-review (round 1) + author response
- What: independent review against the readiness gate and Phase 4 dimensions, reading calendar-sync.el as the analog. Eight findings (one blocking: the bridge JSON schema; seven non-blocking: data/ path + atomic write, async make-process, the two house helpers, v2 re-fetch, KEEP_ID server id, source-link + archived). Craig accepted all; folded in here. Added the Bridge JSON schema subsection, the calendar-sync-convention reuse, and the Review findings section.
- Why: clears the blocker and the readiness gaps; rubric moves to Ready.
- Artifacts: this spec (Review findings [8/8], Bridge JSON schema).