diff options
| author | Craig Jennings <c@cjennings.net> | 2026-06-30 08:11:14 -0400 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-06-30 08:11:14 -0400 |
| commit | da5bff1deb1ee8428fe5f14f8c795cf1559f104a (patch) | |
| tree | 39c2ffbba102238e143cc9c25f3976d74b914d89 | |
| parent | e88391f252ba038d9f824308ce2b873d7b29ab79 (diff) | |
| download | dotemacs-da5bff1deb1ee8428fe5f14f8c795cf1559f104a.tar.gz dotemacs-da5bff1deb1ee8428fe5f14f8c795cf1559f104a.zip | |
docs: spec for migrating tests off mocking primitives
Pull the long-term subr-mock migration out of the task list into a discussion spec: the native-comp origin, the stopgap already shipped (variadic sweep plus the arity meta-test), the real scope (the UI-prompt mocks, not the boundary mocks the test rules say to keep), and the open decisions to settle before any code moves.
| -rw-r--r-- | docs/subr-mock-migration-spec.org | 158 |
1 files changed, 158 insertions, 0 deletions
diff --git a/docs/subr-mock-migration-spec.org b/docs/subr-mock-migration-spec.org new file mode 100644 index 000000000..26f1dd576 --- /dev/null +++ b/docs/subr-mock-migration-spec.org @@ -0,0 +1,158 @@ +#+TITLE: Spec: Migrate Tests Off Mocking C Primitives +#+AUTHOR: Craig Jennings +#+DATE: 2026-06-30 +#+STATUS: Draft — for discussion + +* Status + +Draft. Pulled out of =todo.org= (=** TODO [#C] Migrate tests off mocking +primitives (native-comp robustness) :test:refactor:solo:=) so the scope and +approach can be settled before any code moves. Execution is deferred; this +document is the discussion vehicle. + +Companion reference: [[file:native-comp-subr-mocking.org][native-comp-subr-mocking.org]] holds the full mechanism, the +upstream research, and the 2026-06-21 decision. This spec does not restate the +mechanism; it plans the remaining work that decision deferred. + +* Background — how we hit this + +We re-enabled native compilation config-wide (early-init.el, commit 3fd28987, +2026-06-20). Tests that had been green for months immediately started failing +with no change to their source — the first 8 were window-primitive mocks in +=test-dirvish-config-wrappers.el= and =test-calibredb-epub-config.el= +(=window-body-width=, =window-margins=, =current-window-configuration=, +=get-buffer-window=), throwing =wrong-number-of-arguments= for a zero-arg mock +lambda called with one argument. + +** What we struggled with (the consequences) + +- *Intermittent, non-deterministic failure.* The same test passed, then + crashed, then passed again within a session. Native-comp generates a + per-primitive trampoline =.eln= lazily and caches it on disk; whether a mock + "works" depends on whether that trampoline has been built yet. The + non-determinism was the tell, and it made the failures hard to trust or + reproduce. +- *Three distinct failure modes from one cause* (full detail in the companion + doc): (1) trampoline generation failure under =--batch=; (2) silent bypass — + natively-compiled callers ignore the mock and run the real primitive, so a + test passes for the wrong reason; (3) arity mismatch — the trampoline calls + the mock with the primitive's *maximum* arity, so a fixed-arity mock narrower + than the primitive throws. Mode 3 is the one that bit us; modes 1 and 2 sit + latent. +- *The tempting quick fix is the dangerous one.* Disabling subr trampolines + (=native-comp-enable-subr-trampolines nil=) is the most-cited workaround, but + in our native-comp-heavy setup it produces mode 2 — tests that pass while + asserting against the real primitive. A quiet false pass is worse than a loud + crash. +- *Scale of the latent surface.* The suite mocks subrs in hundreds of places. + The variadic sweep touched 188 arity-narrow mocks. + +** What is already done (the stopgap, 2026-06-21) + +Not currently broken — two things shipped (commits 571da499, b62c3c88): + +1. *Variadic sweep.* Every arity-narrow subr mock got =&rest _= appended, which + tolerates the trampoline's full-arity call. Deterministic, keeps trampolines + on, so no silent bypass. Fixes mode 3. +2. *Meta-test gate.* =tests/test-meta-subr-mock-arity.el= statically scans every + test file for =symbol-function= / =fset= / =setf= subr redefinitions and + fails =make test= if any mock can't accept the primitive's maximum arity. A + new arity-narrow mock can't merge silently. + +The stopgap fixes the mode we actually suffered. It leaves modes 1 and 2 latent. +The durable fix the ecosystem (and our own =elisp-testing.md=) points to is to +*not redefine primitives at all*. That is the work this spec scopes. + +* The real scope — most mocks should NOT move + +The raw inventory is large, but the headline number is misleading. =testing.md= +says to mock external boundaries; converting those to "drive real state" would +mean running real shells and touching the real filesystem in unit tests — the +opposite of what we want. So the actual migration target is narrow. + +Current subr-mock sites across 261 test files (=cl-letf= / =fset= / =advice-add= +on the named primitive): + +| Primitive | Sites | Classification | Disposition | +|-------------------------+-------+-------------------------+-----------------------------------| +| shell-command-to-string | 62 | external boundary | keep mocked (variadic) | +|-------------------------+-------+-------------------------+-----------------------------------| +| executable-find | 60 | external boundary | keep mocked (variadic) | +|-------------------------+-------+-------------------------+-----------------------------------| +| shell-command | 29 | external boundary | keep mocked (variadic) | +|-------------------------+-------+-------------------------+-----------------------------------| +| call-process | 17 | external boundary | keep mocked (variadic) | +|-------------------------+-------+-------------------------+-----------------------------------| +| current-time | 11 | time boundary | keep mocked (variadic) | +|-------------------------+-------+-------------------------+-----------------------------------| +| save-buffer | 10 | file I/O boundary | keep, or real temp-file fixture | +|-------------------------+-------+-------------------------+-----------------------------------| +| write-region | 4 | file I/O boundary | keep, or real temp-file fixture | +|-------------------------+-------+-------------------------+-----------------------------------| +| message | 69 | output-silencing | keep mocked (variadic) | +|-------------------------+-------+-------------------------+-----------------------------------| +| completing-read | 25 | UI prompt | MIGRATE — extract pure internal | +|-------------------------+-------+-------------------------+-----------------------------------| +| read-string | 16 | UI prompt | MIGRATE — extract pure internal | +|-------------------------+-------+-------------------------+-----------------------------------| +| yes-or-no-p | 14 | UI prompt | MIGRATE — extract pure internal | +|-------------------------+-------+-------------------------+-----------------------------------| +| read-from-minibuffer | 6 | UI prompt | MIGRATE — extract pure internal | +|-------------------------+-------+-------------------------+-----------------------------------| + +The genuine migration target is the UI-prompt bucket: ~61 sites. Per +=elisp-testing.md='s Interactive-vs-Internal rule, the fix is to extract a pure +internal that takes the value as an argument and test that directly, leaving the +interactive wrapper a thin un-tested (or smoke-tested) shell. That removes the +prompt mock entirely — immune to all three failure modes — and improves the +production code's testability as a side effect. + +Boundary mocks (shell, file I/O, time, =executable-find=, =call-process=) stay +mocked: that is correct unit-test practice, and the variadic form already +handles native-comp. =message= is output-silencing, not logic — keep it. + +* Proposed approach + +Not a single sweep. The migration touches production code (each extraction is a +small design change), so it is incremental and reviewable, not mechanical. + +1. *Scoped exemplar pass first.* Pick one module with a few prompt-mocks, do the + extract-internal conversion there, set the pattern, and measure the per-case + effort. This calibrates the rest. +2. *Batch by module afterward.* Convert remaining UI-prompt sites a module at a + time, each its own commit, with the suite green between. +3. *Leave boundaries alone.* No conversion of shell / file / time / process + mocks. The meta-test keeps them arity-safe. + +* Open decisions (resolve in discussion) + +** TODO Confirm the scope: UI-prompt mocks only, boundaries stay +Is the migration scoped to the ~61 UI-prompt sites (completing-read, read-string, +yes-or-no-p, read-from-minibuffer), with all boundary mocks explicitly out of +scope? Or is there an appetite to also convert the file-I/O mocks +(=save-buffer=, =write-region=) to real temp-file fixtures where it reads +cleaner? + +** TODO Reframe the todo.org task title to match the real scope +The current title — "Migrate tests off mocking primitives" — reads as all 300+ +sites. If we agree on UI-prompt-only, retitle to something like "Extract pure +internals for UI-prompt-mocked tests" so a future session does not re-scope it +as a wholesale sweep. + +** TODO Pick the exemplar module for the first pass +Which module gets the calibrating conversion? A small one with 2-4 prompt-mocks +is ideal. Candidate selection needs a per-module breakdown of the ~61 sites +(not yet collected). + +** TODO Decide priority and timing +Currently =[#C]=, =:solo:=. The suite is not broken (stopgap holds), so this is +test-quality debt, not urgent. Confirm it stays low and gets done in batches +between other work, rather than as a dedicated push. + +* Non-goals + +- Re-deriving or re-documenting the native-comp trampoline mechanism (see the + companion doc). +- Converting boundary mocks (shell, file I/O, time, process, executable-find). +- Removing the variadic-mock convention or the meta-test gate — both stay; they + are the standing protection for every mock that legitimately remains. |
