docs/design/coverage.org


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160

#+TITLE: Design: Coverage Reporting
#+AUTHOR: Craig Jennings
#+DATE: 2026-04-22

* Status

Draft. Not yet implemented.

* Problem

Before committing or opening a PR, there's no quick way to answer "are the lines I just changed actually covered by tests?" Line-level coverage for the *whole* project is also missing, and there's no artifact to track coverage over time.

The primary user-facing need is the first one: point-in-time feedback on in-flight changes, triggered from Emacs. The other two (whole-project report, long-term artifact) fall out naturally once the primary path exists.

The tooling should be pluggable so the same workflow covers Elisp today and Python, TypeScript, and Go later — without rebuilding the UI for each language.

* Non-Goals

- Continuous in-buffer overlays (fringe marks, line highlights). Parked over performance concerns.
- Mutation testing or any signal other than line coverage.
- CI integration beyond emitting an LCOV artifact. No coveralls, no GitHub Actions wiring.
- Shadowing or replacing existing test-running commands (=make test=, =make test-file=, etc.).

* Approaches Considered

** Recommended: diff-aware report with pluggable backends

Core engine reads an LCOV file, shells to ~git diff~ at a selectable scope, intersects, and displays the result in a compilation-mode-derived buffer. Language-specific "backends" each produce LCOV in their own way and register themselves with the core.

*Pros:* Directly serves the primary use case. LCOV is a universal format, so new languages plug in without touching the core. Compilation-mode inheritance gives free =next-error= / =previous-error= navigation.

*Cons:* More code than a "just run coverage and read the output" approach. Backend registry adds one layer of indirection (small — ~30 lines).

** Rejected: non-interactive pre-commit hook

Would run coverage on every commit and report uncovered-changed-lines to stderr. Literal fit for the use case but adds a long delay to every commit and offers no way to inspect non-staged scopes.

** Rejected: coverage as a =review-code= skill criterion

Would fold coverage into the existing pre-commit review skill. Clean in principle, but couples =review-code= to Emacs-specific tooling and makes ad-hoc inspection (outside a review) awkward.

** Rejected: mutation testing instead of line coverage

Stronger signal than coverage but minutes-to-hours runtime on the current 265-file suite, and no polished Elisp tool exists. Different conversation.

* Design

** Architecture

Three files:

- =modules/coverage-core.el= — engine + backend registry + user-facing command. Language-agnostic.
- =modules/coverage-elisp.el= — the initial backend. Registers itself on load.
- (Future) =modules/coverage-python.el=, =coverage-typescript.el=, =coverage-go.el= — each ~30 lines, self-registering.

=init.el= requires the core and the active backends.

*** Backend protocol

Each backend is a plist registered into =cj/coverage-backends=:

#+begin_src emacs-lisp
(:name       'elisp
 :detect     (lambda () ...)   ; non-nil if current project matches
 :run        (lambda (cb) ...) ; kick off coverage build; invoke CB with LCOV path
 :lcov-path  (lambda () ...))  ; where the LCOV lives (for re-reading without running)
#+end_src

Detection precedence: =.dir-locals.el= override (=cj/coverage-backend= set to a backend name), then project-root fingerprints (=go.mod=, =pyproject.toml=, =package.json=, =.el= files + Makefile, etc.). First =:detect= that matches wins. No silent fallback — if nothing matches, the command errors with guidance.

*** Pure helpers

- =cj/--coverage-parse-lcov FILE= → hash-table ={file → covered-line-set}=.
- =cj/--coverage-changed-lines SCOPE BASE= → hash-table ={file → changed-line-set}= by shelling a =git diff --unified=0= for the selected scope and parsing hunk headers.
- =cj/--coverage-intersect COVERED CHANGED= → per-file records with three buckets: covered, uncovered, not-tracked.

All three are pure, fully ERT-tested.

** Data Flow

1. User invokes =cj/coverage-report= (bound to =F7=).
2. Core resolves the backend for the current project.
3. =completing-read= prompts for scope:
   - "Working tree — all uncommitted changes"
   - "Staged — about to commit"
   - "Branch vs parent" (uses =cj/coverage-base-branch= → =@{upstream}= → =main= in order)
   - "Branch vs main" (explicit)
4. Freshness check: if =lcov.info= is missing, or older than the newest changed file, prompt "Run coverage now?" Yes runs the backend's =:run= asynchronously via =compile=; no reads the stale file anyway.
5. Parse LCOV, compute changed lines, intersect.
6. Display a report buffer in a mode derived from =compilation-mode=.

** Persistence

- =.coverage/lcov.info= at the project root, gitignored. Overwritten on each run.
- No long-term storage. Historical tracking is explicitly out of scope for v1.

** Error Handling

*Pre-flight:*
- No backend matches → =user-error= with instructions to register a backend or set =.dir-locals.el=.
- =.dir-locals.el= names an unknown backend → error listing registered backends.
- Not in a git repository → error; don't swallow git's stderr.
- "Branch vs main" scope on a repo with no common ancestor (orphan branch, shallow clone missing the fork point) → "no merge base with main" error, suggest "Working tree" or "Staged" scope.

*During the coverage run:*
- Backend =:run= fails (test failure, Make error) → keep the =compile= buffer visible, do *not* proceed to display a report. Partial data is worse than no data.
- Run completes but no LCOV produced → error naming the expected path.

*Post-flight classification:* three buckets, not two.
- *Covered* — changed line in LCOV's covered-line set.
- *Uncovered* — changed line in a tracked file but not covered.
- *Not tracked* — changed file isn't in LCOV at all (test files, READMEs, config). Reported separately — don't conflate "coverage didn't look here" with "tests didn't exercise this code."

*Happy-path degenerates:*
- Zero changed lines in scope → "No changes in this scope; nothing to report."
- All changed lines covered → "N of N changed lines covered. "

** Keybindings

*Global:*
- =F7= → =cj/coverage-report= (prompts scope, shows report).
- =C-u F7= → force re-run regardless of LCOV freshness.

*In the report buffer* (compilation-mode derived, most inherited for free):
- =RET= → jump to source under point.
- =n= / =p= → next / previous uncovered line.
- =g= → refresh (re-run + redisplay).
- =q= → bury buffer.

*Globally available via compilation-mode integration:*
- =M-g n= / =M-g p= → =next-error= / =previous-error= on the last compilation buffer.
- =C-x `= → visit next uncovered line without leaving the current buffer.

The =F4=–=F7= developer block (compile+run, debug, test, coverage) gets its full rework in a separate todo ticket. The coverage work binds =F7= now because it's its final position.

** Testing

*Pure helpers, fully tested* (Normal / Boundary / Error for each):
- =cj/--coverage-parse-lcov= — handcrafted LCOV fragments in temp files; empty, headers-only, spaces/unicode in filenames, malformed lines, missing =end_of_record=.
- =cj/--coverage-changed-lines= — =cl-letf= over =shell-command-to-string= to return canned =git diff= output; single hunk, new-file hunk, deletion-only hunk, binary marker, no-diff case.
- =cj/--coverage-intersect= — pure table-in / table-out; covered ⊇ changed, unknown files, nil/empty inputs.

*Backend registry, structurally tested:*
- =cj/coverage-backend-for-project ROOT= — synthetic temp project roots with marker files; assert correct backend. Registration-order test: two backends match, first-registered wins.

*Not tested:*
- =cj/coverage-report= interactive command — one smoke test with a prepared LCOV and a stubbed git-diff. No tests for the prompt UI or the compilation-buffer display.
- The elisp backend's =:run= function — shells to =make coverage=; integration-test-shaped, low value, slow. Skipped by design.

* Open Questions

- [ ] Which tests should a coverage run actually execute? All of them (simple, slow for 265 files), or only the test files whose target modules changed (fast, but dependent-test discovery in Elisp is non-trivial)? Deferred until implementation.
- [ ] Default behavior when LCOV is stale but not missing: prompt, or auto-rerun? Current design prompts. Revisit after first use.
- [ ] Whether =cj/coverage-base-branch= should be a single value or a list of candidates (useful if you routinely stack PRs more than one level deep). Single value for v1.

* Next Steps

1. Replace the existing =[#C] Integrate undercover.el for test coverage= entry in =todo.org= with a sharper implementation ticket referencing this design.
2. Begin implementation, starting with the pure helpers (TDD) and the elisp backend, then the =cj/coverage-report= command, then the =make coverage= Makefile target.
3. Open questions above → individual =arch-decide= ADRs if they turn out to be load-bearing; otherwise resolve inline during implementation.