.ai/workflows/readability-audit.org


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242

#+TITLE: Readability Audit Workflow
#+AUTHOR: Craig Jennings & Claude
#+DATE: 2026-06-28

* Overview

A pass over one file, a set of modules, or the whole tree that makes the code
*readable to a future maintainer*. It checks four things and fixes the cheap
ones in place: the file-top commentary, the inline comments, the names, and the
physical organization of the code. Structural changes that need a real refactor
(splitting a module, renaming a public symbol) are not done here — they are
filed as =:refactor:= tasks so they get their own design and test pass.

This is language-agnostic. Where a step names a language-specific tool or
convention, it's stated as "the project's <X>, if it has one" — read the
project's =CLAUDE.md= / =notes.org= and the language bundle to resolve the
concrete tool.

* Where it sits among the code-quality tools

These tools are a pipeline, not duplicates. Knowing which to reach for:

- *readability-audit* (this workflow) — prose and human-reader clarity:
  comments, file headers, names, and physical organization. Judgment-driven
  (does this comment lie? does this name reveal intent? can a newcomer place
  this file in a minute?).
- =/refactor= — structure on measurable metrics: complexity, duplication,
  dead-code, the =simplification= lens (behavior-preserving logic/size
  reduction), and =rename= (executes a codebase-wide symbol rename).
- =/simplify= — behavior-preserving cleanup of the current diff, applied
  directly.

The link that keeps them from overlapping: when this audit finds a structural
problem too big for a comment/name fix — a module to split, a *public* symbol to
rename across call sites — it *files* a =:refactor:= task rather than doing it
here. =/refactor= (rename, simplification) or =/start-work= then executes that
filed task with a proper design and test plan. Readability finds and files;
=/refactor= transforms.

* Problem We're Solving

Source files drift toward two opposite failure modes, and both hurt the next
person to open the file:

- *Documentation rot and noise.* Headers carry stale user-manual content
  (quick-starts, full option matrices, setup walkthroughs) that belongs in user
  docs; comments restate what the next line already says; comments go out of
  date and start lying; placeholder =TODO=/=FIXME= stubs and conversational
  asides accumulate. A blank summary or a missing file-top description leaves a
  reader with no map.
- *Structural fog.* Names that don't reveal intent force the reader to decode
  them; related functions scatter; a public entry point sits far from the
  private helpers it calls; a file grows to hold several unrelated
  responsibilities.

Left alone, opening a file costs more every month. The fix is a repeatable audit
with a clear, checkable standard, run on demand or as files are touched.

* Exit Criteria

For the audited scope:

1. *Every file has an accurate top section* that states what the file does and
   how it fits the rest of the codebase — terse, no user-manual content, and
   carrying the project's file-header convention where it has one.
2. *Every surviving comment earns its place* — it explains a *why* the code
   can't (a constraint, a workaround and its reason, an ordering dependency, a
   warning), it is accurate against the current code, and it is terse. Obvious
   "describe the next line" comments are gone.
3. *Names reveal intent* — no cryptic abbreviations; the project's
   public/private visibility convention is applied consistently.
4. *Related code is co-located* — a public function's private helpers sit right
   after it; the file reads top-to-bottom by descending abstraction; sections
   group what belongs together.
5. *Structural problems too big to fix in a comment pass are filed* as
   =:refactor:= tasks, not left as a vague note and not half-done inline.
6. *Nothing broke* — the build is clean and the test suite is green
   (comment/name edits are behavior-preserving, so this should always hold; it
   is the proof, not a hope). See "Graceful degradation" for projects without a
   suite.

* When to Use This Workflow

- "Let's run the readability-audit workflow."
- "Audit the comments and commentary in <file/area>."
- "Clean up the structure/organization of <module>."
- After landing a feature, on the files it touched, before moving on.
- On a single file you just found hard to read.
- As a tree-wide sweep: inventory all the source files, audit each, batch the
  fixes.

Do NOT use this to *perform* the structural refactors themselves (use
=/refactor= or =/start-work= against a filed task) or to hunt for bugs /
complexity / duplication (that is =/refactor=, not a readability pass).

* Approach: How We Work Together

** Phase 1 — Scope and inventory

Pick the target: one file, a named module set, or the whole tree. For a sweep,
list the source files (honor =.aiignore=) and decide coverage. Lean on the
language's own doc linters as a first filter where they exist — many flag a
missing or blank file summary and malformed headers; run the project's lint
target first.

** Phase 2 — Audit each file against the four dimensions

Record findings as =file:line — issue — proposed fix=. The four dimensions:

*** A. File-top commentary (the map)

- Present, and *accurate* against what the file now does.
- States purpose, the file's role/architecture, and key entry points —
  *tersely*. A reader should learn what this is and how it connects in a few
  lines.
- Carries the project's file-header convention where it has one (a metadata
  block, a module docstring, a standard header comment). If the project has no
  header convention, skip this sub-check — don't invent one.
- Does *not* carry user-manual content — quick-starts, full option matrices,
  step-by-step setup. That belongs in user docs; move it, don't keep it in the
  source header.
- Mechanics are correct for the language: a filled summary line (not blank), the
  expected section markers, the expected footer.

*** B. Inline comments (why, not what)

- Explains a *why* the code cannot: a workaround *and its reason*, an ordering
  or load dependency, business-logic rationale, a real warning ("do not reorder
  these — deadlock").
- Is *accurate* — matches the current code. A wrong comment is worse than none;
  fix or delete on sight.
- Is *terse and useful*. Delete the obvious "describe the next line" comment
  unless it names a non-obvious constraint. Replace a stale placeholder or a
  rambling aside with the real one-line reason, or remove it.
- Convert a comment that's only restating the code into a better *name* instead
  (see C).

*** C. Names (carry the what/how so comments don't have to)

- Intention-revealing variable and function names; no cryptic single letters or
  abbreviations outside tight local scopes.
- The project's public/private convention is applied consistently and correctly:
  a helper only called within the file is private; a user-facing or
  intentionally-reusable symbol is public. (Resolve the concrete convention from
  the language and the project — a naming prefix, an export list, an
  access modifier.)
- When a comment exists only to explain a name, rename instead.

*** D. Organization (co-location and ordering)

- Related functions sit together. A public function's private helpers come
  *right after* it (stepdown / proximity / "reads like a newspaper").
- The file reads top-to-bottom by descending abstraction.
- Sections group what belongs together.
- *Cohesion check:* if the file holds several unrelated responsibilities, or has
  grown large enough that the top no longer describes one coherent thing, flag a
  split into layered owners — but see Phase 4: that's a filed refactor, not an
  inline fix.

** Phase 3 — Apply the cheap, safe fixes inline

Dimensions A, B, and C are *comment- and name-only* and *solo* (no design or
preference call): apply them directly. After each file (or a batch), verify with
the project's gates: parse/syntax check, a clean build (no new warnings), and a
green test suite. Comment/name edits can't change behavior, so green is the proof
the edit was clean, not a behavior check.

For a tree-wide sweep, drive the uniform rewrites mechanically and verify the
whole batch at once: a *mechanical applier with a boundary assertion* that
replaces a well-defined header span is reliable and fast, then one suite run
covers the batch. Keep the varied cases (header-line fixes, summary fixes that
must preserve surrounding metadata, inline-comment surgery, generated-file
headers) as careful per-file edits. (The boundary markers are language-specific;
the principle — mechanical applier + assert + one suite run for uniform
rewrites, per-file judgment for varied cases — is not.)

** Phase 4 — File the structural refactors, don't do them here

Dimension D's bigger findings — split a module, rename a *public* symbol across
call sites, move a function to a different file — are real refactors with their
own risk and test surface. Do *not* slip them into a readability pass. File each
as a =:refactor:= task in =todo.org= with the specific finding, so it gets
=/refactor= or =/start-work= with a proper design and test plan. This is the
line between the cheap clarity win and the structural change; keeping it sharp is
what lets the audit stay safe and fast.

** Phase 5 — Verify and commit in logical batches

Full suite green, build clean. Commit the doc/comment changes as =docs:= (or
=refactor:= where a header/structure normalized) in cohesive batches — one
commit per coherent slice (a set of condensed commentaries, the
generated-file-header fixes, the obvious-comment prune), not one mega-commit and
not one-per-file. Generated files are fixed *in their generator* and then
regenerated, so the next regen stays compliant.

* Graceful degradation

The audit adapts to what the project provides:

- *No file-header convention* → skip dimension A's metadata sub-check; still
  check the summary/description for accuracy and terseness.
- *No test suite* → the green-suite proof in Phases 3 and 5 is unavailable. Fall
  back to the strongest gate the project has (compile/byte-compile, parse check,
  linters) and *flag the weaker proof as a known limit* — a behavior-preserving
  edit is lower-risk, but say plainly that there's no suite to confirm it.
- *No doc linter* → do the Phase 1 first-filter by reading instead; the audit
  still runs, just without the cheap pre-pass.

* Principles to Follow

- *Comments explain why; code explains what.* If a comment restates the code,
  delete it or turn it into a better name.
- *Accuracy beats completeness.* A wrong or stale comment is worse than no
  comment. When in doubt, delete.
- *Terse and useful.* Every comment and every header line earns its place. The
  source header is not the user manual — move manuals to user docs.
- *Readable means the next person, fast.* The test of the top-section and the
  organization is whether a maintainer who has never seen the file can place it
  and navigate it in under a minute.
- *Keep the cheap pass cheap.* Comment/name fixes are solo and land inline.
  Structural splits and public renames are not — they get filed, designed, and
  tested separately.
- *Preserve legal and attribution headers verbatim.* Vendored / GPL / copyright
  notices are never condensed away by a readability pass.
- *Manual validation is still Craig's.* Solo means no input is needed to *do*
  the work; visual/behavior confirmation afterward is expected where relevant.

* Living Document

Update this with what real runs teach. Lessons worth keeping as the standard
sharpens:

- *Interpretation default for "fix blank summary":* when a rewrite shows only a
  header + summary and omits a metadata block the file already has, keep the
  existing metadata and replace only the header line and the summary. Its
  absence from the rewrite means "leave it," not "delete it."
- *Generated files:* fix the *generator*, then regenerate. Editing the generated
  file directly is reverted on the next regen.
- *Vendored files:* preserve the copyright/attribution; do not auto-condense a
  licensed header.
- *Mechanical applier + assert + one suite run* is the safe way to do a
  many-file uniform rewrite; per-file judgment is for the varied cases.