.ai/workflows/spec-response.org


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215

#+TITLE: Spec-Response Workflow
#+AUTHOR: Craig Jennings
#+DATE: 2026-05-23
#+STARTUP: showall

* Overview

The spec-response workflow processes external reviews of a design spec and folds them into the spec until it is implementation-ready. A reviewer (human or another agent) leaves a review file next to the spec — typically produced by its counterpart, the *spec-review* workflow, which writes =<spec-basename>-review.org= and assigns a readiness rubric. Claude works through every recommendation, deciding accept / modify / reject for each, updates the spec for the accepted ones, documents the modified and rejected ones with reasons, then deletes the review file. Repeat for each spec under review.

The output is a spec a reader could implement from, plus a durable record — inside the spec — of why any recommendation was changed or declined, so the reviewer can find the reasoning later without re-litigating.

This workflow was first run on 2026-05-23 against the linear-emacs =issue-query-spec.org= and =issue-representation-spec.org= reviews, and written up from that run.

* Problem We're Solving

A review is only useful if every point in it gets a decision. Without a defined process:

- *Recommendations get dropped silently.* A skim picks up the obvious ones and loses the rest; the reviewer can't tell whether a missing change was rejected or forgotten.
- *Rejections leave no trace.* When a recommendation is declined for a good reason, that reason lives only in chat and evaporates. The next reviewer re-raises it.
- *Reviews get rubber-stamped.* Accepting everything uncritically is as bad as ignoring it — the point of the review is judgment, including the judgment to say no.
- *Cross-spec conflicts go unreconciled.* When two related specs are reviewed and the reviews pull in opposite directions, nobody resolves the tension and it surfaces during implementation.
- *The spec never converges.* Without an explicit "implementation-ready" bar, review cycles drift.

*Impact:* slow convergence, lost rationale, re-litigated decisions, and specs that look reviewed but aren't decided.

* Exit Criteria

A spec's review is fully processed when:

1. *Every recommendation has an explicit disposition* — accepted, modified, or rejected. None dropped.
2. *Accepted recommendations are woven into the spec body* — the spec reads as if they were always there, not appended as a changelog.
3. *Modified and rejected recommendations are documented in a bottom section* (e.g. "Review dispositions") with a one-paragraph reason each, so the reviewer can find the reasoning.
4. *Review/response provenance is documented in the spec* — iteration count/date, contributor, role, what changed, and why.
5. *Pre-agreed decisions are flipped to =DONE=* — each settled decision's =TODO= becomes =DONE=, and the =[/]= cookie on the spec's =* Decisions= heading reflects the tally.
6. *Cross-spec tensions are reconciled in writing* when related specs were reviewed together.
7. *The review file is deleted* once 1-6 hold.
8. *Tracking is updated* — the spec's VERIFY/task body notes "review incorporated" and whether it's implementation-ready.
9. *Implementation tasks exist* — once the author confirms the spec is Ready, the project's =todo.org= carries the full implementation-task breakdown (Phase 6), reviewed for completeness, with =:solo:= marked and a Manual-testing task for everything else.

The whole run is done when no =*-review.org= files remain and each spec is judged implementation-ready (or its remaining blockers are named).

*Measurable validation:* a reader scanning the review against the revised spec can find, for every review point, either the change in the body or its disposition at the bottom. Nothing is unaccounted for.

* When to Use This Workflow

Trigger when:

- A reviewer drops a review file alongside a spec — convention: same basename with a =-review.org= suffix (=foo.org= → =foo-review.org=).
- Craig says "respond to the review" / "let's run the spec-response workflow" / "process the spec reviews."
- Any time a spec needs to absorb structured external feedback and converge to implementation-ready.

Not for: drafting a spec from scratch (that's a design conversation), or informal inline comments (handle those conversationally / via the cj-comment flow).

* Precondition: spec filename

Before Phase 0, verify the spec being responded to ends with =-spec.org=. Every design, decision, or planning document under a project's =docs/= directory carries that suffix as its identifier. The =.org= extension alone is not enough because =docs/= holds non-spec org files too.

If the spec does not end with =-spec.org=, stop immediately and surface the mismatch:

#+begin_example
The spec <path> does not end with -spec.org. Either the wrong file was named, or
the file should be renamed first. Spec workflows require the -spec.org suffix as a
guard against pointing the workflow at tutorial, inventory, or setup docs.
#+end_example

The review file the response consumes follows the convention =<spec-basename>-review.org=, so a misnamed spec produces a mis-pointed review file too. Fix the spec name first.

The user resolves the mismatch and re-invokes the workflow. Do not proceed with the response against a misnamed spec.

* Approach: How We Work Together

** Phase 0: Orient

1. List the review files (=ls docs/*-review.org= or wherever they live). Process them one at a time in whatever order; the user may name an order.
2. Re-read the *current* spec, not your memory of it — it may have changed since you wrote it (the user or a linter may have edited it, and pre-agreed decisions may already be encoded in the tracking file).
3. Note any *pre-agreed decisions* the reviewer or user has already settled — in the review's own "Agreed decisions" section, or in the spec's tracking task. These are settled inputs. Don't reopen them; bake them in.

** Phase 1: Read the whole review before touching the spec

Read the entire review first. Recommendations interact — an early "medium" finding may be subsumed by a "high" one, or two findings may point at the same edit. Decide dispositions with the whole picture in view, not finding-by-finding as you scroll.

** Phase 2: Decide a disposition for every recommendation

For each recommendation, choose one:

- *Accept* — the recommendation is right as written. Plan the edit.
- *Modify* — the recommendation is right in spirit but wrong in detail or scope. Adjust it, and record what you changed and why.
- *Reject* — the recommendation doesn't fit. Record why.

*Engage critically.* Rubber-stamping is a failure mode. On a strong review most points are accepts, but actively look for the genuine modify/reject cases — they are where your judgment earns its place. Examples from the first run:

- *Modify:* the review proposed an automatic cache-TTL defcustom. Accepted the goal (fresh data) but deferred TTL to vNext because for a single-user tool an explicit force-refresh + clear-cache command covers it without invalidation complexity.
- *Reject:* the review floated a separate =default-issue-filter= defcustom. Rejected as redundant — a fixed default command plus a default-view preference already covered it.
- *Modify (scope):* "split representation from network into modules" — adopted the logical boundaries but kept a single file, because the package is a single-file MELPA target and a file split is a larger restructuring.

*Honor pre-agreed decisions.* If the reviewer marked a decision "agreed" (or the user pre-encoded it), treat it as accepted and settled — flip its decision task to =DONE= (adding the decision to the spec's =* Decisions= section first if it only existed as an open question).

** Phase 3: Reconcile cross-spec tensions

When related specs were reviewed together, two reviews can recommend opposite things. Don't pick one and silently drop the other — find the framing in which both hold. Example from the first run: one review said "switching views *replaces* the active file"; the other said "refresh by *merge*, not wholesale rewrite." Reconciliation: switching *source* replaces; refreshing the *same source* merges by stable ID with a per-item conflict check. Write the reconciliation into the spec.

** Phase 4: Update the spec

1. *Weave accepted recommendations into the body.* The spec should read naturally — a new "Selector semantics" section, a revised phase plan, an added test-strategy section — not a list of "review said X so I did Y." The body reflects the decisions; it doesn't narrate them.
2. *Add a bottom "Review dispositions" section* listing only the *modified* and *rejected* recommendations, each with a short reason. Close it with a one-line "everything else accepted as written" so the reader knows the omissions from this section are accepts, not gaps.
3. *Update or add a bottom "Review and iteration history" section.* Every response pass gets an entry, even when all findings are accepted. Each entry is an org subheading with a compound id followed by three body fields:

   Heading format: =YYYY-MM-DD Day @ HH:MM:SS -ZZZZ — Contributor — Role=

   Timestamp + contributor + role concatenate as an opaque id; nothing more should be read into it. Contributor uses short parenthetical project context where session matters (e.g. =Claude Code (rulesets)=, =Codex=, =Craig Jennings=). Role can be compound (e.g. =reviewer + responder=) when one pass fused multiple roles.

   Body fields:

   - *What changed:* compact summary of accepted, modified, and rejected work.
   - *Why:* the rationale or decision pressure behind the changes.
   - *Artifacts:* review filename, disposition section, task IDs, source checks, or commits when useful.
4. *Flip settled decisions to =DONE=.* Each decision the decision-maker has agreed flips its =TODO= to =DONE=; the =[/]= cookie on the =* Decisions= heading tracks the tally. A contested decision stays =TODO= with the back-and-forth under its =*** Discussion= child header. Decisions still =TODO= should be only what genuinely still blocks, each with an owner and a by-when.
5. *Raise the spec to implementation-ready:* consolidate decisions up front, add any implementation prerequisites the review surfaced (e.g. a schema-verification checklist), a consolidated test strategy, and a phased plan ordered so dependencies (like an output model everything depends on) come early. *Gate:* the spec Status cannot move past =draft= to implementation-ready while any decision is still =TODO= — the =[/]= cookie must read complete, or the author consciously accepts and records the risk of building with one open.
6. *Update the status line* to note "review incorporated (<reviewer>, <date>)."

** Phase 5: Close out and iterate

1. *Delete the review file* — only after every recommendation has a disposition. Its deletion is the signal the review is fully processed.
2. *Update tracking* — the spec's VERIFY/task body gets a line noting review incorporated, what changed at a high level, which recommendations were modified (pointing at Review dispositions), and whether it's now implementation-ready pending final go.
3. *Update the session log* (state changed this turn).
4. *Move to the next review file.* Repeat Phases 1-5 until none remain.
5. *Report* what was accepted-wholesale, what was modified/rejected and why, any cross-spec reconciliations, and the implementation-ready verdict per spec.

** Phase 6: On Ready, build the implementation-task breakdown

This is the *last* step of the workflow, and it runs *only after the author confirms the spec is Ready* — never during review iterations. A Ready spec nobody can act on is unfinished; this phase turns it into tracked work. It applies to every project type (library, application, service, docs set).

1. *Decide where the tasks live.* If the work is spinning off into its own project/repo, move the parent task into that project's =todo.org= (and relocate the spec with it); otherwise use the current project's =todo.org=. One parent task owns the effort; the phase tasks hang under it.

2. *Create one task per implementation phase* from the spec's =Implementation phases=, in dependency order, so the task set as a whole describes the *full* milestone (e.g. v1) with no gaps. Each task body names the deliverable, its tests, and how it is verified. Carry over deferred/vNext work and any publish/release steps as their own tasks.

3. *Turn a critical eye on completeness.* Re-read the spec — every phase, every acceptance criterion, every named deliverable, every data-safety/principle rule — and confirm each has a home in a task. The work is not done when the tasks merely exist; it is done when nothing in the spec is left untracked. This completeness pass is mandatory regardless of project type.

4. *Mark =:solo:=* on every task the agent can both build *and* verify end-to-end with no involvement from the user — including self-verification of visuals via a headless/screenshot harness. Anything that needs the user's hands, eyes, real hardware, real hosts/credentials, or sign-off is *not* =:solo:=.

5. *Add a "Manual testing and validation" task* (per =verification.md=) collecting everything not =:solo:=. As you write the phase tasks and discover items that need the user, add a child test under it with: a one-line "What we're verifying", the exact steps (one action per line), and an explicit "Expected" result — written so someone familiar with the project's domain but *not* this feature can run it from the text alone.

6. *Add publish/release tasks* — remote creation, initial push, mirroring/release — and *defer the outward-facing steps* (creating a public remote, enabling a mirror, tagging a release) until the user explicitly confirms. Surface anything that needs the user's infrastructure (a server-side repo, credentials) rather than guessing it.

The workflow is complete when these tasks exist, the completeness pass confirms nothing in the spec is untracked, and the user can pick up implementation from the task list alone.

* Principles to Follow

** Every recommendation gets a decision
Accept, modify, or reject — but never silently drop. The reviewer must be able to account for every point.

** Document the no's, not the yes's
Accepted recommendations live in the spec body (the change *is* the record). Modified and rejected ones need an explicit written reason at the bottom, because the change is invisible and the reasoning would otherwise be lost. The asymmetry is deliberate.

** Critique, don't rubber-stamp
A review you accept entirely without finding a single thing to push on probably wasn't read critically. Your judgment — including a well-reasoned no — is the value you add.

** Pre-agreed decisions are settled inputs
Don't re-open what the reviewer or user already agreed. Bake it in and move on.

** Reconcile, don't choose-and-drop
When reviews conflict, find the framing where both are right. Silently honoring one and ignoring the other reintroduces the conflict at implementation time.

** A reject goes back to the reviewer, not just into the file
Recording a reasoned reject is the floor, not the close. Communicate the rejection and its reason to the reviewer — a reject is a two-party event, not a unilateral call. If the reviewer disagrees, that's a discussion: weigh the counter, and if you still can't agree, escalate to whoever owns the decision rather than letting the author's "no" stand by default. "I'm not doing that" with no reason the reviewer can engage is the failure mode. (For a tight solo author-reviewer loop this is lightweight; for a team it's the difference between a review and a rubber-stamp-in-reverse.)

** The spec reads forward, the dispositions read backward
The body is written for the implementer (no review archaeology). The dispositions section is written for the reviewer (the reasoning trail). Keep the two audiences separate.

** The history explains provenance, not implementation behavior
The spec body should still be the implementation contract. The bottom =Review and iteration history= section is for provenance: number of iterations, dates, contributors (including agents), roles, what each pass contributed, and why. Keep it short enough that future readers can understand how decisions evolved without rereading chats, deleted review files, or session logs.

** Re-read before editing
The spec may have changed since you last saw it. Edit the current file, reconcile against the latest tracking state.

** Converge to implementation-ready
The bar is "a reader could implement from this." Decisions consolidated, prerequisites named, tests strategized, phases ordered by dependency. The loop terminates when the spec-review rubric reaches =Ready= (or =Ready with caveats= the author accepts) and every blocking, high-priority finding has a disposition — not when you run out of energy. Iterate toward that bar, not in circles.

* Living Document

Update this workflow as we learn what works. Capture new disposition patterns, better reconciliation examples, and refinements to the implementation-ready bar.

** Updates and Learnings

*** 2026-05-23: First run — two coupled specs, one reviewer
*Learning:* The reviews were high quality and mostly accepts, which made the genuine modify/reject cases the high-value work — and the cross-spec reconciliation (replace-on-switch vs merge-on-refresh) was the single most useful synthesis, something neither review could do alone. The "Review dispositions" section closing with "everything else accepted as written" proved important: without it, a reader can't tell whether an unlisted recommendation was accepted or missed.

*Open question for next run:* whether to keep dispositions inside each spec (current approach) or in a shared response doc when many specs are reviewed at once. Inside-the-spec kept the reasoning next to the affected text, which read well for two specs.

* Review and iteration history

** 2026-05-23 Sat @ 00:00:00 -0500 — Claude Code (linear-emacs) — author
- *What:* Initial draft of the workflow file. Validated by the same 2026-05-23 linear-emacs run that exercised spec-review.org.
- *Why:* Craig wanted to use each contributor for their value — his time goes into user scenarios and direction-setting, agents' time goes into structured technical review (accept/modify/reject discipline, disposition tracking) where agents are more consistent than humans. spec-response formalized the responder-side discipline so review findings get processed completely rather than skimmed.
- *Artifacts:* Delivered to =rulesets/inbox/= on 2026-05-23 in the same 4-file handoff as spec-review.org. Heading timestamp is approximate per the opaque-id convention; no exact time was recorded.

** 2026-05-23 Sat @ 12:59:31 -0500 — Claude Code (rulesets) — reviewer + responder
- *What:* Three response-workflow edits: Overview now names spec-review as the review-file producer (closing the cross-link the handoff had asked for); added "A reject goes back to the reviewer" principle (Google eng-practices, IETF RFC 7282); added explicit response-loop termination condition under "Converge to implementation-ready." Paired review-workflow edits landed in the same commit and are recorded in spec-review.org's history.
- *Why:* Pre-install review found the missing cross-link to spec-review. Disposition-tracking research flagged unilateral rejection without reviewer escalation as a documented anti-pattern.
- *Artifacts:* Same commit =7f2aea1= (the workflow pair shipped together).

** 2026-05-28 Thu @ 03:22:25 -0500 — Craig Jennings — author
- *What:* Added the paired Review-and-iteration-history requirement on the response side. New goal item #4 with downstream renumbering; new Phase 4 sub-step with the entry shape (heading-as-compound-id + What/Why/Artifacts body); new "history explains provenance, not implementation behavior" principle.
- *Why:* Same as spec-review.org's matching entry — Craig had no way to scan a spec under review and tell which iterations had been processed vs new vs pending; file mtime was the only signal, silly and error-prone. The responder side needs the same visibility surface so the response cycle can leave its provenance alongside the review side's.
- *Artifacts:* Commit =55adf6e=.

** 2026-05-28 Thu @ 03:23:01 -0500 — Claude Code (rulesets) — author
- *What:* Backfilled this section with entries for the workflow file's prior iterations (Draft 1, Review-and-Fold, Requirement addition). Used commit author dates where exact; Draft 1's timestamp is an opaque-id placeholder since no exact time was recorded.
- *Why:* The iteration-history requirement landed in =55adf6e= without retroactive entries for the workflow file's own prior history. This pass closes that gap so the section starts with full provenance rather than empty.
- *Artifacts:* Working draft sources at =working/spec-workflows-iteration-history-backfill/draft-entries.org= (deleted on this commit). Spec-response cycle (two Codex reviews and three rulesets responses on the draft) validated the entries and the entry-shape before splicing.

** 2026-06-12 Fri @ 19:09:00 -0500 — Claude Code (rulesets) — responder
- *What:* Reconciled this workflow to spec-create's new Decisions convention (each decision is an org =TODO= task that flips to =DONE= on agreement, with a =[/]= cookie on the =* Decisions= heading and a =*** Discussion= child for disputes). Exit Criterion 5, Phase 2's pre-agreed-decisions step, and Phase 4 steps 4-5 now speak in flip-to-=DONE= terms, and the implementation-ready step gates on the all-=DONE= cookie.
- *Why:* The convention change landed in spec-create.org via an .emacs.d handoff (originated in its keymap-consolidation spec); this workflow still described the retired =State: proposed | accepted | superseded= model.
- *Artifacts:* Handoff =inbox/2026-06-12-1906-from-.emacs.d-spec-create-decisions-todo-note.org=. Paired spec-create.org and spec-review.org edits in the same commit.