.ai/workflows/task-audit.org


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112

#+TITLE: Task Audit Workflow
#+AUTHOR: Craig Jennings & Claude
#+DATE: 2026-05-22

* Overview

A *task audit* reconciles each open task's recorded content against the actual state of the world, then fixes what's stale and surfaces what needs the user. It answers "is the information in this task still *true*?" — not "is this task still worth doing?"

That distinction is the whole point, and it separates this workflow from its two siblings:

- =open-tasks.org= — *displays* open tasks (list mode) or picks the next one. Read-only.
- =task-review.org= — *relevance/priority hygiene*: walk the oldest-unreviewed tasks, re-grade / keep / kill / mark DOING, stamp =:LAST_REVIEWED:=. Decides whether a task still belongs and at what priority.
- =task-audit.org= (this one) — *content reconciliation*: take the surviving tasks and check their recorded facts against reality (sessions, email, chat, ticketing, calendar, meeting recordings), update the stale facts, mark statuses that moved, consolidate duplicates, and flag the judgment calls only the user can answer.

They compose: =task-review= keeps the list lean; =task-audit= keeps the survivors honest. Run a task audit when a task's *content* looks stale (a "waiting on X" that already happened, a date that passed, a status that moved), not just when its priority is in question.

* When to Use This Workflow

Trigger phrases:
- "task audit" / "audit my tasks" / "audit the open tasks"
- "are my tasks still accurate" / "reconcile my tasks" / "update the stale tasks"
- "go through the open tasks and update them"

Typical timing:
- After a gap where the world moved but the tasks didn't (travel, an offsite, a conference, a sprint).
- When the user notices a specific task is stale and wants the rest checked too.
- Periodically alongside (but distinct from) the lighter =task-review= habit.

Do *not* use this for picking what to work on (=open-tasks.org=) or for keep/kill/priority grooming (=task-review.org=).

* The Core Split: Autonomous vs. Flag

Every task lands in one of three buckets. The split is the discipline of the workflow.

1. *CURRENT* — the recorded content still matches reality. Leave it (optionally re-stamp =:LAST_REVIEWED:= if it was genuinely re-verified; don't bulk-stamp mechanically).
2. *STALE → update autonomously* — a fact the sources verify has changed: a status that moved, a date that passed, a "waiting on X" that resolved, a dead/renamed link, a duplicate to consolidate, a task the evidence clearly shows is done. Apply the factual update and bump =:LAST_REVIEWED:= on the tasks you edit.
3. *NEEDS-USER → flag, don't guess* — a decision (priority, send/don't-send, keep/kill, push-or-shelve), or a fact that lives only in the user's head (did the meeting happen, what's your read, who referred this). Surface it; never fabricate it.

The line: *factual reconciliation is autonomous; judgment calls and user-only facts get flagged.* Guessing on bucket 3 is the failure mode this workflow exists to prevent.

* The Workflow

** Phase A — Enumerate the open tasks

List every top-level task in the project's open-work section (e.g. =* Work Open Work= in =todo.org=, the headings between it and the next =*= section):

#+begin_src bash
awk 'NR>=<start> && /^\* <next-section>/{exit} NR>=<start> && /^\*\* /{print NR": "$0}' todo.org
#+end_src

Record each task's keyword, priority, tags, and current =:LAST_REVIEWED:=. This is the audit checklist.

** Phase B — Reconcile each task against the sources (read-only)

For each open task, read its body and cross-check its claims against the actual state of the world. The sources (use whichever the project has):

- *Recent session summaries* — =.ai/sessions/*.org= (read the =* Summary= sections; they capture what shipped and what moved).
- *Email* — the project's mail accounts (work + personal). Search by the people/threads the task names.
- *Chat* — Slack/Teams DMs, mentions, and the threads the task references.
- *Ticketing* — Linear/Jira/GitHub. Check the linked ticket's status, comments, and whether a PR merged.
- *Calendar* — did a scheduled event happen; is a SCHEDULED/DEADLINE date now past.
- *Meeting recordings* — if a task hinges on "did this conversation happen / what was said," check the recording queue (e.g. =~/sync/recordings/=) and transcribe via =process-meeting-transcript.org= if the answer lives in an un-transcribed recording. (This is exactly how a "did the interview happen?" task gets resolved instead of guessed.)

Assign each task a bucket (CURRENT / STALE / NEEDS-USER) and, for STALE, the specific factual update.

*Scale tactic.* For a large open-task set, dispatch read-only investigation sub-agents over batches of tasks (parallel-safe per =subagents.md= — independent read-only domains). Each returns a per-task bucket + suggested update. *Never* let sub-agents write to =todo.org= concurrently — apply all edits serially in the main thread (concurrent writes to one file race and lose work).

** Phase C — Apply the factual updates (autonomous)

For every STALE task, edit it in the main thread:

- Add a dated reconciliation note (=*** YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <what changed>=) or update the offending line in place.
- Mark statuses that moved; rewrite "waiting on X" lines whose X resolved.
- Fix dead/renamed =file:= links.
- *Consolidate duplicates* — when several tasks track the same thing, fold them into one home and delete the duplicates (per the user's call on which is canonical).
- *Ensure priority is set per the project scheme.* The top of the project's =todo.org= should carry the priority legend (=[#A]= through =[#D]=). Every task should carry an explicit priority cookie. If a cookie is missing, or no longer matches the reconciled facts, assign the right level per the legend. If the level is unambiguous from the body, do it autonomously; if it's a judgment call (especially the [#A] / [#B] line for important-but-not-urgent work), flag NEEDS-USER. Also enforce the [#A]-discipline rule from the legend — an [#A] task without a =SCHEDULED:= or =DEADLINE:= line is mis-graded and is either down-graded to [#B] (when reconciled facts say "important but not urgent") or surfaced as NEEDS-USER for the user to date.
- *Ensure a type tag is set.* Every task carries one type tag from the project's tag legend (typically =:feature:= / =:chore:= / =:spec:= / =:bug:=). If missing or wrong, assign or correct it from the body when the type is unambiguous. If two tags fit (a refactor that also fixes a bug; a spec that's also a chore), flag NEEDS-USER rather than picking one silently.
- *Re-assess the =:quick:= and =:solo:= tags* — reconciliation can change a task's effort or autonomy: a resolved dependency may make a stuck task =:solo:=, a scope cut may make it =:quick:=, and new complexity surfaced by the sources can invalidate either. Add or remove the tags per the definitions in the project's tag legend (and [[file:task-review.org][task-review.org]]) when the reconciled facts make the call clear. When they don't — an effort estimate you can't pin down, a =:solo:= gate you can't confirm — it's a NEEDS-USER flag, not a guess.
- Bump =:LAST_REVIEWED:= on each edited task.

Follow =todo-format.md= for completion mechanics (depth-based DONE vs dated-rewrite) and the working-files / link-hygiene rules when moving artifacts.

** Phase D — Flag the judgment calls (interactive)

Present the NEEDS-USER bucket as a short, scannable list — one line per task, naming the decision or the fact required. Adjudicate with the user one item at a time (inline numbered options per =interaction.md=, no popup). Apply the user's calls as they come (which may itself produce more autonomous updates, or new tasks).

When the trail on a task has gone cold and only the user knows the outcome, *say so and ask* — do not invent a status. ("The screen was set for 5/15; nothing in sessions, email, or chat records what came of it. Did it happen, and what's your read?")

** Phase E — Close out

After Phase D's judgment calls are adjudicated and applied, stamp the audit run by updating =.ai/notes.org='s *Workflow State* section:

#+begin_example
:LAST_AUDIT: YYYY-MM-DD
#+end_example

Use today's =date= output (=date "+%Y-%m-%d"=). Overwrite the previous value — the marker reflects the most recent run, not a history.

If the *Workflow State* section doesn't exist yet, create it (placement: after *Active Reminders*). Other workflows (notably =open-tasks.org= Next Mode's audit-warranted check) read this marker to decide whether to offer running another audit.

* Common Mistakes

1. *Guessing a judgment call instead of flagging it.* If the answer isn't in the sources, it's bucket 3. Ask.
2. *Fabricating an outcome for a cold-trail task.* Surface the gap; don't fill it from imagination.
3. *Bulk re-stamping =:LAST_REVIEWED:= without actually re-verifying.* Stamp only what you checked.
4. *Concurrent sub-agent writes to =todo.org=.* Investigation fans out; edits stay serial in the main thread.
5. *Reconciling against memory instead of the live sources.* Email/chat/ticketing/recordings move independently of the task text — check them.
6. *Doing relevance grooming here.* Keep/kill/priority is =task-review='s job; this workflow is about factual accuracy.

* Validation

Created and validated 2026-05-22 against the DeepSat work =todo.org=: reconciled the open-work tasks one at a time, applied factual updates (a contract task's gating, a hiring task whose interview outcome was recovered by transcribing an un-processed meeting recording, a partnership task's stale comms-hold + consolidation of two duplicate trackers, plus several "waiting on X resolved" updates), and surfaced the judgment-call subset (shelve-vs-push, awaiting-team-feedback, who-initiated-this) for the user to adjudicate.