| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
| |
Renaming an .ai artifact by hand is the kind of mechanical job that gets done incompletely: the canonical copy moves but the mirror doesn't, a reference in the INDEX is missed, a trigger phrase points at the old name. I'd also assumed a rename was costly because references scatter, when the index update is trivial and the drift check already guards it. So I built the discipline into a script instead of re-deriving it each time.
scripts/rename-ai-artifact.sh takes old and new basenames, moves the file in both the canonical and mirror trees, and rewrites every reference repo-wide on a token boundary so renaming "foo" can't corrupt "foobar" or "foo-bar". It rewrites the underscore module-name variant too (a hyphenated script imported as foo_bar via importlib), leaves the archived session records under sessions/ alone because they're history, and runs workflow-integrity + sync-check at the end to prove no drift. rename-artifact.org documents it and indexes the triggers.
Then I used the tool to do the rename that prompted it: the org-drill deck workflow and its helpers are now flashcard-named, since "flashcard" is the word you'd actually search for. The renamed set is flashcard-review.org plus flashcard-stats.py, flashcard-sync, flashcard-to-anki.py, and flashcard-diff-ids.py, with their tests, every reference, and the INDEX entry updated. The deck is still an org-drill deck under the hood, so the ":drill:" tag handling and the "drill deck" trigger phrases stay. I added "review/update the flashcards" alongside them.
Tests: 9 bats for the rename tool (including the prefix-collision and history-preservation edges), and the renamed script suites all pass under make test.
|
| |
|
|
|
|
|
|
|
|
| |
I split each into lanes so a reader can stop at the level that answers the question: Summary for "what does this do and what does it produce", Execution for the steps to follow, Reference for examples and edge cases, History for old decisions. Both files are large enough that an agent loading them at routing time pays for context it doesn't need yet.
startup.org keeps Summary, Execution, and Reference (workflow discovery and common mistakes moved under Reference). triage-intake.org gets all four, including a History lane for its design notes. Every instruction is preserved. The triage reorder ran through a content-preservation check that compared the multiset of content lines before and after, so only heading depth and lane grouping moved. Nothing was dropped or reworded.
workflow-integrity.py now counts "Summary" as a valid orientation heading, since that's the new top section both files lead with.
This is the pilot from the codex backlog, scoped to the two largest workflows. Whether the lanes actually cut session token use gets evaluated before any wider rollout.
|
| |
|
|
|
|
| |
daily-prep's Phase 3 re-implemented email/Slack/Linear/PR scanning inline (sub-steps 3b-3g, ~280 lines): the same fan-out, classify, and reactive-task work the triage-intake engine has owned since its 2026-05-26 plugin refactor. I collapsed those to four steps: 3b runs the engine, 3c surfaces today's reactive items as Day's Priorities thin links, 3d re-sorts by urgency, 3e writes the audit footer from the engine's per-source coverage.
Source coverage carries because the engine's Phase 0 globs both .ai/workflows/ and .ai/project-workflows/ plugins, so the work account's Gmail/Slack/Linear/GHE plugins are still scanned, and a source change now lives in one plugin instead of being duplicated here. I adapted the downstream references (the Prep-Doc-Structure rule, the Heads-up FYI source, the Recommended Approach Pattern reframed as engine-applied), dropped the orphaned Linear-digest note, and added a Living Document entry. The file goes 825 to 576 lines, and the prep-doc contract (Day's Priorities, Heads-up, Sources-checked footer) is unchanged.
|
| |
|
|
|
|
| |
Agents (and any future inventory tool) doing a naive recursive read of a project pick up node_modules, __pycache__, build output, and token artifacts even when those are gitignored, because a recursive read sees the disk, not git. I added a gitignore-syntax .aiignore at the repo root with the default skip list, and a protocols.org "Recursive Reads" subsection documenting the convention, the defaults to assume absent a file, and the lockfile policy (skip on agent reads, independent of git-tracking).
I did not wire the walking scripts (audit.sh, diff-lang.sh, sync-language-bundle.sh): they do targeted finds over .ai/.claude/bundle dirs, never whole-tree walks, so honoring .aiignore there would be dead code. That belongs in a future catalog tool.
|
| |
|
|
| |
From a pearl handoff: Phase 6 logged deferred and v1 work only in passing, so the implementer handoff was a re-read of the spec rather than a paste. I added a step that lifts the spec's Implementation phases section into a drop-in todo.org block: one [#B] TODO per phase plus a test-surface entry mirroring the Acceptance criteria. A spec with no phase decomposition fails the step, surfacing the shape problem as a finding before Ready rather than inventing phases. Added Exit Criterion 6 and a review-history entry.
|
| |
|
|
|
|
| |
Handoffs that arrive mid-session used to sit unseen until the next startup or a manual check. Today's burst of cross-project handoffs made that gap obvious. I added monitor-inbox.org, the cadence-and-decision layer over process-inbox: check the inbox at every task boundary, decide act-now (just do it) versus file (ask, with filing as option 1), and reply to the sender. An opt-in background-monitor /loop recipe covers unattended watching.
inbox-status (with bats tests) is the cheap check the cadence calls. It lists unprocessed handoffs and exits nonzero when any are pending, using the same artifact exclusions as the wrap-up sanity check. protocols.org gets a short cadence note so the habit fires every session, and INDEX.org lists the new workflow. The act-vs-file rule (act-now is silent, filing asks with file as option 1, ambiguity asks) is the decision protocol we settled today.
|
| |
|
|
| |
An org-drill session asked to send a follow-up email first claimed it couldn't, then hand-built MIME through msmtp, because nothing told it cmail send exists. I added a "Sending Email" subsection to protocols.org (read every session): cmail (c@cjennings.net) is the default for personal mail, dmail for work, and cmail-action send is the tool, with one-liner examples for body-file, attachments, Cc/Bcc, and threaded replies. I also rewrote send-email.org Step 4, replacing the inline-Python heredoc that taught the hard way with the cmail-action send call.
|
| |
|
|
| |
cmail-action send couldn't do a proper reply (no Cc/Bcc, no In-Reply-To/References), so an org-drill session that needed to reply to an upstream maintainer hand-rolled a raw MIME message through msmtp instead. I extended build_message (the pure function) with cc, bcc, in_reply_to, and references, wired the matching --cc/--bcc (repeatable), --in-reply-to, and --references flags through cmd_send, and wrote the tests first. send_message derives recipients from the To/Cc/Bcc headers and strips Bcc, so no manual recipient list is needed.
|
| |
|
|
|
|
| |
A single .ai/session-context.org races when two agents share a project: each agent's writes clobber the other's session log. I added .ai/scripts/session-context-path, which resolves the active path from AI_AGENT_ID: unset gives the legacy .ai/session-context.org singleton (so every existing one-agent session is unchanged), set gives .ai/session-context.d/<id>.org with the id sanitized to filename-safe characters. This is Codex's Phase 1 slice from the runtime-neutral spec: the race fix on its own, no broader refactor.
startup.org's existence check and wrap-it-up.org's rename now resolve through the helper, each with a singleton fallback so older checkouts that haven't synced the script still work. Wrap folds the agent id into the archive name so two agents wrapping in the same minute don't collide. protocols.org documents the rule. Verified with 5 bats cases and a two-agent simulation showing distinct paths per id.
|
| |
|
|
|
|
| |
From jr-estate's handoff: Phase A's =rsync -a --delete= copies the rulesets working tree by disk presence, so a downstream session that starts while rulesets has in-flight WIP pulls that WIP into its own =.ai/workflows/= and =.ai/scripts/=, where it reads as drift the user never authored. I guarded the three rsyncs behind a =git status --porcelain= check on the synced source paths (=claude-templates/.ai/{protocols.org,workflows/,scripts/}=). It syncs when those are clean and skips with a message when dirty, catching up on the next clean session. The check is scoped to those paths, so unrelated rulesets dirt (a stray session-context.org, scratch files) doesn't block the sync.
The handoff's secondary anomaly (two workflow files that didn't reach jr-estate) was a timeline artifact, not a Phase A bug. Both were added in 664bf01 on 2026-05-29, after jr-estate's rsync had already run, so they correctly didn't exist to copy yet.
|
| |
|
|
|
|
| |
org-lint reads an =** Foo= verbatim span in body prose as a possible misplaced heading, but verbatim markup is never a real heading. lint-org kept surfacing these as judgment items, so they recurred in lint-followups.org on every wrap and could never be acted on, since the todo.org content was already correct.
I added lo--verbatim-asterisk-at-line-p, which mirrors the markdown-bold detector: it checks the reported line and the one before it, since org-lint marks the blank line after the offender. A match is now suppressed silently, the same way the cj-comment false positives already are. I flipped the two tests that pinned the old judgment behavior, and confirmed todo.org lints clean (judgment=0). This resolves the checker-bug report I filed in the inbox earlier, which I removed.
|
| |
|
|
|
|
|
|
| |
Health ran the new leakage check on a 43-card deck and hit two false-positive classes. The check read the whole card body, so a =Source: <label> — <url>= citation line inflated the front/back overlap whenever the URL slug repeated the question's words. Range/category cards ("What are the HbA1c ranges across normal, prediabetes, and diabetes?") tripped it too, because the question's categories echo in the answer even though the recalled content is the numbers.
drill-deck-stats.py now routes leakage through an is_leaky helper. It strips =Source:= and created-date lines before computing overlap, and exempts a card when the answer carries a numeric range or threshold the question lacks. leakage_ratio itself is unchanged, so the genuine-restatement case still flags.
Two body conventions now hold: a =Source:= citation goes at the end of a card after two blank lines, and no created/added date goes on a card. drill-to-anki.py now strips =Created:= / =:CREATED:= lines from the back as a backstop, and the workflow's Phase C removes them from the source during the rewrite. I added tests for the source-strip, the numeric carve-out, and the created-line strip, and documented all of it in drill-deck-review.org.
|
| |
|
|
|
|
| |
From health's handoff: the startup =.ai/scripts/= sync was dragging pytest build artifacts into every consuming project. =rsync -a= copies by disk presence, not git status, so the =__pycache__/= and =.pytest_cache/= that rulesets' own pytest leaves in =claude-templates/.ai/scripts/tests/= rode along to each project's tree even though the root =.gitignore= already keeps them out of rulesets' commits. Phase A's scripts rsync now excludes =__pycache__=, =.pytest_cache=, and =*.pyc=. A project that already received the cache has to remove it once by hand, since =--delete= leaves excluded paths in place. I noted that in the startup doc.
Health also flagged that =inbox-send.py= kept needing a manual chmod. The cause wasn't rsync dropping the bit. Four shebang scripts (=inbox-send.py=, =cj-scan.py=, =cj-remove-block.py=, =eml-view-and-extract-attachments.py=) were committed mode 100644, so rsync faithfully copied the wrong mode. I set the exec bit on all four so the synced copies are runnable.
|
| |
|
|
|
|
|
|
| |
I researched spaced-repetition best practices (Wozniak's twenty rules, Matuschak's prompt-writing guide, Nielsen, the Anki and FSRS docs) and folded the findings into the drill-deck pipeline.
drill-deck-stats.py now checks authoring quality on top of structure. Two checks block: answer leakage (a question that echoes >= 80% of its own answer's content words tests recognition, not recall) and duplicate / near-duplicate fronts (confusable cards interfere). Three checks warn without blocking, surfacing rewrite candidates without failing the gate: overloaded backs, list-shaped backs, and binary yes/no prompts. The fuzzy thresholds live in constants at the top of the script, so a real deck that trips false positives can be tuned. I pulled the card-parsing into a parse_cards helper that captures each card's body, and added focused tests for every new helper plus CLI coverage of the leaky, duplicate, and notes-only cases.
drill-deck-review.org gains a Card Authoring Principles section (the why behind the canonical shapes, with sources), a person-card splitting path bounded by the :ID:-preservation rule, a Phase B cost-benefit-removal and leech-reformulation disposition, and a scheduling-is-Anki-side note so a future editor doesn't try to encode FSRS retention in the org source. I left out cloze cards (would need a second note type), per-card tractability targeting and retention encoding (Anki-side telemetry that never reaches the source), and on-face source-stamping (the converter strips those drawers by design). Each is noted with its reason.
|
| |
|
|
|
|
|
|
| |
I backfilled the gaps left after the flashcard work landed. drill-to-anki.py had tests only for its two default helpers. I added coverage for the core parser and its pieces: parse (section-to-tag mapping, drawer-only body, blank trimming, multiline join, no-card input), strip_org_metadata (drawer and planning-line stripping, unclosed drawer), section_to_tag, escape_html, and the deterministic stable_id. I also filled the remaining drill-deck-stats / drill-deck-diff-ids branches (missing-title and PROPERTIES-mismatch warnings, the appeared-IDs note path).
I added test_cross_project_broadcast.py for the two scripts that had none here: is_broadcastable / discover (SEARCH_ROOTS pointed at a tmp tree) / sender_project / inbox_send_path, plus an ERT suite for daily-prep-agenda.el (dp-iso-date, dp-bucket with the clock pinned, dp-format-entry, and dp-collect end to end on a temp org file).
daily-prep-agenda.el needed one change to be loadable under ERT: its batch entrypoint fired on any load. I gated it behind dp--cli-invocation-p, the same readable-files check lint-org.el already uses, so requiring the file for tests no longer runs the extractor. A real invocation with a file argument still fires. A no-argument run now no-ops instead of printing an empty header.
|
| |
|
|
|
|
| |
I incorporated the flashcard-tooling bundle from the work project's deck-review workflow, validated there against a 93-card deck. Three scripts now live under .ai/scripts/: drill-deck-stats.py (pre-rewrite inventory plus a gate that warns on stray *** Answer headers, missing :ID:, non-prompt headings, and #+TITLE jargon like "org-drill"), drill-deck-diff-ids.py (SRS-state preservation check that flags any :ID: lost across a rewrite), and drill-deck-sync (bash wrapper chaining stats, optional diff-ids, then drill-to-anki, writing to ~/sync/phone/anki/ only when the gates pass).
The drill-deck-review.org workflow gains a Helper Scripts section and references the scripts from its phases. I reconciled its output-path prose with the drill-to-anki default that just moved to ~/sync/phone/anki/, so it no longer claims the script still defaults to ~/sync/org/drill/. I added tests for both Python scripts (pure logic plus CLI gate behavior) and a bats suite for the wrapper's guard paths. The clean end-to-end sync path stays uncovered since it needs uv-resolved genanki.
|
| |
|
|
|
|
| |
Two default-behavior tweaks from real use. Output now defaults to ~/sync/phone/anki/ instead of ~/sync/org/drill/. The .apkg is a mobile-Anki artifact the phone picks up from its sync dir, while the org source stays in the project. Deck name now defaults to the raw input basename, case preserved. That drops the #+TITLE preference, which leaked tool-name jargon like "DeepSat org-drill flashcards" into the Anki deck list.
The --deck and --output flags still override both. I dropped the now-unused title_from_org helper and added a test covering the two defaults. genanki is stubbed in the test since uv resolves it only at runtime.
|
| |
|
|
|
|
|
|
| |
I added a drill-deck-review workflow that walks an org-drill deck end-to-end: question-form audit on every heading (so the heading is the prompt, not the answer), content-accuracy audit via subagent against project source-of-truth, source rewrite preserving SRS state, regenerate to ~/sync/phone/anki/. The workflow covers three card families (acronym/concept, person, talking-point) and codifies the person-card pattern as "Who is X? Tell me about their Y." where X is a role descriptor that doesn't name the person and Y is the topical anchor from the answer body.
The drill-to-anki.py script picks up a new strip_org_metadata helper that drops :PROPERTIES: drawers and SCHEDULED/DEADLINE/CLOSED planning lines from the rendered Anki back. Org-drill needs both in the source for SRS state and review scheduling; Anki cards shouldn't show them.
INDEX entry under "On-demand utilities" wires the trigger phrases.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The inbox sanity check I added in 8424e8f counted lint-followups.org
as unprocessed, which surfaced as a false alarm during this session's
wrap-up. lint-org writes its judgment items into inbox/lint-followups.org
earlier in the same wrap-up workflow by design. The file is a
pipeline artifact for the next morning's daily-prep, not a handoff
that needs the value gate.
The find filter now excludes .gitkeep, lint-followups.org, and
PROCESSED-* prefixes. The Validation Checklist line names the same
expected-artifacts list explicitly.
Smoke test: post-fix count is 0 in the rulesets project (where
lint-org just wrote 5 judgment items into inbox/lint-followups.org).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds one bullet to When NOT to Use and a new Cadence Guideline
section that names the per-commit-broadcast anti-pattern explicitly.
The new section lays out the five reasons broadcasts stay
capability-and-rule-level rather than commit-level: cost per
broadcast across the fleet, signal-to-noise on project-internal
commits, the rsync-already-does-the-work observation, aggregation
winning over per-commit pings, and the train-projects-to-ignore-inbox
risk.
The end-of-session bundling guidance lands too. If a session ships
several broadcastable changes, bundle them into one broadcast at
session end instead of firing one per commit.
Source: session-end conversation 2026-05-29 surveying today's
cross-project changes. Today's session shipped 13 cross-project
items (4 new workflows, 6 workflow updates, 2 new scripts, 1 new
bin tool). A per-commit broadcast cadence would have fired 13 inbox
files across 23 targets, or 299 total inbox files. One consolidated
broadcast (or no broadcast at all, since Craig already coordinated
manually) covered the same ground.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Wrap-up never knew about the inbox. After process-inbox.org landed
today as the formal workflow for handoff intake, wrap-it-up.org
needs to surface inbox/ when it's non-empty so the wrap doesn't
silently defer handoffs to next session.
This commit adds a new Inbox sanity check sub-step under Step 3
(project hygiene). The check runs find inbox -maxdepth 1 -type f
filtered to skip .gitkeep and PROCESSED-* prefixes. A non-zero
count surfaces with the file list and a recommendation to run
process-inbox.org or explicitly defer each item. A zero count or no
inbox/ directory makes the check a silent no-op.
It also adds one line to the Validation Checklist: inbox is empty
(excluding .gitkeep and PROCESSED-* prefixes), OR each remaining
item has an explicit deferral logged in the valediction.
This is the only integration gap surfaced today. The other gaps
considered (LAST_INBOX_PROCESS marker stamp, sync-check, make
status) didn't justify wrap-up changes. The marker is
process-inbox's own responsibility. sync-check is project-specific
to rulesets. make status doesn't generalize across projects.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generalizes the health-drill-to-anki.py converter into a template
script under claude-templates/.ai/scripts/. Every project's startup
rsync now picks it up at .ai/scripts/drill-to-anki.py.
The converter walks an org-drill file. Top-level * Section headings
become Anki tags. Each ** Card name :drill: entry becomes a card
with the heading as Front and the body as Back (newlines converted
to <br>).
Parameterized from the original health-specific version:
- Input file is a positional argument (was hardcoded
health-drill.org).
- Deck name defaults to the org file's #+TITLE if present, else the
input basename in Title Case (was hardcoded "Health Drill").
- Output path defaults to ~/sync/org/drill/<input-basename>.apkg
(was hardcoded health-drill.apkg in the same dir).
- Deck and model IDs are derived from the deck name via SHA-256, so
re-running against the same source produces stable IDs. Anki
imports update existing cards in place rather than duplicating.
Dependencies via PEP 723 inline script metadata. The shebang is
#!/usr/bin/env -S uv run --script. First run resolves and caches
genanki>=0.13. Subsequent runs are instant. There is no venv to
maintain and no PEP 668 friction.
Smoke tested against ~/projects/health/health-drill.org. The script
wrote 43 cards into the Health Drill deck, matching the original
script's output.
Inbox source: 2026-05-29-1114-from-health-todo-b-org-drill-anki-export-updated.org.
Craig confirmed all projects will likely have org-drill files,
justifying template-tier promotion.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both workflows now check that the file under review (or the spec
being responded to) ends with -spec.org before proceeding. If it does
not, the workflow stops and surfaces the mismatch with the rename
suggestion.
The suffix is the identifier per Craig's spec-naming convention:
every design, decision, or planning document under a project's docs/
directory ends with -spec.org. The .org extension alone is not enough
because docs/ holds non-spec org files too (tutorials, frozen
inventories, reference material).
spec-review.org gained a top-level Precondition section between When
to Use and Approach. spec-response.org gained the same Precondition
section in the parallel position, with a note that the review-file
convention <spec-basename>-review.org means a misnamed spec produces
a mis-pointed review file too.
Inbox source: 2026-05-28-0858-from-home-spec-naming-convention-apply-to-spec.org.
Home renamed two docs to the new convention and asked rulesets to
update the template workflows so the next startup rsync in every
project picks up the guard.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
helper
Three coupled additions ship together.
claude-templates/bin/page-signal is a bash wrapper around signal-cli
send. It defaults to --note-to-self for safety. The wrapper supports
--file for attachments, --to <+number> for outbound (explicit per
call, no defaults, no batch), --quiet, and --json. Exit codes: 0
sent, 1 signal-cli failure, 2 usage error, 3 signal-cli not
installed.
claude-templates/.ai/workflows/page-signal.org carries the
discrimination rules and safety rails. When desktop notify covers it,
don't reach for Signal. Long-running task completion is the canonical
case. Outbound to other contacts requires explicit Craig instruction
per send. A known-limitation note covers the current notification
gap. signal-cli registered on Craig's primary number means messages
don't fire notifications until the pending Google Voice registration
lands.
claude-templates/.ai/workflows/cross-project-broadcast.org and its
helper cross-project-broadcast.py fan out a single message file to
every AI project's inbox in one operation. Discovery is
fingerprint-based: any directory under ~/code, ~/projects, ~/.emacs.d
with both .ai/protocols.org and a top-level inbox/ is broadcastable.
Senders are auto-excluded. Verified discovery against 23
broadcastable targets.
Makefile's install target gains a general bin/ loop. The previous
version hardcoded bin/ai. The new version iterates over every
executable under claude-templates/bin/ and symlinks each into
~/.local/bin/. install-hooks (existing Claude hook installer) is
unchanged. install-githooks (sync-check pre-commit hook setup, added
earlier today) is unchanged. The bin/ loop now picks up bin/page-signal
automatically.
INDEX entries for both new workflows landed under Tools and meta.
No bats tests on the new scripts. page-signal was smoke-tested with a
live send. The send succeeded. The notification gap is covered by the
workflow's known-limitation note. cross-project-broadcast.py was
smoke-tested via --list against the live project set. Tests can be
added when the broadcast pattern proves out across multiple use cases.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pre-commit caught a false-positive on its first real-world run. The
.ai/scripts/__pycache__ directory created by make test (pytest writes
.pyc and .pytest_cache, emacs writes .elc) was flagged as drift
against the canonical's clean tree.
Added matching --exclude patterns to the diff -rq check and the --fix
rsync calls. Patterns: __pycache__, *.pyc, *.pyo, .pytest_cache, *.elc.
The --fix excludes prevent rsync's --delete from wiping the mirror's
__pycache__ when the canonical has none.
Four new bats tests cover the exclusions. All 12 pass.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
open-tasks.org Phase A now runs `emacs --batch -q -l
.ai/scripts/todo-cleanup.el --archive-done todo.org` as a pre-step
before the parallel read batch. A level-2 task that completed during
the session sits as ** DONE under Open Work until something archives
it, which means Next Mode would surface it as a "what's next" candidate
between cleanups. The pre-step ensures Phase A's read of todo.org
reflects current state.
The pre-step skips only in an explicit read-only or dry-run context.
Default is always to sweep.
Common Mistake #13 records "recommending a freshly-DONE task" with the
pointer back to the pre-step.
The proposal came from a pearl handoff on 2026-05-28.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generic process-inbox workflow at claude-templates/.ai/workflows/ and
mirror. Owns inbox discipline across every project: items are ideas to
evaluate, not orders to execute, and earn a place in todo.org or git
history only when they pass a three-question value gate.
The gate (Phase B):
- Advances an existing TODO (look up by topic).
- Improves how the project works (architecture, workflows, tooling,
rule hygiene).
- Serves the project's stated mission (read from notes.org
Project-Specific Context).
One yes accepts. Three nos reject.
Per-source rejection flow (Phase D):
- From Craig: state in chat, wait for override.
- From another project: write a response via inbox-send naming which
gate question failed plus any reconsideration condition. Silent
rejection on a handoff is worse than no reply.
- From a script or automated system: just delete.
Phase B.1 gates filing on priority-scheme presence. If todo.org has a
scheme at the top, file with cookie + mandatory type tag + optional
effort/autonomy tags. If not, surface a one-sentence nudge to adopt
one (or to skip grading and flag the gap in the commit).
Phase D within accepts splits implement-now / fold-into-existing /
file-as-TODO so the inbox doesn't default to inflating todo.org for
every item that passes the gate.
Phase E stamps :LAST_INBOX_PROCESS: in notes.org Workflow State if the
section exists.
startup.org Phase C step 2 now delegates here instead of inlining the
inbox-processing language. INDEX entry under Tasks and planning lists
the full set of trigger phrases.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pearl independently built its own no-approvals workflow (handoff in this
morning's inbox) and asked rulesets to take the best from both. That's
cross-project signal that the workflow earns a place at the template tier.
Promoted from =.ai/project-workflows/= to =claude-templates/.ai/workflows/=
with a mirror under =.ai/workflows/=. INDEX entry added under "Tools and
meta" listing the full set of trigger phrases.
Merged from pearl's draft: the prominent "What's Suspended" / "What Stays
On" split (same content as the prior "Contract" section, easier to scan),
wider trigger phrases (the project-only version had fewer), the
mode-resets-when-Craig-switches-topics guard, an explicit destructive-action
carve-out as its own bullet rather than buried in the "real question"
section, and a subagent-review-gate callout.
Kept from the rulesets version: the Session Log update emphasis between
items, the real-question include/exclude lists, and the don't-auto-wrap-up
guard.
A History section at the bottom records the merge for future archaeology.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two changes land together because each is broken without the other.
open-tasks.org gains a new Phase A.1, evaluated only in Next Mode. The
phase reads :LAST_AUDIT: from notes.org and walks five state signals
(reminder/task mismatch, passed scheduled date, "waiting on X" matches a
shipped X, dead file: link, sub-task >75% DONE coverage). If the temporal
threshold of 14 days trips, or any signal fires, Next Mode offers a
task-audit run before producing the recommendation. Item 1 in the offer
is "run task-audit first" per the recommendation-at-item-1 convention.
task-audit.org gains two pieces. Phase C now enforces priority and
type-tag presence per the project's legend, applies the [#A] dating rule
from that legend, and re-assesses :quick: and :solo: from reconciled
facts. Unambiguous calls land autonomously. Ambiguous ones flag
NEEDS-USER instead of being guessed. A new Phase E stamps :LAST_AUDIT:
on completion.
|
| |
|
|
|
|
|
|
|
|
| |
Bundles end-of-session state across three concerns.
claude-templates/.ai/workflows/open-tasks.org: in-flight restructure of Next Mode into a two-question output (cascade + friction filter), +62 lines. Separate from the iteration-history work that landed in 55adf6e and 684b273.
.ai/session-context.org: live session record covering startup through the iteration-history backfill. Captures the no-approvals mode and the full spec-review/spec-response cycle on the working draft.
inbox/: five new arrivals from this session: four pearl notes (pattern catalog, prompts-ux, spec-review implementation, prompt collapse) plus one .emacs.d note (whats-next + task-audit). A sixth file, PROCESSED-prefixed, represents mid-handling state on the open-tasks friction-cascade suggestion from earlier in the session.
|
| |
|
|
|
|
| |
Closes the org-drill backfill TODO. Both workflow files now carry a bottom "Review and iteration history" section with four entries: Draft 1 (linear-emacs origin, 2026-05-23, placeholder timestamp), Review-and-Fold (commit 7f2aea1, 2026-05-23), Requirement addition (commit 55adf6e, 2026-05-28), and this backfill commit. Canonical and mirror updated together. Working directory working/spec-workflows-iteration-history-backfill/ removed on this commit.
The spec-response cycle on the working draft (two Codex reviews, three rulesets responses) validated the entry-shape and the content before splice. Codex caught file-history conflation that would have made per-file provenance inaccurate. Craig's direct rationale on the Iteration 1 and Iteration 3 Why lines replaced the original INFERRED markers.
|
| |
|
|
|
|
| |
Specs reviewed under either workflow now carry a bottom =Review and iteration history= section. Each entry is an org subheading with a compound id (timestamp, contributor, role) plus What/Why/Artifacts body fields. The id is opaque. Timestamp, contributor, and role concatenate without implying any decision ordering.
spec-review.org adds the gate item, the entry-shape spec, and a "Preserve iteration provenance" principle. spec-response.org adds the matching Phase 4 step and a "history explains provenance" principle. Canonical and mirror updated together.
|
| |
|
|
|
|
|
|
|
|
| |
credential
A session false-alarmed on a leak risk when restoring a credentials doc into a tracked .ai/ file. The reasoning was wrong: a tracked secret is only a public-leak risk where the repo can reach a public remote, which means code projects on public GitHub, the ones that already gitignore .ai/. Personal and documentation projects push to a private single-user repo on cjennings.net, so tracked credentials in their .ai/ files are fine and expected.
I added the rule next to the existing "should .ai/ be committed?" decision in protocols.org, since it's a direct corollary of the same code-vs-personal split. The "is this a leak?" question now resolves on which kind of project and remote it is, not on the mere presence of a credential in a tracked file.
Origin: an elibrary session raised the false alarm and Craig corrected it.
|
| |
|
|
|
|
|
|
| |
A project script dropped into .ai/scripts/ gets wiped on the next startup, because that dir syncs from the template with rsync --delete. There was no documented home for a project's own scripts, the script-side counterpart to .ai/project-workflows/.
I added .ai/project-scripts/ to the Directory Architecture table and noted in startup.org that it sits outside the synced set, like project-workflows/. A script a workflow imports lives there. Naming: a Python module imported via sys.path needs an importable name (underscores), while a CLI-invoked script can stay kebab-case like the template tooling.
No mechanism change. Startup Phase A only rsyncs protocols.org, workflows/, and scripts/, so project-scripts/ is already sync-safe. This just documents it.
|
| |
|
|
|
|
| |
I default page notifications to --persist so a page that fires while I'm away from the desk waits for me instead of auto-dismissing after a few seconds.
page-me and status-check already persisted every page. I added --persist to the rest: the alarm, reminder, and meeting-alert examples in protocols.org, the long-running-process completion ping, and the cross-agent-watch message notification. I documented --persist as the default for any page meant to get attention, with a low-value informational ping as the only exception.
|
| |
|
|
|
|
|
|
| |
The triage-intake workflow had every source baked into one file, so adding or changing a source meant editing the workflow itself. I replaced it with a source-agnostic engine plus per-source plugins named triage-intake.<source>.org. The engine carries the anchor/sentinel logic, the four-bucket model, the Phase A-D orchestration, the todo.org persistence convention, and the exit criteria. Each source's scan, classify, render, and action knowledge moved into its own plugin.
Four general plugins ship in the template: personal-gmail, personal-calendar, cmail, and github-prs. Project-specific sources live in the project's .ai/project-workflows/ and are never synced. Phase 0 globs both directories so a project source can't silently drop out of the sweep.
I taught INDEX.org and the startup workflow-discovery drift check the namespace. A file matching <engine>.*.org is a plugin of that engine, not an orphan, and gets no trigger entry of its own. A "run the triage-intake workflow" request routes to the engine, never to a plugin.
|
| |
|
|
|
|
| |
send_file ran filenames through slugify(), which flattens dots to hyphens. That corrupts the engine.plugin.org plugin-namespace convention: triage-intake.personal-gmail.org arrived as triage-intake-personal-gmail.org, which breaks the engine's triage-intake.*.org glob and the routing that depends on the first dot.
I added slugify_filename() for filename stems. It keeps dots, hyphens, underscores, and case, collapses only whitespace runs to hyphens, and truncates on a separator boundary. The prose --text path still uses slugify().
|
| |
|
|
|
|
|
|
| |
These two workflows form a reviewer side and an author side for taking a design spec to implementation-ready. spec-review judges a spec against a readiness gate, reads the code before critiquing, evaluates across dimensions, assigns a rubric (Ready / Ready-with-caveats / Not-ready / Needs-research), and writes a <spec>-review.org file when it isn't ready. spec-response consumes that file: it decides accept / modify / reject for every recommendation, weaves the accepts into the spec body, records the modifies and rejects with reasons in a "Review dispositions" section, and reconciles tensions when coupled specs get reviewed together. The review file is the contract between them.
Both were validated by a real run on 2026-05-23 before landing here. I then reviewed them against established practice and tightened five things. The readiness gate now covers security/privacy and observability, since a spec shouldn't pass with those undefined. Phase 1 is a fast triage and Phase 3 is the authoritative gate after the code read. Finding severity maps to blocking power. A rejection goes back to the reviewer instead of standing as a unilateral call. And the response loop has an explicit termination condition.
I added both to INDEX.org under a new "Specs and design" section with trigger phrases, cross-linked as a pair, and kept the canonical and mirror copies identical.
|
| |
|
|
|
|
| |
The task-review and task-audit workflows already assess :quick:, an effort estimate. I added :solo: alongside it: a task Claude can finish end-to-end without Craig's input, gated on bounded scope, no design or preference call, and local verifiability.
In task-review.org I added a tagging subsection paralleling :quick:, a mention in the close-out summary, and a common-mistake entry against over-tagging. task-audit.org now re-assesses both tags during content reconciliation, since a resolved dependency can make a stuck task newly solo-able. It points at task-review.org for the definitions rather than restating them.
|
| |
|
|
| |
The mark-read step extracted message paths with mu --fields='p', but 'p' is the priority field (returns "normal"), not the path. Every path came back as "normal" and the flag manager errored on all of them. 'l' is the file-location field. Caught during a live triage on 2026-05-22.
|
| |
|
|
| |
A task audit reconciles each open task's recorded content against reality (sessions, email, chat, ticketing, calendar, recordings) and fixes the stale facts. That's distinct from task-review, which grooms relevance and priority. The two compose: review keeps the list lean, and audit keeps the survivors factually honest. Registered it in the workflow INDEX with its trigger phrases.
|
| |
|
|
|
|
|
|
| |
screenshot.py
Extends screenshot.py with --launch CMD, which runs a command on a transient headless Hyprland output, captures it, and tears the output down, so a UI can be verified without touching the visible workspace. --layout (tiled/monocle/floating) and --size control placement: output resolution for tiled/monocle, window size plus centering for floating.
Refactors the testable logic (size parsing, geometry strings, window matching, the exec-rule body, centering) into pure helpers and adds test_screenshot.py covering them across normal, boundary, and error cases. The grim/hyprctl wrappers and the capture orchestration stay thin and are verified functionally.
|
| |
|
|
| |
Adds a grim + hyprctl wrapper so a session can capture the screen or a single window and read the resulting PNG, turning "does this look right?" into an inspectable artifact. Modes: --full (all outputs), --active (focused window), --window REGEX (matched against class or title), and --list to enumerate open windows. Output goes to a chosen path (default a timestamped file in /tmp) and the saved path is printed on stdout so the caller can read it back; the parent directory is created if it does not exist. Syncs into every project's .ai/scripts/ via the startup rsync.
|
| |
|
|
|
|
| |
Step 3.5's Linear ticket-state sweep uses gh to find the merged PR for a Dev-Review ticket, which assumes a GitHub-family host. That holds today because DeepSat, the only Linear-using project, lives on GHE where gh talks to the API.
I added a note flagging the assumption rather than rewriting the step to be provider-agnostic. A future Linear project on GitLab, Gitea, or Bitbucket would need a different PR lookup, but none exists yet, so documenting the boundary beats building for a host we don't have.
|
| |
|
|
|
|
|
|
|
|
| |
Startup synced the .ai/ templates into the current project every session but never checked the language bundle (elisp, python) installed in .claude/. Bundle drift went unnoticed until someone re-ran make install-lang by hand: a generic rule added to claude-rules/ after the last install, or a changed validator hook.
scripts/sync-language-bundle.sh closes that gap. It fingerprints which bundle a project has by the presence of the language's own rule files (elisp.md, python-testing.md), then reconciles against the canonical source: auto-fix for rulesets-owned files (.claude/rules/*.md, .claude/hooks/*, githooks/*), surface-only for settings.json, which a project may have customized. CLAUDE.md is left alone. It's seed-only in install-lang and project-owned afterward, the same reason diff-lang skips it.
Startup Phase A step 12 calls it for the current project, guarded so older checkouts that lack the script still boot. It writes only under .claude/ and githooks/, disjoint from the .ai/ rsync paths, so the parallel batch stays safe. A script rather than a make target keeps the Makefile-parse layer off the boot path. The absolute rulesets path it depends on is the same one the rsyncs already carry.
Tested: 11 bats cases (no-bundle, clean, drifted rule/hook auto-fixed, surfaced settings.json asserted unmodified, absent CLAUDE.md not flagged, python detection, $PWD default, bad path). A smoke run against a copy of a real elisp project's .claude/ caught a perpetual "CLAUDE.md missing" alarm, which is what drove dropping CLAUDE.md from the surface set.
|
| |
|
|
|
|
| |
I added a per-task effort check to the task-review walk. If a task looks like 30 minutes or less and isn't already tagged, mark it :quick: on the heading. When the heading and body don't make the effort clear, ask instead of guessing. A mislabeled :quick: defeats the point, since the tag exists so Craig can grab a genuinely small task in a spare moment.
I edited the canonical claude-templates copy and the synced project mirror together, so the next startup rsync won't revert them.
|
| |
|
|
|
|
|
|
|
|
| |
The new task-review.org workflow is the daily habit that retires the old date-coverage scan. It surfaces the oldest-unreviewed top-level tasks, walks them one at a time, and records each outcome — keep, re-grade, kill, mark DOING, or edit — stamping :LAST_REVIEWED: as it goes. It's a pure Claude workflow, no elisp. open-tasks.org displays the list; this one changes it.
task-review-staleness.sh gains a --list mode that emits the N oldest-unreviewed tasks (line, review date, heading), oldest first, so the workflow walks a deterministic batch instead of eyeballing todo.org. Never-reviewed and unparseable-date tasks sort oldest. Seven new bats cases cover ordering, the count limit, exclusions, and output format; count mode is unchanged.
startup.org gains the matching nudge. Phase A counts tasks unreviewed for >7 days and Phase C surfaces one line when that count is non-zero, pointing at the workflow. It lives in the template startup.org rather than the project-only startup-extras layer, so every project picks it up the same way it picks up the wrap-up health check.
The INDEX entry is added with the "task review" triggers the rename freed up.
|
| |
|
|
|
|
|
|
| |
The list-and-pick-next workflow was named task-review.org, but "task review" better describes a list-hygiene habit that re-grades and prunes tasks, not one that just displays them. I'm freeing the task-review.org name (and the "task review" trigger) for that habit, which lands next.
This workflow goes back to open-tasks.org — the name it carried before it merged with whats-next.org. Its content and INDEX entry drop the "task review" trigger and point at task-review.org for the hygiene habit. Behavior is unchanged; only the name and the routing phrases move.
The rename touches both the canonical workflow and the project mirror.
|
| |
|
|
|
|
|
|
| |
The date-coverage scan flagged every open [#A]/[#B] task with no DEADLINE or SCHEDULED, on the assumption that high-priority work needs a date. That assumption is wrong (research and watch-list tasks are legitimately dateless), so the scan generated dismiss-or-paper-over cleanup at every wrap-up.
The replacement watches the daily task-review habit instead. It calls task-review-staleness.sh todo.org 30 and, when the count is non-zero, writes one summary line to the follow-ups file: N top-level tasks unreviewed for >30 days. There's no per-task dump, because the per-task walk is the review habit's job. Staleness, not datelessness, is the signal worth surfacing.
The change lands in both the canonical workflow and the project mirror.
|
| |
|
|
|
|
|
|
| |
First component of the daily task-review habit from docs/design/task-review.org. The staleness count is the shared primitive both the wrap-up health check (threshold 30) and the startup reminder (threshold 7) call, so it lives in one tested script rather than being reimplemented in each workflow.
The script counts top-level todo.org tasks whose review has gone stale: depth-2 headings with a TODO/DOING/VERIFY keyword and an [#A]/[#B]/[#C] cookie, where LAST_REVIEWED is missing, unparseable, or older than the threshold. Age uses a strict greater-than, so a task reviewed exactly N days ago is still fresh. Today normalizes to local midnight before the diff, and the day count rounds to the nearest day, so a DST hour can't push a boundary task across the line.
Twelve bats cases cover the normal, boundary, and error categories. Dates are generated relative to the current date rather than hardcoded. The script path resolves as the sibling-of-parent of the test file, so the suite runs identically from the canonical claude-templates tree and the rsync'd project mirror. Makefile test target now globs .ai/scripts/tests for bats alongside scripts/tests.
|