diff options
| author | Craig Jennings <c@cjennings.net> | 2026-06-20 23:04:19 -0400 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-06-20 23:04:19 -0400 |
| commit | d9d3be90921d57577115fc778542219bfebe72ab (patch) | |
| tree | eb17aaf620291a87d66f0e71679df7c75fa8526b /docs/design | |
| parent | f5609ec3aa591e93938992e329c4732332288517 (diff) | |
| download | rulesets-d9d3be90921d57577115fc778542219bfebe72ab.tar.gz rulesets-d9d3be90921d57577115fc778542219bfebe72ab.zip | |
docs: capture ntfy, flashcard, and triage phone-push proposals
Three contributions came in from the home and work projects: a proposal to promote the self-hosted ntfy phone channel into a general two-way agent-comms tool, a flashcard multi-tag tooling change to reconcile into the canonical scripts, and a triage-intake auto-mode phone-push delivery step. I preserved each as a design bundle under docs/design and queued the build/reconcile work, since all three change shared assets and need review before landing.
Diffstat (limited to 'docs/design')
| -rw-r--r-- | docs/design/2026-06-17-flashcard-multitag-note.md | 28 | ||||
| -rwxr-xr-x | docs/design/2026-06-17-flashcard-multitag-stats.py | 332 | ||||
| -rwxr-xr-x | docs/design/2026-06-17-flashcard-multitag-to-anki.py | 294 | ||||
| -rw-r--r-- | docs/design/2026-06-17-ntfy-agent-comms-proposal.org | 89 | ||||
| -rw-r--r-- | docs/design/2026-06-18-triage-intake-phone-push-note.org | 11 | ||||
| -rw-r--r-- | docs/design/2026-06-18-triage-intake-phone-push-workflow.org | 427 |
6 files changed, 1181 insertions, 0 deletions
diff --git a/docs/design/2026-06-17-flashcard-multitag-note.md b/docs/design/2026-06-17-flashcard-multitag-note.md new file mode 100644 index 0000000..cfe960d --- /dev/null +++ b/docs/design/2026-06-17-flashcard-multitag-note.md @@ -0,0 +1,28 @@ +# Flashcard tooling: multi-tag headings + --tag-filter (for curated subset decks) + +Source: work project, 2026-06-17. Two synced scripts edited locally as a stopgap; please reconcile into the rulesets canonical so the next sync doesn't revert them. + +## What changed and why + +Craig wanted a curated "DeepSat Fundamentals" Anki deck: the 100 most fundamental cards out of the 465-card deepsat.org org-drill deck, marked with a second org tag (`:fundamental:`) so they stay findable/grep-able in the source. + +The blocker: both `flashcard-to-anki.py` and `flashcard-stats.py` keyed cards on a heading ending in exactly ` :drill:` (`CARD_RE = ^\*\*\s+(.+?)\s+:drill:\s*$`). Adding any second org tag turns the heading into `... :fundamental:drill:`, which that regex does not match — so the 100 tagged cards would silently drop from the full-deck apkg and be undercounted by stats. The "passing gate skips your file" failure mode. + +## flashcard-to-anki.py + +- `CARD_RE` now matches a trailing org tag block (`^\*\*\s+(.*?)\s+(:[A-Za-z0-9_@#%:]+:)\s*$`); a heading is a card when `drill` is among its tags. Other org tags ride along as Anki tags next to the section tag. +- Card body is now bounded by any L1/L2 heading (`HEADING_RE = ^\*{1,2}\s`) instead of only `* ` or a drill heading. +- New `--tag-filter <tag>`: emit only cards carrying that org tag (e.g. `--tag-filter fundamental` → the 100-card subset). +- New `--guid-salt <s>`: salt note GUIDs so a derived subset deck gets its own GUID space. Without it, the subset's notes share fronts with the full deck, Anki dedupes on GUID, and the subset deck imports empty. Default (no salt) is unchanged — `guid_for(front)` — so the existing deepsat deck's GUIDs and SRS state are untouched. + +Generation used: `flashcard-to-anki.py deepsat.org --tag-filter fundamental --deck "DeepSat Fundamentals" --guid-salt fundamentals`. + +## flashcard-stats.py + +- Same `CARD_RE` broadening + `HEADING_RE` body bound, and the card guard now checks `drill` membership in the tag block. Verified: full deck still counts 465 after 100 cards were multi-tagged. + +## Companion files to reconcile + +- Both rulesets copies: `~/code/rulesets/.ai/scripts/` and `~/code/rulesets/claude-templates/.ai/scripts/` (the synced source). +- `claude-templates/.ai/scripts/tests/flashcard-sync.bats` — worth adding a multi-tag case (a `:foo:drill:` heading still parses; `--tag-filter foo` returns only those) so this doesn't regress. +- Regression checked locally: full deck parses to 465 with and without the change; `--tag-filter fundamental` returns exactly 100. diff --git a/docs/design/2026-06-17-flashcard-multitag-stats.py b/docs/design/2026-06-17-flashcard-multitag-stats.py new file mode 100755 index 0000000..3c984e7 --- /dev/null +++ b/docs/design/2026-06-17-flashcard-multitag-stats.py @@ -0,0 +1,332 @@ +#!/usr/bin/env python3 +"""Inventory + authoring-quality checks for an org-drill deck source file. + +Reports counts and flags two tiers of issue. + +Blocking WARNs (exit 1): +- PROPERTIES drawer count not matching card count +- Cards missing :ID: (risks SRS-state loss across rewrites) +- `*** Answer` sub-headers (should be 0 per flashcard-review.org) +- Non-prompt headings (topic-as-heading not yet rewritten) +- #+TITLE missing, or carrying source-tool jargon ("org-drill") +- Answer leakage: a card whose question echoes most of its own answer + (Source: citation lines and created-date lines are excluded from the + overlap, and range/category cards that recall numbers are exempted) +- Duplicate / near-duplicate fronts (interference between confusable cards) + +Non-blocking NOTEs (exit unaffected): +- Overloaded backs (long answer — candidate to split into atomic cards) +- List-shaped backs (enumeration — candidate to split or use overlapping cloze) +- Binary yes/no prompts (low retrieval effort — candidate to reformulate) + +Exits 0 when no blocking warnings are present, 1 otherwise, 2 on bad usage. +Use as a gate before regenerating the Anki deck or running flashcard-sync. + +The fuzzy checks (leakage, duplicate, overloaded) are tuned by the LEAKAGE_* +and BACK_WORD_LIMIT constants below; loosen them if a real deck trips false +positives. + +Usage: + flashcard-stats.py <file.org> +""" +from __future__ import annotations + +import re +import sys +from pathlib import Path + +# A level-2 card heading carries a trailing org tag block that includes +# `drill`. The block may hold more than one tag (e.g. ":fundamental:drill:"), +# so match the whole block and check membership rather than pinning :drill: +# as the literal last tag. +CARD_RE = re.compile(r"^\*\*\s+(.+?)\s+(:[A-Za-z0-9_@#%:]+:)\s*$") +HEADING_RE = re.compile(r"^\*{1,2}\s") +ANSWER_RE = re.compile(r"^\*\*\*\s+Answer\b") +PROP_START_RE = re.compile(r"^\s*:PROPERTIES:\s*$") +PROP_END_RE = re.compile(r"^\s*:END:\s*$") +ID_RE = re.compile(r"^\s*:ID:\s+(\S+)\s*$") +TITLE_RE = re.compile(r"^#\+TITLE:\s*(.+?)\s*$", re.IGNORECASE) +SOURCE_TOOL_RE = re.compile(r"\borg[-\s]?drill\b", re.IGNORECASE) +PLANNING_RE = re.compile(r"^\s*(SCHEDULED|DEADLINE|CLOSED):\s") +SOURCE_LINE_RE = re.compile(r"^\s*source:\s", re.IGNORECASE) +CREATED_LINE_RE = re.compile(r"^\s*:?created:?\s", re.IGNORECASE) +RANGE_RE = re.compile(r"\d[^\n]*[-–—]\s*\d") +THRESHOLD_RE = re.compile(r"[<>≤≥]\s*\d") +BULLET_RE = re.compile(r"^\s*([-+*]|\d+[.)])\s+") +BINARY_LEAD_RE = re.compile( + r"^\s*(is|are|was|were|does|do|did|can|could|should|would|will|has|have|had)\b", + re.IGNORECASE, +) + +# A heading qualifies as "prompt form" if it contains `?` or starts with one of +# these imperative verbs (directive prompts like "Spell these out" and +# "Introduce yourself" are valid even without `?`). +IMPERATIVE_VERBS = frozenset({ + "spell", "describe", "explain", "name", "list", "give", + "show", "tell", "define", "compare", "identify", "outline", + "introduce", "walk", "state", "recite", "recall", "summarize", +}) + +# Function words ignored when comparing a question against its answer. +STOPWORDS = frozenset({ + "the", "a", "an", "is", "are", "was", "were", "of", "to", "in", "on", + "for", "and", "or", "with", "what", "who", "whom", "when", "where", "why", + "how", "which", "does", "do", "did", "tell", "me", "about", "their", "this", + "that", "it", "as", "at", "by", "be", "your", "you", "they", "them", +}) + +# Tuning knobs for the fuzzy checks. +LEAKAGE_RATIO = 0.8 # share of a question's content words echoed in its answer +LEAKAGE_MIN_WORDS = 3 # ignore very short questions, where overlap is noise +BACK_WORD_LIMIT = 60 # words on a card back before it's flagged as overloaded + + +def is_prompt_form(heading: str) -> bool: + """True if the heading reads as a question or imperative prompt.""" + if "?" in heading: + return True + first_word = heading.split(None, 1)[0].lower().rstrip(":,;") + return first_word in IMPERATIVE_VERBS + + +def content_words(text: str) -> set[str]: + """Lowercased alphanumeric tokens of length >= 3, minus stopwords.""" + return {w for w in re.findall(r"[a-z0-9]+", text.lower()) + if len(w) >= 3 and w not in STOPWORDS} + + +def leakage_ratio(heading: str, body: str) -> float: + """Fraction of the question's content words that reappear in the answer. + + A high ratio means the answer is largely restated in the question, so the + card can be answered by recognition rather than recall. Returns 0.0 for a + question with fewer than LEAKAGE_MIN_WORDS content words, where overlap is + just noise. + """ + hw = content_words(heading) + if len(hw) < LEAKAGE_MIN_WORDS: + return 0.0 + return len(hw & content_words(body)) / len(hw) + + +def prose_body(body: str) -> str: + """Body with Source: citation and created-date lines removed. + + Those lines are metadata, not the answer. A Source line's URL slug often + repeats the question's words, and a created date is bookkeeping — neither + should count toward answer-leakage overlap. + """ + return "\n".join( + ln for ln in body.splitlines() + if not SOURCE_LINE_RE.match(ln) and not CREATED_LINE_RE.match(ln) + ) + + +def has_distinct_numeric_recall(heading: str, body: str) -> bool: + """True if the answer carries numeric ranges/thresholds the question lacks. + + A range/category card ("What are the HbA1c ranges across normal, + prediabetes, and diabetes?") echoes its categories in the answer, but the + recalled content is the numbers, which the question doesn't give away — so + high word overlap isn't leakage. + """ + body_nums = bool(RANGE_RE.search(body) or THRESHOLD_RE.search(body)) + head_nums = bool(RANGE_RE.search(heading) or THRESHOLD_RE.search(heading)) + return body_nums and not head_nums + + +def is_leaky(heading: str, body: str) -> bool: + """True if a card leaks its answer, after excluding citation lines and + numeric-recall (range/category) cards.""" + prose = prose_body(body) + if leakage_ratio(heading, prose) < LEAKAGE_RATIO: + return False + return not has_distinct_numeric_recall(heading, prose) + + +def normalize_heading(heading: str) -> str: + """Collapse a heading to a comparison key (lowercase, alnum + single spaces).""" + return re.sub(r"\s+", " ", re.sub(r"[^a-z0-9 ]", " ", heading.lower())).strip() + + +def is_binary_prompt(heading: str) -> bool: + """True for yes/no or 'A or B' prompts, which need little retrieval effort.""" + if BINARY_LEAD_RE.match(heading): + return True + return bool(re.search(r"\bor\b", heading, re.IGNORECASE)) and heading.rstrip().endswith("?") + + +def back_word_count(body: str) -> int: + return len(body.split()) + + +def is_list_back(body: str) -> bool: + """True if the answer body is mostly an org list (an enumeration card).""" + lines = [ln for ln in body.splitlines() if ln.strip()] + if len(lines) < 2: + return False + bullets = sum(1 for ln in lines if BULLET_RE.match(ln)) + return bullets >= 2 and bullets * 2 >= len(lines) + + +def parse_cards(lines: list[str]) -> tuple[list[dict], int]: + """Parse :drill: cards from org lines. + + Returns (cards, prop_count). Each card is a dict with heading, has_id, + has_answer, and body (the answer text with PROPERTIES drawers, planning + lines, and `*** Answer` headers removed, approximating the rendered back). + """ + cards: list[dict] = [] + prop_count = 0 + i = 0 + n = len(lines) + while i < n: + m = CARD_RE.match(lines[i]) + if not m or "drill" not in [t for t in m.group(2).split(":") if t]: + i += 1 + continue + heading = m.group(1).strip() + i += 1 + has_id = False + has_answer = False + in_drawer = False + body_lines: list[str] = [] + while i < n: + line = lines[i] + if HEADING_RE.match(line): + break + if PROP_START_RE.match(line): + prop_count += 1 + in_drawer = True + elif in_drawer and PROP_END_RE.match(line): + in_drawer = False + elif in_drawer: + if ID_RE.match(line): + has_id = True + elif ANSWER_RE.match(line): + has_answer = True + elif PLANNING_RE.match(line): + pass + else: + body_lines.append(line) + i += 1 + cards.append({ + "heading": heading, + "has_id": has_id, + "has_answer": has_answer, + "body": "\n".join(body_lines).strip(), + }) + return cards, prop_count + + +def find_duplicate_fronts(cards: list[dict]) -> list[tuple[str, str]]: + """Return (first, dup) heading pairs that normalize to the same key.""" + seen: dict[str, str] = {} + dups: list[tuple[str, str]] = [] + for c in cards: + key = normalize_heading(c["heading"]) + if not key: + continue + if key in seen: + dups.append((seen[key], c["heading"])) + else: + seen[key] = c["heading"] + return dups + + +def main() -> int: + if len(sys.argv) != 2: + print(f"usage: {sys.argv[0]} <file.org>", file=sys.stderr) + return 2 + + path = Path(sys.argv[1]).expanduser().resolve() + if not path.is_file(): + print(f"error: {path} not found", file=sys.stderr) + return 2 + + lines = path.read_text(encoding="utf-8").splitlines() + + title: str | None = None + for line in lines[:20]: + m = TITLE_RE.match(line) + if m: + title = m.group(1).strip() + break + + cards, prop_count = parse_cards(lines) + + no_id = [c["heading"] for c in cards if not c["has_id"]] + not_prompt = [c["heading"] for c in cards if not is_prompt_form(c["heading"])] + answer_count = sum(1 for c in cards if c["has_answer"]) + leaky = [c["heading"] for c in cards if is_leaky(c["heading"], c["body"])] + dups = find_duplicate_fronts(cards) + overloaded = [c["heading"] for c in cards if back_word_count(c["body"]) > BACK_WORD_LIMIT] + listy = [c["heading"] for c in cards if is_list_back(c["body"])] + binary = [c["heading"] for c in cards if is_binary_prompt(c["heading"])] + + print(f"{path.name} — drill deck stats") + print() + print(f"Deck title: {title if title else '(no #+TITLE)'}") + print(f"Cards: {len(cards)}") + drawer_status = "match" if prop_count == len(cards) else f"mismatch (expected {len(cards)})" + print(f"PROPERTIES drawers: {prop_count} ({drawer_status})") + print(f"*** Answer sub-headers: {answer_count} ({'clean' if answer_count == 0 else 'workflow violation'})") + print(f"Cards missing :ID:: {len(no_id)}") + print(f"Cards with non-prompt heading: {len(not_prompt)}") + print(f"Cards with possible answer leakage: {len(leaky)}") + print(f"Duplicate / near-duplicate fronts: {len(dups)}") + print() + + warnings = 0 + + def emit_list(items: list[str]) -> None: + for h in items[:5]: + print(f" - {h}") + if len(items) > 5: + print(f" - ... and {len(items) - 5} more") + + def warn(msg: str, items: list[str] | None = None) -> None: + nonlocal warnings + warnings += 1 + print(f"WARN: {msg}") + if items: + emit_list(items) + + def note(msg: str, items: list[str] | None = None) -> None: + print(f"NOTE: {msg}") + if items: + emit_list(items) + + if title is None: + warn("no #+TITLE: line found; deck name will fall back to the file basename") + elif SOURCE_TOOL_RE.search(title): + warn(f"#+TITLE contains source-tool jargon ('{title}'); the deck name shows in Anki — drop 'Org-Drill' for a name that reads well on the consumption side") + if answer_count: + warn(f"{answer_count} cards have *** Answer sub-headers (drop per flashcard-review.org)") + if prop_count != len(cards): + warn(f"PROPERTIES count {prop_count} does not match card count {len(cards)}") + if no_id: + warn(f"{len(no_id)} cards missing :ID:; losing identity risks SRS-state loss across rewrites", no_id) + if not_prompt: + warn(f"{len(not_prompt)} cards have non-prompt headings (no '?' and no imperative-verb start); likely topic-as-heading not yet rewritten", not_prompt) + if leaky: + warn(f"{len(leaky)} cards may leak their answer (question echoes >= {int(LEAKAGE_RATIO * 100)}% of its own answer's key words); reformulate so the answer is recalled, not recognized", leaky) + if dups: + warn(f"{len(dups)} duplicate / near-duplicate fronts (interference between confusable cards); disambiguate or merge", + [f"{a} == {b}" for a, b in dups]) + + if overloaded: + note(f"{len(overloaded)} cards have a long answer (> {BACK_WORD_LIMIT} words); candidates to split into atomic cards", overloaded) + if listy: + note(f"{len(listy)} cards have a list-shaped answer; enumeration cards recall poorly — candidates to split or use overlapping cloze", listy) + if binary: + note(f"{len(binary)} cards are binary (yes/no or 'A or B'); low retrieval effort — candidates to reformulate open-ended", binary) + + if warnings == 0: + print("clean (with non-blocking notes above)" if (overloaded or listy or binary) else "clean") + return 0 + return 1 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/docs/design/2026-06-17-flashcard-multitag-to-anki.py b/docs/design/2026-06-17-flashcard-multitag-to-anki.py new file mode 100755 index 0000000..3764acf --- /dev/null +++ b/docs/design/2026-06-17-flashcard-multitag-to-anki.py @@ -0,0 +1,294 @@ +#!/usr/bin/env -S uv run --script +# /// script +# requires-python = ">=3.11" +# dependencies = [ +# "genanki>=0.13", +# ] +# /// +"""Convert an org-drill file into an Anki .apkg deck. + +Parses org-drill structure: + - Top-level "* Section" headings become tags on every card under them. + - Each "** Card name :drill:" entry becomes a card. Front = heading + text (sans the org tag block). Back = entry body with newlines + converted to <br>. + +A card heading may carry more than one org tag (e.g. +"** Question :fundamental:drill:"). Any heading whose trailing tag block +includes `drill` is a card; the other org tags ride along as Anki tags +next to the section tag. Pass --tag-filter <tag> to emit only the cards +carrying that org tag (e.g. a curated "fundamentals" subset). + +Deck name defaults to the input basename, case preserved. Deck and model +IDs are derived from the deck name via stable hash so re-importing the +same deck updates existing cards instead of duplicating them. + +Note GUIDs default to a hash of the card front, so re-running against the +same source preserves SRS state. A derived subset deck (one built with +--tag-filter) should pass --guid-salt so its notes get a distinct GUID +space and Anki treats it as a separate deck rather than merging its cards +into a full deck that shares the same fronts. + +Output defaults to ~/sync/phone/anki/<input-basename>.apkg. The .apkg is +a mobile-Anki artifact the phone picks up from its sync dir, so it lands +there rather than next to the org source. + +Usage: + flashcard-to-anki.py <input.org> + flashcard-to-anki.py <input.org> --deck "My Deck Name" + flashcard-to-anki.py <input.org> --output /path/to/deck.apkg + flashcard-to-anki.py <input.org> --tag-filter fundamental \ + --deck "DeepSat Fundamentals" --guid-salt fundamentals + +Requires genanki, which uv resolves automatically via the PEP 723 +script metadata above. No venv or system install needed. +""" +from __future__ import annotations + +import argparse +import hashlib +import re +import sys +from pathlib import Path + +import genanki + +# 32-bit integer space genanki accepts. Start above the conventional +# "user model" floor so collisions with hand-written decks stay +# unlikely. +ID_BASE = 1_500_000_000 +ID_RANGE = 500_000_000 + + +def stable_id(name: str, salt: str) -> int: + """Derive a deterministic 32-bit id from `name` and a `salt`. + + Same (name, salt) pair always returns the same id, so re-running + against the same source produces a stable deck/model id pair and + Anki imports update existing cards in place rather than duplicating. + """ + h = hashlib.sha256(f"{salt}:{name}".encode()).hexdigest() + return ID_BASE + (int(h[:8], 16) % ID_RANGE) + + +def make_model(deck_name: str) -> genanki.Model: + return genanki.Model( + stable_id(deck_name, "model"), + f"{deck_name} (Craig)", + fields=[{"name": "Front"}, {"name": "Back"}], + templates=[ + { + "name": "Card 1", + "qfmt": "{{Front}}", + "afmt": '{{FrontSide}}<hr id="answer">{{Back}}', + } + ], + css=( + ".card { font-family: sans-serif; font-size: 18px; " + "color: #222; background: #fafafa; line-height: 1.45; }\n" + "hr#answer { margin: 14px 0; }\n" + ), + ) + + +def section_to_tag(title: str) -> str: + return re.sub(r"[^a-z0-9]+", "-", title.lower()).strip("-") + + +def escape_html(s: str) -> str: + return ( + s.replace("&", "&") + .replace("<", "<") + .replace(">", ">") + ) + + +def strip_org_metadata(body_lines: list[str]) -> list[str]: + """Drop :PROPERTIES: drawers, planning lines, and created-date lines. + + Org-drill needs these in the source file (SRS state lives in the + PROPERTIES drawer; SCHEDULED carries the next-review date), but they + are noise on the back of an Anki card. A created/added date never + belongs on a card, so a stray "Created:" or ":CREATED:" body line is + dropped too. + """ + cleaned: list[str] = [] + in_drawer = False + planning_re = re.compile(r"^\s*(SCHEDULED|DEADLINE|CLOSED):\s") + created_re = re.compile(r"^\s*:?created:?\s", re.IGNORECASE) + drawer_start_re = re.compile(r"^\s*:PROPERTIES:\s*$") + drawer_end_re = re.compile(r"^\s*:END:\s*$") + for line in body_lines: + if in_drawer: + if drawer_end_re.match(line): + in_drawer = False + continue + if drawer_start_re.match(line): + in_drawer = True + continue + if planning_re.match(line) or created_re.match(line): + continue + cleaned.append(line) + return cleaned + + +# A level-2 heading carrying a trailing org tag block. Group 1 is the +# front text, group 2 the colon-delimited tag block (e.g. ":fundamental:drill:"). +CARD_RE = re.compile(r"^\*\*\s+(.+?)\s+(:[A-Za-z0-9_@#%:]+:)\s*$") +# Any level-1 or level-2 heading — used to bound a card's body. +HEADING_RE = re.compile(r"^\*{1,2}\s") +SECTION_RE = re.compile(r"^\*\s+(.+?)\s*$") + + +def parse( + org_text: str, tag_filter: str | None = None +) -> list[tuple[str, str, list[str]]]: + """Return [(front, back_html, anki_tags), ...] for every :drill: card. + + A card is any level-2 heading whose trailing org tag block includes + `drill`. Additional org tags become Anki tags alongside the section + tag. When `tag_filter` is set, only cards carrying that org tag are + returned. + """ + cards: list[tuple[str, str, list[str]]] = [] + current_section: str | None = None + + lines = org_text.splitlines() + i = 0 + while i < len(lines): + line = lines[i] + + sec = SECTION_RE.match(line) + if sec: + current_section = sec.group(1).strip() + i += 1 + continue + + m = CARD_RE.match(line) + tags = [t for t in m.group(2).split(":") if t] if m else [] + if m and "drill" in tags: + front = m.group(1).strip() + body_lines: list[str] = [] + i += 1 + while i < len(lines): + nxt = lines[i] + if HEADING_RE.match(nxt): + break + body_lines.append(nxt) + i += 1 + body_lines = strip_org_metadata(body_lines) + while body_lines and not body_lines[0].strip(): + body_lines.pop(0) + while body_lines and not body_lines[-1].strip(): + body_lines.pop() + back_html = "<br>".join(escape_html(ln) for ln in body_lines) + + org_tags = [t for t in tags if t != "drill"] + if tag_filter and tag_filter not in org_tags: + continue + anki_tags: list[str] = [] + if current_section: + anki_tags.append(section_to_tag(current_section)) + anki_tags.extend(org_tags) + if not anki_tags: + anki_tags = ["drill"] + cards.append((front, back_html, anki_tags)) + continue + + i += 1 + + return cards + + +def build( + cards: list[tuple[str, str, list[str]]], + deck_name: str, + guid_salt: str | None = None, +) -> genanki.Deck: + deck = genanki.Deck(stable_id(deck_name, "deck"), deck_name) + model = make_model(deck_name) + for front, back, tags in cards: + guid = ( + genanki.guid_for(guid_salt, front) + if guid_salt + else genanki.guid_for(front) + ) + note = genanki.Note( + model=model, + fields=[front, back], + tags=tags, + guid=guid, + ) + deck.add_note(note) + return deck + + +def default_deck_name(input_path: Path) -> str: + return input_path.stem + + +def default_output_path(input_path: Path) -> Path: + anki_dir = Path.home() / "sync" / "phone" / "anki" + return anki_dir / f"{input_path.stem}.apkg" + + +def main() -> int: + parser = argparse.ArgumentParser( + description="Convert an org-drill file into an Anki .apkg deck.", + ) + parser.add_argument( + "input", + type=Path, + help="Path to the org-drill source file.", + ) + parser.add_argument( + "--deck", + help="Deck name. Defaults to the input basename.", + ) + parser.add_argument( + "--output", + type=Path, + help="Output .apkg path. Defaults to " + "~/sync/phone/anki/<input-basename>.apkg.", + ) + parser.add_argument( + "--tag-filter", + help="Emit only cards carrying this org tag (e.g. 'fundamental').", + ) + parser.add_argument( + "--guid-salt", + help="Salt note GUIDs with this string so a derived subset deck " + "gets its own GUID space and Anki keeps it separate from a " + "full deck sharing the same card fronts.", + ) + args = parser.parse_args() + + input_path: Path = args.input.expanduser().resolve() + if not input_path.is_file(): + print(f"error: {input_path} not found", file=sys.stderr) + return 1 + + org_text = input_path.read_text(encoding="utf-8") + deck_name = args.deck or default_deck_name(input_path) + output_path: Path = (args.output or default_output_path(input_path)).expanduser().resolve() + output_path.parent.mkdir(parents=True, exist_ok=True) + + cards = parse(org_text, tag_filter=args.tag_filter) + if not cards: + if args.tag_filter: + print( + f"error: no :drill: cards tagged :{args.tag_filter}: in {input_path}", + file=sys.stderr, + ) + else: + print(f"error: no :drill: cards found in {input_path}", file=sys.stderr) + return 1 + + deck = build(cards, deck_name, guid_salt=args.guid_salt) + genanki.Package(deck).write_to_file(str(output_path)) + print(f"wrote {output_path} ({len(cards)} cards, deck '{deck_name}')") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/docs/design/2026-06-17-ntfy-agent-comms-proposal.org b/docs/design/2026-06-17-ntfy-agent-comms-proposal.org new file mode 100644 index 0000000..ce17138 --- /dev/null +++ b/docs/design/2026-06-17-ntfy-agent-comms-proposal.org @@ -0,0 +1,89 @@ +#+TITLE: Proposal — Promote the ntfy phone channel into a general agent-comms tool +#+AUTHOR: Craig Jennings & Claude (home project) +#+DATE: 2026-06-17 + +* Why this is in rulesets' inbox + +The home project built a working, private phone-notification channel for Craig on 2026-06-17 (self-hosted ntfy over Tailscale). Craig wants rulesets to consider promoting it from a one-way notification system into a *general two-way communication tool* between him and his agents — and, critically, to move it off pure polling toward event-driven delivery (an inbound message can trigger an action or notify an agent, not just sit in a queue waiting to be polled). + +This is a proposal, not a change to anything rulesets owns. It documents exactly what exists, what ntfy makes possible, and the open design decisions rulesets would own. It also relates directly to the cross-agent-comms scripts that were retired from the templates in this same session — ntfy may be the transport layer that effort was missing. + +* Part 1 — What exists now (as-built, verified) + +- *Server:* ntfy in Docker on =ratio= at =~/docker/ntfy/= (=compose.yml= + =server.yml=, =data/= volume, =restart=unless-stopped=, healthcheck on =/v1/health=). Listens container :80, published to =127.0.0.1:2586=. +- *Tailnet exposure:* =tailscale serve --bg --http=80 http://127.0.0.1:2586= → reachable at =http://ratio.tailf3bb8c.ts.net= (tailnet only, no public exposure). Disable with =tailscale serve --http=80 off=. +- *Transport security:* plain HTTP, but every byte rides inside the WireGuard mesh (Tailscale), so it is encrypted end to end. The Tailscale account does not support TLS certs, and TLS would be redundant on the tailnet anyway. If ever exposed publicly, TLS + stronger auth become mandatory. +- *Auth:* =auth-default-access: deny-all=. User =cj= has read-write on topics =claude= and =infra=. Anonymous is denied — verified 403 on both publish AND subscribe without a token. Token =tk_…= never expires. App login is username =cj= + a short password. +- *Phone:* Pixel 6, ntfy F-Droid build (no Firebase / Google Play Services), WebSocket instant delivery. Already on the tailnet. +- *Publisher wrapper:* =~/.local/bin/phone-notify= (on ratio only). Reads =~/.config/phone-notify/config= (chmod 600: URL, token, topic). Supports =-t/--title=, =-p/--priority=, =-T/--tags=, =--topic=, =--click=, =--url=. +- *Verified two-way:* agent → phone push lands instantly; phone → publish to the topic lands on the server and is readable by the agent (Craig sent "It did, in fact, land." from the app and the agent polled and saw it). + +* Part 2 — The ntfy building blocks rulesets can use + +** Publish (agent → phone), already wired +- =curl -H "Authorization: Bearer <token>" -d "msg" <url>/<topic>= or =phone-notify=. +- Rich features available, unused so far: =Priority= (1-5), =Tags= (emoji/keywords), =Click= (URL opened on tap), =Actions= (tappable buttons — =view= a URL, =http= fire a request, =broadcast= an Android intent), =Attach= (files/images), =Markdown=, scheduled/delayed delivery (=At:= / =Delay:= header), and email/call forwarding. + +** Read (agent ← phone) +- One-shot poll, all cached: =GET /<topic>/json?poll=1= (needs the token). +- Only-new since a point: =?since=<id|timestamp|duration>= (e.g. =?since=5m= or =?since=<last-seen-id>=). This is the basis of a =phone-recv= helper that prints only messages newer than the last one seen. +- Cache window is 12h (=cache-duration= in server.yml), so on-demand polling never misses a recent message. + +** Subscribe with side effects (the event-driven primitive) +- =ntfy subscribe <topic> '<command>'= holds a persistent connection (WebSocket / JSON stream) and runs =<command>= for *every* inbound message, with fields exposed as environment variables (=$message=, =$title=, =$topic=, =$tags=, =$priority=, etc.). +- This is the answer to "not all polling": a long-running subscriber reacts the instant a message arrives. + +* Part 3 — Making it event-driven (Craig's core ask) + +Three tiers, increasing capability and difficulty: + +** Tier A — Subscriber daemon routes inbound (clean, doable now) +A systemd *user* service on ratio (always-on): +#+begin_src +ntfy subscribe --since=<last> claude /usr/local/bin/ntfy-inbound-handler +#+end_src +=ntfy-inbound-handler= classifies the message and routes it: +- Append to a watched queue (a project =inbox/= or a dedicated comms file) → the next agent session picks it up at a task boundary (already in protocols: inbox check at task boundaries). +- Fire desktop =notify= so a human at a screen sees it immediately. +- Tag-based dispatch: =#task= → file as a TODO; =#infra= → infra queue; etc. + +This gets us instant reaction with zero polling, and it degrades gracefully — if nothing is listening, the message still sits in the queue. + +** Tier B — Inbound spawns a *new* agent session +The handler invokes the =ai= launcher (or a scheduled/cron Claude run) to process the message autonomously. An inbound phone text becomes an agent action — "remind me to X" from anywhere, "what's the status of Y", "approve the pending commit". This is where it stops being a notifier and becomes a remote control for the agent fleet. Ties into the harness cron/schedule features and the retired cross-agent-comms intent. + +** Tier C — Notify / interrupt a *live* agent session (hardest, harness-dependent) +A turn-based session has no native external interrupt. Honest options to explore: +- The session runs a background subscriber/poll loop; the harness re-invokes the agent when backgrounded work emits or completes (the background-Bash + Monitor + ScheduleWakeup / =/loop= dynamic-pacing mechanisms). +- A =/loop= that polls the topic every N seconds (still polling, but bounded and cheap). +- Whatever the harness exposes for inbound push into a live session (e.g. a RemoteTrigger / inbound-PushNotification path) — needs experimentation. + +Recommendation: ship Tier A first (high value, low risk), prototype Tier B, treat Tier C as research. + +* Part 4 — The general-comms vision (beyond notifications) + +- *Channels as topics:* =claude= (agent ↔ Craig), =infra= (server/health/backup alerts — the DEGRADED-pool class), per-project topics, a cross-agent topic. +- *Bidirectional chat:* Craig texts his agent from anywhere over Tailscale; the agent replies. Effectively private, self-hosted "SMS with your agent." +- *Approval buttons:* the publish =Actions= feature can render Approve / Reject buttons on the phone. For the commits.md approval gates (commit message, PR body, PR review) when Craig is away from the desk, a tapped button fires a webhook the handler turns into "proceed." This is a concrete, high-value use. +- *Attachments:* agent sends a generated screenshot/report to the phone; Craig sends a photo to the agent. + +* Part 5 — What rulesets would own / decide + +1. *Canonical tooling:* promote =phone-notify= (send) and add =phone-recv= (check-since) as rulesets bin scripts, synced to all machines via dotfiles/templates. Today =phone-notify= lives only on ratio. +2. *Config + secret convention:* where the server URL + token live per machine (=~/.config/phone-notify/config= chmod 600 today), and whether the token should be a rulesets-managed GPG-encrypted secret distributed via dotfiles. +3. *The subscriber daemon:* a reference =ntfy-inbound-handler= + a systemd user-unit template, plus the routing convention (tags → destinations). +4. *Protocol conventions:* topic naming, a message format/tag vocabulary for routing, and how inbound maps to the existing =inbox/= and (retired) cross-agent-comms protocols. +5. *Harness integration:* how — if at all — to wake or notify a live/new agent session on inbound. The Tier C research. +6. *Relationship to cross-agent-comms:* decide whether ntfy is the transport that replaces the just-retired scripts, and whether agent↔agent messaging rides the same server (a dedicated topic) or stays separate. + +* Part 6 — Open questions + +- Multi-machine token distribution (per-machine config vs encrypted-in-dotfiles). +- Daemon placement: one always-on subscriber on ratio vs per-machine subscribers. +- Inbound integration with the existing inbox + the retired cross-agent protocols. +- Live-session interrupt feasibility (entirely harness-dependent — needs a spike). +- Whether agent↔agent comms and agent↔Craig comms share a server or are isolated. + +* Companion artifact + +The full as-built runbook (concrete values, server.yml, the verification checklist, the security model) lives in the home project at =working/phone-notifications/spec.org=. This proposal is the forward-looking half; that file is the operational record of what was deployed. diff --git a/docs/design/2026-06-18-triage-intake-phone-push-note.org b/docs/design/2026-06-18-triage-intake-phone-push-note.org new file mode 100644 index 0000000..2f6502b --- /dev/null +++ b/docs/design/2026-06-18-triage-intake-phone-push-note.org @@ -0,0 +1,11 @@ +#+TITLE: WORKFLOW UPDATE — triage-intake.org auto mode gains a phone +#+SOURCE: from work +#+DATE: 2026-06-18 15:15:47 -0500 + +WORKFLOW UPDATE — triage-intake.org auto mode gains a phone (ntfy) delivery step. Supersedes/consolidates the earlier 2026-06-18-1512 send. + +WHAT we did: added a new subsection 'Push each sweep to Craig's phone (ntfy) — the primary delivery' under 'Trigger and delivery' in the auto-mode section. It makes phone-notify (the self-hosted ntfy channel over Tailscale) the primary delivery for every auto-mode sweep, pushing the fuller End-of-sweep output (per-source deltas + open-PR/Linear state + the awaiting-ack list + a one-line verdict + a timestamp; SCAN FAILED banner if any source failed), and polling phone-recv each sweep for Craig's replies. Falls back to inline if phone-notify is absent. + +WHY: auto mode exists precisely for when Craig is away from the desk (out of office for a while, on vacation — which is the case right now, traveling 6/17-24). An inline-only report is useless if he is not at the screen; the whole point is reaching his phone. We ran it live this way all day and it worked, so we codified the delivery rather than re-deriving it each time. Craig confirmed this is his standard pattern: he starts auto-triage before leaving the office / on vacation. + +Companion: the reference_phone_notify_ntfy_channel harness memory documents the channel + the high-bar caveat. A separate task is producing standalone ntfy setup instructions (install + start/stop the service) so the channel can be brought up on other machines. diff --git a/docs/design/2026-06-18-triage-intake-phone-push-workflow.org b/docs/design/2026-06-18-triage-intake-phone-push-workflow.org new file mode 100644 index 0000000..cd830fb --- /dev/null +++ b/docs/design/2026-06-18-triage-intake-phone-push-workflow.org @@ -0,0 +1,427 @@ +#+TITLE: Triage Intake Workflow (Engine) +#+AUTHOR: Craig Jennings & Claude +#+DATE: 2026-05-01 + +* Summary + + +Lightweight, between-meetings sweep across whatever sources are plugged in — email, calendar, chat, open PRs, ticketing. Classifies what came in since the last check (Action / FYI / Noise-keep / Noise-trash), produces a single synthesized summary, and offers to execute the routine actions (trash, mark-read, star, respond, merge, attachment fetch). + +Think of it as the ER intake queue: every new message, invite, and PR notification is a "patient" walking through the door. This workflow is the triage nurse looking at the queue and telling Craig what needs attention now, what's just FYI, and what can be cleared. + +*This file is the engine.* It carries no sources of its own. Every source it scans comes from a *source plugin* — a =triage-intake.<source>.org= file the engine loads at Phase 0. The engine is source-agnostic and project-agnostic; the project- and account-specific knowledge lives entirely in the plugins. To add a source, drop a plugin file. To change one, edit its plugin. Never wire a source into this file. + +Distinct from =daily-prep.org=: +- *daily-prep* — heavier, once daily, builds the day's plan + standup brief + meeting prep + time blocks. +- *triage-intake* — fast, repeatable, just answers "what's new since last check?" + + +Quick contract — what it does: fans out across source plugins, classifies every item into Action / FYI / Noise-keep / Noise-trash, synthesizes one deduped summary, writes each Action item to =todo.org= as a =:quick:reactive:= task, and executes star/mark-read/trash on confirmation. + +** When to Use This Workflow + +Trigger phrases: + +- "Run a triage-intake" +- "Triage intake" +- "What's new" / "What's new since I last checked" +- "Do a sweep" / "Do a triage sweep" +- "Check email, calendar, and PRs" + +Typical timing: + +- Between meetings (1-2 minute glance) +- After a long focused-work block +- Before context-switching to a new task +- When ambient anxiety about "did I miss something?" creeps in + +Do *not* use when running daily-prep — daily-prep already does this as Phase 3. + + +* Execution + +** Phase 0 — Load source plugins (MANDATORY — do not skip) + +The engine has no sources baked in. It discovers them by globbing *two* directories, and you MUST glob *both*: + +#+begin_src bash +ls .ai/workflows/triage-intake.*.org .ai/project-workflows/triage-intake.*.org 2>/dev/null +#+end_src + +- =.ai/workflows/triage-intake.*.org= — *general* source plugins, template-synced (personal Gmail, personal calendar, cmail/Proton, Telegram, personal GitHub PRs). +- =.ai/project-workflows/triage-intake.*.org= — *PROJECT-SPECIFIC* source plugins, never synced, owned by this project (e.g. a work project's Linear, work Gmail, work Slack, enterprise-GitHub PRs). + +⚠ *THE #1 FAILURE MODE — read this twice.* Globbing only =.ai/workflows/= and silently missing every project plugin. If you skip =.ai/project-workflows/=, the sweep runs with *half its sources* and Craig never learns what it dropped — the omission is invisible, because a missing source looks identical to a quiet source in the output. There is no error, no empty block, no warning. The sweep just lies by omission. *Glob both directories. Always.* + +The glob exclude is automatic: =triage-intake.*.org= matches the plugins but not this engine file (=triage-intake.org= has no second dot-segment), so the engine never loads itself. + +After globbing, for each plugin file: +1. Read it. +2. Evaluate its =ENABLED= precondition. If false, *announce the skip with its reason* ("skipping linear — mcp__linear not present") and move on. +3. The surviving set is the source list for Phases A-D. + +*Announce the loaded set before scanning* so the omission can't hide: + +#+begin_example +Loaded 5 source plugins: + general: personal-gmail, personal-calendar, cmail, github-prs + project: deepsat-gmail + skipped: linear (mcp__linear not present) +#+end_example + +If the project directory glob returns nothing, say so explicitly ("no project plugins in .ai/project-workflows/") rather than staying silent — silence is indistinguishable from forgetting to look. + + +** Approach: Phases A → D + +*** Phase A: Fan-out (one parallel batch) + +Issue every enabled source's =Scan= command in a single message, with the anchor substituted in each source's declared format. They have no dependencies and benefit from running concurrently. + +Per-source subagent escalation: if a source's scan is expected to return more than its =SUBAGENT_OVER= count (e.g. personal Gmail after a multi-day gap), dispatch a subagent for that source. The subagent applies Phase B classification and returns the synthesized buckets, not the raw item list. + +*** Phase B: Classify per source (shared four-bucket model) + +Every item lands in one bucket. Plugins refine these with source-specific bias and noise patterns in their =Classify= section; they do not redefine the buckets. + +- *Action* — needs Craig to do something: an explicit ask, a decision needed, blocked-on-Craig, a mergeable PR, an invite needing a response, a deadline inside 48h. +- *FYI* — substantive context worth seeing, but no action owed. +- *Noise-keep* — low value but worth retaining (audit trail, receipts). +- *Noise-trash* — safe to discard: newsletters, marketing, social digests, bot pings, redundant aggregator digests, wrong-recipient mail, past-event artifacts. + +Per-source bias (a work email account leans keep for audit value; a personal account leans trash on high noise volume) lives in each plugin's =Classify= section. Read it from there; don't re-derive it here. + +*** Phase C: Synthesize a single summary + +One markdown summary surfaced inline to Craig. Order: + +0. *Scan failures — first, loud, always.* Any loaded source whose scan failed, hung, was killed, or was skipped for an operational reason renders at the very top of the summary, before Top signals: + + #+begin_example + ⚠ SCAN FAILED: <source> — <reason, one line> — <what's now unknown> + #+end_example + + A failed scan is never folded into "quiet." Quiet means the scan ran and found nothing; a failure means the sweep is blind on that channel, and the reader must know which. The same applies to a precondition skip the user hasn't standing-approved (e.g. a messaging client that needs a temporary server spin-up): run the lifecycle or report the failure — don't silently narrow the sweep. + +1. *Top signals to act on* — bullet list of 3-7 items, ordered by urgency, *Action only*. Each bullet links to the source (permalink, thread URL, PR number). +2. *Per-source breakdown* — one short section per loaded source *that has changes*, in =ORDER=, using that plugin's =Render= shape: Action items detailed, FYI items as a short list, Noise as a tally only ("Noise: 12 trash candidates, 4 keep, 0 starred"). +3. *Suggested actions* — explicit list of state changes Craig could take this run (trash these N messages, mark-read these M, star this Action item, respond to this invite, merge PRs #X and #Y, etc.). This line stays whenever there are queued actions, regardless of how quiet the sweep was. + +*Deltas only.* The summary reports what *changed* since the anchor: a new invite, a new/moved/cancelled calendar event, a new message needing attention. A source with no changes gets no block — no "Calendar — quiet", no "PRs — nothing new" roll-call. A sweep where nothing changed anywhere renders as a single line: + +#+begin_example +17:39 sweep: no changes +#+end_example + +(Craig, 2026-06-11: "we only need to report if anything's changed when we do triage intake. did someone send me a new invite? did christine throw something on my calendar that wasn't there earlier? did someone cancel a meeting?") + +Scan failures are the standing exception: a failed or skipped scan always renders loudly per point 0 above and is never folded into the no-change line — "no changes" is a claim about channels the sweep could actually see. + +Format target: scannable in 30 seconds, full read in 2 minutes. Don't pad. + +**** Sub-step: write each Action item into =todo.org= as its own =:quick:= task + +After surfacing the summary inline, append every Action item — regardless of source — to =todo.org= as its own top-level =** TODO= heading carrying the =:quick:= tag plus =:reactive:= and any relevant person/entity tag. + +Each Action item is one task. Don't group items by source under =** Email Response=, =** PR Review=, etc. sub-headings. Each response is its own filterable task so Craig can re-prioritize, =SCHEDULE:= / =DEADLINE:=, or tag individually. + +Format: + +#+begin_example +*** TODO [#B] Merge PR #42 on archsetup (approved, CI green) — [[https://github.com/<user>/archsetup/pull/42][PR #42]] :quick:reactive: +*** TODO [#B] Respond to the 2pm reschedule invite from Dana :quick:reactive: +*** TODO [#B] Reply to the contract-terms email thread :quick:reactive: +#+end_example + +Rules: + +- Heading is plain prose. Lead with the verb (Read / Re-review / Reply / Respond / Address / Merge / Schedule). +- Priority: default =[#B]= for fresh reactive items. Bump to =[#A]= only if blocking someone or a deadline lands inside 7 days. +- Tags: always =:quick:= + =:reactive:=. Add person/entity tags when the dependency is sharp. +- Link the source in the heading when it has a URL (GitHub PR, mail thread, chat permalink). Use org's =[[url][label]]= form so the heading stays clickable in Emacs. +- *Record the source locator in the task body* so a reply can be routed back to where the request came from — the channel + thread id for chat, the repo + PR number, the message id for mail. The general rule: a reply goes back to the *origin* of the request, not a fixed notification channel. (Project plugins may add stricter routing rules in their own files.) +- Placement: append at end of =* Work Open Work= (just before =* Work Incubate=) unless the project's =todo.org= has a designated triage section near the top (=* Triage= or =* Inbox=). + +This sub-step makes triage-intake's findings *persist* in =todo.org= instead of evaporating after the inline summary. + +*** Phase D: Execute actions on confirmation + +Wait for Craig's go-ahead before running any state changes. Default to single-confirmation for the whole batch ("yes" → run everything proposed). Craig may also pick a subset ("trash personal but hold the work account") or hand back a different plan ("trash all but star the expense thread and queue PR merges for after lunch"). + +Each action dispatches to the owning source plugin's =Actions= verb (trash, mark-read, star, respond, merge, comment, attachment-fetch). The engine doesn't hardcode action commands — it reads them from the loaded plugins. Read each plugin's =Actions= section for the exact command. + +After actions complete, write the Phase A capture into the sentinel's *content* (see "Capture the Phase A timestamp"): =echo "$PHASE_A_TS $(date -d "@$PHASE_A_TS" '+%Y-%m-%d %H:%M:%S %z')" > .ai/last-triage-intake=. Do not use plain =touch= (writes mtime to /now/ and strands items posted between Phase A and end of run) and do not use =touch -d "@$PHASE_A_TS"= (correct timestamp but mtime is per-machine — won't survive a fresh clone or cross-machine sync). + +*Do not close the workflow yet.* See Exit Criteria below. + +*** Exit Criteria + +The workflow stays open until Craig has *explicitly* either: + +1. *Confirmed* that the executed actions are sufficient and nothing more is needed this round, or +2. *Handed back a different plan* (e.g., "actually hold the PR merges, address #131 first"). + +A successful Phase D run is *not* an exit signal. After the action batch returns, surface what shipped and wait. Don't volunteer "done" or "all set" — those are exit-claim phrases that pre-empt Craig's call. Use a status report ("17 actions succeeded, sentinel written at 12:19") and stop. + +If Craig has been silent for a while after Phase D and the surface looks closed-out, *ask*: "Anything else on this triage, or are we good to close out?" Don't auto-terminate. + +This rule prevents the failure mode where the workflow self-declares done and the next exchange has to relitigate what state things are in. + + +* Auto mode (unattended monitoring) + +Auto mode is a self-running variant of the engine for when Craig is away from the desk but wants tight awareness — a loop that runs the standard sweep on a short interval, *accumulates* findings rather than mutating state, and hands Craig a gated checkpoint to commit the batch. It composes two things: the *delivery* (a =/loop= in the live session) and the *behavior* (accumulate-don't-mutate sweeps with a checkpoint). The one-shot run above is unchanged; auto mode is an additional way to run the same Phase 0 / A-D engine. + +** Trigger and delivery + +- "auto triage" / "auto triage-intake" / "watch the desk" / "monitor the triage" — start auto mode. +- Default interval *20 minutes*; Craig sets it. + +Auto mode runs as a =/loop= in the *live session*, not a detached cron job: + +#+begin_src +/loop 20m run an auto-mode triage-intake sweep per triage-intake.org +#+end_src + +Running in the live session means MCP auth (Slack, Gmail, Linear) is inherited from the session — the headless-auth wall that blocks a detached cron run does not apply. A durable cross-session schedule is out of scope here; that belongs to the morning-ops orchestrator, which can later invoke auto mode's accumulate behavior as its triage limb. The close/stop commands below require a live session by design. + +*** Push each sweep to Craig's phone (ntfy) — the primary delivery + +Auto mode exists for when Craig is away from the desk (out of office for a while, on vacation), so the report's primary delivery is a push to his phone, not an inline message he won't be looking at. After every sweep, send the End-of-sweep output to his phone via =phone-notify= (the self-hosted ntfy channel over Tailscale; see the phone-notify reference memory for usage and the high-bar caveat). Push on *every* sweep, including a quiet "no changes" one — the timestamp line is the proof the loop ran. + +The pushed summary is the *fuller* shape, not a terse one-liner: per-source deltas (Slack / work-email / Linear / PRs / calendar / Telegram, noise tallies included), the current open-PR + Linear state, the awaiting-acknowledgment list, a one-line verdict on whether anything needs Craig, and the timestamp. Lead with a ⚠ SCAN FAILED banner if any source failed. + +Poll =phone-recv= at the top of each sweep for anything Craig sends back (delivery is not pushed to the agent); act on his requests and reply via =phone-notify=. Note that =phone-recv= echoes the agent's own outgoing pushes back, so only treat a message as inbound from Craig when it is not one of the sweep summaries. + +If =phone-notify= isn't installed on the host (it lives on ratio for now), fall back to inline delivery and say so once. + +** Preconditions and Close-out + +Auto mode borrows the inbox-monitor gates (=monitor-inbox.org=): do not start on a dirty worktree or a red test suite — a close's batch commit would otherwise sweep up unrelated changes — and leave the tree clean and green when the loop stops. Surface a blocker with inline numbered options per =interaction.md= and wait. + +** A sweep: accumulate, don't mutate + +Each sweep runs Phase 0 (load *both* plugin dirs — the loud requirement still holds) and Phases A-D's scan / classify / synthesize, but performs *none* of the normal run's mutations: + +- Does NOT advance the sentinel. The scan window grows from the last *close* until the next close: every sweep scans from the existing sentinel up to now, so nothing between sweeps is dropped. +- Does NOT write =todo.org= Action tasks — accumulates them for the close. +- Does NOT take mail actions (trash / mark-read / star). +- Does NOT commit. +- DOES update an active daily-prep in Update mode and re-open it on change (per =daily-prep.org=). +- DOES report, deltas-only, with loud scan-failure banners (Phase C rules unchanged). + +** End-of-sweep output — three sections + +1. *Deltas* — what changed since the *previous sweep* (the standard Phase C summary scoped to the inter-sweep delta; one line if nothing: "HH:MM sweep: no changes"). +2. *Responses awaiting your acknowledgment* — every Slack reply, email, or message directed at Craig that he hasn't acknowledged or had the agent answer. A *running list carried forward across sweeps* until Craig acks each item or closes the triage. An away user's first need is "who's waiting to hear back from me," which a delta-only sweep loses the moment it scrolls past. +3. *Timestamp* — the current date, time, and timezone on the sweep's own final line, so an away reader sees how fresh the summary is without computing it. Print it on *every* sweep, including a quiet "no changes" one — on a quiet sweep the stamp is the proof the loop ran. Generate it with: + + #+begin_src bash + date "+%A %Y-%m-%d %H:%M:%S %Z (%z)" + #+end_src + +** The unacked list — durable state + +The awaiting-acknowledgment list lives in =.ai/triage-intake-unacked.org=, so it survives a session crash, a =/clear=, or a restart — the away-from-desk case auto mode exists for. It's project-local state, tracked the same way as the sentinel (=.ai/last-triage-intake=), created on first need. + +Shape — one =** = heading per awaiting item: + +#+begin_example +#+TITLE: Triage Intake — Responses Awaiting Acknowledgment +# Maintained by triage-intake auto mode. One heading per item; acked items are removed. + +** Dana — 2pm reschedule invite +:PROPERTIES: +:SOURCE: personal-calendar +:LOCATOR: <event id or thread url — the dedupe key> +:SINCE: 2026-06-15 10:42 +:END: +She's waiting on a yes/no to the move. +#+end_example + +- *Add* — a sweep appends any new directed-at-Craig response not already listed, deduped on =LOCATOR=. +- *Carry forward* — every sweep re-renders the full list in its second section, whether or not it changed this sweep. +- *Ack* — "ack <item>" (e.g. "ack the Dana thread") removes that heading; "ack all" clears the list. +- *Close* — a close empties the list as part of processing (each item is either actioned or filed). + +** Close and stop — the checkpoint + +The mutations are gated behind two commands: + +- *"close the triage"* — run the full close: take the accumulated mail actions, add the accumulated Action items to =todo.org= as =:quick:reactive:= tasks (asking Craig the questions a normal Phase C/D would), empty the unacked list, then *advance the sentinel* — capture the close run's Phase A timestamp, do the mutations, write that timestamp to =.ai/last-triage-intake= exactly as a normal run does (per "Capture the Phase A timestamp") — and commit + push the batch. Then *keep looping* (next sweep on the normal interval). This is the "flush the batch and carry on" checkpoint. +- *"stop the triage"* — the same close processing, then *stop the loop* and revert to manual (on-demand) triage. + +A close is the only point auto mode advances the sentinel or commits. Between closes the engine state is untouched — that is what makes a 20-minute sweep cheap and non-destructive, and it preserves the engine invariant: the sentinel still means "everything before this timestamp has been scanned," it just advances once per close instead of once per run. + +** Why a separate mode + +The standard engine is one-shot and mutating — right for an at-the-desk "what's new?" glance, wrong for unattended polling: run every 20 minutes it would advance the sentinel past unprocessed items, spray reactive todos, take mail actions, and commit noise without review. Auto mode separates the cheap, frequent *watching* from the deliberate, gated *committing*, and adds the away-user's missing primitive — the running unacked-responses list. + + +* Reference + +** Source Plugin Contract + +A source plugin is a file named =triage-intake.<source>.org=. The first dot after =triage-intake= is the engine/plugin boundary; the segment after it is the source id. Hyphens stay *inside* a segment (=triage-intake.personal-gmail.org= is engine =triage-intake=, source =personal-gmail=). Deeper dots (=triage-intake.<source>.<sub>.org=) are reserved for sub-adapters — YAGNI for now, but the namespace accommodates them at no cost. + +A plugin file declares exactly one source through a fixed shape: + +*Property drawer* on the top-level =* Source:= heading: +- =ORDER= — integer. Output ordering in the per-source breakdown (lower = earlier). +- =ENABLED= — the precondition the engine evaluates before loading the source. The source is skipped — *with an announced reason* — when it's false. Forms: =always=, a shell test (=command -v gh && gh auth status=), or =mcp <server> present=. +- =ANCHOR= — the cutoff format this source consumes: =epoch=, =iso8601=, =day=, or =none= (state-based source with no since-window — e.g. live IMAP unread, or open-PR state). The engine computes the anchor once and substitutes it in the requested format. +- =SUBAGENT_OVER= — integer. If the scan is expected to return more than this many items, dispatch a subagent for the source so its raw output stays out of main context. The subagent applies Phase B and returns buckets only, not the raw list. + +*Body sections:* +- =** Scan= — the command(s) that fetch new/unread items since =<anchor>=, emitting raw items. +- =** Classify= — the source's per-bucket bias and noise patterns. *Deltas* from the engine's shared four-bucket model below, not a re-derivation. +- =** Render= — the source's block in the Phase C summary. "Omit if empty." +- =** Actions= — the executable state-changes, one verb per line: =verb :: command template (parameterized by item id)=. + +Template: + +#+begin_example + +** Source: <id> +:PROPERTIES: +:ORDER: <n> +:ENABLED: <precondition> +:ANCHOR: epoch | iso8601 | day | none +:SUBAGENT_OVER: <n> +:END: + +*** Scan +<command(s) that fetch new/unread items since <anchor>> + +*** Classify +<bias + noise patterns; deltas from the shared four-bucket model> + +*** Render +"<Source label> — N <unit>" block; omit if empty. + +*** Actions +- <verb> :: <command, parameterized by <id>> +#+end_example + + +** Anchor: Since When? + +The workflow needs a "scan since" timestamp. Resolution order: + +1. *Sentinel file content:* first whitespace-delimited token in =.ai/last-triage-intake= is the Phase A scan-kickoff epoch from the most recent successful run (see "Capture the Phase A timestamp" below). Most accurate. +2. *Sentinel file mtime* (back-compat): if the file exists but is empty, read its mtime — that's the older mtime-based convention that pre-dates the content-based change. Still accurate on the machine that wrote it. +3. *Most recent prep doc:* if no sentinel content or readable mtime, anchor on the latest =daily-prep/YYYY-MM-DD-daily-prep.org= mtime. +4. *Most recent session file:* if none of the above, anchor on the most recent =.ai/sessions/= file's mtime. +5. *Session start:* fall back to the current session's start time. Last resort. + +The engine computes the anchor *once* and exposes it in every format a plugin might request (=epoch=, =iso8601=, =day=). Each plugin's =ANCHOR= field says which it consumes; the engine substitutes that form into the plugin's =<anchor>= placeholder. Sources with =ANCHOR: none= are state-based (live unread, open-PR state) and get no cutoff substituted — they report current state, and Phase B uses the anchor only to flag what's *new since* last check. + +*** Capture the Phase A timestamp + +Just before issuing the Phase A batch, capture the current epoch seconds: + +#+begin_src bash +PHASE_A_TS=$(date +%s) +#+end_src + +Hold this value through Phases B, C, and D. At end of run, *write* the captured timestamp into the sentinel's content (not its mtime): + +#+begin_src bash +echo "$PHASE_A_TS $(date -d "@$PHASE_A_TS" '+%Y-%m-%d %H:%M:%S %z')" > .ai/last-triage-intake +#+end_src + +The file ends up with a single line like =1778683109 2026-05-13 09:38:29 -0500= — epoch first (machine-readable, parsed by reading the first token), human-readable timestamp second. + +*Why content, not mtime:* the sentinel is checked into git. Git tracks content, not mtime, so an mtime-based sentinel is per-machine: one machine's anchor stays on that machine; a fresh clone gets the file but the mtime is whenever the clone happened, not the actual triage time. Writing the epoch as content means the anchor travels with the repo and stays accurate after a fetch + pull on any machine. + +*Why Phase A and not end-of-run:* Phase A runs at one moment, but Phases B-D may take 5-30 minutes. Items posted to any source /during/ Phases B-D land between the Phase A scan time and the eventual end-of-run time. If the sentinel were set to the end-of-run time, those items would silently fall through the cracks: the next triage's Phase A would skip the gap window and never see them. Anchoring the sentinel to Phase A's scan time guarantees the next run's window starts where this run's window ended, with zero gap. + +*** Reading the sentinel + +When the workflow needs the anchor at the start of a new run: + +#+begin_src bash +# Content-first, mtime-fallback. +ANCHOR_EPOCH=$(awk 'NR==1 {print $1; exit}' .ai/last-triage-intake 2>/dev/null) +if [ -z "$ANCHOR_EPOCH" ] && [ -f .ai/last-triage-intake ]; then + ANCHOR_EPOCH=$(stat -c %Y .ai/last-triage-intake) +fi +#+end_src + +If both fail, fall through to the resolution order above (prep doc → session file → session start). + + +** Output Template + +The summary follows this shape (deltas only: a source with no changes gets no block; when *nothing* changed anywhere, the whole summary collapses to the one-line form below — plus any scan-failure banners and the suggested-actions line if actions are queued): + +#+begin_example +17:39 sweep: no changes +#+end_example + +When there are changes, render one block per changed source in =ORDER=, using each plugin's =Render= shape: + +#+begin_example +**Anchor:** <previous run timestamp> → now (<elapsed> elapsed) +**Loaded:** <general plugins> + <project plugins> (skipped: <disabled, with reason>) + +**Top signals to act on:** +1. <terse Action description with link> +2. ... + +<one block per loaded source, in ORDER — see each plugin's Render> + +**Suggested actions:** +- Trash N noise items +- Mark-read M keep items +- Respond to <invite> +- Merge PRs #X and #Y +- ... +#+end_example + +Order matters: top-signals first because that's what Craig reads in 30 seconds between meetings. Per-source detail second. Suggested actions last because they require a decision. + + +** Common Mistakes + +1. *Globbing only =.ai/workflows/= and missing the project plugins.* The single most damaging failure mode — the sweep runs with half its sources and the omission is invisible (a missing source looks identical to a quiet one). Phase 0 globs *both* =.ai/workflows/triage-intake.*.org= and =.ai/project-workflows/triage-intake.*.org=, every run, and announces the loaded set. +2. *Running Phase A sequentially.* Send every enabled source's scan in one message — the whole point is parallelism. +3. *Wiring a source into the engine.* Sources live in plugin files, never here. If you find yourself editing this file to add an account, repo, or channel, stop — write or edit a =triage-intake.<source>.org= plugin instead. +4. *Executing actions without explicit confirmation.* Phase D runs only after Craig says "yes" or picks a subset. +5. *Forgetting to set the sentinel at the end.* Without it, the next run re-scans the same window. +6. *Using mtime instead of content for the sentinel.* Plain =touch= writes /now/ to mtime, stranding items posted between Phase A and end of run. =touch -d "@$PHASE_A_TS"= fixes the time but mtime is per-machine — git tracks content, not metadata, so the anchor doesn't survive a clone or cross-machine sync. Always write the epoch into the file's *content*. +7. *Running this alongside daily-prep.* Daily-prep already does this as Phase 3 — don't duplicate. +8. *Mixing Action and FYI in the top-signals list.* Top signals = Action only. FYI lives in the per-source detail. +9. *Reporting a failed or skipped scan as a quiet source.* A hung receive, a dead daemon, or a skipped spin-up looks identical to "no new messages" in the output unless it's flagged. The 2026-06-10 sweep shipped with Signal silently missing because the scan hung on an account lock. Failures lead the summary, in their own banner line. +10. *Rendering a per-source quiet roll-call.* "Calendar — quiet" / "PRs — nothing new" lines on every silent source bury the one change that matters and pad a no-change sweep into a report. Deltas only: changed sources get blocks, unchanged sources get nothing, and an all-quiet sweep is one line (Craig's 2026-06-11 ruling in Phase C). + + +* History / Design Notes + +** Living Document + +Update the engine as the orchestration pattern evolves; update a plugin as its source evolves. Source-specific learnings belong in the plugin's own file, not here. + +*** Updates and Learnings + +**** 2026-06-15: Auto mode (unattended monitoring) +Added a self-running mode for when Craig is away but wants tight awareness — a =/loop= in the live session running accumulate-don't-mutate sweeps with "close the triage" / "stop the triage" as the gated checkpoint. Born the morning Craig cleared his day for a family emergency and wanted the desk watched while in and out. Design decisions (work-project proposal, ratified by Craig 2026-06-15): the unacked-responses list is durable in =.ai/triage-intake-unacked.org= (survives a crash/clear, the away-from-desk case it exists for); the sentinel advances only at close, preserving the scanned-before invariant; delivery is an in-session loop so MCP auth is inherited (a detached cron schedule belongs to the morning-ops orchestrator, not here, because of the headless-auth wall); it stays a mode of this engine, distinct from but reusable by that orchestrator. Same-day addendum (work, 2026-06-15): each sweep ends with a date/time/timezone stamp on its own final line (printed on quiet sweeps too, as proof the loop ran) so an away reader gauges freshness at a glance. + +**** 2026-05-01: Initial creation +Extracted from daily-prep's Phase 3 pattern as a standalone, lightweight, between-meetings sweep. + +**** 2026-05-07: Anchor the sentinel to Phase A scan time, not run-end time +Gap-window bug: a run had Phase A fire at 13:35 and the sentinel set at 15:04, so an item posted at 14:20 would be skipped by the next run (the sentinel claimed everything before 15:04 was scanned when Phase A only reached 13:35). Fix: capture =PHASE_A_TS= just before Phase A, hold it through B-D, write it to the sentinel at end of run. The sentinel means "everything before this timestamp has been scanned," the only invariant that prevents items falling through the cracks. + +**** 2026-05-13: Move the sentinel from mtime to content (cross-machine survivability) +The sentinel is checked into git, but git tracks content, not mtime — so an mtime anchor is per-machine. Fix: write the captured epoch into the sentinel's content (=EPOCH ISO-8601=), read with =awk 'NR==1 {print $1}'=, mtime as back-compat fallback. + +**** 2026-06-11: Deltas-only reporting (Phase C + Output Template + Common Mistake 10) +Craig, via the work project's same-day handoff: "we only need to report if anything's changed when we do triage intake." Sweep summaries report deltas only — a new invite, a new/moved/cancelled event, a new message needing attention. Unchanged sources get no block (the "Calendar — quiet" roll-call is retired), and an all-quiet sweep renders as a single "HH:MM sweep: no changes" line. Failures keep their loud banner (never folded into the no-change line) and the suggested-actions line stays when actions are queued. Same ruling: the telegram plugin's dev-community group traffic is dropped from reports entirely unless Craig asks (see that plugin's 2026-06-11 note). + +**** 2026-06-10: Loud failure surfacing (Phase C item 0 + Common Mistake 9) +Craig: "highlight any failures in daily triage loudly. I get important communication from all these channels." Trigger: the 2026-06-10 sweep shipped with Signal silently missing — a standalone receive hung on the account lock while the signel daemon owned it, and the failure looked identical to a quiet source. Failures now lead the summary in a ⚠ SCAN FAILED banner; the telegram plugin's failure path points at this rule. + +**** 2026-05-26: Refactor into engine + source plugins +Split the monolithic workflow into a source-agnostic engine (this file) and per-source plugins named =triage-intake.<source>.org=. The engine carries the anchor/sentinel logic, the four-bucket model, the Phase A-D orchestration, the todo.org persistence convention, and the exit criteria. Each source's scan/classify/render/action knowledge moved to its own plugin. General plugins (personal-gmail, personal-calendar, cmail, github-prs) live in =.ai/workflows/= and are template-synced; project-specific plugins (a work project's Linear, work Gmail, work Slack, enterprise PRs) live in the project's =.ai/project-workflows/= and are never synced. Phase 0 globs *both* directories — the loud requirement, because missing the project dir silently halves the sweep. Naming convention: first dot is the engine/plugin boundary, deeper dots reserved for sub-adapters. This removed all DeepSat/Linear specifics from the engine; they become work-project plugins. + |
