diff options
| author | Craig Jennings <c@cjennings.net> | 2026-06-29 18:30:40 -0400 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-06-29 18:30:40 -0400 |
| commit | ce9af72ac4a5a6a30443ccdc4d8d785f0fe28c1b (patch) | |
| tree | 4ddf395ca6be035e858b251b799aadfc50aaa69e /docs/design/2026-06-29-waybar-network-module-spec.org | |
| parent | b248644109b79d91c3f2a8603cbaefaca05ced81 (diff) | |
| download | archsetup-ce9af72ac4a5a6a30443ccdc4d8d785f0fe28c1b.tar.gz archsetup-ce9af72ac4a5a6a30443ccdc4d8d785f0fe28c1b.zip | |
docs: finalize waybar network module spec (reviews incorporated)
Incorporated the review feedback and my inline comments into the network-module spec. It's now implementation-ready, every finding resolved.
The reviews reshaped the design in a few ways. Secrets stay in NetworkManager's own store instead of a separate GPG file, dropping that dependency. A net doctor mode plus Makefile targets make recovery work from a bare TTY when the GUI is down. The doctor classifies failures and stops at the right terminal state (needs-user-action, upstream-not-local, deferred-vpn) instead of looping destructive repairs. The module absorbs the airplane indicator, and enterprise WiFi add/edit is vNext (activate-only in v1, since the saved history has no enterprise networks). Added a failure-mode coverage table, exact user-facing strings, the test harness and coverage gate, and the panel UX flow.
Also corrected the spec's test framework from pytest to unittest, which is what the repo uses.
Diffstat (limited to 'docs/design/2026-06-29-waybar-network-module-spec.org')
| -rw-r--r-- | docs/design/2026-06-29-waybar-network-module-spec.org | 1557 |
1 files changed, 1422 insertions, 135 deletions
diff --git a/docs/design/2026-06-29-waybar-network-module-spec.org b/docs/design/2026-06-29-waybar-network-module-spec.org index bf195b8..37b87b0 100644 --- a/docs/design/2026-06-29-waybar-network-module-spec.org +++ b/docs/design/2026-06-29-waybar-network-module-spec.org @@ -2,12 +2,26 @@ #+AUTHOR: Craig Jennings & Claude #+DATE: 2026-06-29 +* Status + +Ready for Phase 1; Ready-with-caveats overall. Three Codex review rounds + Craig's +cj comments are all incorporated — every finding has a disposition and the findings +cookie reads complete ([31/31]), with no open decisions (enterprise scope settled: +open + WPA-PSK in v1, 802.1X add/edit vNext, activate-only). The cj comments +reshaped several decisions (no separate credential store — use NM's own; =net +doctor= + Makefile console-recovery in v1; rfkill + full-stack-bounce repair; +airplane module absorbed; VPN a later Phase 5). The only remaining caveats are +Phase-2/3 build unknowns named under Open items (gtk4-layer-shell anchoring, the +=captive= =--json= refactor) — not Phase-1 blockers. Phase 1 (indicator + console +recovery) is ready to build. + * Goal One waybar network component that does the whole job: shows connection state (including the missing "associated but no internet / captive portal" state), -manages connections from a dropdown (nmcli-backed, optional GPG-encrypted secret -store), and runs the network diagnostics and remediation off the same place +manages connections from a dropdown (nmcli-backed; secrets stay in +NetworkManager's own store, no separate credential file), and runs the network +diagnostics and remediation off the same place (captive-portal detection + forcing, bounce/reset, gateway/DNS checks, speed test). @@ -15,8 +29,9 @@ It unifies three todo tasks that are really one feature: - =[#C]= "archsetup Waybar Wi-Fi module should show no-internet state" — the indicator state plus the 2026-06-22 roam expansion (bounce, diagnostics, speed test off the component). -- =[#B]= "Network-manager dropdown, nmcli-backed with GPG-stored secrets" — the - management dropdown. +- =[#B]= "Network-manager dropdown, nmcli-backed" — the management dropdown. (The + todo task's original "GPG-stored secrets" framing is superseded: secrets stay in + NM's own store, decision 5.) - The network diagnostics already shipped in =captive= (the hotel/captive-portal tool, formerly =login-page=) become this module's diagnostics engine rather than a standalone CLI. @@ -26,31 +41,61 @@ It unifies three todo tasks that are really one feature: ** In - *Indicator* — wifi/ethernet icon + signal + SSID, plus an internet sub-state: online / captive / no-internet / connecting / disconnected / airplane. +- *Absorbs the airplane module* — the airplane state + toggle move into + =custom/net= (airplane is a network concern). Once this ships, the standalone + =custom/airplane= module, the =waybar-airplane= + =airplane-mode= scripts, their + =tests/=, and the css are deleted (listed under Files touched). The + desktop-settings panel (sibling =[#B]=) no longer needs an airplane row. - *Interface-correct* — targets the wifi (or chosen) device, not the default-route interface, so an active USB tether or wired link can't mask wifi state. (Same lesson =captive= fixed; the current =custom/netspeed= keys off the default route and has the bug.) - *Connection management (panel)* — list saved connections most-recently-used - first, live signal for in-range wifi, click to switch; add / edit / remove; - ethernet↔wifi and wifi↔wifi switching even when a link appears mid-session. -- *Diagnostics (panel)* — captive probe (204-vs-portal) with the extracted - portal URL and an Open button; bounce/reset (fresh MAC); gateway ping; DNS - config + temporary 1.1.1.1 override test. + first, live signal for in-range wifi, click to switch; add / edit / remove for + open + WPA-PSK; activate any existing saved profile (including enterprise ones + NM already stores); ethernet↔wifi and wifi↔wifi switching even when a link + appears mid-session. +- *Diagnostics (panel)* — read-only Diagnose (captive probe 204-vs-portal with + the extracted portal URL, gateway ping, DNS config) separated from mutating + Repair. Repair has tiers, lightest first: rfkill-unblock, per-connection reset + (fresh MAC), full-stack bounce (=nmcli networking off/on=, then restart + NetworkManager if that fails), and the temporary 1.1.1.1 override test. Each + Repair action confirms and verifies cleanup. - *Speed test (panel)* — down/up/ping with a progress indicator and last-result - shown. -- *Credential store* — optional GPG-encrypted connection+secret file under - =~/.config=, opt-in (default unencrypted), passphrase cadence via gpg-agent - TTL. Supplements NetworkManager, does not replace it. + shown, via the already-installed =speedtest-go --json=. +- *Connection secrets* — none of our own. Settings and passwords live where NM + already keeps them: =/etc/NetworkManager/system-connections/*.nmconnection= + (root-only =0600=, the PSK/EAP secret stored inline). We read/write them through + nmcli, which handles the privilege. No separate file, no GPG, no gpg-agent — one + fewer dependency, and NM's store is already the secure-at-rest source of truth. - *Persistence* — connectivity probe result cached in the runtime dir so the bar reads it cheaply between probes. +- *Observability* — a redacted JSONL event log so a post-failure session can + diagnose without re-running destructive actions. ** Out (v1, note for later) - No replacement of NetworkManager's connection engine. NM stays the thing that connects; we drive it via nmcli. -- No VPN / wireguard management (separate tooling already exists). -- No per-connection captive-portal auto-login automation beyond opening the - portal page. +- No add/edit *form* for WPA-Enterprise / 802.1X in v1. The reason is effort vs + payoff: 802.1X has many interdependent fields (CA cert, client cert, identity, + anonymous identity, phase-2 auth) where a wrong entry silently fails auth, so a + trustworthy form is a lot of UI for connections Craig rarely adds (open + + WPA-PSK covers home, hotels, and phone hotspots). v1 still *activates* existing + saved enterprise profiles and points editing at =nmtui=/=nmcli=. Settled + (Craig, 2026-06-29): enterprise add/edit is vNext — 24 saved profiles on velox, + 0 enterprise, so the form would be unused UI; if one ever appears nmtui adds it + once and the module activates it thereafter. +- No per-connection captive-portal *auto-login* in v1. (That would mean storing a + portal's login form answers — room number, surname, a checkbox — and replaying + them automatically when a known portal is detected, so the page never appears. + Out for v1 because every portal's form differs and it means storing per-venue + answers; v1 just opens the portal for you.) - No graphing/history of speed-test results beyond the last run. +- No static-IP / proxy / metered / MAC-randomization editing in v1 (activate + existing, edit elsewhere). +- No VPN / WireGuard management in v1, but it's a planned later phase (Phase 5), + not a permanent exclusion — it folds the existing archsetup wireguard tooling + into the same panel/CLI. - The desktop-settings dropdown (sibling =[#B]=) is a separate module, but it shares the GTK4 layer-shell panel shell built here. @@ -58,11 +103,12 @@ It unifies three todo tasks that are really one feature: Three layers. Keep the bar cheap, the panel rich, the logic in one tested place. -1. *Engine* — a =net= Python package (src-layout, pytest), exposing a CLI. Wraps +1. *Engine* — a =net= Python package (src-layout, unittest), exposing a CLI. Wraps every nmcli op and owns the diagnostics. Emits JSON. This is the testable - core (fake =nmcli= / =curl= on PATH, like the existing =waybar-netspeed= and - =waybar-sysmon= test harnesses). Precedent: pocketbook is Python in the - dotfiles repo; =wtimer= is Python for the same testability reason. + core (fake =nmcli= / =curl= / =speedtest-go= on PATH, like the existing + =waybar-netspeed= and =waybar-sysmon= test harnesses). Precedent: pocketbook is + Python in the dotfiles repo; =wtimer= is Python for the same testability + reason. 2. *Indicator* — a thin =waybar-net= script that calls =net status --json= and renders icon + signal + state + tooltip. Replaces =custom/netspeed= (throughput folds into the tooltip). @@ -75,15 +121,48 @@ How the existing pieces map in: interactive portal-force flow (sudo reset, DNS override, browser launch). Its cheap portal-detection logic is mirrored natively in the engine for the fast status path so the bar never blocks on a subprocess. =captive= stays a usable - standalone CLI. + standalone CLI. The refactor (below) extracts its probe + reset into functions + the engine can call non-interactively. - =waybar-netspeed= (sh, shipped) — retired; its throughput sampling moves into the engine's status output and renders in the indicator tooltip only. - =nmcli= — the connection backend for every op. Language note: the engine is Python; the indicator is a thin Python or sh -wrapper over =net status --json=. The bar path must stay fast (sub-100ms for the -cheap poll), so the indicator does no network I/O itself — it reads link state -and the cached connectivity result. +wrapper over =net status --json=. The bar path must stay fast (see Performance +budgets), so the indicator does no network I/O itself — it reads link state and +the cached connectivity result. + +* Repository + dependencies + +- *Code lives in the dotfiles repo* (=~/.dotfiles=), not archsetup. The =net= + package sits in-tree like pocketbook (src-layout, unittest, Makefile target); + =waybar-net= and the =net= CLI entry live in the hyprland tier + (=hyprland/.local/bin/=). Tests under =tests/net/= and =tests/waybar-net/=. + archsetup owns only the *dependency install*, not the code. +- *archsetup installs the deps* in its Hyprland step: =gtk4-layer-shell=, + =python-gobject=, plus =nmcli=/=curl=/=resolvectl=/=rfkill= (already present via + NetworkManager/curl/systemd/util-linux). Speed test uses =speedtest-go= (AUR + =speedtest-go-bin=, already installed on velox); archsetup adds it to the AUR + list. librespeed-cli is the documented fallback if a self-hosted LibreSpeed + server is ever wanted. No =gpg= dependency (secrets live in NM's own store). +- *Daily-drivers*: a stowed-script + AUR-dep feature, so ratio needs the same + =git pull= + stow + the archsetup-added deps. Note the manual dep step in the + rollout. + +** Makefile targets (console recovery is a first-class path) +=net doctor= and the diagnostics are reachable from a bare TTY when waybar and +the GUI are down — that's the case where you most need them. The dotfiles +Makefile carries targets that wrap the =net= CLI so "get back online" is one make +command from the console: +- =make online= — =net doctor --fix= (diagnose, then apply the lightest repair: + rfkill-unblock → reset → bounce → open portal). The headline recovery target. +- =make net-doctor= — =net doctor= (read-only diagnose + recommendation). +- =make net-status= / =make net-diagnose= / =make net-portal= / =make net-reset= + / =make net-bounce= — the individual ops. +- =make test= — already runs =tests/*=; the =net= package's unittest suites are + collected the same way. +These intentionally need only nmcli/curl/rfkill (no GUI, no waybar, no Python +GTK), so they work from a TTY on a broken graphical session. * Connectivity model — split cadence @@ -91,208 +170,1416 @@ The indicator polls every ~2s, but a real internet/captive probe every 2s wastes battery and can re-trigger a captive portal. So split it: - *Fast path (every poll, cheap, no network)* — interface, type, SSID, signal, - IPv4 presence, throughput sample. From nmcli / sysfs only. + IPv4 presence, throughput sample. From nmcli / sysfs only. No network I/O. - *Slow path (cached, TTL ~45s)* — the actual internet/captive probe (the 204 check + meta-refresh portal extraction). Result cached at =$XDG_RUNTIME_DIR/waybar/net-connectivity.json= with a timestamp. The indicator reads the cache each poll. When the cache is older than the TTL, -=net status= kicks =net probe= in the background (non-blocking) and renders the -last cached sub-state meanwhile. A user-triggered diagnose/reconnect refreshes -the cache immediately. This keeps the bar responsive and the portal un-poked. +=net status= kicks =net probe= in the background (spawn + detach, never awaited) +and renders the last cached sub-state meanwhile. A user-triggered +diagnose/reconnect refreshes the cache immediately. This keeps the bar +responsive and the portal un-poked. + +** Concurrency, atomicity, staleness +- *Single-flight* — =net probe= takes a lock file at + =$XDG_RUNTIME_DIR/waybar/net-probe.lock= (flock, non-blocking). A second probe + while one runs is a no-op, so a flapping 2s poll can't pile up overlapping + probes. +- *Atomic writes* — the cache is written to a temp file + =os.replace= (atomic + rename), so a reader never sees a half-written cache. Same pattern as =wtimer=. +- *Max probe runtime* — the probe has a hard timeout (≤ 6s total: curl + =--max-time 5= + slack). On timeout it writes an =unknown= result, never hangs. +- *Stale classes* the indicator distinguishes: fresh (< TTL), stale (TTL..3×TTL, + shown with a subdued/aging hint), expired (> 3×TTL → treat as unknown), + unknown (no cache / probe failed). The bar never shows a confident "online" + past the expired threshold. +- *Invalidation* — the cache records the iface + SSID + active-connection UUID it + was taken under; a change in any of them invalidates it immediately (a + reconnect must not show the old network's verdict). +- *Crash cleanup* — a stale lock older than the max runtime is ignored/reclaimed. + +* Performance budgets (hot path) + +The bar exec path (=waybar-net= → =net status=) must stay responsive: +- *Budget*: =net status= returns in < 100ms typical, < 250ms worst case. +- *No sleeping in the bar path.* Throughput is sampled from two reads of + =/sys/class/net/<iface>/statistics/{rx,tx}_bytes= across the *waybar poll + interval itself* (delta since the last cached sample + timestamp), not via an + in-process =sleep= like the old =waybar-netspeed=. The cache holds the prior + counters. +- *Subprocess cap*: at most one =nmcli= invocation on the hot path (a single + =nmcli -t -f ...= multi-field query), plus sysfs reads. Never a per-field + nmcli call. +- *Every subprocess has a timeout* (=nmcli --wait 2=, =subprocess timeout=). On + timeout or error the indicator emits a degraded JSON state (class + =net-degraded=, a neutral glyph) rather than blocking or crashing waybar. +- *Benchmark test*: a fake slow =nmcli= asserts =net status= still returns within + budget by falling back to the degraded state. * Engine — =net= CLI surface All subcommands take =--json= where a machine reads them. Pure formatting/state -functions under the CLI; IO (nmcli, curl, file) at the edges. +functions under the CLI; IO (nmcli, curl, file) at the edges. Every subcommand +exits non-zero with a JSON error envelope (see JSON schemas) on failure. - =net status [--json] [--iface IF]= — fast link state + cached connectivity - sub-state + throughput. The indicator's source. + sub-state + throughput. The indicator's source. Never does network I/O. - =net probe [--iface IF]= — run the connectivity/captive probe now, update the - cache, print online | captive (+ portal URL) | no-internet. Mirrors =captive='s - cheap detection natively. + cache (single-flight, atomic), print online | captive (+ portal URL) | + no-internet | unknown. Mirrors =captive='s cheap detection natively. - =net list [--json]= — saved connections, MRU order, active flag, plus in-range wifi with signal. -- =net up <id|ssid>= / =net down [--iface IF]= — switch / disconnect. -- =net add= / =net edit <id>= / =net remove <id>= — manage connections; sync the - GPG store (below). +- =net up <uuid>= / =net down [--iface IF]= — switch / disconnect. Operates on + UUID, not name (see nmcli contract). +- =net add= / =net edit <uuid>= / =net remove <uuid>= — manage connections + (open + WPA-PSK) through nmcli; the secret lands in NM's own + =.nmconnection=. Enterprise profiles are activate-only. - =net rescan [--iface IF]= — wifi rescan. -- =net diagnose [--json]= — full report: gateway ping, DNS config, DNS 1.1.1.1 - override test, captive probe. Shells to =captive= for the interactive/sudo - parts; native for the read-only parts. +- =net diagnose [--json]= — read-only report: gateway ping, DNS config, captive + probe. The structured contract below. Doubles as the post-failure snapshot. +- =net repair <action> [--json]= — mutating remediation, lightest first: + =rfkill= (unblock + radio on), =reset= (fresh MAC), =bounce= (full-stack: + =nmcli networking off/on=, escalating to =systemctl restart NetworkManager=), + =dns-test= (temporary 1.1.1.1 override, auto-reverted). Each confirms via the + caller and verifies cleanup. +- =net doctor [--json] [--fix]= — one-shot "get me online" mode for the console: + runs the full diagnose, then applies the lightest repair that fits (unblock + rfkill, reset, bounce, open portal) — read-only without =--fix=, acting with + it. The TTY recovery path when waybar/the GUI is down (see the Makefile + targets). - =net portal= — run =captive='s portal-force flow (reset if needed, extract + open the portal page). -- =net reset [--hardware-mac]= — fresh-MAC reconnect (=captive='s =fresh_mac=). -- =net speedtest [--json]= — librespeed run; down/up/ping. +- =net speedtest [--json]= — =speedtest-go --json= run; down/up/ping. + +* nmcli contract + +The command wrapper is the reliability boundary; SSIDs and connection names +contain spaces, colons, duplicates, hidden names, and non-ASCII. Rules: + +- *Terse, field-selected output*: =nmcli -t -f <fields> --escape yes ...= and + =nmcli -g <fields> ...= (get-values) for single-value reads. Parse with the + documented escaping (=\:= and =\\=); never naive =cut -d:=. +- *UUID is the handle.* Every saved-profile op (=up=, =down=, =modify=, =delete=) + uses the connection UUID, never the display name — names duplicate and contain + separators. =net list= surfaces UUIDs; the panel maps row → UUID. +- *Wait budgets*: activation/deactivation use =nmcli --wait <n>= with an explicit + budget (hot-path reads =--wait 2=; activation =--wait 30=). No unbounded waits. +- *Connectivity*: NM's own =nmcli networking connectivity= can return + =none/portal/limited/full/unknown=. Use it as a *cheap hint* on the fast path + when present, but the authoritative captive verdict is still our own probe + (NM's portal detection is coarser and config-dependent). +- *Parser tests* (fake nmcli fixtures): escaped colons and backslashes in SSIDs, + embedded newlines, duplicate connection names, hidden SSID (empty name), + non-ASCII SSID, the wired-appears-mid-session case, and the multi-active case + (wifi + tether both up). + +* JSON schemas + +Versioned (="v": 1=) envelopes so tests lock the contract. Sketches (fields +nullable unless noted): + +- =status=: ={v, iface, type: wifi|ethernet|none, ssid, signal, ipv4, + gateway, throughput: {rx_bps, tx_bps}, connectivity: online|captive|no-internet|unknown, + connectivity_age_s, connectivity_class: fresh|stale|expired|unknown, state: + online|captive|no-internet|connecting|disconnected|airplane|wired|degraded}=. +- =probe=: ={v, result: online|captive|no-internet|unknown, portal_url, http_code, + redirect_host, elapsed_ms, ts}=. +- =list=: ={v, connections: [{uuid, name, type, active, last_used, signal, + in_range, security}]}=. +- =diagnose=: ={v, steps: [<diagnostic step, see contract>], overall: + ok|warn|fail}=. +- =speedtest=: ={v, down_mbps, up_mbps, ping_ms, server, elapsed_ms, ts}=. +- error envelope (any command): ={v, error: {code, message, detail, partial: + bool}}= with a non-zero exit. + +* Diagnostics contract + +=net diagnose --json= returns an ordered list of steps. Each step is the unit the +panel renders and the log records: + +- =id= — stable identifier (e.g. =link=, =dhcp=, =gateway-ping=, =dns-config=, + =dns-resolve=, =http-probe=, =portal=). +- =status= — =pending | running | pass | warn | fail | skipped=. +- =title= — short human label. +- =evidence= — redacted detail (the value seen), per the redaction rules. +- =elapsed_ms=. +- =safety= — =read-only= or =mutating= (diagnose steps are all read-only). +- =next_action= — what the user/agent should do on warn/fail (e.g. "open portal", + "reset connection", "switch network"). + +Repair actions (=net repair=) carry the same shape but =safety: mutating=, plus a +=cleanup_verified: bool= field (e.g. the DNS override was reverted) and a +terminal =cleanup-unverified= status when revert can't be confirmed. + +** Diagnose vs Repair (read-only vs mutating) +The panel separates them visually and behaviorally: +- *Diagnose* — probe, gateway ping, DNS config read, captive check. No state + change, no sudo, runnable freely. +- *Repair* — reset (fresh MAC, deletes+recreates the NM profile), DNS override + test (mutates resolver, auto-reverts), portal force. Each needs an explicit + confirm, shows that it's privacy/state-changing, and verifies cleanup. A + Repair whose cleanup can't be verified ends in a visible =cleanup-unverified= + state, never a silent success. + +* Failure states, messages, recovery + +Each row below gives the *exact, final* user-facing string (not a template) with +=<placeholders>= for redacted evidence, plus the evidence field included and the +next action. The string is canonical: every surface renders the same text, so +there's one source of truth. + +Per-surface rendering of the canonical string: +- *Indicator* — the matching glyph + CSS class; the string is the tooltip + (untruncated). +- *Notification* (=notify=) — title = the failure label, body = the string. +- *CLI* — the string on stderr; =--json= puts it in =error.message= with the + evidence in =error.detail= and a stable =error.code=. +- *Panel* — the string as the section banner, with the diagnostic step's evidence + shown beneath. +Evidence is always redacted per the redaction rules (SSID/host shown; PSK/EAP/ +portal tokens never). + +- *associated, no DHCP* — "Connected to <SSID>, no IP (DHCP failed)" → + evidence: SSID, iface → reset / reconnect. +- *no-internet* — "On <SSID>, no internet (gateway reachable, no route out)" → + diagnose / switch network. +- *captive* — "Captive portal at <host> — login required" → Open portal. +- *DNS hijack* — "DNS is being redirected (portal)" → Open portal. +- *DNS broken* — "DNS not resolving (hotel DNS down); 1.1.1.1 works" → use + override / report. +- *HTTP intercepted* — "Traffic is being intercepted before it leaves" → Open + portal. +- *sudo declined* — "Reset needs admin; it was declined — nothing changed" → + retry with auth. +- *command timed out* — "<op> timed out; the system was left unchanged" → retry. +- *partial mutation* — "<op> partially applied: <what>; rolled back to <state>" + → review. +- *missing speedtest-go* — "speedtest-go not installed" → install hint. +- *no wifi hardware* (desktop) — wifi rows hidden; ethernet-only view. +- *wifi rfkill-blocked* — "WiFi is blocked (rfkill)" → unblock. The indicator + detects a soft-blocked radio (=rfkill list= shows the radio off though hardware + is present) and shows this distinct from disconnected. =net repair rfkill= (and + =net doctor --fix= as its first step) runs =rfkill unblock wifi= + =nmcli radio + wifi on= and reconnects. This is the framework-laptop case: an out-of-power + shutdown sometimes leaves wifi soft-blocked at next boot, and yes — the module + recovers it (the rfkill state is the indicator; the rfkill repair / doctor is + the one-step fix). A *hard* block (physical switch) is reported as + not-recoverable-in-software with that message. +- *wifi rfkill hard-blocked* — "WiFi is blocked by the hardware switch" → + evidence: rfkill hard state → flip the physical switch. +- *wrong password / missing secret* — "Saved password for <SSID> was rejected" → + evidence: SSID, NM auth-failure reason → re-enter the password. +- *enterprise auth/cert failure* — "Enterprise login failed for <SSID> (802.1X)" + → evidence: SSID, EAP failure reason → edit the profile in nmtui/nmcli. +- *upstream / AP / provider* — "On <SSID>, link is fine but the network has no + uplink" → evidence: gateway reachable, no route out, not a portal → switch + network or contact the venue. +- *VPN-routed* — "Connected; internet is routed through a VPN (<dev>)" → + evidence: default route on a tun/wg device or non-NM DNS owner → check the VPN, + not WiFi. +- *HTTP interception, no parseable portal URL* — "A portal is intercepting + traffic but didn't give a login link" → evidence: HTTP code, redirect host → + opens neverssl + the gateway page to log in manually. +- *DNS override cleanup unverified* — "Couldn't confirm DNS was restored after the + test" → evidence: iface, attempted revert → revert DNS manually + (=resolvectl revert <iface>=). + +Each message names whether the system was left unchanged, partially changed (with +what), or fully changed, so the user knows the residue. + +* Doctor: escalation, classification, terminal states + +=net doctor= diagnoses, classifies the failure, then (with =--fix=) applies the +*lightest* repair that fits and re-checks — it never loops destructive repairs +against a failure they can't fix. Each failure resolves to one of four outcomes, +and the doctor stops at any terminal one: + +- =fixable= — a local repair should help. Escalate lightest-first: rfkill-unblock + → reset (fresh MAC) → bounce (full stack) → portal, re-probing after each, and + stop as soon as the probe returns online. +- =needs-user-action= (terminal) — no reset/bounce will help; doctor stops and + names the exact next step. Covers: wrong WPA password / missing NM secret + (enter the password), locked keyring or polkit denial (retry with auth), + enterprise 802.1X cert/identity failure (edit the profile in =nmtui=/=nmcli=), + captive portal login-required (open the portal + accept terms). Doctor must not + delete/recreate the profile against these — that loses the saved password and + makes things worse. +- =upstream-not-local= (terminal) — the local link is up but the problem is past + it: AP has no uplink, gateway down/dropping traffic, DHCP server broken, ISP + outage, portal backend failing. =diagnose= proves it (link up + IP + gateway + reachable, but no route out and not a captive redirect), and =doctor --fix= + stops after local repairs are exhausted with "local repairs tried; likely + upstream/AP/provider" + the evidence. Next action: switch networks or contact + the venue. +- =deferred/vpn= (terminal for v1) — an active VPN / policy route / non-NM + resolver owns the default route or DNS, so "no internet" may be the VPN's fault, + not WiFi's. v1 *detects* this (default route on a =tun/wg= device, or DNS owned + by something other than the NM link) and classifies it separately — "link is + fine; internet is VPN-routed" — rather than misclassifying it as a WiFi failure. + v1 does not repair it (VPN management is Phase 5); it names the VPN as the likely + owner and stops. + +** DNS handling in doctor (explicit per class) +- *Captive DNS hijack* — open the portal (the hijack clears on login). No DNS + mutation. +- *Broken resolver, 1.1.1.1 works* — doctor offers an explicit *temporary* 1.1.1.1 + override as a repair with cleanup verification (auto-revert, =cleanup_verified=); + without =--fix= it only recommends the command. It does not leave a permanent + resolver change. +- *Port-53 / egress blocked* (even 1.1.1.1 fails) — terminal =upstream-not-local=; + doctor stops, since it's not locally fixable. + +* Failure-mode coverage + +For each common field failure: does =net diagnose= detect it, can =net doctor +--fix= repair it, and what terminal user action remains when it can't. (The +=needs-user-action= / =upstream-not-local= / =deferred/vpn= outcomes are defined +above.) + +| Failure mode | diagnose detects | doctor --fix | terminal user action | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| rfkill soft block | yes | yes (unblock) | none | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| rfkill hard block | yes | no | flip the physical switch | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| no wifi hardware | yes | n/a | use ethernet | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| associated, no DHCP | yes | yes (reset/bounce) | none, else switch network | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| gateway unreachable | yes | yes (bounce) | switch network if it | +| | | | persists | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| captive DNS hijack | yes | opens portal | log in at the portal | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| broken DNS, 1.1.1.1 works | yes | yes (temp override, | report the venue's DNS | +| | | auto-reverted) | | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| HTTP captive portal | yes | opens portal | log in at the portal | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| HTTP interception, no | yes | opens neverssl + gateway | log in manually | +| parseable URL | | | | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| upstream / AP outage | yes (link up, no route out) | no (stops after local) | switch network / contact | +| | | | venue | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| wrong WPA password / | yes | no | enter the password | +| missing secret | | | | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| enterprise auth / cert | yes | no | edit the profile in | +| failure | | | nmtui/nmcli | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| duplicate SSID / | yes (UUID-keyed) | yes (activate by UUID) | none | +| connection-name | | | | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| hidden SSID | yes | yes (connect by name) | enter SSID + password | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| multiple active links | yes | n/a | pick the interface | +| (wifi+tether) | | | | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| wedged NetworkManager | yes | yes (bounce → restart NM) | none, else reboot | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| slow / hung command | yes (degraded) | retries within budget | retry | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| stale / corrupt cache | yes | self-heals (atomic + | none | +| | | invalidation) | | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| DNS cleanup failure | yes | flags cleanup-unverified | revert DNS manually | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| missing speedtest backend | yes | n/a | install speedtest-go | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| +| VPN / policy-routing | yes (route/DNS ownership) | no (deferred to Phase 5) | check the VPN | +| interference | | | | +|----------------------------+-----------------------------+-----------------------------+-----------------------------| + +* Observability — logging + redaction + +- *Event log*: JSONL at =$XDG_STATE_HOME/net/events.jsonl= (fallback + =~/.local/state/net/events.jsonl=), size-rotated (e.g. 1 MB × 3). Every + mutating op and probe appends an event: =ts, op, argv (redacted), exit_code, + stderr_tail, elapsed_ms, iface, nm_uuid, probe_url_class, http_code, + redirect_host, cache_event=. +- *Redaction (always on)*: PSKs, EAP identities/passwords, NM secrets, and + portal query tokens are never logged. MAC addresses, full IPs, and SSID are + redacted when configured (=redact_mac=, =redact_ip=, =redact_ssid= in config). +- *Post-failure diagnosis*: =net doctor --json= is the snapshot + recommendation + (diagnose plus the suggested repair), =net diagnose --json= the raw report, and + the event log the history. =net doctor= is the console-recoverable entry point + (reachable as =make online= / =make net-doctor=). +- *Secret-leak tests*: assert no PSK/EAP/portal-token ever appears in any JSON + output, log line, or error message. * Indicator (task #C — Phase 1, the fast win) ** States (internet sub-state on top of link state) - online — associated and the probe returned 204. Normal icon. - captive — associated, probe hit a portal. Distinct glyph + warning CSS class; - tooltip names the portal host; left-click opens the panel's diagnostics with - the portal ready to open. + tooltip names the portal host; left-click opens diagnostics with the portal + ready to open (Phase 2+; see interactions for the Phase-1 interim). - no-internet — associated, probe failed (no portal, no 204). Distinct glyph + warning class. +- degraded — =net status= couldn't read link state within budget (slow/failed + nmcli). Neutral glyph, =net-degraded= class. Never blocks the bar. +- rfkill-blocked — the radio is soft-blocked (=rfkill=), distinct from + disconnected. Distinct glyph; the fix is =net repair rfkill= / =net doctor=. - connecting / disconnected / airplane / wired — as today, plus wired shown - correctly even when it appears after session start. + correctly even when it appears after session start. (airplane is now this + module's state, absorbed from the retired airplane module.) ** Glyphs Nerd-font codepoints, final values verified live before merge (same discipline as wtimer). Reuse the signal-strength ramp already in =waybar-netspeed=; add a -captive/no-internet overlay glyph. +captive / no-internet / degraded overlay glyph. ** Tooltip SSID + signal + IPv4 + gateway + the throughput readout (absorbed from -netspeed) + the last probe result and age. +netspeed) + the last probe result and its age (stale/expired hinted). -** Interactions (no keyboard-modifier clicks — waybar can't qualify clicks by -modifier; the panel hosts the rich actions) -- left-click — open the panel. -- right-click — quick reconnect / bounce (=net reset=, no panel). -- middle-click — run =net portal= (force the captive page). -- scroll — cycle nothing in v1 (reserved; could cycle saved connections later). +** Interactions (phase-aware; no keyboard-modifier clicks — waybar can't qualify +clicks by modifier, so the rich actions live in the panel, not ctrl/super-click) +Clicks never block the bar: each dispatches a detached background job and reports +via =notify=, single-flight per action. +- *Phase 1 (no panel yet)*: left-click runs =net probe= + notify (refreshes the + state on demand) and keeps the existing =pypr toggle network= scratchpad as the + interim manager; right-click runs =net repair reset= in the background + + notify; middle-click runs =net portal=. +- *Phase 2+ (panel exists)*: left-click opens the panel (focused on the relevant + section — diagnostics when captive); right/middle keep the background + reset/portal shortcuts. * Panel (tasks #B + #C diagnostics — Phases 2-3) -GTK4 + gtk4-layer-shell, pocketbook scaffold (src-layout package, pytest, +GTK4 + gtk4-layer-shell, pocketbook scaffold (src-layout package, unittest, Makefile, gtk4-layer-shell anchored dropdown under the bar). One panel shell, reused by the future desktop-settings panel. Sections: 1. *Connections* — list, MRU-first, active marked, live signal bars for in-range wifi; row click switches; buttons for add / edit / remove; a rescan control. -2. *Diagnostics* — buttons: Probe (204/captive, shows portal URL + Open), - Bounce/Reset, Gateway ping, DNS override test. Streaming output area. -3. *Speed test* — Run button, progress, down/up/ping result + last-run line. +2. *Diagnose* (read-only) — Probe (204/captive, shows portal URL + Open), Gateway + ping, DNS config. Streaming step output (the diagnostics contract). +3. *Repair* (mutating, confirmed) — tiered lightest-first: Unblock rfkill, Reset + (fresh MAC), Bounce (full stack), DNS override test, Force portal. A "Get me + online" button runs =net doctor --fix= (the auto-escalating sequence). +4. *Speed test* — Run button, progress, down/up/ping result + last-run line. + +** Panel state, cancellation, permissions +State machines for: connection-list loading, rescan-in-progress, +activation-in-progress, diagnose-running, repair-running, speedtest-running. Plus +the real terminal states on this two-machine fleet: no-wifi-hardware (desktop → +ethernet-only view) and missing speedtest-go. (No GPG-key state — there's no +credential store; secrets live in NM.) ("No NetworkManager" is not a modeled +state — NM is always present +on these machines; if nmcli is somehow absent the panel shows a single hard-error +and exits.) Long operations show elapsed time and are cancellable where the +underlying op allows (rescan, speedtest, probe); clearly non-cancellable ones +(an in-flight activation) show elapsed + a disabled control. Permission-denied +(sudo/polkit declined) is a first-class outcome with the "nothing changed" +message, never a silent failure. Interaction-pattern catalog (=~/code/rulesets/patterns/=) principles that apply: - transient-state-buttons — all the network levers in one place, reachable by one chord (the bar click), state visible. - default-most-common-friction-proportional — connections MRU-ordered so the - common pick is first; destructive ops (remove) get a confirm, switching does - not. + common pick is first; destructive ops (remove) and privacy-changing ones + (reset, override) get a confirm, switching does not. - one-prompt-picker-typed-prefix — if the connection picker ever goes keyboard-driven, kind (wifi/eth/saved/in-range) + name in one typed picker. +** Panel UX flow (settle before Phase 2) +The concrete interaction defaults, so the GTK build isn't inventing them: +- *Default focus*: the Connections section, current connection's row selected. If + the indicator opened the panel because of a captive/no-internet state, focus + Diagnose instead with the relevant action highlighted. +- *Row content*: glyph (signal bars / wired / active check) + name + a secondary + line (security type, "active"/last-used). The active row is visually pinned at + top of its group. +- *Buttons*: one *primary* per section (Connections: Connect to the selected row; + Diagnose: Run diagnose; Repair: "Get me online"; Speed test: Run). Secondary + actions (add / edit / remove / rescan; individual repair tiers) are smaller and + grouped. +- *Disabled rules*: Connect disabled on the already-active row; Repair tiers + disabled while one runs; Speed test disabled while running; add/edit disabled + for enterprise (with the "edit in nmtui/nmcli" hint). +- *Confirmations* (exact wording): Reset → "Reset <SSID>? This drops the + connection and reconnects with a new MAC."; Bounce → "Restart networking? All + links drop briefly."; DNS override → "Temporarily set DNS to 1.1.1.1 for the + test? It reverts automatically."; Remove → "Forget <SSID>? The saved password is + deleted." +- *"Get me online" reporting*: shows each escalation step live (Unblock rfkill → + Reset → Bounce → Portal) with per-step pass/fail and stops at the first that + restores internet or at a terminal state, naming the next action. +- *After close*: the bar reflects the new state immediately (signal/refresh on + next poll); a running speedtest/diagnose keeps running and notifies on finish + (panel close doesn't cancel it). +- *Keyboard*: Esc closes; Tab moves between sections; arrows move rows; Enter + fires the section primary; the connection list is type-to-filter. + * Connection management (nmcli) -- Every op via nmcli: =device status=, =connection show=, =con up/down=, - =con add/modify/delete=, =dev wifi rescan/list=. +- Every op via nmcli per the nmcli contract above (terse, escaped, UUID-keyed, + bounded =--wait=). - MRU ordering from NM's =connection.timestamp= (last activated), descending. - Ethernet appears in the list whenever a wired device is present, selectable at any time; switching just brings the chosen connection up. -- TDD with a fake =nmcli= on PATH returning canned output, asserting the exact - nmcli command sequence (behavior, not implementation) — the established - pattern in =tests/waybar-netspeed= and =tests/layout-navigate=. - -* Credential storage (GPG) - -- Store: =~/.config/net/connections.<ext>= — connection definitions + secrets - (PSK/EAP), one record per connection, with =last_used=. -- *Default unencrypted* (=connections.json=). Encryption is opt-in: when enabled, - =connections.json.gpg= encrypted to Craig's private key (=c@cjennings.net=). -- Passphrase cadence via gpg-agent: once per session (long cache TTL or - decrypt-and-hold), once per hour (=default-cache-ttl=), or never (plaintext). - Configured in =~/.config/net/config=. -- *Supplements NM, does not replace it.* NM's own store - (=/etc/NetworkManager/system-connections=, root-only) stays the source of - truth that actually connects. The GPG store is a portable, user-owned export + - secret vault. =net add/edit/remove= writes both (nmcli + store). =net import= - rebuilds NM connections from the store on a fresh machine. They sync on every - mutating op; NM wins on conflict for the live connection. - -* Diagnostics + speed test - -- Diagnostics reuse =captive= verbatim for the interactive flow (=net portal=, - =net reset=, the 1.1.1.1 DNS override test) and mirror its cheap probe natively - for =net status= / =net probe=. No logic is duplicated by hand beyond the small - portal-URL parser, which is already unit-tested in =tests/captive=. -- Speed test: *librespeed-cli* (no account, self-hostable, AUR), chosen over - Ookla speedtest-cli. =net speedtest --json= parses its JSON; the panel shows a - progress indicator and the down/up/ping result. +- *Mutation safety + rollback*: switching keeps the current connection up until + the new one activates successfully (=nmcli --wait 30=); on failure it does not + tear down the working link, surfaces the failure, and leaves the prior + connection active. =net down= notes that NM may auto-reactivate a profile and + reports the post-op active connection so the user isn't surprised. A switch that + needs a password it doesn't have prompts (or fails with "password required"), + never silently strands. The exact NM command sequence (preflight active-state + read → activate target → verify default route → on failure, confirm prior + still up) is pinned in the engine and tested against fake nmcli. +- *Add/edit scope*: open + WPA-PSK only in v1. Existing saved profiles of any + type (including enterprise) can be *activated*; editing an enterprise profile + shows "edit via nmtui/nmcli" rather than a broken partial form. + +* Connection secrets (no separate store) + +Per Craig's call: don't build a parallel credential store. Settings and secrets +live where NetworkManager already keeps them, so there's one source of truth and +no extra dependency (no GPG, no gpg-agent, no =~/.config/net/connections=). + +- *Where secrets live*: =/etc/NetworkManager/system-connections/<name>.nmconnection=, + root-owned =0600=, with the PSK/EAP secret stored inline (the default + =secret-flags=0= "owned by NM"). That's already secure-at-rest (root-only) and + is what =nmcli= reads/writes. +- *How we touch them*: every add/edit/remove goes through =nmcli= (=connection add + / modify / delete=), which writes the =.nmconnection= with the right ownership + and perms. We never read or write =system-connections= files directly (root) and + never copy a secret out of them. +- *No export / import / sync* — there's nothing to sync. A new machine gets its + connections the way it always has (the user joins, or restores NM profiles), + not from a tool-specific vault. +- *config file*: =~/.config/net/config= still exists, but only for non-secret + preferences (speedtest server, redaction flags, probe TTL). It holds no + credentials. +- *No secret leakage*: PSK/EAP never appear in =net=' =--json= output, the event + log, or error text (tested) — even though NM is the store, our surfaces must not + echo a secret =nmcli= happens to return. + +* Speed test + +- Backend: *=speedtest-go=* (=--json=, =--server=, =--no-download/--no-upload=), + already installed on velox (AUR =speedtest-go-bin=). No new dependency for v1. + librespeed-cli is the documented fallback for a self-hosted LibreSpeed server. +- =net speedtest --json= parses speedtest-go's JSON into the =speedtest= schema. +- *Server policy*: auto-select nearest by default; allow a pinned server id in + =~/.config/net/config=. +- *Timeout + cancellation*: a hard run timeout (e.g. 60s); the panel run is + cancellable (kills the child). Offline / rate-limited / no-server errors map to + the failure-message table. +- *Tests*: fixture JSON (success) and fixture stderr (offline, no server, + malformed output) drive =net speedtest= parsing without touching the network. + +* Help + documentation + +In-app help has three layers, each reachable in the situation it's needed: + +- *CLI help (works from a dead-GUI TTY)*: =net --help= lists the subcommands in + one screen; =net <cmd> --help= documents each (flags, what it mutates, the + console-recovery targets). The Makefile targets are self-describing (=make help= + lists =online= / =net-doctor= / etc. with one-line descriptions). This is the + layer that matters most when you're at a console with no network. +- *Panel help (in the GUI)*: a small =?= affordance in the panel header opens an + inline help pane — what each section does, which Repair actions mutate state, + what the indicator glyphs/colors mean. Per-control tooltips on the less-obvious + buttons (rfkill, bounce, DNS override). No external help browser. +- *User guide (the durable doc)*: a README / docs page covering every command, + the indicator states + glyphs, the panel sections, the config file keys, the + recovery make targets, troubleshooting (the failure-message table), and + rollback. Written so a future session — or Craig six months out — can operate + and recover the module from the doc alone. + +The failure-message table above is the single source of truth for the +troubleshooting text; the guide and the panel help both render from it rather +than restating it. + +* Enhancement radar + +Low-cost adjacent affordances, each dispositioned so cheap wins aren't lost and +the v1 panel stays focused. (Several are already in v1 by virtue of other +sections; marked here so the consideration is visible.) + +| Enhancement | Disposition | Reason | +|-------------------------------------+-------------+--------------------------------------------------------| +| Open / copy portal URL | v1 | already in the captive flow; trivial Open + Copy | +|-------------------------------------+-------------+--------------------------------------------------------| +| Forget network | v1 | it's the remove op, already specced | +|-------------------------------------+-------------+--------------------------------------------------------| +| Rescan now | v1 | already a Connections control | +|-------------------------------------+-------------+--------------------------------------------------------| +| Retry with hardware MAC | v1 | captive already has --hardware-mac; expose in Repair | +|-------------------------------------+-------------+--------------------------------------------------------| +| Pin speedtest server | v1 | already a config key | +|-------------------------------------+-------------+--------------------------------------------------------| +| Copy redacted doctor report | v1 | cheap, serves the observability/support goal | +|-------------------------------------+-------------+--------------------------------------------------------| +| Show last good network / result | vNext | needs small history persistence | +|-------------------------------------+-------------+--------------------------------------------------------| +| Watch mode for net doctor | vNext | a --watch loop; handy at a TTY, not v1-critical | +|-------------------------------------+-------------+--------------------------------------------------------| +| Actionable desktop notifications | vNext | dunst supports actions; extra wiring | +|-------------------------------------+-------------+--------------------------------------------------------| +| Keyboard connection picker (fuzzel) | vNext | the typed-prefix pattern; panel covers v1 | +|-------------------------------------+-------------+--------------------------------------------------------| +| QR-code share / import WiFi | rejected | low value for a personal 2-machine setup; phones do QR | +|-------------------------------------+-------------+--------------------------------------------------------| * Waybar wiring - Replace =custom/netspeed= with =custom/net= in the bar's module list (same slot). - Module def: =exec: waybar-net=, =return-type: json=, =interval: 2=, a =signal= - for on-demand refresh (next free signal after wtimer's 14), =on-click: <open - panel>=, =on-click-right: net reset=, =on-click-middle: net portal=. -- Remove the old =on-click: pypr toggle network= scratchpad once the panel - replaces it. + for on-demand refresh (next free signal after wtimer's 14), =on-click=, + =on-click-right=, =on-click-middle= per the phase-aware interactions (each + dispatches a detached job, never blocks). +- Remove the old =on-click: pypr toggle network= scratchpad only once the panel + replaces it (Phase 2); Phase 1 keeps it as the interim manager. * Testing plan (TDD) -- *Engine* — fake =nmcli= + fake =curl= on PATH; assert command sequences and - parsed/emitted JSON for status, list, up/down, add/edit/remove, probe, - diagnose, speedtest. Pure state/format functions tested directly. +- *Engine (normal)* — fake =nmcli= + =curl= + =speedtest-go= on PATH; assert + command sequences and parsed/emitted JSON for status, list, up/down, + add/edit/remove, probe, diagnose, repair, speedtest. Pure state/format + functions tested directly. JSON schemas locked by example. - *Portal parser* — already covered in =tests/captive= (Normal/Boundary/Error + the real SONIFI body). The engine's native probe reuses the same cases. +- *nmcli parsing* — escaped colon/backslash/newline in SSID, duplicate names, + hidden SSID, non-ASCII, wired-mid-session, multi-active (wifi+tether). +- *Failure + concurrency (the risky classes)* — slow/hung nmcli/curl/speedtest + (degraded state within budget), concurrent =net status= probe refresh + (single-flight), corrupt cache (recovered), stale cache after SSID change + (invalidated), permission denied / sudo declined, DNS-override cleanup failure + (=cleanup-unverified=), NM partial activation (rollback keeps prior link), + secret redaction, missing speedtest-go, no wifi hardware, rfkill soft/hard + block. +- *Doctor classification* — fixture-driven =net doctor= over fake nmcli/curl + asserting the right terminal classification + that =--fix= stops before + destructive repairs: auth failures (=needs-user-action=), upstream/AP failure + (=upstream-not-local=), VPN-routed failure (=deferred/vpn=), and the DNS classes + (hijack → portal, broken-but-1.1.1.1-works → offered override, egress-blocked → + upstream). Assert the failure-mode coverage table's "detects / repairs / terminal + action" holds for each row. - *Indicator* — drive =net status --json= through =waybar-net=, assert the JSON - the bar renders for each state (online / captive / no-internet / wired / - disconnected), interface override via env like =WAYBAR_NETSPEED_IFACE=. -- *Panel* — pocketbook-style: test the backing logic (list ordering, op - dispatch, store read/write, gpg round-trip with a test key), not the GTK - widgets. -- *GPG store* — round-trip encrypt/decrypt with a throwaway test key; sync - on add/edit/remove; import rebuilds NM ops (asserted against fake nmcli). + per state (online / captive / no-internet / degraded / wired / disconnected / + rfkill), iface override via env. +- *Panel* — pocketbook-style: backing logic (list ordering, op dispatch, + state-machine transitions), not GTK widgets. +- *NM secrets / no-leak* — add/edit writes the secret into NM via nmcli (asserted + against fake nmcli, never to a tool-owned file); assert no PSK/EAP appears in any + =--json=, log line, or error (there is no credential store to round-trip). +- *Live checklist (gated out of the suite)* — a "Manual testing and validation" + task per phase for the real-network states (captive at a hotel, no-internet, + switch under load, reset, speedtest) that can't be faked. + +** Harness + coverage gate +The concrete contract, matching the repo's existing convention (not pytest — the +dotfiles suites are =unittest=, run by =make test= as =python3 -m unittest= over +=tests/*/test_*.py=; 33 suites today): +- *Framework*: =unittest=. Each suite is =tests/<name>/test_<name>.py= + (=tests/net/=, =tests/waybar-net/=), collected by the existing =make test= loop + — no new runner, no pytest dependency. +- *Fakes on a temp PATH*: =fake-nmcli=, =fake-curl=, =fake-speedtest-go=, + =fake-rfkill=, =fake-resolvectl= live as executable stubs in =tests/<name>/= + (the =tests/layout-navigate/fake-hyprctl= pattern). A fixture file encodes the + command→canned-output map and the stub appends each invocation to a log the test + asserts against. Subprocess timeouts are simulated by a stub that sleeps past the + budget; =net status= must still return the degraded state. +- *Waybar wrappers end-to-end*: =waybar-net= is run as a subprocess with the fake + PATH and the env overrides (iface, cache path), asserting the emitted JSON — same + as =tests/waybar-netspeed=. +- *Coverage*: coverage.py is absent system-wide (and not importable), so coverage + runs in a throwaway venv (=python3 -m venv=, =pip install coverage=, =coverage + run -m unittest=, =coverage report=) — the method the wtimer suite used (95%). + Target: *branch* coverage over =net/= and the wrapper, ≥ 90% on the pure + classifier/parser modules. -* Files touched (planned) +** Coverage as a gap-finder, not a number (per phase) +Line coverage alone misses the branches that matter here, so each phase ends with +a *coverage-gap pass*, not just a percentage: +- After the first green run, read the branch report and map every uncovered branch + to either a new test or a consciously-excluded live-only behavior (with a comment + or a Manual-testing entry naming it). +- *Branch coverage is required* for the pure logic: the doctor classifier (every + outcome — fixable / needs-user-action / upstream-not-local / deferred-vpn), the + cleanup-unverified path, the redaction paths, the degraded hot-path fallback, the + timeout branches, and the portal/nmcli parsers. +- A phase isn't "done" until its coverage-gap pass is recorded — uncovered logic is + either tested or explicitly excused, never silently uncovered. +* Files touched (planned, all in =~/.dotfiles=) + +- =net/= package (src-layout, like pocketbook) — engine + panel. - =hyprland/.local/bin/waybar-net= — the indicator (replaces =waybar-netspeed=). -- =hyprland/.local/bin/net= — engine CLI entry (or a package console-script). -- =net/= package (src-layout, like pocketbook) — engine + panel, in the dotfiles - repo (or in-tree as pocketbook is). -- =hyprland/.config/waybar/config= — swap =custom/netspeed= → =custom/net=. -- =hyprland/.config/waybar/style.css= — captive / no-internet state classes. +- =hyprland/.local/bin/net= — engine CLI entry (console-script shim). +- =hyprland/.config/waybar/config= — swap =custom/netspeed= → =custom/net=; + remove =custom/airplane=. +- =hyprland/.config/waybar/style.css= — captive / no-internet / degraded / + rfkill classes; remove airplane classes. - =tests/net/=, =tests/waybar-net/= — suites. -- =~/.config/net/= — config + connection store (machine-local; not stowed - content beyond a seed config). -- =captive= — minor refactor so the engine can reuse its probe cleanly. +- =captive= — refactor: extract probe + reset into functions callable + non-interactively (a =--json= probe mode) so the engine reuses them. +- =~/.config/net/config= — seed config (probe TTL, speedtest server, redaction + flags). No secrets; not a credential store. +- dotfiles =Makefile= — add the console-recovery targets (=online=, =net-doctor=, + =net-status=, =net-diagnose=, =net-portal=, =net-reset=, =net-bounce=). +- *Deletions once net ships* (the airplane module is absorbed): + =hyprland/.local/bin/waybar-airplane=, =hyprland/.local/bin/airplane-mode=, + =tests/waybar-airplane/=, =tests/airplane-mode/=, and the =custom/airplane= + module + its css. +- archsetup Hyprland step — add =gtk4-layer-shell=, =python-gobject=, + =speedtest-go-bin= to the install lists (the only archsetup change; no =gpg= + added, secrets stay in NM's store). -* Resolved decisions (this session, Craig's calls) +* Resolved decisions (Craig's calls + this response) 1. Panel UI tech → GTK4 + gtk4-layer-shell, shared pocketbook scaffold (one panel shell, reused by the desktop-settings sibling). 2. Engine language → Python =net= package; shells out to =captive= for the portal-force flow, native cheap probe for the bar path. 3. Connectivity probe → split cadence (fast link poll every 2s + slow cached - internet/captive probe, TTL ~45s). + internet/captive probe, TTL ~45s) with single-flight + atomic cache. 4. No keyboard-modifier clicks (waybar can't qualify them) — the panel hosts the - rich actions; bar clicks are left=panel, right=reset, middle=portal. -5. GPG store supplements NM; NM stays the source of truth. + rich actions; bar clicks dispatch detached jobs (phase-aware). +5. No separate credential store (Craig's call, cj). Secrets live in NM's own + =system-connections= (root =0600=, inline), touched via nmcli. No GPG, no + gpg-agent, no =~/.config/net/connections=. Supersedes the earlier GPG-store + design. 6. =custom/netspeed= absorbed into =custom/net=; throughput moves to the tooltip. -7. Speed-test backend → librespeed-cli. +7. Speed-test backend → =speedtest-go= (already installed), not a new + librespeed-cli dependency; librespeed-cli is the self-hosted fallback. +8. Code lives in the dotfiles repo; archsetup only installs deps. +9. v1 add/edit scope = open + WPA-PSK; enterprise/802.1X is activate-only, + add/edit is vNext (settled by Craig 2026-06-29 — no enterprise networks in his + history, so the form would be unused UI). +10. =net doctor= is in v1 (Craig's call, cj) — a one-shot diagnose+fix mode, + reachable from a TTY via =make online= / =make net-doctor=. (The earlier + "defer the doctor/bundle command" decision is reversed.) +11. Diagnose (read-only) and Repair (mutating, confirmed) are separated in the + panel and the CLI; Repair is tiered lightest-first (rfkill → reset → bounce). +12. =custom/net= absorbs the airplane module (Craig's call, cj); the standalone + airplane module + scripts + tests are deleted once net ships. +13. Repair includes a full-stack bounce and an rfkill-unblock (Craig's calls, + cj) — the latter recovers the framework-laptop post-power-loss soft-block. +14. VPN / WireGuard is a planned Phase 5 (Craig's call, cj), not a permanent + exclusion. -* Phasing +* Implementation phases -- *Phase 1 — Indicator (task #C).* =net status= + =net probe= (native cheap - probe, reusing captive's logic) + =waybar-net= + the split-cadence cache + CSS - states. Ships the no-internet/captive state on the bar. Smallest, highest - value, fully testable without the GTK panel. +- *Phase 1 — Indicator + console recovery (task #C).* =net status= + =net probe= + (native cheap probe, reusing captive's logic) + the =captive= probe refactor + + =waybar-net= + the split-cadence cache (single-flight, atomic, stale classes) + + CSS states (incl. rfkill) + performance budget. Plus the CLI-only recovery path: + =net repair= tiers (rfkill / reset / bounce), =net doctor [--fix]=, and the + Makefile targets (=make online= etc.) — all testable without the GTK panel. + Absorbs the airplane state and removes the standalone airplane module. Interim + left-click keeps the existing scratchpad until the panel lands. + - *Acceptance*: fresh-login waybar smoke test shows correct state on + online/captive/no-internet/wired/rfkill; =net status= stays within budget + under a fake slow nmcli (degraded state); =net doctor --fix= recovers a + soft-blocked radio from a TTY; the live captive checklist passes at a real + portal; the airplane state works and the old airplane module is gone; + reverting = swap =custom/netspeed= + =custom/airplane= back. - *Phase 2 — Panel shell + connection management (task #B core).* GTK4 - layer-shell scaffold + =net list/up/down/add/edit/remove/rescan= + MRU list. + layer-shell scaffold + =net list/up/down/add/edit/remove/rescan= + MRU list + + mutation safety/rollback + panel state machines. + - *Acceptance*: switch wifi↔wifi and ethernet↔wifi without stranding; a failed + switch leaves the prior link up; add/edit open + WPA-PSK writes the secret to + NM; remove confirms; panel states render for loading/rescan/activation. - *Phase 3 — Diagnostics + speed test in the panel.* Wire =net diagnose= / - =net portal= / =net reset= / =net speedtest= into the panel; portal Open - button. -- *Phase 4 — GPG credential store.* Opt-in encryption, cadence config, NM sync, - import. + =net repair= / =net doctor= / =net portal= / =net speedtest= into the Diagnose + vs Repair sections; the "Get me online" button; portal Open button; speedtest + progress + cancel. + - *Acceptance*: diagnose runs read-only; each repair tier confirms + verifies + cleanup (DNS override reverts, shown); speedtest result parses from + speedtest-go and a fixture-driven failure shows the right message. +- *Phase 4 — Docs + rollout.* In-app help (=net --help= / per-command help, the + panel help affordance), README/user-guide (commands, panel, config, + troubleshooting, the make targets, rollback), and the manual dep step on ratio. + - *Acceptance*: =net --help= and each subcommand's help are complete; the + user-guide covers every command + the recovery targets; ratio rollout + documented. +- *Phase 5 — VPN / WireGuard (future).* Fold the existing archsetup wireguard + tooling into the same panel + CLI (=net vpn ...=). Out of the v1 milestone; + specced separately when picked up. * Open items / risks - gtk4-layer-shell dropdown anchoring under a waybar module needs the same - positioning work pocketbook solved; reuse it. -- librespeed-cli availability + a default server choice (public list vs a pinned - server) to confirm before Phase 3. + positioning work pocketbook solved; reuse it. (Phase 2.) +- The =captive= refactor must keep the standalone CLI behavior identical while + exposing a non-interactive =--json= probe; covered by the existing + =tests/captive= suite plus new probe-mode tests. (Phase 1.) +- speedtest-go server selection variance (nearest-server flor) — pin a server in + config if results are noisy. (Phase 3.) - The background-probe kick from =net status= must be truly non-blocking (spawn + - detach) so a slow/hanging probe never stalls the bar. -- NM↔GPG-store conflict policy on edit needs a concrete rule (NM wins for the - live connection) — confirm during Phase 4. + detach); enforced by the single-flight lock and the performance benchmark test. * Rollback Each phase is independent. The indicator (Phase 1) is a drop-in replacement for -=custom/netspeed=; reverting is swapping the module back in the config. The panel -and store are additive — not wiring their clicks / not enabling encryption leaves -the bar working as before. +=custom/netspeed= (and =custom/airplane=); reverting is swapping those modules +back in the config and restoring their scripts. The panel is additive — not +wiring its clicks leaves the bar working as before. No credential store to roll +back (secrets stay in NM throughout). + +* Review findings [31/31] + +** DONE Define the structured diagnostics contract :blocking: +The spec says the engine "emits JSON" and that diagnostics "reuse =captive= +verbatim", but the current =~/.dotfiles/common/.local/bin/captive= flow is a +human-readable bash script that mixes diagnostics, sudo prompts, DNS mutation, +browser launch, and terminal prose. A GTK panel cannot reliably turn that into +clear state, progress, cancellation, or useful error messages. Define the +machine contract before implementation: every diagnostic step should have a +stable id, status (=pending/running/pass/warn/fail/skipped=), redacted evidence, +elapsed time, safety outcome, and next action. Keep =captive= as the interactive +CLI, but either refactor reusable probe/reset functions behind =net diagnose +--json= or make =captive= expose a non-interactive JSON mode. This blocks the +panel and logging work because otherwise the implementer must invent the +boundary. + +Disposition: accept — added the "Diagnostics contract" section (per-step id / +status / evidence / elapsed / safety / next_action) and the =captive= =--json= +probe-mode refactor under Architecture + Files touched. + +** DONE Specify user-facing failure messages and recovery actions :blocking: +The spec names failure states like =no-internet=, =captive=, failed probe, +failed reset, missing DNS, and missing speed-test backend, but it does not define +the messages the user sees or what each message tells them to do next. For this +feature, "error" is not enough: a user needs to know whether WiFi is associated, +whether DHCP succeeded, whether DNS is hijacked/broken, whether HTTP is +intercepted, whether sudo was declined, whether a command timed out, and whether +the system was left unchanged or partially changed. Add a message table for the +indicator, panel, and CLI with: failure class, visible text, evidence included, +redaction rule, and next action. This is blocking because UX quality here is the +product, not an implementation detail. + +Disposition: accept — added the "Failure states, messages, recovery" section +covering each class, the visible message, the "what changed" residue note, and +the next action across indicator/panel/CLI. + +** DONE Define the debug log and redacted support bundle :blocking: +There is no observability section. When this fails in a hotel or cafe, an agent +needs enough evidence to diagnose it without rerunning destructive actions. Add +log location, rotation/retention, JSONL event schema, command argv logging, +exit-code/stderr capture, elapsed time, selected iface, NM active connection +UUID, probe URL class, HTTP code, redirect host, DNS servers, and cache +read/write events. Also define a =net doctor --json= or =net debug-bundle= +command that emits redacted status, recent log events, dependency versions, and +a reproduction command. Redact SSID if configured, MAC addresses, portal query +tokens, PSKs, EAP identities/passwords, IPs when requested, and all GPG/NM +secrets. This blocks implementation readiness because post-failure diagnosis is +currently left to ad hoc terminal spelunking. + +Disposition: modify — accepted the JSONL event log, the schema, and the redaction +rules in full (new "Observability" section). Deferred the dedicated =net +debug-bundle= / =net doctor= command to vNext: for a single-user tool =net +diagnose --json= (the snapshot) plus the event log (the history) cover +post-failure diagnosis; a bundle command is gold-plating for v1. Recorded under +Out + Resolved decision 10. + +** DONE Pin the nmcli parsing and timeout contract :blocking: +The spec lists nmcli operations but not the exact fields, output modes, escaping +rules, ID semantics, or timeouts. This is risky because SSIDs and connection +names can contain spaces, colons, duplicates, hidden names, and non-ASCII; the +current =waybar-netspeed= already had an SSID parsing bug. The nmcli manual +documents =--terse=, =--get-values=, =--escape=, =--wait=, ID/UUID/path +selection, =passwd-file=, and built-in connectivity states +(=none/portal/limited/full/unknown=) at +https://man.archlinux.org/man/nmcli.1.en. The spec should require UUIDs for +saved-profile operations, explicit =--wait= budgets, parser tests for escaped +colons/backslashes/newlines/duplicate names/hidden SSIDs, and a decision on when +to use or ignore =nmcli networking connectivity [check]=. This is blocking +because the command wrapper is the core reliability boundary. + +Disposition: accept — added the "nmcli contract" section: terse + =--escape= + +=--get-values=, UUID-keyed ops, explicit =--wait= budgets, NM connectivity as a +cheap hint (our probe authoritative), and the parser test matrix. + +** DONE Define cache concurrency, atomicity, and stale-state behavior :blocking: +=net status= may spawn =net probe= whenever the cache is stale, but the spec +does not define locking, process coalescing, atomic writes, crash cleanup, or +what happens when the probe hangs. With a 2s Waybar interval, a bad network could +start overlapping probes, corrupt the runtime cache, or keep showing stale +"online" while the link is gone. Add a single-flight lock under +=$XDG_RUNTIME_DIR/waybar=, atomic write+rename for cache updates, max probe +runtime, stale age classes (fresh/stale/expired/unknown), cache invalidation on +iface/SSID/connection UUID change, and tests for concurrent =net status= calls. +This blocks the fast-path design because it is the main performance and +correctness risk. + +Disposition: accept — added "Concurrency, atomicity, staleness" under the +Connectivity model: flock single-flight, temp+rename atomic write, ≤6s probe +timeout, fresh/stale/expired/unknown classes, iface/SSID/UUID invalidation, stale +lock reclaim, plus concurrency tests in the test plan. + +** DONE Bound hot-path performance with measured budgets :blocking: +The spec says the cheap poll should be sub-100ms, but the proposed fast path +still may call multiple =nmcli= commands every two seconds, read sysfs, parse +throughput, and maybe spawn a background probe. The existing =waybar-netspeed= +had a deliberate sleep for throughput sampling; replacing it must define how +throughput is sampled without sleeping in the bar path. Add a per-command budget +for =waybar-net= and =net status=, a maximum number of subprocesses on the hot +path, a timeout for every subprocess, benchmark tests with fake slow =nmcli=, +and a rule that the indicator emits a degraded JSON state rather than blocking. +This is blocking because Waybar custom modules can visibly freeze or lag when +their exec path stalls. + +Disposition: accept — added the "Performance budgets" section: <100ms typical / +<250ms worst, throughput sampled across the poll interval (no in-process sleep), +one nmcli call max on the hot path, timeouts on every subprocess, the degraded +state, and a fake-slow-nmcli benchmark test. + +** DONE Make click actions non-blocking and visible :blocking: +Waybar right-click runs =net reset= and middle-click runs =net portal= directly. +Those operations can require sudo, open browsers, mutate DNS, delete/recreate NM +profiles, or hang on network commands, but Waybar click handlers provide no +panel, terminal, progress, or cancellation surface by default. Define whether +right/middle click instead opens the panel focused on the action, dispatches a +background job with notifications, or is removed from v1. If kept, specify +single-flight behavior, how sudo/polkit prompts surface, how success/failure is +reported, and how the user can inspect logs. This blocks UX readiness because +the fastest remediation path is currently the easiest place to hide failure. + +Disposition: modify — accepted the concern; made the interactions phase-aware and +non-blocking. Every click dispatches a detached, single-flight background job and +reports via =notify=; sudo surfaces through polkit/the normal prompt; failures go +to the notify + the event log. In Phase 1 (no panel) left-click runs probe + +notify and keeps the scratchpad; from Phase 2 left-click opens the panel focused +on the action. Recorded in the Indicator "Interactions" subsection. + +** DONE Specify connection mutation safety and rollback :blocking: +The spec says row click switches connections and remove gets a confirm, but it +does not define what happens when a switch partially succeeds, disconnects the +current working link, needs a password, loses the default route, or triggers +auto-activation. The nmcli manual warns that =connection down= does not prevent +future auto-activation and may internally block a profile until user action. +Define preflight, the exact NM command sequence, whether the old active +connection is kept until the new one proves usable, when rollback is attempted, +how long activation waits, and what the panel says when rollback fails. This is +blocking because the module can strand the user offline. + +Disposition: accept — added "Mutation safety + rollback" under Connection +management: keep the prior link up until the target activates (=--wait 30=), no +teardown on failure, password-required surfaced not stranded, =net down= reports +post-op active state + the auto-reactivation caveat, and the pinned NM command +sequence is tested against fake nmcli. + +** DONE Define the credential-store security model :blocking: +The GPG store is described as optional and default-unencrypted, but the spec does +not define file modes, schema, secret-source rules, import/export prompts, +recipient verification, stale secret handling, or what is logged. It also says +NM remains source of truth while the user-owned store contains PSK/EAP secrets, +which creates two truth sources for sensitive data. Add a precise schema, +=0600= file creation with parent-dir permissions, encrypted-recipient checks, +plaintext warning text, explicit opt-in flow, redaction requirements, behavior +when NM has a secret not in the store, behavior when the store has a secret NM +rejects, and tests for no secret leakage in JSON/logs/errors. This blocks Phase +4 and the full spec because otherwise the implementer must make security +decisions mid-code. + +Disposition: accept — rewrote "Credential storage" with the versioned schema, +=0600= file / =0700= dir, recipient verification on opt-in, the plaintext +warning, secret-source rule (entered/exported, never harvested from root store), +the two-source reconciliation policy (NM wins live, store wins for what NM +lacks, stale-secret flagging), and the no-leak tests. + +** DONE Define EAP, enterprise WiFi, and unsupported connection behavior :blocking: +The store says "PSK/EAP" and connection management says add/edit, but there is +no v1 contract for WPA-Enterprise fields, certificates, identity vs anonymous +identity, hidden networks, static IP, proxy settings, metered flags, MAC +randomization, or 802.1X prompt behavior. Either scope v1 to open/WPA-PSK plus +existing saved-profile activation, or define the minimum EAP form and the +unsupported-state messages. This blocks add/edit/import because enterprise WiFi +is too sensitive to hand-wave. + +Disposition: modify (scope) — scoped v1 to open + WPA-PSK add/edit, with +*activation* of any existing saved profile (including enterprise). Enterprise / +802.1X add/edit, static-IP, proxy, metered, and MAC-randomization editing are +vNext, shown as "edit via nmtui/nmcli". Recorded in Scope/Out, Connection +management, and Resolved decision 9. + +** DONE Split read-only diagnostics from mutating remediation :blocking: +The panel's diagnostics section includes probe, bounce/reset, gateway ping, and +DNS override test in one area, while =captive= currently performs resets and +temporary DNS changes as part of its flow. Users need to know which buttons are +read-only and which mutate NM profiles, MAC mode, DNS, or browser state. Add +separate "Diagnose" and "Repair" actions, confirmations for destructive or +privacy-changing operations, explicit cleanup verification for DNS override, and +a terminal state when cleanup is unverified. This blocks readiness because +network repair must not surprise the user or leave hidden residue. + +Disposition: accept — split the panel into a read-only Diagnose section and a +confirmed, mutating Repair section (and split the CLI into =net diagnose= vs =net +repair=). Added =cleanup_verified= + a terminal =cleanup-unverified= state to the +diagnostics contract. + +** DONE Define panel state, cancellation, and permissions UX :blocking: +The panel sections list buttons and a streaming output area, but not loading +states, disabled states, empty states, keyboard/focus behavior, cancellation, or +permission-denied handling. Add panel state machines for connection list loading, +rescan in progress, activation in progress, diagnostics running, speedtest +running, and no NetworkManager/no WiFi/no permissions/no GPG key/no +librespeed-cli. Each long operation should be cancellable where possible or +clearly non-cancellable with an elapsed-time display. This blocks the GTK work +because without it the implementer must invent the user flow. + +Disposition: modify — accepted the state-machine requirement (added "Panel state, +cancellation, permissions"), but scoped the state set to what can actually occur +on the two-machine fleet: dropped "no NetworkManager" as a modeled state (NM is +always present; a missing nmcli is a single hard-error exit) and kept +no-wifi-hardware, missing speedtest-go, no-GPG-key, plus the in-progress states +with elapsed-time + cancellation where the op allows. + +** DONE Verify speed-test dependency, server choice, and failure contract :blocking: +The spec chooses =librespeed-cli= and notes availability/default-server research +as an open risk, but Phase 3 still depends on parsing its JSON and showing +progress. I checked the upstream project page +(https://github.com/librespeed/speedtest-cli) and the AUR URL named by search is +not sufficient as a verified package/install contract in this spec. Add the +exact package name/source to install, command version expected, JSON shape, +server-selection policy, timeout, cancellation behavior, offline/rate-limited +messages, and tests with fixture JSON and fixture stderr. This blocks Phase 3 +because speed-test failure modes are otherwise undefined. + +Disposition: modify — verified live and changed the backend: =speedtest-go= (AUR +=speedtest-go-bin=, 1.x) is already installed on velox and supports =--json=, +=--server=, =--no-download/--no-upload=, so v1 needs no new dependency. +librespeed-cli (AUR =librespeed-cli= / =-bin=) is the documented self-hosted +fallback. Added the "Speed test" section with server policy, timeout, +cancellation, the failure-message mapping, and fixture-JSON/stderr tests. + +** DONE Define dependency installation and repo boundaries :blocking: +The files touched section alternates between archsetup paths and the external +dotfiles repo, while pocketbook has been folded into this repo and its previous +archsetup provisioning was intentionally removed. The spec should state where +the =net= package actually lives, which repository owns the scripts/tests, +whether =gtk4-layer-shell=, =python-gobject=, =librespeed-cli=, =gpg=, =nmcli=, +=curl=, and =resolvectl= are installed by archsetup or assumed present, and the +Makefile targets for test/lint/install. This blocks implementation because the +current path plan can produce code that is not installed on a fresh machine. + +Disposition: accept — added the "Repository + dependencies" section: all code in +=~/.dotfiles= (=net/= package in-tree like pocketbook, scripts in the hyprland +tier, tests under =tests/=), archsetup owns only the dep install +(=gtk4-layer-shell=, =python-gobject=, =speedtest-go-bin=; nmcli/curl/resolvectl +already present), Makefile =make test= collects the package suite, and a +daily-drivers note for ratio. Rewrote Files touched to match. + +** DONE Expand the test plan for failure, concurrency, and live verification :blocking: +The testing plan covers normal parsing and fake command sequences, but it misses +the riskiest behaviors: slow/hung =nmcli=/=curl=/=librespeed=, concurrent +=net status= cache refresh, corrupt cache, stale cache after SSID change, +permission denied, sudo declined, DNS override cleanup failure, NM partial +activation, duplicate connection names, secret redaction, missing optional +dependencies, no WiFi hardware, wired+tether+WiFi ambiguity, portal redirect +tokens, and Waybar click handlers. Add unit/fixture tests for each class plus a +manual/live checklist gated out of the normal suite. This is blocking because +the current plan would leave the exact "things that can go wrong here" mostly +untested. + +Disposition: accept — rewrote the Testing plan with the "Failure + concurrency" +class (slow/hung commands, single-flight, corrupt/stale cache, perm-denied, +cleanup-failure, partial activation, redaction, missing deps, no-wifi, +multi-active) and a per-phase live checklist gated out of the suite. + +** DONE Define status JSON schemas and compatibility rules +The spec says all subcommands take =--json= but does not define schemas. Add +versioned JSON examples for =status=, =probe=, =list=, =diagnose=, =speedtest=, +and error envelopes, including nullable fields and unknown/degraded states. This +is non-blocking for product direction but should be fixed before code so tests +can lock the CLI contract. + +Disposition: accept — added the "JSON schemas" section with versioned (=v:1=) +envelopes for status / probe / list / diagnose / speedtest and a shared error +envelope, including the degraded/unknown states. + +** DONE Rename or alias the phasing section for workflow compatibility +The spec has a usable =Phasing= section, but the spec-review workflow expects an +=Implementation phases= section that can be lifted into =todo.org=. Rename it or +add an alias heading during response. This is non-blocking because the existing +phase decomposition is understandable, but aligning the heading prevents future +workflow friction. + +Disposition: accept — renamed =Phasing= → =Implementation phases= and added +per-phase acceptance criteria. + +** DONE Add documentation and rollout acceptance checks +Rollback is described, but docs and rollout are thin. Add README/user-guide +updates for commands, panel behavior, config file, GPG opt-in, troubleshooting, +and rollback; add acceptance checks for each phase, including a fresh-login +Waybar smoke test and restoring =custom/netspeed=. This is non-blocking but +important for handing the feature to a future session without re-discovery. + +Disposition: accept — added per-phase acceptance criteria under Implementation +phases (incl. the fresh-login waybar smoke test and the =custom/netspeed= +restore), a Phase 4 "Docs + rollout", and (answering Craig's cj follow-up) a +dedicated "Help + documentation" section with the three help layers (CLI help, +panel help affordance, user guide). + +** DONE Add a failure-mode coverage table :blocking: +The spec now names many individual network failures, but it still does not carry +one compact coverage matrix that says, for each common failure mode, whether +=net diagnose= detects it, whether =net doctor --fix= can repair it, and what +terminal user action remains when it cannot. Add a table covering at least: +rfkill soft block, rfkill hard block, no WiFi hardware, associated/no DHCP, +gateway unreachable, captive DNS hijack, broken DNS where 1.1.1.1 works, HTTP +portal, HTTP interception without a parseable portal URL, upstream/AP outage, +wrong WPA password or missing secret, enterprise auth/cert failure, duplicate +SSID/connection-name ambiguity, hidden SSID, multiple active links, wedged +NetworkManager, slow/hung command, stale/corrupt cache, DNS cleanup failure, +missing speedtest backend, and VPN/routing interference. This blocks because +Craig asked for confidence that the diagnostics and doctor cover the real field +failures, and prose scattered across sections is too easy to misread. + +Disposition: accept — added the "Failure-mode coverage" section: a 22-row table +(every mode the finding named) with detect / doctor-fix / terminal-action +columns, conformed to the org-table standard (rules under every row, ≤120). + +** DONE Pin DNS repair semantics in doctor :blocking: +The spec diagnoses DNS hijack, broken hotel DNS, and the temporary 1.1.1.1 +override test, but =net doctor --fix= does not say whether it merely recommends +the override, applies a temporary override during recovery, or leaves DNS alone +after diagnosis. Define the exact behavior for each DNS class: captive hijack +should open the portal, broken DNS where 1.1.1.1 works should either offer an +explicit temporary repair with cleanup verification or recommend the command, +and port-53/egress blocking should stop as upstream/not locally fixable. This is +blocking because DNS is one of the most common "connected but unusable" failures +and the current doctor contract is ambiguous. + +Disposition: accept — added "DNS handling in doctor (explicit per class)" under +the new Doctor section: hijack → open portal (no DNS mutation); broken-but-1.1.1.1 +→ explicit temporary override with cleanup verification under =--fix=, recommend +otherwise; egress-blocked → terminal =upstream-not-local=. + +** DONE Make auth failures terminal user-action states :blocking: +Wrong WPA password, missing NM secret, locked keyring/polkit denial, enterprise +802.1X certificate/identity failure, and portal login-required are not fixed by +resetting or bouncing NetworkManager. The doctor sequence should classify these +as =needs-user-action= terminal states, stop before looping through destructive +repairs, and tell the user the exact next action (enter password, edit profile in +=nmtui=/=nmcli=, accept portal terms, provide cert/identity, or retry with +admin auth). This blocks because repeated reset/bounce against auth failures is +slow, noisy, and can make the network state worse without helping. + +Disposition: accept — added the =needs-user-action= terminal outcome to the +Doctor section: wrong password / missing secret / keyring-or-polkit denial / +802.1X cert-or-identity failure / portal-login-required all stop the doctor before +any destructive repair and name the exact next step. + +** DONE Define upstream/AP/provider failure terminal states :blocking: +Some failures are not client-repairable: AP has no uplink, hotel gateway is +down, DHCP server is broken, gateway drops traffic, ISP outage, or captive +portal backend is failing. The spec should define how =diagnose= proves "local +link is up but upstream is broken" and how =doctor --fix= stops after local +repairs are exhausted with a clear message like "local repairs tried; likely +upstream/AP/provider" plus the evidence. This blocks because users need to know +when to stop poking the laptop and switch networks or contact the venue. + +Disposition: accept — added the =upstream-not-local= terminal outcome: diagnose +proves link-up + IP + gateway-reachable but no route out and no captive redirect; +=doctor --fix= stops after local repairs with "local repairs tried; likely +upstream/AP/provider" + evidence → switch network / contact venue. + +** DONE Decide how VPN and policy routing affect v1 diagnosis +VPN/WireGuard management is Phase 5, but active VPNs, policy routes, DNS +overrides, and firewall killswitches can break apparent internet access in v1. +The current spec does not say whether v1 detects active VPN/policy routing and +classifies "network is fine, VPN route/DNS is broken" separately from WiFi +failure. Add either a v1 diagnostic check for active VPN/default-route/DNS +ownership with a "deferred repair" outcome, or explicitly state that VPN-routed +failures are out of scope and may be misclassified. This is blocking if Craig +expects the module to diagnose normal daily-driver network failures while VPN +tooling remains separate. + +Disposition: accept (chose the detect-and-classify option) — v1 detects an active +VPN / non-NM default route / non-NM DNS owner and classifies =deferred/vpn= ("link +is fine; internet is VPN-routed"), distinct from a WiFi failure. v1 does not +repair it (VPN management is Phase 5); it names the VPN as the likely owner and +stops. Added to the Doctor section + the coverage table + a doctor-classification +test. + +** DONE Remove stale GPG-store references from the resolved spec +The spec now decides "no separate credential store; secrets live in +NetworkManager", but the Testing plan still mentions =gpg round-trip= and =GPG +store= tests, and the panel-state list still mentions a no-GPG-key state. Remove +those stale references and replace them with NM-secret/no-secret-leak tests. +This is non-blocking for product behavior but blocking for implementation +clarity: otherwise tests will be written for a credential store that no longer +exists. + +Disposition: accept — replaced the Testing-plan =gpg round-trip= / =GPG store= +bullets with an "NM secrets / no-leak" test (add/edit writes the secret via nmcli; +assert no PSK/EAP in any JSON/log/error; no store to round-trip) and dropped the +=no-GPG-key= panel state. Residue from the cj-comment pass that dropped the store. + +** DONE Reconcile status, goal, and task text before implementation :blocking: +The spec status says "Implementation-ready with caveats" and "Phase 1 ready to +build", but the body still has an unresolved enterprise add/edit VERIFY, the +Goal still says "optional GPG-encrypted secret store", and the unified task title +still names "GPG-stored secrets" even though the accepted design removed the +store. Before implementation, make the top-level status, goal, scope, task +mapping, and resolved decisions agree with the current design. This blocks +readiness because a developer starting from the top of the file would still build +or plan around abandoned GPG-store behavior. + +Disposition: accept — fixed the Goal ("secrets stay in NM's own store"), the +=[#B]= task-mapping line (notes the "GPG-stored secrets" framing is superseded by +decision 5), the enterprise VERIFY (now resolved → Status updated), and corrected +the stale =pytest= mentions to =unittest= (the repo's actual harness). Top-of-file +status/goal/scope/decisions now agree with the design. + +** DONE Resolve enterprise add/edit scope or make the caveat explicit :blocking: +The spec still says "One open question for Craig: pull enterprise add/edit into +v1?" and points to a VERIFY in =todo.org=. That is a real product-scope decision: +if enterprise add/edit is in v1, panel forms, nmcli command sequences, tests, +error messages, and docs change materially; if it is out, the UI must consistently +show activate-only with "edit in nmtui/nmcli". Decide it in the spec before +implementation, or downgrade the status to =Ready with caveats= with this exact +accepted caveat. As written, the spec cannot be plain =Ready=. + +Disposition: accept — Craig decided (2026-06-29): enterprise add/edit is vNext, +activate-only in v1. Settled in the Status line, the Scope/Out bullet, decision 9, +and the VERIFY (now DONE in todo.org). The UI shows activate-only with "edit in +nmtui/nmcli" consistently. Evidence: 24 saved profiles, 0 enterprise. + +** DONE Define the concrete test harness and coverage gate :blocking: +The spec says TDD, fake binaries on PATH, and benchmark tests, but it does not +define the actual harness contract: pytest vs unittest for the =net= package, +where fake =nmcli=/=curl=/=speedtest-go=/=rfkill=/=resolvectl= live, how test +fixtures encode command histories, how subprocess timeouts are simulated, how +Waybar scripts are executed end-to-end, and how coverage is run. Add the exact +Makefile targets (=test=, =test-unit= or package-local =pytest=), pytest config, +coverage command (e.g. branch coverage over =net/= and =waybar-net= wrappers), +minimum threshold, and the rule for reading the coverage report to add missing +tests before declaring a phase done. This blocks readiness because "what is the +test harness?" is still answerable only by analogy to older suites. + +Disposition: accept — added the "Harness + coverage gate" section. Corrected the +premise: the repo is =unittest= (=make test= → =python3 -m unittest=, 33 suites), +not pytest. Pinned the fake-binary stub convention (=tests/<name>/fake-*= on a +temp PATH), the fixture command→output map, timeout simulation, the end-to-end +=waybar-net= subprocess run, and coverage via a throwaway venv (coverage.py is +absent system-wide) with a ≥90% branch target on the pure modules. + +** DONE Use coverage to find missing behavior, not just report a percentage :blocking: +The spec does not say how coverage findings affect implementation. For this +feature, line coverage alone can miss the important holes: doctor classification +branches, cleanup-unverified paths, redaction paths, degraded hot-path fallbacks, +timeout branches, and auth/upstream/VPN terminal states. Define coverage review +criteria per phase: branch coverage for pure classifiers and parsers, named +untested branches allowed only with comments or manual-check entries, and a +required "coverage gap pass" after the first green test run that maps uncovered +logic back to tests or consciously excluded live-only behavior. This blocks +readiness because the current test plan is broad but does not force the suite to +expose missing edge tests. + +Disposition: accept — added the "Coverage as a gap-finder, not a number (per +phase)" subsection: branch coverage required for the doctor classifier (every +outcome), cleanup-unverified, redaction, degraded-fallback, timeout, and the +parsers; a mandatory coverage-gap pass after the first green run mapping each +uncovered branch to a test or a named live-only exclusion; a phase isn't done +until that pass is recorded. + +** DONE Convert error classes into exact user-facing strings and evidence fields :blocking: +The failure table and doctor outcomes classify errors well, but many messages +are still templates or descriptions rather than final text. Add exact strings +for indicator tooltip, notification, CLI stderr, JSON =error.message=, and panel +banner/step text for every failure-mode row, including cases doctor cannot fix: +wrong password, missing secret, enterprise cert failure, upstream/AP/provider +failure, VPN-routed failure, hard rfkill block, DNS cleanup failure, speedtest +missing, and HTTP interception without parseable URL. For each string, specify +the redacted evidence included and the next action. This blocks UX readiness +because "useful error" is only testable once the actual text and evidence are +defined. + +Disposition: accept — rewrote the Failure states section: each row now carries the +exact final string (with =<placeholder>= evidence), the evidence field, and the +next action, plus a per-surface rendering rule (indicator tooltip / notify / +CLI+JSON error.message+detail+code / panel banner all render the one canonical +string). Added the missing doctor-unfixable rows: hard rfkill, wrong password / +missing secret, enterprise cert failure, upstream/AP/provider, VPN-routed, HTTP +interception without a parseable URL, and DNS cleanup-unverified. + +** DONE Add an enhancement disposition table +The spec captures several good enhancements (doctor, Makefile recovery, rfkill, +airplane absorption, VPN phase), but it does not show that low-cost adjacent +enhancements were considered and accepted/deferred/rejected. Add a small radar +table for likely affordances: copy redacted doctor report, open/copy portal URL, +retry with hardware MAC, forget network, rescan now, pin speedtest server, show +last good network/result, watch mode for =net doctor=, desktop notification +actions, QR-code/share WiFi import/export, and keyboard picker. Mark each +=v1=, =vNext=, or =rejected= with a one-line reason. This is non-blocking, but it +prevents accidental loss of cheap UX wins and keeps the v1 panel focused. + +Disposition: accept — added the "Enhancement radar" table dispositioning all the +named affordances: open/copy portal URL, forget network, rescan, hardware-MAC +retry, pin speedtest server, copy redacted doctor report = v1; last-good +network/result, doctor watch mode, actionable notifications, keyboard picker = +vNext; QR-share = rejected (low value for a 2-machine personal setup). + +** DONE Tighten the panel UX flow before Phase 2 +The panel has sections and state machines, but not a concrete interaction flow: +default focused section, row content, primary/secondary buttons, disabled-state +rules, confirmation wording for reset/bounce/DNS override, how "Get me online" +reports each escalation, what stays visible after the panel closes, and keyboard +navigation. Add a short UX flow spec or wire-level outline before Phase 2. This +is non-blocking for Phase 1, but it blocks Phase 2 implementation because a GTK +panel can easily become noisy or surprising if these defaults are invented while +coding. + +Disposition: accept — added the "Panel UX flow (settle before Phase 2)" +subsection: default focus (Connections, or Diagnose when opened from a captive +state), row content, one primary button per section, disabled-state rules, exact +confirmation wording for reset/bounce/DNS-override/remove, the live "Get me +online" escalation reporting, what survives panel close, and keyboard nav. + +* Review and iteration history + +** 2026-06-29 Mon @ 17:00:39 -0400 — Codex — reviewer + +- *What changed or was recommended:* Rubric: =Not ready=. Applied the + spec-review workflow and added blocking findings for diagnostics structure, + user-facing errors, observability, nmcli contracts, cache concurrency, + hot-path performance, Waybar click actions, mutation rollback, credential + security, unsupported WiFi types, panel states, speed-test dependency + verification, install boundaries, and test coverage. +- *Why:* The spec has the right broad shape, but this feature combines a status + indicator, NetworkManager mutation, sudo-backed remediation, captive-portal + probing, secret storage, and GTK UI. Too many failure modes were still left for + implementers to invent during coding. +- *Artifacts:* Findings recorded in =Review findings=. Local code read included + =~/.dotfiles/hyprland/.local/bin/waybar-netspeed=, + =~/.dotfiles/common/.local/bin/captive=, + =~/.dotfiles/tests/waybar-netspeed/test_waybar_netspeed.py=, + =~/.dotfiles/tests/captive/test_captive.py=, pocketbook's GTK layer-shell + scaffold, current Waybar config, and the linked =todo.org= tasks. External + checks included the Arch nmcli man page and upstream LibreSpeed CLI page. + +** 2026-06-29 Mon @ 17:35:00 -0400 — Claude Code (archsetup) — responder + +- *What changed:* Processed all 18 findings — 13 accepted, 5 modified, 0 + rejected. Accepts added new sections: Repository + dependencies, Performance + budgets, nmcli contract, JSON schemas, Diagnostics contract (+ Diagnose vs + Repair), Failure states/messages/recovery, Observability, cache + concurrency/atomicity/staleness, mutation safety + rollback, the rewritten + credential security model, the expanded test plan, JSON schemas, the renamed + Implementation phases with per-phase acceptance criteria, and a Status line. + Modifies: deferred the =net debug-bundle= command (log + =diagnose --json= + cover it), made click actions phase-aware + non-blocking, scoped enterprise + WiFi to activate-only (add/edit vNext), trimmed the panel state set to states + that occur on the two-machine fleet, and switched the speed-test backend to the + already-installed =speedtest-go= (verified live) instead of a new + librespeed-cli dependency. +- *Why:* Codex's review was strong and mostly right; the modifies are scope calls + for a single-user, two-machine tool (avoid gold-plating, use installed deps, + don't model impossible states) and one phasing reconciliation (the panel is + Phase 2, so Phase-1 clicks can't open it). +- *Artifacts:* All 18 findings completed =DONE= in place with dispositions. Live + check: =pacman -Qq | grep speed= (speedtest-go-bin installed; =speedtest-go + --json= confirmed). Status now "review incorporated (Codex, 2026-06-29)", + implementation-ready with the named Phase-2/3 caveats; Phase 1 ready to build. + +** 2026-06-29 Mon @ 17:37:58 -0400 — Claude Code (archsetup) — responder (cj comments) + +- *What changed:* Folded in Craig's 10 cj comments on the spec. Design changes: + (1) dropped the separate credential store entirely — secrets stay in NM's own + =system-connections= (root =0600=, inline), touched via nmcli; no GPG, no + gpg-agent (rewrote the secrets section, decision 5, dropped Phase 4 + the gpg + dep). (2) Added =net doctor [--fix]= + Makefile console-recovery targets + (=make online= etc.) as a first-class TTY path; reversed the earlier + defer-the-doctor call (decision 10). (3) Added a full-stack =bounce= repair and + an =rfkill= unblock repair + indicator state — the rfkill one recovers the + framework-laptop post-power-loss soft-block Craig hit. (4) =custom/net= absorbs + the airplane module; the standalone airplane scripts/tests/module are deleted on + ship. (5) Moved VPN/WireGuard from "out" to a planned Phase 5. (6) Added a + "Help + documentation" section (CLI help / panel help / user guide). Answered + the enterprise-defer rationale and the captive-auto-login explanation inline. +- *Why:* Craig's comments simplify (NM-only secrets, fewer deps) and harden the + recovery story (doctor + make targets + rfkill/bounce reachable from a dead-GUI + console — his stated need). Enterprise add/edit in v1 is the one open call, + raised as a VERIFY. +- *Artifacts:* All 10 cj blocks removed. Live checks: airplane scripts/tests + present (confirmed deletion targets); =/etc/NetworkManager/system-connections/= + is root =0600= with inline secrets (confirms the NM-only secret model); + =rfkill= present. VERIFY filed in =todo.org= for the enterprise-scope decision. + +** 2026-06-29 Mon @ 17:49:24 -0400 — Codex — reviewer + +- *What changed or was recommended:* Added a second feedback batch from the + common network failure-mode review: require a diagnose/doctor coverage table, + pin DNS repair semantics, make auth failures terminal user-action states, + define upstream/AP/provider terminal states, decide how active VPN/policy + routing affects v1 diagnosis, and remove stale GPG-store references. +- *Why:* The design now has a solid recovery shape, but the field question is + sharper: for each common failure, does =diagnose= catch it, does =doctor --fix= + fix it, or does the user need to switch networks/provide credentials/contact + the venue? That mapping needs to live in the spec, not in chat. +- *Artifacts:* Findings appended under =Review findings= after reviewing the + current spec's Diagnostics, Failure states, Doctor, Testing plan, and + responder history. + +** 2026-06-29 Mon @ 18:06:22 -0400 — Codex — reviewer + +- *What changed or was recommended:* Added a readiness pass focused on + development-time questions, missed enhancements, UX cleanliness, error wording, + and test/coverage mechanics. New findings require reconciling stale status/goal + text, resolving or explicitly caveating enterprise add/edit, defining the + concrete harness and coverage gate, making coverage actionable, turning failure + classes into exact user-facing strings, adding an enhancement disposition + table, and tightening the Phase-2 panel UX flow. +- *Why:* The network failure model is now strong, but implementation would still + force developers to invent test infrastructure, coverage expectations, final + error copy, and several UX details. The spec should decide those up front or + explicitly mark them as caveats. +- *Artifacts:* Findings appended under =Review findings= after reading the + updated spec, current archsetup Makefile test targets, pocketbook's pyproject, + and the linked =todo.org= enterprise VERIFY. + +** 2026-06-29 Mon @ 17:55:08 -0400 — Claude Code (archsetup) — responder (2nd review) + +- *What changed:* Processed all 6 second-batch findings — 6 accepted, 0 modified, + 0 rejected. Added a "Doctor: escalation, classification, terminal states" + section with four outcomes (=fixable= escalating lightest-first; =needs-user- + action=, =upstream-not-local=, =deferred/vpn= terminal) and explicit per-class + DNS handling. Added a "Failure-mode coverage" table (22 rows: detect / doctor-fix + / terminal action, org-table-standard conformant, ≤120). Cleaned the last + GPG-store residue (Testing plan → NM-secret/no-leak test; dropped the no-GPG-key + panel state). Added a doctor-classification test class. +- *Why:* The findings sharpen the doctor from "escalate repairs" to "classify, + then stop at the right terminal state" — the safety property that keeps + =doctor --fix= from looping destructive repairs against a wrong password or an + upstream outage. The VPN finding got the detect-and-classify option (not repair; + that's Phase 5) so v1 doesn't misread a VPN-routed failure as WiFi down. The + coverage table is the single artifact Craig asked for to trust the field + coverage. +- *Artifacts:* All 6 findings =DONE= in place; cookie =[24/24]=. Table conformed + via =wrap-org-table.el= (max line 120). No new findings introduced; no scope + expansion past what the findings asked. Spec remains implementation-ready, the + enterprise add/edit VERIFY still the one open decision. + +** 2026-06-29 Mon @ 18:24:51 -0400 — Claude Code (archsetup) — responder (3rd review) + +- *What changed:* Processed the third review (7 findings). The enterprise-scope + finding closed first on Craig's call (vNext, activate-only). The other 6, all + accepted: (1) reconciled the top-of-file text — fixed the Goal's GPG-store + wording, the =[#B]= task-mapping line, the resolved enterprise VERIFY, and + corrected the stale =pytest= mentions to =unittest= (the repo's real harness). + (2) Added a "Harness + coverage gate" section (unittest, fake-binary stubs on a + temp PATH, venv coverage, ≥90% branch on pure modules). (3) Added a per-phase + "coverage as a gap-finder" pass. (4) Rewrote the Failure states section to exact + final strings + evidence fields + a per-surface rendering rule, and added the + missing doctor-unfixable rows. (5) Added the "Enhancement radar" table + (v1/vNext/rejected). (6) Added the "Panel UX flow" subsection. +- *Why:* The findings close the gap between "design decided" and "a developer can + start": the harness/coverage contract, the exact UX strings, and the panel flow + are the things otherwise invented mid-code. The =pytest=→=unittest= correction + was a real defect — the spec contradicted the repo's actual test convention. +- *Artifacts:* All 31 findings =DONE=; cookie =[31/31]=. Both new tables conformed + via =wrap-org-table.el= (coverage 120, radar 110). Harness verified against the + live repo (33 unittest suites, =make test=, coverage.py absent → venv). Status + raised to "Ready for Phase 1; Ready-with-caveats overall" — no open decisions + remain. |
