aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rwxr-xr-xarchsetup3
-rw-r--r--docs/design/2026-06-29-zfs-pre-snapshot-installer.org87
-rw-r--r--todo.org119
3 files changed, 182 insertions, 27 deletions
diff --git a/archsetup b/archsetup
index 7531821..7c98147 100755
--- a/archsetup
+++ b/archsetup
@@ -1866,6 +1866,9 @@ hyprland() {
action="Hyprland Utilities" && display "subtitle" "$action"
aur_install pyprland # scratchpads, magnify, expose (fixes special workspace issues)
pacman_install waybar # status bar
+ pacman_install gtk4-layer-shell # custom/net connection panel (GTK4 layer-shell)
+ pacman_install python-gobject # PyGObject for the net panel
+ aur_install speedtest-go-bin # net panel speed test backend
pacman_install fuzzel # app launcher (native Wayland, pinentry support)
pacman_install awww # wallpaper daemon (swww successor; provides swww)
aur_install waypaper # wallpaper GUI (awww backend)
diff --git a/docs/design/2026-06-29-zfs-pre-snapshot-installer.org b/docs/design/2026-06-29-zfs-pre-snapshot-installer.org
new file mode 100644
index 0000000..413bfa5
--- /dev/null
+++ b/docs/design/2026-06-29-zfs-pre-snapshot-installer.org
@@ -0,0 +1,87 @@
+#+TITLE: ZFS pre-pacman snapshot installer step (durable retention)
+#+DATE: 2026-06-29
+#+SOURCE: handoff from the home project, 2026-06-29
+
+* Problem
+
+A pacman =PreTransaction= hook snapshots =zroot/ROOT/default@pre-pacman_<ts>=
+before every transaction, but nothing prunes them. Sanoid doesn't manage them
+(they aren't =autosnap_= names), so they accumulated to 53 on velox between
+April and the 2026-06-29 health check. Unbounded, they fill the pool over time.
+
+* What's actually on velox vs. archsetup
+
+The live =/usr/local/bin/zfs-pre-snapshot= is *not* authored by archsetup —
+=git grep= for its content (=MIN_INTERVAL=, the pre-pacman =LOCKFILE= logic)
+finds nothing tracked. The =PreTransaction= hooks in the archsetup monolith
+(~lines 910, 1907, 1942) are the live-update guard, a different hook. The
+script appears hand-placed on velox.
+
+The 2026-01-17 security doc line "ZFS pre-pacman snapshots (already in
+install-archzfs)" is therefore *out of date* — archsetup does not install this.
+Incorporating the fix is a NET-NEW installer step, not a patch to an existing
+one. Correct that stale doc line as part of the work.
+
+velox was patched live (pruned to 10, script replaced with the self-pruning
+version below); live backup at =/usr/local/bin/zfs-pre-snapshot.bak-2026-06-29=.
+
+* Proposed installer step
+
+In the archzfs / ZFS-on-root install path, gated to ZFS-root installs (velox is
+the only ZFS daily driver; ratio is btrfs), install:
+
+1. =/etc/pacman.d/hooks/zfs-snapshot.hook= — the =PreTransaction= hook that
+ runs the script. *Not included in the handoff* — source it from velox
+ (=/etc/pacman.d/hooks/zfs-snapshot.hook=) or write it.
+2. =/usr/local/bin/zfs-pre-snapshot= — the =KEEP=10= self-pruning version
+ below.
+
+Tests live in archsetup, so this wants an archsetup session and a ZFS-root VM
+test (=make test FS_PROFILE=zfs=), not a cross-project edit from home.
+
+* The script (KEEP=10 self-pruning version)
+
+#+begin_src bash
+#!/bin/bash
+POOL="zroot"
+DATASET="$POOL/ROOT/default"
+LOCKFILE="/tmp/.zfs-pre-snapshot.lock"
+MIN_INTERVAL=60
+KEEP=10 # how many pre-pacman snapshots to retain (rollback safety for recent transactions)
+
+# Skip if a snapshot was created within the last 60 seconds
+if [ -f "$LOCKFILE" ]; then
+ last=$(stat -c %Y "$LOCKFILE" 2>/dev/null || echo 0)
+ now=$(date +%s)
+ if (( now - last < MIN_INTERVAL )); then
+ exit 0
+ fi
+fi
+
+TIMESTAMP=$(date +%Y-%m-%d_%H-%M-%S)
+SNAPSHOT_NAME="pre-pacman_$TIMESTAMP"
+
+if zfs snapshot "$DATASET@$SNAPSHOT_NAME"; then
+ echo "Created snapshot: $DATASET@$SNAPSHOT_NAME"
+ touch "$LOCKFILE"
+
+ # Retention: keep only the most recent $KEEP pre-pacman snapshots, destroy older ones.
+ # Sanoid does not manage these (they aren't autosnap_), so prune them here at creation time.
+ zfs list -H -o name -t snapshot -s creation "$DATASET" 2>/dev/null \
+ | grep '@pre-pacman_' \
+ | head -n -"$KEEP" \
+ | while read -r old; do
+ zfs destroy "$old" && echo "Pruned old snapshot: $old"
+ done
+else
+ echo "Warning: Failed to create snapshot" >&2
+fi
+#+end_src
+
+* Open items before implementation
+
+- Source or write =/etc/pacman.d/hooks/zfs-snapshot.hook= (the trigger).
+- Decide the exact insertion point in the ZFS-root install path.
+- Add a ZFS-root VM test asserting the hook + script land and the script
+ self-prunes past =KEEP=.
+- Correct the stale 2026-01-17 security-doc line.
diff --git a/todo.org b/todo.org
index b72e9d4..98c6ed3 100644
--- a/todo.org
+++ b/todo.org
@@ -21,6 +21,26 @@ The vocabulary is open — topic tags are coined as needed — so these are conv
- *Effort / autonomy*: =:quick:= a spare-moment fix (minutes, not a sitting); =:solo:= Claude can carry it end to end — there's a build path, a test path, and no upfront decision needed (a leftover manual spot-check doesn't disqualify it).
- *Topic / area* (open): the subsystem a task touches — e.g. =:hyprland:= =:waybar:= =:mpd:= =:music:= =:network:= =:tooling:= =:llm:= =:eask:= =:pocketbook:= =:cmail:=. Coin a new one when it aids filtering.
* Archsetup Open Work
+** TODO [#B] ZFS pre-pacman snapshot installer step (ZFS-root) :feature:zfs:
+Add a ZFS-root-gated installer step that installs the pre-pacman snapshot pacman hook plus a self-pruning =/usr/local/bin/zfs-pre-snapshot= (KEEP=10). The script is hand-placed on velox, not authored by archsetup, so a reinstall loses it; snapshots accumulated unbounded (53 since April) because nothing prunes them and Sanoid ignores non-autosnap_ names. Gate to ZFS-root (velox; ratio is btrfs). Also correct the stale 2026-01-17 security-doc line claiming it's "already in install-archzfs". Needs the hook file (source from velox) and a ZFS-root VM test.
+
+Design notes and the KEEP=10 script: [[file:docs/design/2026-06-29-zfs-pre-snapshot-installer.org]]. Origin: home handoff 2026-06-29.
+
+** TODO [#B] Consistent red=off across waybar toggle modules :waybar:
+Extend the red=off convention (just added to the touchpad/mouse indicator) to the other toggles — sound volume, microphone mute, and caffeine — so a disabled / muted / off state reads red across the board. Skip the "cross"/slash; the color alone carries it. Origin: roam inbox capture.
+
+** TODO [#B] Microphone-mute keybind :feature:waybar:quick:
+A keyboard shortcut to toggle the mic mute. The pulseaudio#mic module shows the state but there's no hotkey to flip it. Wire a hyprland bind to a mic-mute toggle. Origin: roam inbox capture.
+
+** TODO [#B] File-manager swallow pattern :feature:hyprland:
+When the file manager launches another app, it should hide to a special workspace (the "swallow" pattern) and return when that process ends, rather than vanishing. Today it disappears with no signal of whether it's coming back, so the user can't tell success from failure — they should quit explicitly instead. Origin: roam inbox capture.
+
+** TODO [#C] Keybind hints in waybar module tooltips :waybar:
+Every module's hover tooltip should list its keyboard shortcut(s), for discoverability. Audit the modules and add the bindings to each tooltip. Origin: roam inbox capture.
+
+** TODO [#C] Smooth waybar expansion animation :waybar:
+The cluster expansion jumps instead of animating, and a few systray icons pop in one-by-one afterward, which reads as glitchy. Animate the expansion smoothly if waybar allows it — width transitions are limited, so feasibility is uncertain (hence [#C]). Origin: roam inbox capture.
+
** TODO [#B] Scrolling/Carousel layout: frame fit + wrap-around :hyprland:
:PROPERTIES:
:LAST_REVIEWED: 2026-06-13
@@ -129,26 +149,55 @@ Deferred to Phase 2/3: archsetup deps (gtk4-layer-shell/python-gobject Phase 2,
speedtest-go-bin Phase 3 — not added before the code that needs them).
Verify (manual, live): see Manual testing and validation.
-*** TODO Phase 2 — panel shell + connection management :network:
-Deliverable: GTK4 + gtk4-layer-shell panel (pocketbook scaffold); =net list/up/
-down/add/edit/remove/rescan= (open + WPA-PSK; enterprise activate-only); MRU list
-with live signal; mutation safety + rollback (keep prior link until target
-activates, no stranding); panel state machines; the panel UX flow (default focus,
-primary buttons, disabled rules, confirmation wording, keyboard nav).
-Tests: fake-nmcli command-sequence assertions (UUID-keyed, escaped parsing:
-colon/backslash/newline/duplicate/hidden/non-ASCII); rollback keeps prior link on
-failed switch; NM-secret write + no-secret-leak; panel state-machine transitions.
-Verify (manual, live): see Manual testing and validation.
-
-*** TODO Phase 3 — diagnostics + speed test in the panel :network:
-Deliverable: wire =net diagnose= / =net repair= / =net doctor= / =net portal= /
-=net speedtest= into the Diagnose (read-only) vs Repair (mutating, confirmed)
-sections; "Get me online" with live escalation reporting; portal Open button;
-speedtest (=speedtest-go --json=) progress + cancel; failure-mode → exact-string
-rendering across surfaces.
-Tests: diagnose read-only; each repair tier confirms + verifies cleanup (DNS
-override reverts → cleanup_verified, else cleanup-unverified); speedtest parse from
-fixture JSON + fixture stderr failure messages.
+*** 2026-06-29 Mon @ 22:19:25 -0400 Phase 2 shipped — panel shell + connection management
+Shipped to dotfiles (commits =4e7740f=..=24bcac5=, pushed). Engine: =net list= (saved
+MRU + in-range wifi scan, infrastructure types filtered), =net up/down= (UUID-keyed,
+mutation safety — keep prior link until target activates, classify wrong-password vs
+generic, report auto-reactivation), =net add/edit/remove/rescan= (open + WPA-PSK;
+enterprise activate-only; secret to NM's store, never our JSON/log — tested).
+
+Panel: a GTK-free PanelModel (selection, four state machines, the UX-flow enable
+rules, terminal states) + a GTK4 gtk4-layer-shell window (=net panel=) anchored
+top-right under the bar — Connections section with MRU list, active marked, signal
+glyph, row-click select, Connect/Add/Forget/Rescan, confirm-on-forget, worker-thread
+engine calls via GLib.idle_add. GTK imported lazily so the CLI/tests stay GTK-free.
+
+Bar interactions (settled with Craig over live iteration): left = =net-panel= toggle,
+middle = =net portal=, right = =net-fix= (notify the doctor result when one-way; open
+a terminal only when the outcome is fixable — the sudo/interactive case). Airplane on
+Super+Shift+A. archsetup adds =gtk4-layer-shell= + =python-gobject= (this commit);
+already on velox.
+
+Tests: 204 in tests/net (merge ordering/dedup, up/down mutation safety, no-secret-leak
+on add/edit, panel model + state machines, gui row-format helpers). Full dotfiles suite
+green (32 suites). Live-verified on velox: panel opens/toggles, list shows real 24
+profiles, right-click notification delivers (Craig confirmed). Phase 3 (diagnose/repair/
+speedtest IN the panel) is next; the engine for it already exists from Phase 1.
+
+*** 2026-06-29 Mon @ 22:43:40 -0400 Phase 3 shipped — diagnostics + speed test in the panel
+Shipped to dotfiles (=91277cf=..=691abcb=) + archsetup (=48052d6=, speedtest-go-bin),
+pushed. Engine: =net speedtest= (parses speedtest-go --json → ping from latency ns,
+down/up from per-server byte rates; missing-backend / offline / malformed → error
+envelope per the failure table). Panel grew a section switcher with four pages:
+- Connections (Phase 2).
+- Diagnose: =net diagnose= on a worker thread, each step a row (✓/✗/… glyph + title +
+ redacted evidence), read-only; Open-portal button when captive.
+- Repair: "Get me online" (=net doctor --fix=) + tiers (rfkill/reset/bounce/dns-test)
+ + force portal. Confirmations in-panel with the spec's exact wording; the privileged
+ tiers run via =net-popup= terminal (where the sudo prompt + step output, incl.
+ cleanup-verified, show) — a panel has no tty, and pkexec would mean a prompt per op.
+- Speed test: in-process =net speedtest= (no privilege → inline result: ↓/↑ Mbps + ping
+ + server), Run/Cancel (Cancel pkills the child), error envelope shown.
+
+213 net tests; pure helpers (step_indicator, format_speedtest) unit-tested. Full
+dotfiles suite green (32 suites). One unverified assumption: speedtest-go's dl/ul unit
+(taken as bytes/s; =BYTES_PER_SEC= flips it) — needs one real run vs a reference. The
+in-panel repair streaming (vs terminal) is a named future polish once the GUI-privilege
+story settles.
+
+The waybar network module ([#B] parent) is now COMPLETE through Phase 3. Phase 4
+(in-app help + user guide) and Phase 5 (VPN/WireGuard) remain as future work; the core
+feature (indicator + recovery + panel + diagnostics + speed test) is done.
Verify (manual, live): see Manual testing and validation.
*** TODO Phase 4 — docs + rollout :network:
@@ -787,7 +836,10 @@ What we're verifying: the physical keychord opens a floating Dirvish popup; open
- Expected: GUI nautilus opens (the binding nautilus moved to)
*** Network module Phase 1 — indicator states on the live bar
What we're verifying: =custom/net= shows the right state for each real network condition. The engine logic is unit-tested; this is the live-bar + visual check (states can't be faked on the running bar). Phase 2-3 tests get added under this task as those phases land.
-- Reload waybar to pick up =custom/net= (Super+B, or =pkill waybar; waybar &=).
+- Reload waybar to pick up =custom/net=. Super+B does NOT reload a running bar — it only toggles visibility (SIGUSR1), and the bar reads a generated runtime config, so a stale copy keeps the old module. The correct reload regenerates the runtime config then restarts:
+#+begin_src sh :results output
+waybar-active-config && killall waybar && waybar-toggle
+#+end_src
- On a normal connected network, read the module.
- Expected: wifi glyph + signal + SSID; tooltip shows IPv4, gateway, throughput, and a recent "online" probe result.
- Join the hotel/captive network (or any portal network).
@@ -805,12 +857,25 @@ rfkill list wifi # confirm Soft blocked: yes
make -C ~/.dotfiles online # or: net doctor --fix
#+end_src
- Expected: doctor reports the rfkill block, runs =rfkill unblock wifi= + =nmcli radio wifi on=, reconnects, and ends "online" — all from the TTY.
-*** Network module Phase 1 — airplane state absorbed, display-only (option 1)
-What we're verifying: =custom/net= shows the airplane state, the toggle stays the airplane-mode low-power script (now on the net module's right-click), and only the redundant *display* pieces were removed. Craig's call: net absorbs the display, not the low-power orchestration.
-- Right-click =custom/net= (it now runs =airplane-mode=).
-- Expected: airplane engages — wifi drops, brightness dims, CPU/services to low-power — and =custom/net= shows the airplane glyph in gold. Right-click again restores everything.
-- Check the bar has no separate =custom/airplane= module, and =waybar-airplane= / =waybar-netspeed= are gone from =~/.local/bin= (dangling symlinks removed).
-- Expected: no duplicate airplane indicator; =airplane-mode= itself is still present (=ls ~/.local/bin/airplane-mode= → exists), since the low-power toggle is not a network concern.
+*** Network module — bar clicks + airplane keybind (FINAL scheme)
+What we're verifying: the custom/net clicks and the airplane keybind. Clicks (settled with Craig over live use 2026-06-29): left = =net-panel= toggle (the GTK panel), middle = =net portal= (floating terminal), right = =net-fix= (notify the doctor result when one-way; open a terminal only when fixable). Airplane = Super+Shift+A.
+- Left-click =custom/net=.
+- Expected: the GTK connection panel toggles open (left-click again, or Esc, closes it).
+- Right-click =custom/net= while online.
+- Expected: a desktop notification "Network / Online" (success), no terminal. When a repair is needed it instead opens a terminal running =net doctor --fix=. (Craig confirmed the notification delivers, 2026-06-29.)
+- Middle-click =custom/net= on a captive network.
+- Expected: =net portal= runs in the floating terminal — reset + opens the portal page.
+- Press Super+Shift+A.
+- Expected: airplane engages (wifi off, dim, low-power); =custom/net= shows the airplane glyph in gold. Super+Shift+A again restores everything.
+- Check =airplane-mode= is still present (=ls ~/.local/bin/airplane-mode=), and =waybar-airplane= / =waybar-netspeed= / =custom/airplane= are gone.
+*** Network module Phase 3 — panel Diagnose / Repair / Speed test tabs
+What we're verifying: the four-tab panel works end to end. Left-click =custom/net= to open it.
+- Diagnose tab → "Run diagnose".
+- Expected: a list of steps (link, DHCP, gateway, DNS config, DNS resolution, internet) each with a ✓/✗/… glyph + evidence; on a captive network an "Open portal" button appears.
+- Repair tab → click Reset (or Bounce, or DNS override test).
+- Expected: a confirmation dialog with the exact wording (Reset names the network + new-MAC warning; Bounce "links drop briefly"; DNS test "reverts automatically"). Proceed opens a floating terminal that runs the repair (sudo prompt there) and shows the step output incl. cleanup-verified for the DNS test.
+- Speed test tab → "Run speed test" (uses ~30s + data — do it on real wifi, not the metered hotspot).
+- Expected: ↓/↑ Mbps + ping + server shown inline. CONFIRM THE NUMBERS are sane vs a reference (fast.com) — this verifies the byte-rate→Mbps unit. If off by ~8x, the =BYTES_PER_SEC= constant in =net/src/net/speedtest.py= flips.
** DOING [#B] Prepare for GitHub open-source release
:PROPERTIES: