From 4f0baa9cfb690a46c2b692b9b8c6de67bb5bc793 Mon Sep 17 00:00:00 2001 From: Craig Jennings Date: Thu, 11 Jun 2026 12:59:23 -0500 Subject: chore(todo): close the VM-warning investigation — all five resolved MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- todo.org | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/todo.org b/todo.org index cc4508c..c90c9bb 100644 --- a/todo.org +++ b/todo.org @@ -572,13 +572,12 @@ Root cause was in =retry_install=: =last_exit_code=$?= ran AFTER =if eval ...; t *** 2026-05-19 Tue @ 01:25:26 -0500 Verified the b9907c7 emacs-stow fix end-to-end =make test= 21:44 → 22:29 (42 min), =test-results/20260518-214516/=. 52/0/5, =ArchSetup Exit Code: 0=. The third-branch path fired correctly — install log =archsetup-2026-05-18-21-45-46.log:14358-14365= shows =From https://git.cjennings.net/dotemacs= → =[new branch] main -> origin/main= → =Reset branch 'main'= → =branch 'main' set up to track 'origin/main'=. No exit-128, no =fatal: not a git repository=. Error Summary down to 7 (was 13 on 2026-05-16); the emacs entry is gone. AUR exit-0 logging triggered for 2 packages this run (mkinitcpio-firmware, tidaler) vs 6 on 2026-05-16 — same bug class, fewer triggers, still tracked under =[#B] AUR exit-0 logged as error=. Issue Attribution: 1 ARCHSETUP entry (Proton VPN Daemon failed — known VM-no-VPN-config artifact). Cleanup ran clean via the normal path. -** TODO [#C] Investigate the 2026-05-11 VM-test warnings +** DONE [#C] Investigate the 2026-05-11 VM-test warnings +CLOSED: [2026-06-11 Thu] :PROPERTIES: -:LAST_REVIEWED: 2026-06-10 +:LAST_REVIEWED: 2026-06-11 :END: -The 18:36 =make test= run passed (52/0/5) but raised 5 validation warnings. Each is investigated below with a recommendation. Most look like headless-VM / QEMU-slirp false positives the test harness should skip rather than archsetup bugs — but a couple have a real archsetup angle worth checking. Source: =test-results/20260511-183643/test.log= (WARN lines) and =scripts/testing/lib/validation.sh=. - -2026-06-10: four of five resolved as harness skips (=ced91c4=, verified in the 19:06 run — warnings 5 → 2, and one of the two is the portal case whose refined skip lands next run). Only the lingering investigation below remains open. +All five resolved. Four were environment-impossible checks converted to uncounted skips (=ced91c4= + the portal refinement =19015c7=) — socket, portal, mDNS-on-slirp, docker-pre-reboot — and all four skips verified firing in the 2026-06-11 12:56 run (52/0, 1 warning). The fifth (lingering) turned out to be a harness quoting bug, not a logind issue — fixed in =5b51900=, dated entry below. The next clean run should report zero warnings. The 18:36 =make test= run that filed this passed 52/0/5; the sub-entries below carry each investigation. *** 2026-06-10 Wed @ 19:07:54 -0500 Hyprland-socket warning converted to a skip Shipped in =ced91c4=: the check now passes when the socket exists, skips (uncounted) when no Hyprland process is running — the headless-VM state — and warns only in the genuinely odd case of a running compositor with no socket. Verified live: the skip fired in the 2026-06-10 19:06 run. @@ -589,9 +588,8 @@ Shipped in =ced91c4= + a follow-up refinement: the first condition (portal proce *** 2026-06-10 Wed @ 19:07:54 -0500 mDNS-ping warning converted to a slirp-aware skip Shipped in =ced91c4=: when the VM is on QEMU slirp (a =10.0.2.x= address), the =.local= ping is skipped — multicast genuinely can't pass there — and the =is-enabled= check stands alone. On real networking the full ping test still runs and still warns on failure. Verified live: the skip fired in the 2026-06-10 19:06 run. -*** TODO [#C] Warning: User lingering not enabled (syncthing may not autostart) -=validation.sh:661= runs =loginctl show-user -p Linger= and warns if it isn't =yes=. archsetup *does* call =loginctl enable-linger "$username"= (=archsetup:1438= and =:1741=), and the install log shows =...enabling user-services lingering for cjennings @ 18:40:28= ran with no error — yet the check still says lingering is off. Likely the =logind=-unhappy-in-the-VM issue (the log-diff shows =logind: Failed to start session scope … Permission denied=) — =loginctl enable-linger= may return 0 but not actually create =/var/lib/systemd/linger/=, or the =show-user= query itself may be wrong while logind is degraded. Possibly a real concern *if* the same happens on bare metal; almost certainly a VM artifact otherwise. -Recommendation: investigate with =make test-keep= — after a run, check =ls -l /var/lib/systemd/linger/cjennings= and =loginctl show-user cjennings -p Linger= on the VM. If the file exists but =loginctl= disagrees → logind/dbus health issue (cross-ref the logind errors in =[#B]=; the fix may be to ensure =systemd-logind= is healthy before =enable-linger=). If the file doesn't exist → =enable-linger= is silently no-op'ing in the VM; consider a fallback (=install -d /var/lib/systemd/linger && touch /var/lib/systemd/linger/$username=) or running it later in the install. Either way the =enable-linger= call in =archsetup= is wired correctly. +*** 2026-06-11 Thu @ 12:58:19 -0500 Lingering warning was a harness quoting bug — fixed, hypothesis disproven +make test-keep forensics on the kept VM: the linger file existed (created mid-install), =loginctl show-user cjennings -p Linger= said yes, logind active with zero errors — lingering was correctly enabled all along, so the logind-degraded hypothesis was wrong and archsetup's =enable-linger= calls were always fine. The actual bug was in the check itself (=validation.sh=): it captured =ls path && echo yes=, so a present file produced "path\nyes", which never string-equals "yes" — the check warned on every run regardless of state. Fixed in =5b51900= with =test -e=; the corrected expression verified returning "yes" against the live VM. With this, all five 2026-05-11 warnings are resolved and a clean run should report zero. *** 2026-06-10 Wed @ 19:07:54 -0500 Docker warning converted to a pre-reboot skip Shipped in =ced91c4=: =docker info= success still passes; enabled-but-inactive (the deliberate enable-not-now install state, validated pre-reboot) now skips; active-but-unresponsive still warns — that's the real failure case. Verified live: the skip fired in the 2026-06-10 19:06 run. The enable vs enable-now question for archsetup itself was left as-is (the daemon's weight makes enable-on-boot defensible). @@ -746,17 +744,19 @@ CLOSED: [2026-06-10 Wed] :END: Done live on velox 2026-06-10. Hardware re-verified first (i915 graphics, ath9k wifi), then removed the meta + 12 subpackages (the task's 9 plus liquidio/mellanox/nfp/qlogic from the finer 2026 split), keeping intel + atheros + whence. The meta needed =-Rdd= — mkinitcpio-firmware declares a dep on it; the dangling dep is cosmetic. Initramfs rebuilt clean (warnings only for absent hardware), wifi stayed connected. Codified in archsetup commit =adb39f2= as a DMI-gated Framework-Intel block. Full confidence needs the next reboot — see Manual testing below. -** TODO [#B] Identify and replace packages no longer in repos +** DONE [#B] Identify and replace packages no longer in repos +CLOSED: [2026-06-11 Thu] :PROPERTIES: -:LAST_REVIEWED: 2026-05-21 +:LAST_REVIEWED: 2026-06-11 :END: -Systematic check for availability issues +Shipped 2026-06-11 as =1f89523=: =scripts/audit-packages.sh= (unit-tested) makes the check repeatable, and its first run over 420 packages found four casualties, all fixed in the same commit — libva-mesa-driver (folded into mesa), nvidia-dkms → nvidia-open-dkms, swww → awww (set-theme's stale swww call fixed in dotfiles =4ea35a1=), libappindicator-gtk3 → libayatana-appindicator. Re-run anytime: =scripts/audit-packages.sh=. -** TODO [#B] Verify package origin for all packages +** DONE [#B] Verify package origin for all packages +CLOSED: [2026-06-11 Thu] :PROPERTIES: -:LAST_REVIEWED: 2026-05-21 +:LAST_REVIEWED: 2026-06-11 :END: -Ensure packages are installed from correct source (official repos vs AUR) - prevent installing from wrong place +Covered by the same auditor (=1f89523=): it flags movers in both directions. Current state: zero official packages wrongly routed through aur_install-only territory; 15 aur_install entries have graduated to official repos (duf, flameshot, gist, inxi, nsxiv, nvm, papirus-icon-theme, ptyxis, qt5ct, qt6ct, ttf-lato, ueberzug, warpinator, xcolor, xdg-desktop-portal-hyprland). Left as-is deliberately — yay resolves repo packages fine — but switching them to pacman_install is a clean :quick: cleanup whenever wanted; the auditor lists them on every run. ** DONE [#B] Automate script usage tracking :solo: CLOSED: [2026-06-10 Wed] -- cgit v1.2.3