aboutsummaryrefslogtreecommitdiff
path: root/scripts/testing/lib
Commit message (Collapse)AuthorAgeFilesLines
* fix(test): bump default VM RAM to 8 GiB to stop AUR-build OOM killsCraig Jennings19 hours1-1/+3
| | | | | | | | The zfs green run OOM-killed cc1plus three times during AUR C++ builds: makepkg runs -j$VM_CPUS (4), and parallel compiles at ~700 MB each overran the 4 GiB default. The install still passed (yay retries), but the kills showed up as attributed issues. 8 GiB gives the four jobs headroom. Overridable via VM_RAM as before.
* fix(test): give each filesystem profile its own OVMF NVRAM fileCraig Jennings24 hours1-1/+5
| | | | | | | | | | | | | init_vm_paths suffixed the disk image per profile but shared one OVMF_VARS.fd across btrfs and zfs. NVRAM holds the UEFI boot entries and lives outside the qcow2, so a disk-snapshot revert can't restore it. A zfs run's ZFSBootMenu entries clobbered the btrfs GRUB entry, and with no removable ESP fallback the btrfs base then booted to "no bootable device" and timed out before archsetup ran. NVRAM now carries the same per-profile suffix as the disk image, so the two profiles keep separate boot state. Validated by a full green zfs run (ArchSetup exit 0, Testinfra 96 passed / 0 failed).
* refactor(testing): delete the dead validation.sh shell sweepCraig Jennings3 days1-842/+0
| | | | | | Both runners now validate through run_testinfra_validation, so the shell sweep validation.sh ran is dead. Delete run_all_validations, validate_all_services, run_full_validation, the ~35 validate_* checks, and validation_pass/fail/warn/skip (called only by those checks). Keep the live helpers the runners and testinfra.sh still use: ssh_cmd, attribute_issue, capture_pre/post_install_state, analyze_log_diff, categorize_errors, generate_issue_report, and the VALIDATION_* counters plus issue arrays. The file drops from 1156 lines to 314. Closes the P5 follow-up from the Testinfra cutover.
* test(archsetup): migrate bare-metal runner to key auth + TestinfraCraig Jennings3 days1-1/+3
| | | | | | | | | | run-test-baremetal.sh SSHed to the target as root by password throughout, which archsetup's sshd hardening (PermitRootLogin prohibit-password) kills mid-install, the same break the VM runner already fixed. It also still called the validation.sh shell sweep (run_all_validations, validate_all_services, validate_zfs_services), the last caller keeping those functions alive. It now mirrors the VM runner. After the first SSH, and after any genesis rollback so the key survives it, inject_root_key authorizes a throwaway root key, and every later ssh_cmd plus the raw scp transfers and log-copies thread SSH_KEY_OPT to survive the hardening. The shell sweep is replaced with run_testinfra_validation, now the authoritative validator on both runners. A --port option, threaded through every SSH and scp, lets the runner target a test VM on 2222 instead of only real hardware on 22. inject_root_key now authorizes root@$VM_IP instead of root@localhost, so one helper serves both runners (the VM runner sets VM_IP=localhost). Validated against the ZFS VM (--validate-only, localhost:2222): connectivity, the ZFS check, key authorization, and the Testinfra sweep all connect and run over the key-based ssh-config. A green bare-metal install still needs real ZFS hardware.
* test(archsetup): add FS_PROFILE selector for ZFS VM coverageCraig Jennings3 days1-1/+16
| | | | | | | | | | The VM harness only built one btrfs base image, so every ZFS-conditional check in the Testinfra suite skipped and the ZFS install path went untested in automation. I added an FS_PROFILE selector (btrfs default, zfs) so `make test FS_PROFILE=zfs` can target a ZFS root. init_vm_paths derives the image name from FS_PROFILE and validates it. btrfs keeps the legacy unsuffixed archsetup-base.qcow2 so existing images and invocations are untouched. The zfs profile gets archsetup-base-zfs.qcow2. create-base-vm.sh picks archsetup-test.conf vs the new archsetup-test-zfs.conf (FILESYSTEM=zfs, NO_ENCRYPT=yes for an unattended install), and the Makefile resolves the matching image for its base-VM check. The archsetup run config stays shared. archsetup reads no filesystem key. It detects ZFS from the live root via is_zfs_root, so the ZFS branch fires on its own once the base image is ZFS. The design doc is reconciled to that: no separate archsetup-vm-zfs.conf, and the non-ZFS profile is btrfs, not ext4. Building the ZFS base image and running the ZFS sweep green is next.
* test(archsetup): make Testinfra the authoritative validator (P3 cutover)Craig Jennings4 days1-24/+57
| | | | | | | | | | | | | run-test.sh no longer runs the shell run_all_validations sweep; the Testinfra pytest sweep now drives the run's pass/fail. run_testinfra_validation returns pytest's exit code (and treats "could not run" as a failure, not a silent pass), surfaces the pass/skip/fail counts through the shared VALIDATION_* counters, and parses the attribution file so generate_issue_report still buckets failures into archsetup / base_install / unknown. The shell-sweep functions stay in validation.sh for now because run-test-baremetal.sh still calls them; removing them (after migrating the bare-metal runner) is filed as a follow-up.
* fix(testing): authorize a root key so make test survives sshd hardeningCraig Jennings4 days3-19/+56
| | | | | | The VM test SSHes into the guest as root with a password for the whole run. archsetup hardens sshd to PermitRootLogin prohibit-password and reloads it partway through the install, so every SSH after that step failed with "Permission denied" and the run aborted before any validation — make test had been silently broken since the hardening landed. inject_root_key authorizes a throwaway root key right after the first SSH (before archsetup runs) and the ssh/scp helpers now add -i <key> via SSH_KEY_OPT. prohibit-password still allows root key auth, so the harness survives the very hardening it validates. Password stays as the fallback, so the change is additive.
* test(archsetup): scaffold Testinfra post-install validation (P1)Craig Jennings4 days1-0/+81
| | | | | | | | Stand up the Testinfra/pytest harness alongside the existing shell sweep so the two can be compared for parity before pytest takes over. Adds scripts/testing/tests/ (conftest with failure attribution markers, a report hook, and a target_user fixture, plus three parity checks: user, ufw, dotfiles) and scripts/testing/lib/testinfra.sh, which injects a throwaway SSH key into the VM and runs pytest over SSH. The sweep is advisory here (RUN_TESTINFRA toggle, non-fatal) and does not yet affect pass/fail. Pulls python-pytest and python-pytest-testinfra into make deps. Verified on the host: py_compile clean, pytest --collect-only green, bash -n and shellcheck clean. The sweep running against a real VM is verified by the next make test run.
* chore: open-source release-prep (udev flag, SPDX headers, boolean style)Craig Jennings4 days4-0/+4
| | | | | | | | Three release cleanups, all behavior-preserving for my machines: - Gated the Logitech BRIO udev rule behind INSTALL_DEVICE_UDEV_RULES (default yes, opt-out), so the device-specific rule is off for anyone without that hardware. Added the config read, validation, and a conf.example entry. - Added a GPL-3.0-or-later SPDX-License-Identifier header after the shebang of all 24 shell scripts in the repo. - Standardized boolean conditionals on the explicit [ "$var" = "true" ] form, replacing the bare `if $var` idiom. The STEPS function-dispatch is left alone, since it runs a function name rather than testing a boolean.
* fix(testing): lingering check could never pass — ls output broke the captureCraig Jennings2026-06-111-2/+5
| | | | The check captured 'ls path && echo yes', so a present linger file produced 'path\nyes', which never string-equals yes — every run warned regardless of actual state. Forensics on a kept VM showed lingering correctly enabled all along (file present mid-install, loginctl Linger=yes, logind healthy): the original VM-artifact hypothesis was wrong, archsetup's enable-linger calls were always fine. test -e captures cleanly; verified returning 'yes' against the live VM.
* fix(testing): key the portal-query skip on the compositor, close warning tasksCraig Jennings2026-06-101-3/+5
| | | | The 19:06 verification run showed the portal skip not firing: a socket-activated xdg-desktop-portal process exists even headless, so the process check was the wrong precondition. The skip now keys on a running Hyprland, same as the socket check. That run confirmed the other three skips live (warnings 5 to 2); the remaining counted warnings are this portal case and the lingering question, which stays open.
* fix(testing): skip environment-impossible checks instead of warningCraig Jennings2026-06-101-8/+35
| | | | Four warnings fired on every headless VM run, training the reader to ignore the warning count: the Hyprland socket and portal queries (no graphical login), the mDNS ping (slirp passes no multicast), and docker-not-responding (enabled but deliberately not started pre-reboot). Each now detects its precondition and logs a skip that counts nowhere; the warn paths stay for the cases that are real (compositor running without a socket, portal running but unqueryable, mDNS failing on real networking, docker active but dead). The lingering warning stays — it needs its own investigation.
* fix(testing): expect minimal/ tree for the .zshrc symlink on DESKTOP_ENV=noneCraig Jennings2026-06-101-2/+5
| | | | The dotfiles validation hardcoded .dotfiles/common/.zshrc, but a none install stows the standalone minimal/ tree, so the first none-run ever to reach validation failed on a correct symlink. The expected path now follows DESKTOP_ENV from the VM conf.
* fix(testing): expect ~/.dotfiles symlink target in dotfiles validationCraig Jennings2026-05-221-2/+4
|
* fix(testing): cleanup traps, arg validation, and two real bugsCraig Jennings2026-05-173-6/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two real bugs and a sweep of hygiene across the harness. `make test` passed cleanly on this branch with the same 52/0/5 profile as the 2026-05-11 run, so the wiring is verified end-to-end. Real bugs: - `lib/vm-utils.sh` `snapshot_exists` was running `qemu-img snapshot -l | grep -q "$snapshot_name"`, which matches the name as a substring anywhere in the output — including inside dates or filenames in other fields. Replaced with an awk field extraction on the TAG column plus `grep -Fxq` for a whole-line literal match. - `run-test-baremetal.sh` was setting `VALIDATION_PASSED=true|false` after validation, but `validation.sh` already uses `VALIDATION_PASSED` as a pass counter. The test report then referenced `$VALIDATION_PASSED_COUNT`, which is defined nowhere. Renamed the boolean to `TEST_PASSED` (matching run-test.sh's pattern) and report the actual counter. Cleanup traps and arg validation: - `run-test.sh` now installs a top-level EXIT trap that, on abort, kills QEMU and restores the clean-install snapshot. A `CLEANUP_DONE=1` sentinel keeps the existing normal-path cleanup from double-firing. This is the recurring pain from 2026-05-11 where two failed runs left orphaned QEMU processes and dirty base disks behind. - `create-base-vm.sh` and `debug-vm.sh` got the same kind of trap, plus `debug-vm.sh` now rejects non-`.qcow2` paths up front instead of letting QEMU fail later. - `run-test.sh`, `run-test-baremetal.sh`, and `cleanup-tests.sh` now validate that options with required values actually receive one (`${var:?msg}` for `--script`/`--snapshot`/`--host`/`--password`, numeric check for `--keep`). - `run-test-baremetal.sh` traps the temp git bundle for cleanup if the script aborts before its explicit `rm`. The ZFS rollback loop now uses `while IFS= read -r ds` and quotes `$ds` inside the ssh_cmd so dataset names with whitespace wouldn't break it. Smaller hygiene: - `vm-utils.sh` `check_ovmf` also checks `OVMF_VARS_TEMPLATE`; `start_qemu` validates disk and ISO paths before building the QEMU command; numeric tests quoted. - `cleanup-tests.sh` find expression for temp disks wrapped in `\( ... -o ... \)`, all `while read` loops use `IFS= read -r`, orphaned QEMU cleanup tries SIGTERM with a 2s sleep before SIGKILL. - `create-base-vm.sh` moved the "Copy an archangel-*.iso" info line before its `fatal` instead of after (unreachable), and added the serial-log path to the final summary. - `lib/logging.sh` `stop_timer` no longer produces `$((end - ))` when the named timer was never started. - `lib/network-diagnostics.sh` `read` → `IFS= read -r`. - `setup-testing-env.sh` now installs all missing pacman packages in one transaction instead of one-at-a-time (avoids half-installed state if package N fails). KVM check also verifies the user has read+write on `/dev/kvm` and prints the `gpasswd -a $(id -un) kvm` fix if not. A few items from the review I deliberately skipped: replacing the codebase-wide unquoted `$SSH_OPTS` string with an array (cosmetic, would need to be done everywhere at once), `set -e` adds where the existing fall-through-on-failure is intentional, and a `--force` gate on `create-base-vm.sh` (would break the expected workflow).
* chore(scripts): drop dead and superseded scriptsCraig Jennings2026-05-161-21/+0
| | | | | | | | | | | | | | | | | Audit pass: each of these had no references anywhere in the repo (excluding self-references and review notes). - wip-bootcandy.sh — "wip" prefix, non-executable. Comments mention a boot animation but the script only installs ly and disables getty@tty2. - protonmail-bridge.sh — `pacman_install protonmail-bridge` (the package landed in extra) plus cmail-setup-finish.sh now cover this. - wireguard-proton.sh — hardcoded USGA tunnel and a relative `../assets/wireguard-config/*.conf` path that depends on the caller's pwd. - create-archiso-zfs.sh — one-off ISO build snippet, non-executable. - scripts/testing/lib/finalize-base-vm.sh — libvirt-era leftover. The test stack moved to direct QEMU and nothing sources or calls it.
* fix(testing): drop stale plugin checks, count failed validationsCraig Jennings2026-05-111-51/+0
| | | | | | validate_hyprland_plugins and validate_hyprpm_hook checked for the hyprland-plugins-setup script and the hyprpm pacman hook, both removed in 4a3056a (Hyprland 0.54 brings the layouts into core). I deleted the two functions and their calls in validate_window_manager. I also disabled errexit in run-test.sh from the validation phase onward, so one failed check is counted in VALIDATION_FAILED instead of aborting the run before the report or VM cleanup. About 16 validations across the file do a bare `return 1` after `validation_fail`; any of them firing under the previous behavior would have killed the harness mid-run.
* feat(archsetup): add rustup, log-cleanup cron, update configsCraig Jennings2026-02-271-0/+10
| | | | | | | Add rustup toolchain manager to developer_workstation (before AUR packages that need rust to compile). Add log-cleanup cron job with test validation. Update ISO glob for archangel naming. Add dunst icon theme, hyprlock animations, waybar log filtering.
* chore(archsetup): add texlive-latexextra, update test scriptsCraig Jennings2026-02-121-1/+1
| | | | | Add texlive-latexextra for pdflatex resume builds (enumitem package). Update test VM password and Arch mirror URL. Process inbox items.
* feat(hyprland): install plugins on first login via setup scriptCraig Jennings2026-02-011-19/+14
| | | | | | | | hyprpm requires running Hyprland to determine version for plugin compilation. Move plugin installation from archsetup to a first-login script (hyprland-plugins-setup) that runs via exec-once. Script checks if plugins are already installed and skips if so. Update validation to check for setup script presence instead of enabled plugins.
* fix(hyprland): auto-rebuild plugins and preserve stash master positionCraig Jennings2026-01-311-0/+56
| | | | | | | - Add pacman hook to rebuild hyprpm plugins after Hyprland updates - Change startup to hyprpm update -n (rebuilds if needed) - Fix stash-restore to preserve master window using batch commands - Add validation tests for plugins and hyprpm hook
* test(validation): add Settings portal dark mode checkCraig Jennings2026-01-301-0/+33
| | | | | | Validates that portals.conf uses gtk backend for Settings portal and that the portal returns color-scheme=1 (prefer-dark) for libadwaita apps like Nautilus.
* feat(testing): rewrite test infrastructure from libvirt to direct QEMUCraig Jennings2026-01-273-221/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the never-fully-operational libvirt-based VM test infrastructure with direct QEMU management and archangel ISO for fully automated, unattended base VM creation. Key changes: - vm-utils.sh: complete rewrite — QEMU process mgmt via PID file, monitor socket for graceful shutdown, qemu-img snapshots, SSH port forwarding (localhost:2222) - create-base-vm.sh: boots archangel ISO, SSHs in, runs unattended install via config file, verifies, creates clean-install snapshot - run-test.sh: snapshot revert, git bundle transfer, detached archsetup execution with setsid, polling, validation, and report generation - debug-vm.sh: CoW overlay disk, GTK display, auto-cleanup on close - setup-testing-env.sh: reduced deps to qemu-full/sshpass/edk2-ovmf/socat - cleanup-tests.sh: PID-based process management, orphan detection - validation.sh: port-based SSH (backward compatible), fuzzel/foot for Hyprland, corrected package list paths - network-diagnostics.sh: getent/curl instead of nslookup/ping (SLIRP) New files: - archsetup-test.conf: archangel config for base VM (btrfs, no encrypt) - archsetup-vm.conf: archsetup config for unattended test execution - assets/archangel.conf.example: reference archangel config Deleted: - finalize-base-vm.sh: merged into create-base-vm.sh - archinstall-config.json: replaced by archangel .conf format Tested: full end-to-end run — 51 validations passed, 0 failures. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(dotfiles): rename system/ to common/ and remove unused configsCraig Jennings2026-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | Rename dotfiles/system to dotfiles/common for clarity - indicates shared dotfiles used across all desktop environments (DWM, Hyprland). Removed config directories for uninstalled applications: - ghostty (using different terminal) - lf (using ranger instead) - mopidy (using mpd instead) - nitrogen (X11-only, obsolete for Wayland) - pychess (not installed) - JetBrains (not installed via archsetup) - youtube-dl (using yt-dlp with different config location) Kept audacious config for potential future use. Updated all references in archsetup, CLAUDE.md, todo.org, and validation.sh. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(testing): remove obsolete --skip-slow-packages optionCraig Jennings2026-01-245-0/+1633
This flag was removed from archsetup but remained in test scripts.