diff options
| -rw-r--r-- | docs/NOTES.org | 259 |
1 files changed, 259 insertions, 0 deletions
diff --git a/docs/NOTES.org b/docs/NOTES.org new file mode 100644 index 0000000..854e598 --- /dev/null +++ b/docs/NOTES.org @@ -0,0 +1,259 @@ +#+TITLE: ArchSetup Development Notes +#+DATE: 2025-11-10 + +* 2025-12-01: Major Feature Session - Pre-flight, State Tracking, Integration Test + +** Work Completed + +*** Rofi Configuration Standardization +- Created self-contained rofi configuration in dotfiles/system/.config/rofi/ +- Designed theme to match dunst notification colors (bg: #383c4af0, fg: #cdd1dc, border: #2d303c) +- Simplified sxhkd keybindings (removed inline flags, moved to config.rasi) +- Removed phantom PATH entry for non-existent ~/.config/rofi/scripts/ + +*** Nitrogen to Feh Migration +- Replaced nitrogen with feh (nitrogen removed from Arch repos) +- Updated all references: .xinitrc, monitor script, lf/lfrc, ranger/rc.conf +- Added feh to archsetup package list + +*** Pre-flight Checks Implementation +- Added preflight_checks() function validating: + - Disk space (20GB minimum) + - Network connectivity (ping archlinux.org) + - pacman availability + - Arch Linux detection (/etc/arch-release) + +*** State Tracking for Resume Capability +- State stored in /var/lib/archsetup/state/ as timestamped marker files +- run_step() wrapper tracks completion of 12 major installation phases +- Added CLI flags: --status (show progress), --fresh (clear state), --help + +*** Integration Test Fixes +- Fixed duplicate package entries (multimarkdown, proselint were listed as both AUR and pacman) +- Diagnosed and fixed git server 504 errors (fcgiwrap overload after 2 months) +- Blocked aggressive crawler IP via UFW + +** Test Results +- Test 20251201-084055: *PASSED* (60m 34s) +- 3 non-critical errors: mkinitcpio-firmware (intermittent), multimarkdown/proselint (fixed) + +** Git Commits (8 total) +- 590aa02: feat(rofi): standardize rofi configuration with dunst-matched theme +- 0601d39: feat(wallpaper): replace nitrogen with feh +- 84a52a8: docs(TODO): mark completed tasks from today's session +- bcacfa0: docs(TODO): mark -debug packages task as verified complete +- 03145a1: docs(TODO): mark root check and git pull fixes as verified complete +- 75b0a17: feat(archsetup): add pre-flight checks before installation +- 50423fd: feat(archsetup): implement state tracking for resume capability +- 777e113: fix(archsetup): remove duplicate multimarkdown and proselint entries + +** Context for Next Session +- mkinitcpio-firmware occasionally fails (network/AUR timing issue) - kept as-is +- All [#A] priority TODOs now complete +- Consider testing resume capability (--fresh then interrupt mid-run) + +* 2025-11-21: GPG/Emacs Authentication Fix + +** Problem Diagnosed +- mu4e in Emacs failing with "gpg: problem with the agent: End of file" +- Root cause: Emacs daemon started via systemd user service (emacs.service) +- systemd-started Emacs had no DISPLAY or GPG_TTY in environment +- pinentry-dmenu couldn't connect to X, causing silent failures +- gpg-agent settings were fine (400-day TTL, pinentry-dmenu, ssh-support) + +** Diagnostic Commands Used +#+begin_src bash +gpgconf --list-dirs agent-socket # Socket present at /run/user/1000/gnupg/S.gpg-agent +gpg-connect-agent 'getinfo pid' /bye # Agent running (pid 1759, --supervised) +cat /proc/1728/environ | tr '\0' '\n' | grep -E 'GPG|TTY|DISPLAY' # Empty - no X env +#+end_src + +** Previous Fix (Did Not Work) +- Disabled emacs.service: =systemctl --user disable emacs.service= +- Issue persisted after reboot - still got "End of file" error in mu4e + +** Understanding "End of File" Error +- Error comes from IPC level between gpg-agent and pinentry +- pinentry-dmenu spawns, tries to connect to X11, fails (no DISPLAY) +- pinentry crashes/exits without sending response +- gpg-agent reads EOF from the pipe and reports the error +- The error message is the literal syscall result, not a helpful description + +** Actual Fix Applied +- Added =~/.local/bin/reset-auth= call to .xinitrc after DISPLAY is exported +- This restarts gpg-agent with correct DISPLAY environment +- Runs before any apps (signal-desktop, protonmail-bridge, etc.) start +- Location: .xinitrc lines 29-31 + +** Key Files Referenced +- ~/.gnupg/gpg-agent.conf - TTL and pinentry settings +- ~/.xinitrc - now includes reset-auth call after DISPLAY export +- ~/.local/bin/reset-auth - restarts gpg-agent with current environment + +** Context for Next Session +- Verify GPG auth works correctly after next reboot +- If still failing, investigate pinentry-dmenu X11 connection + +* 2025-11-18: Console Font Configuration + +** Work Completed +- Improved console font readability for laptop display +- Tested multiple console fonts: solar24x32, latarcyrheb-sun32, terminus variants +- Selected ter-132n (Terminus 32px normal) as optimal for readability +- Updated /etc/vconsole.conf to FONT=ter-132n on current system +- Added terminus-font package to archsetup (archsetup:561) +- Changed console font configuration from lat0-16 to ter-132n (archsetup:965) + +** Console Font History +- Original (pre-Nov 16): No FONT setting (kernel default, very small) +- Commit 26a20f3: Added FONT=lat0-16 (moderate size, 16px) +- Uncommitted change: FONT=lat4a-19 (HiDPI, too small for laptop) +- Current configuration: FONT=ter-132n (large, 32px, excellent readability) + +* 2025-11-16: Session Workflows Setup + +** Work Completed +- Adopted session-start workflow from claude-templates +- Executed session start routine: synced templates, checked inbox (empty), reviewed project context +- Updated session-wrap-up workflow in both archsetup and claude-templates projects + - Changed archiving from time-based (7 days/2 weeks) to session-count-based (keep last 5 sessions) + - Updated both docs/templates/docs/workflows/session-wrap-up.org and ~/projects/claude-templates/docs/workflows/session-wrap-up.org +- Verified all ~/.profile.d/ files are properly symlinked to ~/code/archsetup/dotfiles/system/.profile.d/ + +** Context for Next Session +- Outstanding TODOs from 2025-11-13 session still pending: + - Run verification test to confirm yay-debug is no longer installed + - Investigate 56 AUR package retry errors + - Consider if errors need to be addressed or are acceptable failures + +* 2025-11-13: Critical Bug Fixes & Test Infrastructure + +** Test Results Summary +- Test 20251113-232631: *PASSED* (exit code 0) in 17m 33s +- Test 20251113-190717: Exit code 1 (ran old version without fixes) in 40m 54s +- All validations passed: user creation, dotfiles stowed, yay installed, DWM built +- 56 non-critical errors (mostly AUR package retry failures) + +** Bug Fixes Implemented +*** Root Permission Check (archsetup:21-27) +- Added EUID check at script start +- Fails fast with clear error message if not run as root +- Prevents confusing errors later in execution + +*** Disable Debug Packages +- Added --nodebug flag to all yay calls (archsetup:167-171) +- Added --nodebug to makepkg for yay build (archsetup:375) +- Prevents installation of unnecessary -debug packages (saves ~500MB+) +- *Note*: Still need to verify yay-debug is prevented in next test + +*** Safe Git Operations +- Replaced dangerous `git pull --force` with safe rm + fresh clone +- Applied to git_install function (archsetup:154-160) +- Applied to yay installer (archsetup:366-372) +- Prevents accidental data loss in local repositories + +*** Completion Marker for Tests +- Added unique marker: `=== ARCHSETUP_EXECUTION_COMPLETE ===` (archsetup:1044) +- Test script now detects completion reliably (run-test.sh:239) +- Fixed false negative test failures + +*** Test Script Process Detection +- Fixed pgrep infinite loop bug (run-test.sh:216) +- Changed from `pgrep -f 'bash archsetup'` to `ps aux | grep '[b]ash archsetup'` +- Prevents test script from matching its own SSH commands + +** Package Changes +*** Removed Packages +- anki: Build hangs 98+ minutes (removed from archsetup:924) +- adwaita-color-schemes: CMake build issues (removed comment from archsetup:704) + +*** Added Packages +- geoclue: Geolocation service with correct systemd service (archsetup:456-458) + - Service name: geoclue.service (not geoclue-agent@user.service) + +** Test Infrastructure Added +- Comprehensive VM-based testing framework in scripts/testing/ +- Network diagnostics and pre-flight checks +- Snapshot-based testing for reproducible runs +- Support for --skip-slow-packages flag for faster testing + +** Git Commits +- 2e10a88: fix(archsetup): implement critical bug fixes and test improvements +- 0148a1d: fix(archsetup): prevent yay-debug package installation during yay build + +** Next Session TODO +- [ ] Run verification test to confirm yay-debug is no longer installed +- [ ] Investigate 56 AUR package retry errors +- [ ] Consider if errors need to be addressed or are acceptable failures + +* 2025-11-10: Successful Test Run & Package Fixes + +** Test Results +- Completed successful full test run in VM +- Total runtime: 58 minutes, 4 seconds +- Errors encountered: 0 +- Test results location: test-results/20251110-154908/ + +** Package Issues Resolved +*** anki (DISABLED - line 924) +- Problem: Hangs for 98+ minutes during cargo build, missing .gitconfig +- Solution: Temporarily disabled with comment +- TODO: Investigate and re-enable or replace + +*** tageditor (DISABLED - line 933) +- Problem: Hangs indefinitely building qt5-webengine dependency +- Solution: Temporarily disabled with comment +- TODO: Investigate alternatives or fix dependency issue + +*** Other packages removed/fixed: +- gtk-engine-murrine: Removed (no longer needed) +- vagrant: Moved to AUR (was in official repos, now AUR-only) +- Blue light filter: Removed entirely from installation + +** Test Infrastructure Improvements +*** Git Clone Simulation +- Test now properly simulates: git clone --depth 1 +- Uses git bundle to transfer repository to VM with full git metadata +- Location: scripts/testing/run-test.sh lines 158-180 + +*** Dotfiles Git Restore +- Confirmed working: stow --adopt followed by git restore +- Reverted conditional git restore back to unconditional (line 320-322) +- This workflow requires proper git repository structure (provided by git clone) + +** Known Issues +*** Test Script Timeout Detection +- Bug: Test script doesn't properly detect when archsetup completes +- Symptom: Reports "timeout after 90 minutes" even when archsetup finishes at 58 minutes +- Impact: Test marked as failed despite successful completion +- All validations still pass (user creation, dotfiles, yay, DWM) +- TODO: Fix timeout detection logic in run-test.sh + +** Remaining High Priority Tasks (see TODO.org) +- [#A] Replace nitrogen with feh for wallpaper management +- [#A] Fix or permanently disable adwaita-color-schemes + +** VM Configuration +- VM Name: archsetup-base +- Clean snapshot: clean-install +- IP: 192.168.122.21 (typically) +- Location: vm-images/archsetup-base.qcow2 + +** Important Commands +#+begin_src bash +# Run test +./scripts/testing/run-test.sh + +# Run test and keep VM up for debugging +./scripts/testing/run-test.sh --keep + +# Manual snapshot management +virsh --connect qemu:///system snapshot-list archsetup-base +virsh --connect qemu:///system snapshot-revert archsetup-base clean-install + +# Check VM status +virsh --connect qemu:///system list --all +#+end_src + +** Git Commits Made +- 18c6bc3: fix(archsetup): disable problematic slow packages |
