aboutsummaryrefslogtreecommitdiff
path: root/docs/NOTES.org
blob: 6cfe8c9356f353b017a6cf5d00b456b4ff44a07d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
#+TITLE: Claude Code Notes - archzfs
#+AUTHOR: Craig Jennings & Claude
#+DATE: 2026-01-17

* About This File

This file contains project-specific information for this project.

**When to read this:**
- At the start of EVERY session (after reading protocols.org)
- When needing project context or history
- When checking reminders or pending decisions

**What's in this file:**
- Project-specific context and goals
- Available workflows for this project
- Active reminders
- Pending decisions
- Session history

**For protocols and conventions, see:** [[file:protocols.org][protocols.org]]

* Project-Specific Context

** Overview

Build system for creating a custom Arch Linux installation ISO with ZFS support. The goal is to have a bootable ISO that can install Arch Linux on ZFS root without needing to manually compile ZFS or deal with kernel version mismatches.

** Repository

- Remote: =cjennings@cjennings.net:git/archzfs.git=
- Branch: =main=
- docs/ is committed (not private)

** Key Components

- =build.sh= - Main build script (runs as root)
  - Downloads ZFS packages from archzfs GitHub releases
  - Creates custom archiso profile based on releng
  - Adds custom packages (nodejs, npm, jq, zsh, htop, ripgrep, etc.)
  - Copies custom installer scripts into ISO
  - Builds ISO with mkarchiso

- =custom/= - Custom scripts included in ISO
  - =install-archzfs= - Main installer script
  - =install-claude= - Claude Code installer
  - =archsetup-zfs= - ZFS-specific Arch setup
  - =zfs-setup= - Installs ZFS packages and loads module (generated by build.sh)

- =scripts/test-vm.sh= - QEMU VM for testing the ISO

** Current State

TESTING: install-archzfs script almost complete.

- ISO builds successfully (4.8G) with linux-lts + zfs-dkms
- ZFS module loads correctly in live environment
- install-archzfs runs through partitioning, pool creation, base install, system config
- Last fix: added freetype2 for grub-mkfont (needs rebuild to test)

Next: Rebuild ISO, complete install test, boot installed system.

** Goals

Create a bootable Arch Linux installation ISO that:
1. Installs Arch on ZFS root with native encryption
2. Uses sane defaults for dataset layout
3. Configures automatic snapshots (sanoid)
4. Sets up replication to TrueNAS for backups
5. Includes Claude Code on live ISO for emergency troubleshooting

** Design Decisions

*** Kernel Strategy
- Use =linux-lts= + =zfs-dkms= from archzfs.com repo
- DKMS builds ZFS from source, guaranteeing kernel compatibility
- Slower build time but eliminates version mismatch issues entirely
- LTS kernel provides stability, DKMS provides flexibility

*** ZFS Pool Configuration
| Setting | Value | Rationale |
|---------+-------+-----------|
| Pool name | =zroot= | Standard convention |
| Encryption | AES-256-GCM, passphrase | Required at every boot |
| Compression | =zstd= (default) | Good balance of speed/ratio |
| Ashift | 12 (4K sectors) | Modern drives |
| Root reservation | 50GB | Prevents pool from filling |

*** Dataset Layout
| Dataset | Mountpoint | Special Settings | Purpose |
|---------+------------+------------------+---------|
| zroot/ROOT/default | / | reservation=50G | Root filesystem |
| zroot/home | /home | | Home directories (archsetup creates user subdataset) |
| zroot/media | /media | compression=off | Pre-compressed media files |
| zroot/vms | /vms | recordsize=64K | VM disk images (qemu/libvirt + virtualbox) |
| zroot/var/log | /var/log | | System logs |
| zroot/var/cache | /var/cache | | Package cache |
| zroot/var/lib/pacman | /var/lib/pacman | | Package database |
| zroot/var/lib/docker | /var/lib/docker | | Docker storage |
| zroot/tmp | /tmp | auto-snapshot=false | Temp files |
| zroot/var/tmp | /var/tmp | auto-snapshot=false | Temp files |

*** Snapshot Policy (Sanoid)
Less aggressive since TrueNAS handles long-term backups:

| Template | Hourly | Daily | Weekly | Monthly | Used For |
|----------+--------+-------+--------+---------+----------|
| production | 6 | 7 | 2 | 1 | root, home, var/log, pacman |
| backup | 0 | 3 | 2 | 1 | media, vms |
| none | 0 | 0 | 0 | 0 | tmp, cache |

Plus: Pacman hook creates snapshot before every transaction.

*** TrueNAS Replication
- Primary: =truenas.local= (local network)
- Fallback: =truenas= (tailscale)
- Destination pool: =vault/[TBD]=
- Schedule: Nightly at 2:00 AM
- Datasets: ROOT/default, home, media, vms

*** Included Packages
- Base system + development tools
- =nodejs=, =npm=, =jq= (for Claude Code)
- =zsh=, =htop=, =ripgrep=, =eza=, =fd=, =fzf=
- =sanoid= (snapshot management)
- =dialog= (installer UI)

*** Installation UX
- All questions asked upfront, then unattended installation
- WiFi tested before installation begins (if provided)
- User can walk away during install and come back
- Summary + final confirmation before starting

*** User Account Strategy
- install-archzfs creates root account only (asks for root password)
- No user account created during install
- Just create =zroot/home= dataset (no user-specific subdataset)
- archsetup creates user account + home dataset post-reboot

*** GRUB HiDPI Support
- Generate 32px DejaVuSansMono font during install
- Set =GRUB_FONT= to use custom font
- Works well on HiDPI and regular displays

*** WiFi Configuration
- Ask for SSID + password during install (optional)
- Test connection before installation starts
- Copy connection profile to installed system
- Auto-connects after reboot

*** Post-Install Workflow
1. install-archzfs: Minimal ZFS system + root account
2. Reboot, login as root
3. Run archsetup manually for full workstation setup

*** Testing/Debugging (VM)
- SSH access on live ISO: sshd enabled, known root password
- Serial console: =-serial mon:stdio= in QEMU for terminal copy/paste
- Port forwarding: 2222→22 (already configured)
- Allows easy copy/paste of error messages during testing

** Open Questions

- [ ] TrueNAS destination dataset path (vault/???)

* AVAILABLE WORKFLOWS

This section lists all documented workflows for this project. Update this section whenever a new workflow is created.

** create-workflow
File: [[file:workflows/create-workflow.org][docs/workflows/create-workflow.org]]

Meta-workflow for creating new workflows. Use this when identifying repetitive workflows that would benefit from documentation.

Workflow:
1. Q&A discovery (4 core questions)
2. Assess completeness
3. Name the workflow
4. Document it
5. Update NOTES.org
6. Validate by execution

Created: [Date when workflow was created]

** create-v2mom
File: [[file:workflows/create-v2mom.org][docs/workflows/create-v2mom.org]]

Workflow for creating a V2MOM (Vision, Values, Methods, Obstacles, Metrics) strategic framework for any project or goal.

Workflow:
1. Understand V2MOM framework
2. Create document structure
3. Define Vision (aspirational picture of success)
4. Define Values (2-4 principles with concrete definitions)
5. Define Methods (4-7 approaches ordered by priority)
6. Identify Obstacles (honest personal/technical challenges)
7. Define Metrics (measurable outcomes)
8. Review and refine
9. Commit and use immediately

Time: ~2-3 hours total
Applicable to: Any project (health, finance, software, personal infrastructure, etc.)

Created: 2025-11-05

** session-start
File: [[file:workflows/session-start.org][docs/workflows/session-start.org]]

Workflow for beginning a Claude Code session with proper context and priorities.

Triggered by: **Automatically at the start of EVERY session**

Workflow:
1. Add session start timestamp (check for interrupted sessions)
2. Sync with templates (exclude NOTES.org and previous-session-history.org)
3. Scan workflows directory for available workflows
4. Read key NOTES.org sections (NOT entire file)
5. Process inbox (mandatory)
6. Ask about priorities (urgent work vs what's-next workflow)

Ensures: Full context, current templates, processed inbox, clear session direction

Created: 2025-11-14

** session-wrap-up
File: [[file:workflows/session-wrap-up.org][docs/workflows/session-wrap-up.org]]

Workflow for ending a Claude Code session cleanly with proper documentation and version control.

Triggered by: "wrap it up," "that's a wrap," "let's call it a wrap," or similar phrases

Workflow:
1. Write session notes to NOTES.org Session History section
2. Archive sessions older than 5 sessions to previous-session-history.org
3. Git commit and push all changes (NO Claude attribution)
4. Provide brief valediction with accomplishments and next steps

Ensures: Clean handoff between sessions, nothing lost, clear git history, proper documentation

Created: 2025-11-14

** [Add more workflows as they are created]

Format for new entries:
#+begin_example
** workflow-name
File: [[file:workflows/workflow-name.org][docs/workflows/workflow-name.org]]

Brief description of what this workflow does.

Workflow:
1. Step 1
2. Step 2
3. Step 3

Created: YYYY-MM-DD
#+end_example

* PENDING DECISIONS

This section tracks decisions that need Craig's input before work can proceed.

**Instructions:**
- Add pending decisions as they arise during sessions
- Format: =** [Topic/Feature Name]=
- Include: What needs to be decided, options available, why it matters
- Remove decisions once resolved (document resolution in Session History)

**Example format:**
#+begin_example
** Feature Name or Topic

Craig needs to decide on [specific question].

Options:
1. Option A - [brief description, pros/cons]
2. Option B - [brief description, pros/cons]

Why this matters: [impact on project]

Implementation is ready - just need Craig's preference.
#+end_example

** Current Pending Decisions

(None currently - will be added as they arise)

* Active Reminders

** Current Reminders

(None currently - will be added as needed)

** Instructions for This Section

When Craig says "remind me" about something:
1. Add it here with timestamp and description
2. If it's a TODO, also add to =/home/cjennings/sync/org/roam/inbox.org= scheduled for today
3. Check this section at start of every session
4. Remove reminders once addressed

Format:
- =[YYYY-MM-DD]= Description of what to remind Craig about

* Session History

This section contains notes from each session with Craig. Sessions are logged in reverse chronological order (most recent first).

**Note:** Sessions older than 5 sessions are archived in [[file:previous-session-history.org][Previous Session History]]

** Format for Session History Entries

Each entry should use this format:

- **Timestamp:** =*** YYYY-MM-DD Day @ HH:MM TZ= (get TZ with =date +%z=)
- **Time estimate:** How long the session took
- **Status:** COMPLETE / IN PROGRESS / PAUSED
- **What We Completed:** Bulleted list of accomplishments
- **Key Decisions:** Any important decisions made
- **Files Modified:** Links to changed files (use relative paths)
- **Next Steps:** What to do next session (if applicable)

**Best practices:**
- Keep entries concise but informative
- Include enough context to resume work later
- Document important technical insights
- Note any new patterns or preferences discovered
- Link to files using org-mode =file:= links

** Session Entries

*** 2026-01-22 Thu @ 15:44 -0600

*Status:* COMPLETE

*What We Completed:*
- Diagnosed system freeze on ratio - same VPE power gating bug from earlier
- Researched VPE_IDLE_TIMEOUT patch status - NOT merged, AMD maintainer skeptical
- Discovered kernel 6.18.x has critical CWSR bugs for Strix Halo - do NOT upgrade
- Identified two workarounds: amdgpu.pg_mask=0 and amdgpu.cwsr_enable=0
- Verified neither workaround is currently applied on ratio
- Created detailed fix instructions in inbox/instructions.txt for live ISO application

*Key Decisions:*
- Stay on kernel 6.15.2 (pinned) - 6.18.x has worse bugs for Strix Halo
- Apply both pg_mask=0 and cwsr_enable=0 workarounds
- Will apply fixes from velox via live ISO (mkinitcpio triggers freeze on live system)

*Key Technical Findings:*
- VPE_IDLE_TIMEOUT patch (1s→2s) submitted Aug 2025 but not merged
- Framework Community recommends 6.15.x-6.17.x for Strix Halo, avoid 6.18+
- Current ratio state: pg_mask=4294967295 (bad), cwsr_enable=1 (bad)

*Files Created:*
- [[file:../inbox/instructions.txt][inbox/instructions.txt]] - Detailed fix instructions for live ISO

*Next Steps:*
- Boot ratio from archzfs live ISO (from velox)
- Apply both workarounds per inbox/instructions.txt
- Rebuild initramfs from live ISO
- Verify fixes active after reboot

*** 2026-01-22 Thu @ 15:02 -0600

*Status:* COMPLETE

*What We Completed:*
- Diagnosed and fixed ratio (Framework Desktop, AMD Strix Halo) boot failures
- Root cause: missing linux-firmware 20260110 caused amdgpu to freeze at boot
- Installed linux-firmware 20260110-1, fixed ZFS mountpoints, fixed hostid mismatch
- Configured kernel 6.15.2 as default (pinned), created clean GRUB menu
- Created retrospective workflow for continuous improvement
- Added PRINCIPLES.org with behavioral lessons learned
- Documented full troubleshooting session

*Key Decisions:*
- linux-firmware version is critical for AMD Strix Halo (20260110+ required)
- ZFS rollback with separate /boot partition is dangerous - recommend ZFSBootMenu
- Established retrospective workflow for major problem-solving sessions
- Behavioral lessons go in PRINCIPLES.org, technical facts in session docs

*Files Modified:*
- [[file:../custom/install-archzfs][custom/install-archzfs]] - Fixed mkinitcpio configuration
- [[file:../todo.org][todo.org]] - Added ZFS rollback + /boot issue, ZFSBootMenu task
- [[file:PRINCIPLES.org][docs/PRINCIPLES.org]] - New file with behavioral lessons
- [[file:protocols.org][docs/protocols.org]] - Added PRINCIPLES.org to session startup
- [[file:retrospectives/2026-01-22-ratio-boot-fix.org][docs/retrospectives/]] - Retrospective for this session
- [[file:2026-01-22-ratio-boot-fix-session.org][docs/2026-01-22-ratio-boot-fix-session.org]] - Full technical session doc

*Commits Made:*
- c46191c: Fix ratio boot issues: firmware, mkinitcpio, and document ZFS rollback dangers
- 9100517: Update ratio session doc: kernel 6.15.2 now default with clean GRUB menu
- e5aedfa: Add retrospective workflow and PRINCIPLES.org for continuous improvement

*Next Steps:*
- Implement ZFSBootMenu on ratio to solve /boot rollback issue
- Consider adding ZFSBootMenu to install-archzfs as alternative to GRUB

*** 2026-01-18 Sat @ 16:30 -0600

*Status:* COMPLETE

*What We Completed:*
- Completed RESCUE-GUIDE.txt with all 8 sections fully documented
- Added final round of utility packages to ISO:
  - Disk tools: ncdu, tree
  - Hardware diagnostics: iotop
  - Network: speedtest-cli, mosh, aria2, tmate, sshuttle
  - Security: pass (password manager)
- Removed AUR-only packages that broke build: safecopy, ms-sys, dislocker, nwipe
- Successfully rebuilt ISO (5.1GB)
- Copied ISO to truenas.local:/mnt/vault/isos and ~/downloads/isos
- Wrote ISO to USB drives (/dev/sda 1TB, /dev/sdb 240GB)
- Ran all tests:
  - zfs-snap-prune unit tests: 22/22 PASSED
  - VM install test (single-disk): PASSED
  - VM install test (mirror): PASSED
  - VM install test (raidz1): PASSED
- Marked "Add common recovery tools" TODO as DONE

*Commits Made:*
- 36aa130: Add utility tools and rescue guide documentation
- 6f4fd68: Remove AUR-only packages from ISO build

*Files Modified:*
- [[file:../build.sh][build.sh]] - Added utility packages, removed AUR-only packages
- [[file:../custom/RESCUE-GUIDE.txt][custom/RESCUE-GUIDE.txt]] - Completed all 8 sections
- [[file:../TODO.org][TODO.org]] - Marked recovery tools task as DONE
- [[file:session-context.org][docs/session-context.org]] - Updated session state

*Key Technical Notes:*
- AUR packages cannot be included in mkarchiso builds without custom AUR handling
- Documented AUR tools (safecopy, ms-sys, dislocker, nwipe) in RESCUE-GUIDE with install instructions
- ISO now doubles as a comprehensive rescue/recovery disk

*Next Steps:*
- Test booting from physical USB drive on real hardware
- Consider CI/CD pipeline for automated ISO builds
- Consider adding ISO to GRUB boot menu for on-disk recovery

*** 2026-01-17 Sat @ 17:10 -0600

*Status:* IN PROGRESS

*What We Completed:*
- Fixed ZFS kernel module mismatch by switching to linux-lts + zfs-dkms
- Fixed bootloader to use linux-lts kernel (was defaulting to regular linux)
- Fixed broadcom-wl dependency (switched to broadcom-wl-dkms)
- Updated mkinitcpio preset for linux-lts with archiso config
- Fixed install-archzfs bugs:
  - =[[ ]] && error= pattern causing early exit with =set -e= (changed to if/then)
  - 50G reservation on 50G disk (now dynamic: 20% of pool, capped 5-20G)
  - sanoid not in official repos (moved to archsetup)
  - grub-mkfont needs freetype2 package (added to pacstrap)
- Removed sanoid/syncoid from install-archzfs (archsetup will handle)
- Created inbox item for archsetup with full sanoid/syncoid config
- ISO now 4.8G (was 5.4G) - only linux-lts kernel

*Key Technical Insights:*
- =broadcom-wl= depends on =linux= kernel specifically - use =broadcom-wl-dkms= instead
- archiso releng profile has linux.preset in airootfs that needs renaming to linux-lts.preset
- With =set -e=, =[[ test ]] && command= returns exit code 1 if test is false, causing script exit
- =grub-mkfont= requires =freetype2= package (not installed by default with grub)

*Files Modified:*
- [[file:../build.sh][build.sh]] - major updates for linux-lts, bootloader configs, mkinitcpio preset
- [[file:../custom/install-archzfs][custom/install-archzfs]] - multiple bug fixes, removed sanoid/syncoid
- [[file:~/code/archsetup/inbox/zfs-sanoid-detection.txt][archsetup inbox]] - sanoid/syncoid config for archsetup to implement

*Current State:*
- ISO builds successfully with linux-lts + zfs-dkms
- ZFS module loads correctly in live environment
- install-archzfs runs through most steps
- Last error: grub-mkfont missing freetype2 (now fixed, needs rebuild/test)

*Next Steps:*
- Rebuild ISO with freetype2 fix
- Complete full install-archzfs test in VM
- Test booting the installed system
- Git commit all changes

*** 2026-01-17 Sat @ 13:16 -0600

*Status:* COMPLETE (continued above)

*What We Completed:*
- Initialized git repository
- Created .gitignore (excludes work/, out/, profile/, zfs-packages/)
- Initial commit with all build scripts
- Added docs/ to git (decided to track publicly)
- Built fresh ISO (archzfs-claude-2026.01.17-x86_64.iso, 4.9G)
- Tested ISO in QEMU VM
- Documented project goals and design decisions in NOTES.org

*Key Decisions Made:*
- Use linux-lts + zfs-dkms from archzfs.com (DKMS ensures kernel compatibility)
- Less aggressive snapshot policy (TrueNAS handles long-term backups)
- All install questions upfront, then unattended installation
- Root account only (archsetup creates user post-reboot)
- 32px GRUB font for HiDPI displays
- WiFi config tested before install starts

*Files Modified:*
- [[file:../.gitignore][.gitignore]] - created
- [[file:../build.sh][build.sh]] - major rewrite
- [[file:../custom/install-archzfs][custom/install-archzfs]] - complete rewrite
- [[file:../scripts/test-vm.sh][scripts/test-vm.sh]] - added serial console