diff options
| author | Craig Jennings <c@cjennings.net> | 2026-05-29 21:11:06 -0500 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-05-29 21:11:06 -0500 |
| commit | 39970b462c8198220f33ef7323725982723d2233 (patch) | |
| tree | 391c10e2be3207dbea2c1f4e01ca8e2944e18df8 | |
| parent | 06b2c0716b51eb73298f569752dd1d81947d9961 (diff) | |
| download | archsetup-39970b462c8198220f33ef7323725982723d2233.tar.gz archsetup-39970b462c8198220f33ef7323725982723d2233.zip | |
chore(todo): file local-llm and uv install tasks; process inbox
Filed two new [#B] parent tasks. The local offline LLM runtime task carries design-decision and implementation children for resolving the open design questions alongside implementation work. The uv install task matches the existing eask/signal-cli tooling-codification shape — load-bearing for other projects, manually installed today, codify so fresh installs pick it up. Four cross-project handoffs moved to outbox.
5 files changed, 232 insertions, 0 deletions
diff --git a/assets/outbox/2026-05-28-from-rulesets-local-llm-install.org b/assets/outbox/2026-05-28-from-rulesets-local-llm-install.org new file mode 100644 index 0000000..c3cbdaa --- /dev/null +++ b/assets/outbox/2026-05-28-from-rulesets-local-llm-install.org @@ -0,0 +1,88 @@ +#+TITLE: Install local offline LLM runtime and model cache +#+DATE: 2026-05-28 +#+SOURCE_PROJECT: rulesets +#+REQUEST_TYPE: install-feature +#+STARTUP: showall + +* Request + +Please add local offline LLM support to =archsetup='s normal install process so +machines can run a local coding agent when there is no network. + +This came from the =rulesets= generic-agent-runtime design pass. =rulesets= +should become runtime-neutral, but it needs =archsetup= to provision the local +model runtime and prefetch model files while network is available. + +* Hardware-specific recommendations + +** High-end Strix Halo machine + +Detected with =inxi=: + +- AMD Ryzen AI Max+ 395 +- 128 GiB RAM +- Radeon 8060S / Strix Halo unified memory + +Install: + +- Default offline coding model: + =Qwen3-Coder-30B-A3B-Instruct-GGUF=, prefer =Q6_K= on this machine. +- Compatibility quant: + =Qwen3-Coder-30B-A3B-Instruct-GGUF Q4_K_M=. +- Larger general/long-context fallback: + =Qwen3-Next-80B-A3B-Instruct-GGUF Q4_K_M=. + +** velox + +Detected with =ssh velox inxi -C -G -m -S --filter=: + +- Intel Core i7-1370P +- 64 GiB RAM +- Intel Iris Xe integrated graphics + +Install: + +- Strongest practical offline coding default: + =Qwen3-Coder-30B-A3B-Instruct-GGUF Q4_K_M=. +- Add an 8B fallback model for quick edits and low-latency triage. + +Expect =velox= to be CPU/low-end-iGPU bound. The 30B model fits, but latency +will be the limiting factor. + +* Runtime stack + +Recommended packages/components: + +- =llama.cpp= with CPU and Vulkan support where practical. +- Optional =ollama= as a simple model manager/API for workflows that prefer it. +- A shared local model cache, e.g. =~/.local/share/llm/models= or + =/srv/models/llm=. +- OpenAI-compatible local endpoints: + - coding model on =127.0.0.1:8081= + - larger/general model on =127.0.0.1:8082= when installed + - leave =127.0.0.1:11434= for =ollama= if used + +* Install behavior + +- Install runtime packages during normal setup. +- Prefetch model files when network is available. +- Make model download idempotent: skip if exact file already exists. +- Do not make the install fail hard if model download is unavailable; surface a + clear follow-up saying local offline LLM support is incomplete. +- Add a smoke test command that starts the local endpoint and asks a short prompt. + +* Why this belongs in archsetup + +=rulesets= can provide the runtime manifests, launcher behavior, and project +instructions, but it should not own machine provisioning. =archsetup= already +owns package installation and per-host setup, so it is the right place to install +=llama.cpp=/=ollama= and maintain the machine-local model inventory. + +* Sources checked + +- Qwen3-Coder 30B GGUF quant listings show Q4_K_M around 18.6 GB and Q6_K around + 25.1 GB. +- Qwen3-Next 80B GGUF model card shows Q4_K_M around 48.4 GB and native 262K + context. +- =llama.cpp= supports CPU and GPU backends including Vulkan/HIP/ROCm; keep the + backend configurable per host. diff --git a/assets/outbox/2026-05-29-1111-from-health-todo-b-install-python-genanki-system.org b/assets/outbox/2026-05-29-1111-from-health-todo-b-install-python-genanki-system.org new file mode 100644 index 0000000..3cf53ec --- /dev/null +++ b/assets/outbox/2026-05-29-1111-from-health-todo-b-install-python-genanki-system.org @@ -0,0 +1,29 @@ +#+TITLE: * TODO [#B] Install ~python-genanki~ system-wide :install:py +#+SOURCE: from health +#+DATE: 2026-05-29 11:11:48 -0500 + +* TODO [#B] Install ~python-genanki~ system-wide :install:python: + +Add ~genanki~ to the Python package set so projects can generate Anki ~.apkg~ decks from org-drill files without per-project venvs. + +** Why +A converter script in the health project (~health-drill-to-anki.py~) emits an Anki package from an org-drill source file — same workflow as ~/sync/org/drill/~ but pushed into Anki for mobile review. The script is ~100 lines and runs anywhere Python has ~genanki~ available. The pattern is likely to spread: deepsat already has a drill file, more will follow. + +Right now Arch's PEP 668 enforcement blocks ~pip install --user genanki~, so the script needs a throwaway venv to run. Solving once at the system level removes that friction across every project. + +** Install options (pick one) +- *Arch official repo or AUR.* ~pacman -S python-genanki~ if it's in extra/community, or ~yay -S python-genanki~ from AUR. Cleanest. Auto-updates with the rest of the system. +- *pipx.* ~pipx install genanki~ — but genanki is a library, not a CLI app, so pipx is a stretch. Skip. +- *System-wide pip with ~--break-system-packages~.* Works but circumvents PEP 668. Last resort. + +Recommendation: try pacman/AUR first, fall back to a managed venv at a known path (e.g. ~/opt/python-tools/~) that scripts can shebang into. + +** Verification +After install: +#+begin_src bash +python3 -c "import genanki; print(genanki.__version__)" +#+end_src +Should print a version like ~0.13.1~ without traceback. + +** Cross-reference +- A companion message went to rulesets' inbox proposing the export script as a template script under ~claude-templates/.ai/scripts/~. That decision depends on this install landing first — without ~genanki~ available system-wide, the template script can't run out of the box. diff --git a/assets/outbox/2026-05-29-1114-from-health-cancelled-install-python-genanki-system.org b/assets/outbox/2026-05-29-1114-from-health-cancelled-install-python-genanki-system.org new file mode 100644 index 0000000..877eb18 --- /dev/null +++ b/assets/outbox/2026-05-29-1114-from-health-cancelled-install-python-genanki-system.org @@ -0,0 +1,23 @@ +#+TITLE: * CANCELLED Install ~python-genanki~ system-wide — supersede +#+SOURCE: from health +#+DATE: 2026-05-29 11:14:51 -0500 + +* CANCELLED Install ~python-genanki~ system-wide — superseded :install:python: + +The prior message in this inbox (also dated 2026-05-29) asked for a system-wide ~python-genanki~ install to remove pip/venv friction. That request is no longer needed. + +** Why +The converter script in the health project moved to ~uv~ + PEP 723 inline script metadata. Shebang is ~#!/usr/bin/env -S uv run --script~ and the deps are declared in the script header. ~uv~ handles resolve / install / cache transparently on first run (~5ms cached). No system package needed for ~genanki~ specifically. + +** Implication for archsetup +- *No genanki install required.* The earlier TODO can be deleted or marked CANCELLED. +- *~uv~ is now a load-bearing tool for this pattern.* Make sure ~uv~ stays in the Arch base setup. Currently installed via ~/usr/bin/uv~ (version 0.11.15). If it's already part of the standard package set, nothing to do. If it's only installed ad-hoc on this machine, worth adding to the install playbook. + +** Verification +#+begin_src bash +which uv && uv --version +#+end_src +Should print a path and version like ~uv 0.11.15~ without error. + +** Cross-reference +A follow-up went to rulesets' inbox with the updated implementation pattern and a renewed recommendation to formalize the converter as a template script. diff --git a/assets/outbox/2026-05-29-1127-from-health-todo-a-add-uv-to-the-install-playbook.org b/assets/outbox/2026-05-29-1127-from-health-todo-a-add-uv-to-the-install-playbook.org new file mode 100644 index 0000000..9828f0b --- /dev/null +++ b/assets/outbox/2026-05-29-1127-from-health-todo-a-add-uv-to-the-install-playbook.org @@ -0,0 +1,40 @@ +#+TITLE: * TODO [#A] Add ~uv~ to the install playbook — load-bearing +#+SOURCE: from health +#+DATE: 2026-05-29 11:27:58 -0500 + +* TODO [#A] Add ~uv~ to the install playbook — load-bearing for project tooling :install:python: + +Add ~uv~ (Astral's Python package + script runner) to the archsetup install steps so every future machine setup picks it up automatically. Currently installed on this machine at ~/usr/bin/uv~ (~0.11.15~), but it's not yet part of the standard set — if a fresh install skipped it, project scripts that depend on it would fail silently or break at first run. + +** Why this matters now +Several projects are moving Python tooling onto ~uv~ with PEP 723 inline script metadata. The pattern is: a script declares its deps in a ~# /// script~ header and shebangs ~#!/usr/bin/env -S uv run --script~, so it self-runs without venvs or pip. First example landed in the health project (~health-drill-to-anki.py~), and the rulesets follow-up recommends promoting that pattern as a template script under ~claude-templates/.ai/scripts/~ — meaning ~uv~ would become a dependency for any project pulling templates. + +Without ~uv~ in the install playbook, fresh machines would hit ~env: uv: No such file or directory~ on any script using this pattern. The PEP 668 detour (system pip blocked, manual venvs everywhere) was exactly what ~uv~ eliminates. + +** Install +Arch official: ~pacman -S uv~. If extra/community doesn't carry it, AUR has it as ~uv-bin~ or build-from-source. Astral's official installer is also an option but adds a non-pacman path to maintain. + +Pacman is cleanest — auto-updates with the rest of the system, lives under ~/usr/bin/uv~. + +** Verification (post-install) +#+begin_src bash +which uv && uv --version +#+end_src +Should print ~/usr/bin/uv~ and a version string. To verify the PEP 723 pattern works end-to-end: +#+begin_src bash +cat > /tmp/uv-test.py <<'EOF' +#!/usr/bin/env -S uv run --script +# /// script +# requires-python = ">=3.11" +# dependencies = ["requests"] +# /// +import requests +print("ok") +EOF +chmod +x /tmp/uv-test.py && /tmp/uv-test.py +#+end_src +First run resolves and caches ~requests~; subsequent runs are instant. + +** Related +- Earlier message in this inbox CANCELLED the ~python-genanki~ install request — that one is no longer needed because ~uv~ + PEP 723 covers it. +- The rulesets inbox carries the broader template-script proposal. @@ -96,6 +96,58 @@ A custom waybar module providing three time-keeping functions, surfaced in the b Implementation notes (to flesh out when picked up): waybar =custom= module(s) with =exec= polling or a persistent =exec= script emitting JSON; click actions to start/pause/reset; a small state file under =~/.local/state= or =~/.local/var=. Lives in the hyprland tier (=dotfiles/hyprland/.config/waybar/= + a backing script in =hyprland/.local/bin/=). TDD the backing script per testing.md. +** TODO [#B] Local offline LLM runtime + per-host model cache :tooling:llm: +:PROPERTIES: +:LAST_REVIEWED: 2026-05-29 +:END: +Add a local-LLM provisioning track so machines can run an offline coding agent when there's no network. Install =llama.cpp= (CPU + Vulkan where practical) and prefetch per-host model files while network is available; expose OpenAI-compat local endpoints (=127.0.0.1:8081= coding, =:8082= general; =:11434= reserved for =ollama= if used). Per the rulesets generic-agent-runtime design pass — rulesets becomes runtime-neutral and owns the runtime manifests + project instructions; archsetup owns machine provisioning + the per-machine model inventory. Source: handoff from rulesets 2026-05-28 ([[file:assets/outbox/2026-05-28-from-rulesets-local-llm-install.org][outbox copy]]). + +Per-host model targets (from the handoff): +- *ratio* (Strix Halo, 128 GiB) — Qwen3-Coder-30B Q6_K (default) + Q4_K_M (compat) + Qwen3-Next-80B Q4_K_M (long-context fallback). +- *velox* (i7-1370P, 64 GiB iGPU) — Qwen3-Coder-30B Q4_K_M + an 8B fallback for low-latency triage. + +Install behavior: prefetch idempotent (skip if file exists, match size/hash); download failure must NOT fail the install — surface a clear "local LLM support incomplete" follow-up instead. Ship a smoke-test command (boot endpoint + short prompt). + +Decisions to resolve before code: +*** TODO Decide model cache location: per-user vs system-wide +Handoff lists both =~/.local/share/llm/models= (per-user) and =/srv/models/llm= (system-wide). Per-user matches the existing archsetup user-config style and avoids root ownership of large model files. System-wide matches the "machine-local model inventory" phrasing and shares cache across users on multi-user boxes (not the case here — single user per machine). Pick one as the default; the other stays available via =LLM_MODEL_CACHE=. +*** TODO Decide whether =ollama= ships by default or is opt-in +Handoff calls =ollama= "optional". Likely shape: =llama.cpp= is the only mandatory runtime; =ollama= behind =INSTALL_OLLAMA= (default no) for users who prefer its model-manager API. Confirm. +*** TODO Define config keys for the LLM block in =archsetup.conf.example= +Likely: =INSTALL_LOCAL_LLM= (default yes), =LLM_RUNTIME= (=llama.cpp= / =ollama=), =LLM_MODEL_CACHE= (path), =LLM_MODELS= (space-separated, or empty → per-host autodetect). Lock names + defaults before writing install code. +*** TODO Decide per-host model selection: auto-detect by =uname -n= vs explicit =LLM_MODELS= +Auto-detect against a known-host table (ratio → Q6_K + 80B, velox → Q4_K_M + 8B) is simple for current machines but brittle for any new host (silently picks no models). Explicit =LLM_MODELS= per machine in =archsetup.conf= is more verbose but never surprises. Pick the default; the other stays available. +*** TODO Decide network-down behavior for model prefetch +Three shapes: (a) emit =error_warn= and write =/var/lib/archsetup/state/llm-models-pending= for inspection; (b) install a one-shot systemd unit that retries on next boot with network; (c) just log and forget — user re-runs the prefetch helper manually when network returns. + +Implementation work (gated on the decisions above): +*** TODO Install =llama.cpp= with CPU + Vulkan backend where supported +Add to the appropriate install section in =archsetup= (=llama.cpp= / =llama.cpp-vulkan= in AUR). Decide CPU-only vs Vulkan per host from the hardware detection already used for GPU drivers. +*** TODO Install =ollama= behind config flag (if Decision 2 = opt-in) +Add =ollama= package install gated on =INSTALL_OLLAMA=yes=. +*** TODO Configure shared model cache + OpenAI-compat local endpoints +Create =$LLM_MODEL_CACHE= with the right ownership; configure llama.cpp (and ollama if installed) to serve =127.0.0.1:8081= (coding) and =:8082= (general). Likely systemd user units; decide launcher pattern when implementing. +*** TODO Prefetch per-host models (idempotent, non-fatal on network failure) +Download the per-host model set (from Decision 4) into the cache; skip files that exist with matching size/hash. On failure, fall back per Decision 5. Models from HuggingFace GGUF mirrors (URLs locked at implementation time). +*** TODO Ship a local-LLM smoke-test command +Boot the configured endpoint and send a short prompt; surface success/failure + timing. Useful as both a post-install check and a triage tool when something later breaks. Likely =scripts/llm-smoke-test.sh=; runs at end of install if =INSTALL_LOCAL_LLM=yes=. + +Acceptance: fresh VM install of the ratio profile reaches an endpoint on =:8081= that answers a smoke prompt; velox profile gets Q4_K_M + 8B and answers a prompt within reasonable laptop latency; network-down install completes successfully with the pending-models warning surfaced. + +** TODO [#B] Add =uv= to the install playbook :tooling:python: +:PROPERTIES: +:LAST_REVIEWED: 2026-05-29 +:END: +Add =uv= (Astral's Python package + script runner) to archsetup so fresh machines pick it up automatically. Currently installed by hand on ratio + velox (=/usr/bin/uv= 0.11.15), not in the standard set — a fresh install would skip it, and project scripts using PEP 723 inline-script metadata (=#!/usr/bin/env -S uv run --script= shebangs) would fail with =env: uv: No such file or directory=. Source: handoff from health 2026-05-29 ([[file:assets/outbox/2026-05-29-1127-from-health-todo-a-add-uv-to-the-install-playbook.org][outbox copy]]). + +Health requested [#A] (load-bearing for the PEP 723 pattern they're promoting + the rulesets template-script proposal). Demoted to [#B] for archsetup: no current install is broken (uv is pre-installed everywhere it's needed), and the shape matches the existing [#B] tooling-codification tasks (eask, signal-cli) — load-bearing for other projects, manually installed today, codify so fresh installs pick it up. + +- *Install via pacman* — =uv= is in extra (=pacman -S uv=). Cleanest path; auto-updates with the rest of the system. AUR =uv-bin= and Astral's official installer are alternatives but add a non-pacman path to maintain. +- *Placement* — alongside the existing language-tooling block in =archsetup= (near =rustup=, =nvm=, or the Python set). Decide the exact section at implementation time. +- *Verification* — post-install =which uv && uv --version=; PEP 723 end-to-end check per the health handoff (=/tmp/uv-test.py= shebang script with inline =requests= dep). + +Related: the new [#B] LLM task above may grow scripts that benefit from PEP 723 (e.g. =scripts/llm-smoke-test.sh= if Python-based). =uv= landing here removes that friction. + ** DOING [#A] Separate dotfiles from archsetup SCHEDULED: <2026-05-21 Thu> :PROPERTIES: |
