diff options
| -rw-r--r-- | docs/design/2026-07-03-audio-panel-spec.org | 160 | ||||
| -rw-r--r-- | todo.org | 57 |
2 files changed, 215 insertions, 2 deletions
diff --git a/docs/design/2026-07-03-audio-panel-spec.org b/docs/design/2026-07-03-audio-panel-spec.org new file mode 100644 index 0000000..16a087d --- /dev/null +++ b/docs/design/2026-07-03-audio-panel-spec.org @@ -0,0 +1,160 @@ +#+TITLE: Audio Panel — the pulsemixer console +#+AUTHOR: Craig Jennings +#+DATE: 2026-07-03 +#+TODO: TODO | DONE +#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED + +* IMPLEMENTED Status +:PROPERTIES: +:ID: 71f556c6-ee02-47cc-a3be-68c8289380f3 +:END: +- [2026-07-03 Fri] IMPLEMENTED — built in a no-approvals speedrun in the + dotfiles repo (branch panel-bugfixing): engine (pactl), presenter, GTK + panel, PTT arming, bar indicator, and the bar/keybind wiring, across four + commits 65e5bb0..9601420. 102 unit tests + a passing AT-SPI smoke on velox. + All five Decisions below resolved. Live-eyeball validation (visual polish, + PTT-in-a-meeting, fader feel, the master-mute hardware key) is the one open + follow-up, tracked as a manual-testing task in todo.org. +- [2026-07-03 Fri] DRAFT — stub from the todo.org task "Audio panel spec" + (roam ask 2026-07-02) plus the 2026-07-03 waybar/sound design discussion. + Written to iterate alongside the prototype + (=working/sound-panel/sound-panel-prototype.html=). Spine is present; the + Decisions and Design detail get filled in as we go. + +* Metadata + +| Field | Value | +|--------+---------------------------------------------------| +| Status | implemented | +|--------+---------------------------------------------------| +| Owner | Craig Jennings | +|--------+---------------------------------------------------| +| Repo | dotfiles | +|--------+---------------------------------------------------| +| Kin | net panel + bt panel (architecture + aesthetic | +| | donors), desktop-settings panel (sibling) | +|--------+---------------------------------------------------| + +* Problem + +Audio control today is the pyprland audio scratchpad (Super+A) — a floating +pulsemixer TUI — plus scattered bar affordances: =pulseaudio= (volume, click +to mute sink), =pulseaudio#mic= (mic glyph + mic-toggle), Super+M audio-cycle +ring, Super+Shift+A mic-toggle. There's no single glanceable surface that +shows every sink and source, lets you set the default output/input, and +carries the meeting-grade mic controls Craig wants (a clean muted mode and a +hold-to-talk mode). The net + bluetooth panels set the pattern for exactly +this shape; audio is the third instrument in the family. + +* Goals + +1. One panel, opened from the bar's sound glyph, exposing the full pulsemixer + surface: every sink and source, per-device volume, per-device mute, and + switching the default output and input. +2. Replace the pyprland audio scratchpad (Super+A) as the primary audio UI. +3. Mic modes for meetings: *live*, *muted*, and *push-to-talk* (mic stays + muted except while Space is held, releasing re-mutes). +4. A *master quick-mute* — one action mutes all output — reachable from the + faceplate and a keybind. +5. Instrument-console aesthetic and architecture consistent with net + bt: + same faceplate, lamps, engraved sections, console keys, needle gauges, + verify-everything contract. +6. The bar glyph reflects live state: speaker + three arcs normally, a + speaker-with-✕ when muted (Craig's called glyphs). + +* Design sketch + +Prototype: =working/sound-panel/sound-panel-prototype.html= (the reference for +layout + idioms below). + +** Surface (from the prototype) + +- *Faceplate* — status lamp, sound glyph, state word (PLAYBACK / MUTED), a + MUTED badge, the SND·01 unit label, and the *master quick-mute switch* + (same switch idiom as net wifi / bt power), plus the close ✕. +- *OUTPUTS section* — one row per sink. Row body click = set default (gold + DEF tag). A machined fader sets that sink's volume; the trailing glyph + mutes just that device. Active/default row is emphasized (cream name, gold + lamp/glyph). +- *INPUTS section* — one row per source, same idioms. +- *Mic mode* — three console keys: LIVE / MUTED / PUSH·TALK. Push-to-talk + keeps the mic muted (red IN needle) and un-mutes only while Space is held. +- *Twin VU needles* — output level + input level, the sound analog of net + throughput and bt battery gauges. Needle goes red when its side is muted. + +** Architecture — clone the net/bt panel stack + +- GTK4 + gtk4-layer-shell, Blueprint =.blp= → committed =.ui= (=make ui=, + dev-only build dep). +- Humble-object split: a GTK-free PanelModel presenter (unit-tested like the + net/bt PanelModels) + thin composite-widget pages. Backing actions in a + GTK-free =audio.py= that shells to the audio control layer (pactl / + wpctl / pulsemixer — pick below), TDD'd with fake binaries. +- One gated AT-SPI smoke (=run-panel-smoke.sh= pattern). +- Shared instrument-console palette CSS asset (the one net/bt/settings all + load) — do not duplicate the palette block. +- Code lives in dotfiles =audio/= sibling to =net/= (src-layout, tests in + =tests/audio/=). + +* Decisions (Craig) + +** DONE Audio control backend — pactl vs wpctl vs pulsemixer +CLOSED: [2026-07-03 Fri] +Resolved: =pactl= (the engine module is =pactl.py=). Both ratio and velox run +PipeWire with the pipewire-pulse compat layer and no PulseAudio daemon, so +pactl and wpctl hit the same graph — but =pactl -f json= gives structured, +name-addressable output where wpctl offers only a volatile-id tree. Reads go +through =pactl -f json list sinks|sources= + =get-default-*=; writes target +devices by stable name behind an argv-charset guard. + +** DONE Push-to-talk mechanism under Wayland (feasibility — phase 1) +CLOSED: [2026-07-03 Fri] +Resolved: route (a), Hyprland dynamic binds. The phase-1 spike confirmed all +three primitives on velox (Hyprland 0.55.4): =hyprctl keyword bind/unbind= +adds and removes a bind live, =bindr= fires on release, and =pactl +set-source-mute @DEFAULT_SOURCE@ 0|1= toggles the mic cleanly. =ptt.py= arms a +press bind (un-mute) + a bindr (re-mute) on entering PTT mode and unbinds on +leaving, so the talk key isn't grabbed globally otherwise. No evdev needed. +Documented behavior: while PTT is armed, the talk key is the talk key. + +** DONE Quick-mute keybind + scope +CLOSED: [2026-07-03 Fri] +Resolved: the XF86AudioMute hardware key (Super+Shift+M turned out to be taken +by the monocle-layout bind, so the spec's assumption was wrong). The mute key +now runs =audio quick-mute=, which mutes every output (master), not just the +default sink — identical on a single-sink machine, correct on a multi-sink +one. Also reachable from the faceplate master switch and the panel. Scope: +master mute of all sinks, with verify-after-apply per sink. + +** DONE Bar glyph click map +CLOSED: [2026-07-03 Fri] +Resolved with the low-regret wiring: kept the existing =pulseaudio= waybar +module (left-click mute, scroll volume — no regression) and repointed its +right-click from the retired pulsemixer scratchpad to =audio-panel=. So: left += mute, right = open panel, scroll = volume. A fuller =custom/audio= indicator +(state-following speaker glyph + its own click map) is built and tested +(=indicator.py= + =waybar-audio=) but stays unwired until the new bar glyph +gets a live eyeball — the swap is a one-line waybar edit when Craig's ready. + +** DONE Fate of the existing audio affordances +CLOSED: [2026-07-03 Fri] +Resolved: Super+A repurposed from =pypr toggle audio= (the pulsemixer +scratchpad) to =audio-panel= — the panel is the primary audio UI now, so the +scratchpad is retired. Its definition still sits in the machine-local +=pyprland.toml= (not stowed) and can be deleted by hand. Kept: =pulseaudio= + +=pulseaudio#mic= waybar modules (glance + scroll + the mic-mute glance), +Super+M cycle, Super+Shift+A + XF86AudioMicMute mic-toggle. Changed: +XF86AudioMute → master quick-mute (see the quick-mute decision above). + +* Implementation phases + +1. Push-to-talk feasibility spike (decision above) — the one unknown; settle + the mechanism before committing the mic-mode design. +2. =audio.py= backings (list/get/set/mute/default for sinks + sources) — + pure engine, TDD with a fake audio backend. +3. PanelModel presenter (rows, default tracking, mic modes, master mute, + verify-after-apply) — unit-tested, no GTK. +4. Blueprint UI + sound bar glyph (normal / muted / ptt states) + open/close + wiring; shared palette css; AT-SPI smoke. +5. Bar-affordance consolidation per the decision above; retire the Super+A + scratchpad; keybinds. @@ -32,11 +32,14 @@ Start the network panel a bit wider — keep the right edge fixed (it's right-an ** TODO [#B] Net panel doctor results can't display :bug:waybar:network: The doctor diagnostic output is unreadable — the results well is too constrained to show the multi-line result. It should open a results box tall enough for several lines with a copy-results button; closing it via an X in the box's upper-right collapses the space back to what it occupied before. Raised from roam capture 2026-07-03. -** TODO [#B] Audio panel spec :feature:waybar:audio:solo: +** DONE [#B] Audio panel spec :feature:waybar:audio:solo: +CLOSED: [2026-07-03 Fri] :PROPERTIES: :LAST_REVIEWED: 2026-07-02 :END: -Work Craig's ask (roam inbox, 2026-07-02) into a spec, net/bt-panel kin: an audio panel replacing the pypr audio scratchpad (Super+A) with the same functionality — change the default/active output (speaker) and input (mic), volume control for both. The one new capability: a push-to-talk mic mode for meetings — mic stays muted except while the space bar is held, releasing re-mutes. (Hold-to-talk under Wayland needs a global key grab — likely a hyprland bind pair on press/release or an evdev listener; feasibility research belongs in the spec.) Related current bindings: Super+M audio-cycle ring, Super+Shift+A mic-toggle. +Went past the spec to a full build in a no-approvals speedrun. Spec is now IMPLEMENTED ([[file:docs/design/2026-07-03-audio-panel-spec.org]], all 5 Decisions resolved). The panel shipped in the dotfiles repo (branch panel-bugfixing, commits 65e5bb0..9601420): pactl engine, GTK-free presenter, GTK instrument-console panel (OUTPUTS/INPUTS device rows with faders + per-device mute, LIVE/MUTED/PUSH·TALK mic keys, twin VU gauges, master quick-mute), Hyprland-bind push-to-talk, bar indicator, and the bar/keybind wiring (Super+A → panel, XF86AudioMute → master quick-mute). 102 unit tests + a passing AT-SPI smoke on velox. Live-eyeball validation filed under Manual testing and validation. Apply steps + follow-ups handed to the dotfiles project inbox. + +Original ask (roam inbox, 2026-07-02): net/bt-panel kin — change default output/input, volume for both, push-to-talk mic mode for meetings, master quick-mute, bar sound-glyph state. Related bindings: Super+M audio-cycle ring, Super+Shift+A mic-toggle. Prototype: =working/sound-panel/sound-panel-prototype.html=. ** TODO [#B] Scrolling/Carousel layout: frame fit + wrap-around :hyprland: :PROPERTIES: @@ -253,6 +256,11 @@ Boot the configured endpoint and send a short prompt; surface success/failure + Acceptance: fresh VM install of the ratio profile reaches an endpoint on =:8081= that answers a smoke prompt; velox profile gets Q4_K_M + 8B and answers a prompt within reasonable laptop latency; network-down install completes successfully with the pending-models warning surfaced. +** TODO [#C] Voice dictation / speech-to-text input :feature:tooling: +Push-to-talk dictation that types transcribed speech into the focused Wayland window — usable at any text field, including the Claude Code terminal prompt and Emacs buffers. Claude Code has no built-in voice input; dictation has to happen at the OS level and inject text. Raised 2026-07-03. + +Tool choice is the open decision (needs Craig): =nerd-dictation= (Vosk, lighter, lower accuracy) vs a =whisper.cpp=-based daemon (heavier, higher accuracy, optional GPU). Wayland typing backend is =wtype= or =ydotool=. Scope once chosen: install + model download, a push-to-talk keybind (Hyprland), and an autostart entry; fold into archsetup so it lands on both daily drivers. Consider an Emacs-native path (=whisper.el=) as a complement for in-buffer dictation. + ** TODO [#B] Review post-archsetup laptop setup steps (velox 2026-04-10) :PROPERTIES: :LAST_REVIEWED: 2026-06-09 @@ -595,6 +603,51 @@ Parse yay errors and provide specific, actionable fixes instead of generic error Enhance existing indicators to show what's happening in real-time ** TODO Manual testing and validation +*** Audio panel: apply the new shims + configs (precondition for the tests below) +What we're verifying: the new =audio=/=audio-panel=/=waybar-audio= bin shims and the hyprland.conf + waybar config edits are live. New files need a restow (a plain =git pull= doesn't symlink them). Quit Hyprland or run from a TTY — restowing live Hyprland writes a stub hyprland.conf. +#+begin_src sh :results output +cd ~/.dotfiles && git pull +cd ~/.dotfiles && make restow hyprland +#+end_src +- Reload waybar: mod+B (or =killall -SIGUSR2 waybar=). +- Reload the Hyprland config: mod+Shift+R (or =hyprctl reload=). +Expected: =which audio audio-panel waybar-audio= resolves all three to ~/.local/bin symlinks; no stow conflicts reported. + +*** Audio panel: opens and reads the live graph +What we're verifying: Super+A opens the instrument-console panel (not the old pulsemixer scratchpad) and it shows the real devices. +- Press Super+A. +Expected: the audio panel opens top-right. OUTPUTS lists every sink, INPUTS every mic (no .monitor entries), the default device in each is emphasized (cream name, gold glyph), each row shows its volume percent, and the twin OUT·PLAY / IN·MIC needles deflect. Esc closes it. + +*** Audio panel: set default, volume fader, per-device mute +What we're verifying: the three row interactions drive the real graph and the panel re-reads to confirm. +- With two or more outputs present, click a non-default output row's body. +Expected: that device becomes default (gold DEF emphasis moves to it) and playback follows. +- Drag a device's fader. +Expected: the volume percent tracks the fader and the actual device volume changes (no lag, no jump to the border). +- Click a device's trailing mute glyph. +Expected: that one device mutes/unmutes; the glyph and caption reflect it; other devices unaffected. + +*** Audio panel: master quick-mute (faceplate switch + mute key) +What we're verifying: the master switch and the XF86AudioMute key both mute every output at once (not just the default sink). +- Flip the faceplate master switch. +Expected: every output mutes, the state word reads MUTED, the badge shows, the faceplate glyph goes to the slashed speaker. Flip back to restore. +- With two outputs audible, press the hardware mute key (XF86AudioMute). +Expected: both outputs mute (master), not just the default. Press again to unmute all. + +*** Audio panel: mic modes + push-to-talk (the meeting case) +What we're verifying: LIVE/MUTED work, and PUSH·TALK holds the mic muted except while the talk key is held — the one genuinely new capability, and the one only a live test can confirm. +- Click LIVE, then MUTED, watching the default mic row. +Expected: LIVE unmutes the mic (needle lifts), MUTED mutes it (needle red); the active mode key shows a gold lamp. +- Click PUSH·TALK. In a call app (or =audio status= in a loop), watch the mic while you hold Space, speak, then release. +Expected: mic is muted at rest, un-mutes for exactly as long as Space is held, re-mutes on release. Switching back to LIVE or MUTED disarms the hold (Space types normally again). + +*** Audio panel: visual polish eyeball + bar entry point +What we're verifying: the panel looks right in the dupre instrument-console language, and the bar opens it. +- Look over the whole panel: faceplate spacing, engraved section labels, fader styling, VU needle centering, the muted-vs-audible glyph/color states. +Expected: it reads as a sibling of the net and bt panels; nothing overflows the gold border; the faders and gauges look machined, not stock GTK. +- Right-click the waybar sound module. +Expected: the panel opens (left-click still mutes, scroll still changes volume). + *** Timer dialog: Escape cancels at every step What we're verifying: Escape in the real fuzzel dialogs aborts the whole flow (the fix rides fuzzel's abort exit code; the unit tests fake it, this is the live confirmation). - Left-click the timer module on the bar, pick "timer", then hit Escape at the Duration prompt. |
