1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
|
#+TITLE: Waybar Network Module — Design Spec
#+AUTHOR: Craig Jennings & Claude
#+DATE: 2026-06-29
* Status
*Phases 1-3 SHIPPED* (2026-06-29 → 2026-06-30, dotfiles). The core module is live:
the =net= engine (=status/probe/list/up/down/add/edit/remove/rescan/diagnose/repair/
doctor/portal/speedtest=), the =waybar-net= indicator (split-cadence cache, redacted
event log, display-only airplane absorption per decision 12), and the GTK4
layer-shell panel (Connections / Diagnose / Repair / Speed test) with the settled bar
clicks (left = panel, middle = =net portal=, right = =net-fix=; airplane on
Super+Shift+A). 230+ net tests; full dotfiles suite green. Live-verified on velox.
Built on top since the original spec:
- *Captive-portal login engine* (2026-06-30, dotfiles =a7d7559=) — =net portal= now
runs a native =portal-login= repair tier (drop DoT → recover the portal URL from
the redirect → open a throwaway browser profile → auto-restore DoT once online),
replacing the old shell-out to =captive= for the force-portal flow. =net portal
--restore= is the manual fallback.
- *Portal UX fixes from live testing* (2026-06-30, dotfiles =eef6b0b=) — removed a
polkit-gated =resolvectl flush-caches= that popped an auth dialog (the DoT-drop
restart already clears the cache); added an already-online short-circuit so a
forced run on a working connection opens nothing; suppressed Chrome's first-run
wizard; moved =net portal= off the terminal into the panel status line; hardened
the portal-URL extractor against Firefox's detection page.
- *Panel auto-hide + Close button* (2026-06-30, dotfiles =450b7f0=) — the panel
closes on focus-out (popup behavior, suppressed while a child dialog holds focus)
and carries a Close button bottom-right.
*V2 redesign in flight* (designed 2026-06-30, not yet built — see todo.org "Network
panel redesign — no terminals, verify-everything, full failure coverage"). It
reverses two earlier choices and widens coverage:
- *No terminals anywhere.* =net-popup= is removed; every action and result renders
in the panel. This depends on a passwordless privileged path — a root-owned helper
plus a narrow NOPASSWD sudoers rule, archsetup-installed — because an in-panel
worker thread has no tty to prompt for a password. Reverses decision 11's
"privileged tiers run in a terminal".
- *New navigation* — top tabs Connections | Diagnostics | Performance. Diagnostics
merges Diagnose + Repair (sub-row Diagnose | Get Me Online | Advanced; a shared
area below shows diagnose items and streams repair progress; Advanced reveals the
individual repair buttons, renamed with tooltips). Speed test lives under
Performance.
- *Verify every action* (each mutating op confirms its effect before reporting
success) and *detect + respond to every failure mode* — the full ~44-mode catalog,
edge cases included, lives in the redesign task and supersedes the table below.
Phase 4 (docs / rollout) and Phase 5 (VPN) remain. Review incorporated (Codex,
2026-06-30): four review rounds + Craig's cj comments are all dispositioned
([40/40], no open findings) — the fourth round reshaped the V2 panel UX (single nav
target, saved-vs-available groups, join-from-row, the auth matrix, progressive
loading, a findable diagnostics report, and the Waybar visual contract; see "V2 panel
UX"). Phases 1-3's manual live checks are under todo.org "Manual testing and
validation".
* Goal
One waybar network component that does the whole job: shows connection state
(including the missing "associated but no internet / captive portal" state),
manages connections from a dropdown (nmcli-backed; secrets stay in
NetworkManager's own store, no separate credential file), and runs the network
diagnostics and remediation off the same place
(captive-portal detection + forcing, bounce/reset, gateway/DNS checks, speed
test).
It unifies three todo tasks that are really one feature:
- =[#C]= "archsetup Waybar Wi-Fi module should show no-internet state" — the
indicator state plus the 2026-06-22 roam expansion (bounce, diagnostics, speed
test off the component).
- =[#B]= "Network-manager dropdown, nmcli-backed" — the management dropdown. (The
todo task's original "GPG-stored secrets" framing is superseded: secrets stay in
NM's own store, decision 5.)
- The network diagnostics already shipped in =captive= (the hotel/captive-portal
tool, formerly =login-page=) become this module's diagnostics engine rather
than a standalone CLI.
* Scope
** In
- *Indicator* — wifi/ethernet icon + signal + SSID, plus an internet sub-state:
online / captive / no-internet / connecting / disconnected / airplane.
- *Absorbs the airplane module* — the airplane state + toggle move into
=custom/net= (airplane is a network concern). Once this ships, the standalone
=custom/airplane= module, the =waybar-airplane= + =airplane-mode= scripts, their
=tests/=, and the css are deleted (listed under Files touched). The
desktop-settings panel (sibling =[#B]=) no longer needs an airplane row.
- *Interface-correct* — targets the wifi (or chosen) device, not the
default-route interface, so an active USB tether or wired link can't mask
wifi state. (Same lesson =captive= fixed; the current =custom/netspeed= keys
off the default route and has the bug.)
- *Connection management (panel)* — list saved connections most-recently-used
first, live signal for in-range wifi, click to switch; add / edit / remove for
open + WPA-PSK; activate any existing saved profile (including enterprise ones
NM already stores); ethernet↔wifi and wifi↔wifi switching even when a link
appears mid-session.
- *Diagnostics (panel)* — read-only Diagnose (captive probe 204-vs-portal with
the extracted portal URL, gateway ping, DNS config) separated from mutating
Repair. Repair has tiers, lightest first: rfkill-unblock, per-connection reset
(fresh MAC), full-stack bounce (=nmcli networking off/on=, then restart
NetworkManager if that fails), and the temporary 1.1.1.1 override test. Each
Repair action confirms and verifies cleanup.
- *Speed test (panel)* — down/up/ping with a progress indicator and last-result
shown, via the already-installed =speedtest-go --json=.
- *Connection secrets* — none of our own. Settings and passwords live where NM
already keeps them: =/etc/NetworkManager/system-connections/*.nmconnection=
(root-only =0600=, the PSK/EAP secret stored inline). We read/write them through
nmcli, which handles the privilege. No separate file, no GPG, no gpg-agent — one
fewer dependency, and NM's store is already the secure-at-rest source of truth.
- *Persistence* — connectivity probe result cached in the runtime dir so the
bar reads it cheaply between probes.
- *Observability* — a redacted JSONL event log so a post-failure session can
diagnose without re-running destructive actions.
** Out (v1, note for later)
- No replacement of NetworkManager's connection engine. NM stays the thing that
connects; we drive it via nmcli.
- No add/edit *form* for WPA-Enterprise / 802.1X in v1. The reason is effort vs
payoff: 802.1X has many interdependent fields (CA cert, client cert, identity,
anonymous identity, phase-2 auth) where a wrong entry silently fails auth, so a
trustworthy form is a lot of UI for connections Craig rarely adds (open +
WPA-PSK covers home, hotels, and phone hotspots). v1 still *activates* existing
saved enterprise profiles and points editing at =nmtui=/=nmcli=. Settled
(Craig, 2026-06-29): enterprise add/edit is vNext — 24 saved profiles on velox,
0 enterprise, so the form would be unused UI; if one ever appears nmtui adds it
once and the module activates it thereafter.
- No per-connection captive-portal *auto-login* in v1. (That would mean storing a
portal's login form answers — room number, surname, a checkbox — and replaying
them automatically when a known portal is detected, so the page never appears.
Out for v1 because every portal's form differs and it means storing per-venue
answers; v1 just opens the portal for you.)
- No graphing/history of speed-test results beyond the last run.
- No static-IP / proxy / metered / MAC-randomization editing in v1 (activate
existing, edit elsewhere).
- No VPN / WireGuard management in v1, but it's a planned later phase (Phase 5),
not a permanent exclusion — it folds the existing archsetup wireguard tooling
into the same panel/CLI.
- The desktop-settings dropdown (sibling =[#B]=) is a separate module, but it
shares the GTK4 layer-shell panel shell built here.
* Architecture
Three layers. Keep the bar cheap, the panel rich, the logic in one tested place.
1. *Engine* — a =net= Python package (src-layout, unittest), exposing a CLI. Wraps
every nmcli op and owns the diagnostics. Emits JSON. This is the testable
core (fake =nmcli= / =curl= / =speedtest-go= on PATH, like the existing
=waybar-netspeed= and =waybar-sysmon= test harnesses). Precedent: pocketbook is
Python in the dotfiles repo; =wtimer= is Python for the same testability
reason.
2. *Indicator* — a thin =waybar-net= script that calls =net status --json= and
renders icon + signal + state + tooltip. Replaces =custom/netspeed=
(throughput folds into the tooltip).
3. *Panel* — a GTK4 + gtk4-layer-shell app (mirrors pocketbook's structure)
that imports the engine. Hosts connection management, diagnostics, and the
speed test.
How the existing pieces map in:
- =captive= (bash, shipped) — its cheap portal-detection logic is mirrored natively
in the engine for the fast status path so the bar never blocks on a subprocess,
and it still exposes a =--probe-json= mode the engine reuses. *As built (2026-06-30):
the force-portal flow is now native too* — =repair.py='s =portal-login= tier does
the DoT drop, portal-URL recovery, clean-browser launch, and auto-restore in
Python, so =net portal= no longer shells out to =captive= for it. =captive= stays a
usable standalone CLI.
- =waybar-netspeed= (sh, shipped) — retired; its throughput sampling moves into
the engine's status output and renders in the indicator tooltip only.
- =nmcli= — the connection backend for every op.
Language note: the engine is Python; the indicator is a thin Python or sh
wrapper over =net status --json=. The bar path must stay fast (see Performance
budgets), so the indicator does no network I/O itself — it reads link state and
the cached connectivity result.
Privileged-path model (v2, planned): repairs that need root (rfkill unblock, nmcli
modify/up, networking off/on, =systemctl restart NetworkManager/systemd-resolved=,
resolvectl dns/revert, the DoT toggle) go through a single root-owned helper
installed by archsetup, with a narrow NOPASSWD sudoers rule scoped to that helper
only (never a blanket =mv=/=systemctl= rule). =repair.py= calls =sudo <helper>
<verb>=. This is what lets every action run in-panel with no terminal: a GTK worker
thread has no tty, so without a passwordless path it can't prompt. It also fixes a
latent bug in the shipped portal flow — the detached DoT-restore watcher runs with
no tty and silently fails to restore encrypted DNS when sudo creds aren't cached.
* Repository + dependencies
- *Code lives in the dotfiles repo* (=~/.dotfiles=), not archsetup. The =net=
package sits in-tree like pocketbook (src-layout, unittest, Makefile target);
=waybar-net= and the =net= CLI entry live in the hyprland tier
(=hyprland/.local/bin/=). Tests under =tests/net/= and =tests/waybar-net/=.
archsetup owns only the *dependency install*, not the code.
- *archsetup installs the deps* in its Hyprland step: =gtk4-layer-shell=,
=python-gobject=, plus =nmcli=/=curl=/=resolvectl=/=rfkill= (already present via
NetworkManager/curl/systemd/util-linux). Speed test uses =speedtest-go= (AUR
=speedtest-go-bin=, already installed on velox); archsetup adds it to the AUR
list. librespeed-cli is the documented fallback if a self-hosted LibreSpeed
server is ever wanted. No =gpg= dependency (secrets live in NM's own store).
- *Daily-drivers*: a stowed-script + AUR-dep feature, so ratio needs the same
=git pull= + stow + the archsetup-added deps. Note the manual dep step in the
rollout.
** Makefile targets (console recovery is a first-class path)
=net doctor= and the diagnostics are reachable from a bare TTY when waybar and
the GUI are down — that's the case where you most need them. The dotfiles
Makefile carries targets that wrap the =net= CLI so "get back online" is one make
command from the console:
- =make online= — =net doctor --fix= (diagnose, then apply the lightest repair:
rfkill-unblock → reset → bounce → open portal). The headline recovery target.
- =make net-doctor= — =net doctor= (read-only diagnose + recommendation).
- =make net-status= / =make net-diagnose= / =make net-portal= / =make net-reset=
/ =make net-bounce= — the individual ops.
- =make test= — already runs =tests/*=; the =net= package's unittest suites are
collected the same way.
These intentionally need only nmcli/curl/rfkill (no GUI, no waybar, no Python
GTK), so they work from a TTY on a broken graphical session.
* Connectivity model — split cadence
The indicator polls every ~2s, but a real internet/captive probe every 2s wastes
battery and can re-trigger a captive portal. So split it:
- *Fast path (every poll, cheap, no network)* — interface, type, SSID, signal,
IPv4 presence, throughput sample. From nmcli / sysfs only. No network I/O.
- *Slow path (cached, TTL ~45s)* — the actual internet/captive probe (the 204
check + meta-refresh portal extraction). Result cached at
=$XDG_RUNTIME_DIR/waybar/net-connectivity.json= with a timestamp.
The indicator reads the cache each poll. When the cache is older than the TTL,
=net status= kicks =net probe= in the background (spawn + detach, never awaited)
and renders the last cached sub-state meanwhile. A user-triggered
diagnose/reconnect refreshes the cache immediately. This keeps the bar
responsive and the portal un-poked.
** Concurrency, atomicity, staleness
- *Single-flight* — =net probe= takes a lock file at
=$XDG_RUNTIME_DIR/waybar/net-probe.lock= (flock, non-blocking). A second probe
while one runs is a no-op, so a flapping 2s poll can't pile up overlapping
probes.
- *Atomic writes* — the cache is written to a temp file + =os.replace= (atomic
rename), so a reader never sees a half-written cache. Same pattern as =wtimer=.
- *Max probe runtime* — the probe has a hard timeout (≤ 6s total: curl
=--max-time 5= + slack). On timeout it writes an =unknown= result, never hangs.
- *Stale classes* the indicator distinguishes: fresh (< TTL), stale (TTL..3×TTL,
shown with a subdued/aging hint), expired (> 3×TTL → treat as unknown),
unknown (no cache / probe failed). The bar never shows a confident "online"
past the expired threshold.
- *Invalidation* — the cache records the iface + SSID + active-connection UUID it
was taken under; a change in any of them invalidates it immediately (a
reconnect must not show the old network's verdict).
- *Crash cleanup* — a stale lock older than the max runtime is ignored/reclaimed.
* Performance budgets (hot path)
The bar exec path (=waybar-net= → =net status=) must stay responsive:
- *Budget*: =net status= returns in < 100ms typical, < 250ms worst case.
- *No sleeping in the bar path.* Throughput is sampled from two reads of
=/sys/class/net/<iface>/statistics/{rx,tx}_bytes= across the *waybar poll
interval itself* (delta since the last cached sample + timestamp), not via an
in-process =sleep= like the old =waybar-netspeed=. The cache holds the prior
counters.
- *Subprocess cap*: at most one =nmcli= invocation on the hot path (a single
=nmcli -t -f ...= multi-field query), plus sysfs reads. Never a per-field
nmcli call.
- *Every subprocess has a timeout* (=nmcli --wait 2=, =subprocess timeout=). On
timeout or error the indicator emits a degraded JSON state (class
=net-degraded=, a neutral glyph) rather than blocking or crashing waybar.
- *Benchmark test*: a fake slow =nmcli= asserts =net status= still returns within
budget by falling back to the degraded state.
* Engine — =net= CLI surface
All subcommands take =--json= where a machine reads them. Pure formatting/state
functions under the CLI; IO (nmcli, curl, file) at the edges. Every subcommand
exits non-zero with a JSON error envelope (see JSON schemas) on failure.
- =net status [--json] [--iface IF]= — fast link state + cached connectivity
sub-state + throughput. The indicator's source. Never does network I/O.
- =net probe [--iface IF]= — run the connectivity/captive probe now, update the
cache (single-flight, atomic), print online | captive (+ portal URL) |
no-internet | unknown. Mirrors =captive='s cheap detection natively.
- =net list [--json]= — saved connections, MRU order, active flag, plus in-range
wifi with signal.
- =net up <uuid>= / =net down [--iface IF]= — switch / disconnect. Operates on
UUID, not name (see nmcli contract).
- =net add= / =net edit <uuid>= / =net remove <uuid>= — manage connections
(open + WPA-PSK) through nmcli; the secret lands in NM's own
=.nmconnection=. Enterprise profiles are activate-only.
- =net rescan [--iface IF]= — wifi rescan.
- =net diagnose [--json]= — read-only report: gateway ping, DNS config, captive
probe. The structured contract below. Doubles as the post-failure snapshot.
- =net repair <action> [--json]= — mutating remediation, lightest first:
=rfkill= (unblock + radio on), =reset= (fresh MAC), =bounce= (full-stack:
=nmcli networking off/on=, escalating to =systemctl restart NetworkManager=),
=dns-test= (temporary 1.1.1.1 override, auto-reverted). Each confirms via the
caller and verifies cleanup.
- =net doctor [--json] [--fix]= — one-shot "get me online" mode for the console:
runs the full diagnose, then applies the lightest repair that fits (unblock
rfkill, reset, bounce, open portal) — read-only without =--fix=, acting with
it. The TTY recovery path when waybar/the GUI is down (see the Makefile
targets).
- =net portal [--restore]= — the native captive-login flow (=repair.py= =portal-login=
tier): short-circuits if already online, else drops DoT to plain DNS, recovers the
portal URL from the redirect, opens it in a throwaway browser profile, and spawns a
detached watcher that restores DoT once online. =--restore= forces the restore now.
- =net speedtest [--json]= — =speedtest-go --json= run; down/up/ping.
* nmcli contract
The command wrapper is the reliability boundary; SSIDs and connection names
contain spaces, colons, duplicates, hidden names, and non-ASCII. Rules:
- *Terse, field-selected output*: =nmcli -t -f <fields> --escape yes ...= and
=nmcli -g <fields> ...= (get-values) for single-value reads. Parse with the
documented escaping (=\:= and =\\=); never naive =cut -d:=.
- *UUID is the handle.* Every saved-profile op (=up=, =down=, =modify=, =delete=)
uses the connection UUID, never the display name — names duplicate and contain
separators. =net list= surfaces UUIDs; the panel maps row → UUID.
- *Wait budgets*: activation/deactivation use =nmcli --wait <n>= with an explicit
budget (hot-path reads =--wait 2=; activation =--wait 30=). No unbounded waits.
- *Connectivity*: NM's own =nmcli networking connectivity= can return
=none/portal/limited/full/unknown=. Use it as a *cheap hint* on the fast path
when present, but the authoritative captive verdict is still our own probe
(NM's portal detection is coarser and config-dependent).
- *Parser tests* (fake nmcli fixtures): escaped colons and backslashes in SSIDs,
embedded newlines, duplicate connection names, hidden SSID (empty name),
non-ASCII SSID, the wired-appears-mid-session case, and the multi-active case
(wifi + tether both up).
* JSON schemas
Versioned (="v": 1=) envelopes so tests lock the contract. Sketches (fields
nullable unless noted):
- =status=: ={v, iface, type: wifi|ethernet|none, ssid, signal, ipv4,
gateway, throughput: {rx_bps, tx_bps}, connectivity: online|captive|no-internet|unknown,
connectivity_age_s, connectivity_class: fresh|stale|expired|unknown, state:
online|captive|no-internet|connecting|disconnected|airplane|wired|degraded}=.
- =probe=: ={v, result: online|captive|no-internet|unknown, portal_url, http_code,
redirect_host, elapsed_ms, ts}=.
- =list=: ={v, connections: [{uuid, name, type, active, last_used, signal,
in_range, security}]}=.
- =diagnose=: ={v, steps: [<diagnostic step, see contract>], overall:
ok|warn|fail}=.
- =speedtest=: ={v, down_mbps, up_mbps, ping_ms, server, elapsed_ms, ts}=.
- error envelope (any command): ={v, error: {code, message, detail, partial:
bool}}= with a non-zero exit.
* Diagnostics contract
=net diagnose --json= returns an ordered list of steps. Each step is the unit the
panel renders and the log records:
- =id= — stable identifier (e.g. =link=, =dhcp=, =gateway-ping=, =dns-config=,
=dns-resolve=, =http-probe=, =portal=).
- =status= — =pending | running | pass | warn | fail | skipped=.
- =title= — short human label.
- =evidence= — redacted detail (the value seen), per the redaction rules.
- =elapsed_ms=.
- =safety= — =read-only= or =mutating= (diagnose steps are all read-only).
- =next_action= — what the user/agent should do on warn/fail (e.g. "open portal",
"reset connection", "switch network").
Repair actions (=net repair=) carry the same shape but =safety: mutating=, plus a
=cleanup_verified: bool= field (e.g. the DNS override was reverted) and a
terminal =cleanup-unverified= status when revert can't be confirmed.
** Diagnose vs Repair (read-only vs mutating)
The panel separates them visually and behaviorally:
- *Diagnose* — probe, gateway ping, DNS config read, captive check. No state
change, no sudo, runnable freely.
- *Repair* — reset (fresh MAC, deletes+recreates the NM profile), DNS override
test (mutates resolver, auto-reverts), portal force. Each needs an explicit
confirm, shows that it's privacy/state-changing, and verifies cleanup. A
Repair whose cleanup can't be verified ends in a visible =cleanup-unverified=
state, never a silent success.
* Failure states, messages, recovery
Each row below gives the *exact, final* user-facing string (not a template) with
=<placeholders>= for redacted evidence, plus the evidence field included and the
next action. The string is canonical: every surface renders the same text, so
there's one source of truth.
Per-surface rendering of the canonical string:
- *Indicator* — the matching glyph + CSS class; the string is the tooltip
(untruncated).
- *Notification* (=notify=) — title = the failure label, body = the string.
- *CLI* — the string on stderr; =--json= puts it in =error.message= with the
evidence in =error.detail= and a stable =error.code=.
- *Panel* — the string as the section banner, with the diagnostic step's evidence
shown beneath.
Evidence is always redacted per the redaction rules (SSID/host shown; PSK/EAP/
portal tokens never).
- *associated, no DHCP* — "Connected to <SSID>, no IP (DHCP failed)" →
evidence: SSID, iface → reset / reconnect.
- *no-internet* — "On <SSID>, no internet (gateway reachable, no route out)" →
diagnose / switch network.
- *captive* — "Captive portal at <host> — login required" → Open portal.
- *DNS hijack* — "DNS is being redirected (portal)" → Open portal.
- *DNS broken* — "DNS not resolving (hotel DNS down); 1.1.1.1 works" → use
override / report.
- *HTTP intercepted* — "Traffic is being intercepted before it leaves" → Open
portal.
- *sudo declined* — "Reset needs admin; it was declined — nothing changed" →
retry with auth.
- *command timed out* — "<op> timed out; the system was left unchanged" → retry.
- *partial mutation* — "<op> partially applied: <what>; rolled back to <state>"
→ review.
- *missing speedtest-go* — "speedtest-go not installed" → install hint.
- *no wifi hardware* (desktop) — wifi rows hidden; ethernet-only view.
- *wifi rfkill-blocked* — "WiFi is blocked (rfkill)" → unblock. The indicator
detects a soft-blocked radio (=rfkill list= shows the radio off though hardware
is present) and shows this distinct from disconnected. =net repair rfkill= (and
=net doctor --fix= as its first step) runs =rfkill unblock wifi= + =nmcli radio
wifi on= and reconnects. This is the framework-laptop case: an out-of-power
shutdown sometimes leaves wifi soft-blocked at next boot, and yes — the module
recovers it (the rfkill state is the indicator; the rfkill repair / doctor is
the one-step fix). A *hard* block (physical switch) is reported as
not-recoverable-in-software with that message.
- *wifi rfkill hard-blocked* — "WiFi is blocked by the hardware switch" →
evidence: rfkill hard state → flip the physical switch.
- *wrong password / missing secret* — "Saved password for <SSID> was rejected" →
evidence: SSID, NM auth-failure reason → re-enter the password.
- *enterprise auth/cert failure* — "Enterprise login failed for <SSID> (802.1X)"
→ evidence: SSID, EAP failure reason → edit the profile in nmtui/nmcli.
- *upstream / AP / provider* — "On <SSID>, link is fine but the network has no
uplink" → evidence: gateway reachable, no route out, not a portal → switch
network or contact the venue.
- *VPN-routed* — "Connected; internet is routed through a VPN (<dev>)" →
evidence: default route on a tun/wg device or non-NM DNS owner → check the VPN,
not WiFi.
- *HTTP interception, no parseable portal URL* — "A portal is intercepting
traffic but didn't give a login link" → evidence: HTTP code, redirect host →
opens neverssl + the gateway page to log in manually.
- *DNS override cleanup unverified* — "Couldn't confirm DNS was restored after the
test" → evidence: iface, attempted revert → revert DNS manually
(=resolvectl revert <iface>=).
Each message names whether the system was left unchanged, partially changed (with
what), or fully changed, so the user knows the residue.
* Doctor: escalation, classification, terminal states
=net doctor= diagnoses, classifies the failure, then (with =--fix=) applies the
*lightest* repair that fits and re-checks — it never loops destructive repairs
against a failure they can't fix. Each failure resolves to one of four outcomes,
and the doctor stops at any terminal one:
- =fixable= — a local repair should help. Escalate lightest-first: rfkill-unblock
→ reset (fresh MAC) → bounce (full stack) → portal, re-probing after each, and
stop as soon as the probe returns online.
- =needs-user-action= (terminal) — no reset/bounce will help; doctor stops and
names the exact next step. Covers: wrong WPA password / missing NM secret
(enter the password), locked keyring or polkit denial (retry with auth),
enterprise 802.1X cert/identity failure (edit the profile in =nmtui=/=nmcli=),
captive portal login-required (open the portal + accept terms). Doctor must not
delete/recreate the profile against these — that loses the saved password and
makes things worse.
- =upstream-not-local= (terminal) — the local link is up but the problem is past
it: AP has no uplink, gateway down/dropping traffic, DHCP server broken, ISP
outage, portal backend failing. =diagnose= proves it (link up + IP + gateway
reachable, but no route out and not a captive redirect), and =doctor --fix=
stops after local repairs are exhausted with "local repairs tried; likely
upstream/AP/provider" + the evidence. Next action: switch networks or contact
the venue.
- =deferred/vpn= (terminal for v1) — an active VPN / policy route / non-NM
resolver owns the default route or DNS, so "no internet" may be the VPN's fault,
not WiFi's. v1 *detects* this (default route on a =tun/wg= device, or DNS owned
by something other than the NM link) and classifies it separately — "link is
fine; internet is VPN-routed" — rather than misclassifying it as a WiFi failure.
v1 does not repair it (VPN management is Phase 5); it names the VPN as the likely
owner and stops.
** DNS handling in doctor (explicit per class)
- *Captive DNS hijack* — open the portal (the hijack clears on login). No DNS
mutation.
- *Broken resolver, 1.1.1.1 works* — the shipped =dns-test= repair is *diagnostic*:
it sets 1.1.1.1, confirms the venue resolver is the culprit, then auto-reverts
(=cleanup_verified=). Because it reverts, =doctor --fix= does not currently leave
you online in this case — it falls through to =upstream-not-local=, which
misreports a locally-fixable problem. *V2 fix (planned):* on a dns-test *pass*
(public DNS works), set a PERSISTENT resolver override and verify online, with an
offered revert — and classify it as its own outcome rather than upstream.
- *Port-53 / egress blocked* (even 1.1.1.1 fails) — terminal =upstream-not-local=;
doctor stops, since it's not locally fixable.
* Failure-mode coverage
*V2 note (2026-06-30):* the authoritative, exhaustive catalog (~44 modes across 10
connectivity layers, edge cases included, each tagged fix-and-verify or report-text)
now lives in the redesign task (todo.org "Network panel redesign"). The table below is
the v1 baseline; two rows reflect intent the shipped code doesn't yet match, and the
v2 catalog closes them: =gateway unreachable= claims a bounce that doctor never
actually reaches (a no-route failure goes straight to =upstream-not-local=), and
=broken DNS, 1.1.1.1 works= auto-reverts so the user is left offline and misreported
as upstream (the v2 persistent-override fix closes this).
For each common field failure: does =net diagnose= detect it, can =net doctor
--fix= repair it, and what terminal user action remains when it can't. (The
=needs-user-action= / =upstream-not-local= / =deferred/vpn= outcomes are defined
above.)
| Failure mode | diagnose detects | doctor --fix | terminal user action |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| rfkill soft block | yes | yes (unblock) | none |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| rfkill hard block | yes | no | flip the physical switch |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| no wifi hardware | yes | n/a | use ethernet |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| associated, no DHCP | yes | yes (reset/bounce) | none, else switch network |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| gateway unreachable | yes | yes (bounce) | switch network if it |
| | | | persists |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| captive DNS hijack | yes | opens portal | log in at the portal |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| broken DNS, 1.1.1.1 works | yes | yes (temp override, | report the venue's DNS |
| | | auto-reverted) | |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| HTTP captive portal | yes | opens portal | log in at the portal |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| HTTP interception, no | yes | opens neverssl + gateway | log in manually |
| parseable URL | | | |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| upstream / AP outage | yes (link up, no route out) | no (stops after local) | switch network / contact |
| | | | venue |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| wrong WPA password / | yes | no | enter the password |
| missing secret | | | |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| enterprise auth / cert | yes | no | edit the profile in |
| failure | | | nmtui/nmcli |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| duplicate SSID / | yes (UUID-keyed) | yes (activate by UUID) | none |
| connection-name | | | |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| hidden SSID | yes | yes (connect by name) | enter SSID + password |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| multiple active links | yes | n/a | pick the interface |
| (wifi+tether) | | | |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| wedged NetworkManager | yes | yes (bounce → restart NM) | none, else reboot |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| slow / hung command | yes (degraded) | retries within budget | retry |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| stale / corrupt cache | yes | self-heals (atomic + | none |
| | | invalidation) | |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| DNS cleanup failure | yes | flags cleanup-unverified | revert DNS manually |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| missing speedtest backend | yes | n/a | install speedtest-go |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| VPN / policy-routing | yes (route/DNS ownership) | no (deferred to Phase 5) | check the VPN |
| interference | | | |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
* Observability — logging + redaction
- *Event log*: JSONL at =$XDG_STATE_HOME/net/events.jsonl= (fallback
=~/.local/state/net/events.jsonl=), size-rotated (e.g. 1 MB × 3). Every
mutating op and probe appends an event: =ts, op, argv (redacted), exit_code,
stderr_tail, elapsed_ms, iface, nm_uuid, probe_url_class, http_code,
redirect_host, cache_event=.
- *Redaction (always on)*: PSKs, EAP identities/passwords, NM secrets, and
portal query tokens are never logged. MAC addresses, full IPs, and SSID are
redacted when configured (=redact_mac=, =redact_ip=, =redact_ssid= in config).
- *Post-failure diagnosis*: =net doctor --json= is the snapshot + recommendation
(diagnose plus the suggested repair), =net diagnose --json= the raw report, and
the event log the history. =net doctor= is the console-recoverable entry point
(reachable as =make online= / =make net-doctor=).
- *Secret-leak tests*: assert no PSK/EAP/portal-token ever appears in any JSON
output, log line, or error message.
** Automatic diagnostic verbose-capture (V2)
A distinct layer from the event log above: that log records what =net= did;
this captures what the *underlying stack* did at debug verbosity during a run, so a
failed diagnosis leaves real ground-truth instead of relying on memory. Two triggers,
one mechanism:
- *Automatic — on a failing diagnose.* When =net diagnose= ends =overall: fail=, the
next escalation (or =Get Me Online=) runs inside a verbose-capture session.
- *Manual — a debug on/off toggle in the panel's Advanced section.* "Debug on"
elevates and leaves it elevated (with a visible "debug capturing" indicator) so the
user can reproduce an intermittent problem over time; "Debug off" restores and
writes the bundle. Useful when the failure doesn't reproduce inside one diagnose.
Mechanism (shared):
1. *Snapshot* the current log levels (=nmcli general logging=, resolved's level,
wpa_supplicant's).
2. *Elevate* the relevant components to debug at runtime, no restarts, scoped to the
domains that matter (NM: =WIFI,DHCP,DNS,CORE=; resolved; wpa_supplicant).
3. *Run* the diagnostics / repair.
4. *Capture the window*: =journalctl= for NetworkManager + systemd-resolved +
wpa_supplicant since the run started, a =dmesg= tail (driver / firmware / rfkill),
and any =curl -v= probe output.
5. *Restore* every level to its snapshot.
6. *Write a redacted support bundle* to =$XDG_STATE_HOME/net/bundles/<ts>/= and
surface it in the panel.
Hard requirements:
- *Restore is guaranteed and idempotent.* A =try/finally= restores even on error,
and a crash-recovery guard detects "a prior run left NM/resolved/wpa_supplicant
elevated" on the next run and puts it back — the same shape as the DoT-restore
watcher. A crash must never strand the stack at debug verbosity.
- *Redaction before anything leaves.* Raw wpa_supplicant and NM debug logs carry the
PSK and EAP credentials in cleartext. The captured journal is scrubbed before the
bundle is written, shown, or shared; the secret-leak test asserts no passphrase or
EAP secret survives into a bundle.
- *Privilege via the V2 sudo-helper.* The log-level toggles need root, so they become
verbs on the passwordless helper (decision 16) — no extra prompt.
Bonus — this closes a real detection gap, not just observability: the spec notes live
auth-failure detection is a v1 limit (it leans on a one-shot NM state-120 snapshot).
wpa_supplicant at debug during the run is exactly how a wrong-password or EAP failure
is caught properly, so the capture feeds back into classification.
* Indicator (task #C — Phase 1, the fast win)
** States (internet sub-state on top of link state)
- online — associated and the probe returned 204. Normal icon.
- captive — associated, probe hit a portal. Distinct glyph + warning CSS class;
tooltip names the portal host; left-click opens diagnostics with the portal
ready to open (Phase 2+; see interactions for the Phase-1 interim).
- no-internet — associated, probe failed (no portal, no 204). Distinct glyph +
warning class.
- degraded — =net status= couldn't read link state within budget (slow/failed
nmcli). Neutral glyph, =net-degraded= class. Never blocks the bar.
- rfkill-blocked — the radio is soft-blocked (=rfkill=), distinct from
disconnected. Distinct glyph; the fix is =net repair rfkill= / =net doctor=.
- connecting / disconnected / airplane / wired — as today, plus wired shown
correctly even when it appears after session start. (airplane is now this
module's state, absorbed from the retired airplane module.)
** Glyphs
Nerd-font codepoints, final values verified live before merge (same discipline
as wtimer). Reuse the signal-strength ramp already in =waybar-netspeed=; add a
captive / no-internet / degraded overlay glyph.
** Tooltip
SSID + signal + IPv4 + gateway + the throughput readout (absorbed from
netspeed) + the last probe result and its age (stale/expired hinted).
** Interactions (no keyboard-modifier clicks — waybar can't qualify clicks by
modifier, so the rich actions live in the panel, not ctrl/super-click)
Clicks never block the bar: each dispatches a detached background job, single-flight
per action. *As built (settled live with Craig, 2026-06-29):*
- *left* — =net-panel= toggle (pkill-or-launch the GTK panel).
- *middle* — =net portal= (the captive-login flow).
- *right* — =net-fix= (=net doctor= with =--notify=: reports the result when the
outcome is one-way, opens a terminal only when it's fixable; the v2 redesign moves
even that into the panel).
- airplane toggle moved off the bar to Super+Shift+A.
* Panel (tasks #B + #C diagnostics — Phases 2-3)
GTK4 + gtk4-layer-shell, pocketbook scaffold (src-layout package, unittest,
Makefile, gtk4-layer-shell anchored dropdown under the bar). One panel shell,
reused by the future desktop-settings panel.
Sections as built (Phases 1-3, a four-page stack switcher):
1. *Connections* — list, MRU-first, active marked, live signal bars for in-range
wifi; row click switches; buttons for add / edit / remove; a rescan control.
2. *Diagnose* (read-only) — Probe (204/captive, shows portal URL + Open), Gateway
ping, DNS config. Streaming step output (the diagnostics contract).
3. *Repair* (mutating, confirmed) — tiered lightest-first: Unblock rfkill, Reset
(fresh MAC), Bounce (full stack), DNS override test, Force portal. A "Get me
online" button runs =net doctor --fix= (the auto-escalating sequence).
4. *Speed test* — Run button, progress, down/up/ping result + last-run line.
As built, the panel also auto-hides on focus-out (popup behavior, suppressed while a
child dialog holds focus) and carries a Close button bottom-right (2026-06-30).
*V2 nav (planned):* three top tabs — Connections | Diagnostics | Performance.
Diagnostics merges the Diagnose and Repair pages into one: a sub-row
=Diagnose= | =Get Me Online= | =Advanced= over a shared area that shows diagnose
items and streams repair progress in-panel (no terminal). =Advanced= reveals the
individual repair tiers (renamed, with tooltips) plus a *Debug capture on/off*
toggle (the manual side of the verbose-capture feature; a failing diagnose triggers
it automatically). Speed test moves under Performance.
** V2 panel UX — the target design
The shipped four-page stack (Connections / Diagnose / Repair / Speed test) is
*history*, not active design. V2 is the sole current target: one panel opened from
the bar, three top tabs — Connections | Diagnostics | Performance — and the page
model below is the contract for what gets built and what gets deleted, not just for
labels.
*** Connections — saved vs available, join-from-row
Three labelled groups, never one merged list:
- *Saved* — saved NM profiles, MRU-first, rendered instantly without a scan.
- *Available now* — scan-backed in-range SSIDs with signal + security; may carry a
loading/stale hint; unsaved networks appear here.
- *Wired* — ethernet when a wired device is present.
=net list= already yields this (=connections.py= lists saved MRU-first, merges live
signal/security for in-range saved profiles, then appends unsaved in-range SSIDs with
=uuid: nil=); the panel groups and labels it. *Rescan refreshes only the
availability/signal layer* — it never gates or reloads the Saved list.
*Progressive loading:* render the Saved group immediately on open, then overlay
availability, signal, and the unsaved Available-now rows when the scan returns. Show a
small scan-in-progress state (elapsed + last-scan age). A slow or bad radio scan must
not make the whole panel feel stuck — this is the direct answer to "why does it take
so long to see my connections?"
*Join-from-row (no Add page):* selecting an unsaved Available-now row *is* the join
flow — SSID and security come prefilled from the scan, never retyped. Open networks
connect (confirm only if needed); WPA/WPA2/WPA3-Personal ask only for the password.
The standalone Add button + modal are deleted for visible networks. A hidden/manual
SSID join lives behind an Advanced "Join hidden network" affordance.
*** Supported authentication classes (the join matrix)
From the scanned NM =SECURITY= value, V2 handles:
- *Inline-supported* — open, open-with-captive-portal, WPA/WPA2/WPA3-Personal
(PSK/SAE), and WPA2/WPA3 transition mode. The row shows the security label so the
user knows why a password is or isn't asked.
- *Activate-only* — 802.1X / enterprise: connect if already saved, else "edit in
nmtui/nmcli" (no add form in v1/V2, per decision 9).
- *Hidden / manual* — behind the Advanced "Join hidden network" affordance.
- *Rare / unsupported* — WEP, OWE/enhanced-open, MAC-registration, voucher, or
proxy-required: a clear in-panel explanation ("not supported here yet") plus a
non-terminal next step, never a hand-off to a terminal tool.
*** Diagnostics owns the diagnostic story
Diagnostics holds the read-only checks, the repair stream, Get Me Online, debug
capture (Advanced), and the doctor report. A *lightweight* latency/throughput probe
runs inline as a Diagnose evidence row when internet is available (skipped offline, on
a metered/hotspot warning, or with no backend), and its result is stored in the doctor
report. The *full* speed test stays under Performance (decision 19) — which is also
the home for future throughput history, so Performance earns its tab rather than being
a lone button.
*** Forget confirmation — future tense + verified
The destructive copy is future tense and names the scope: "This will remove the saved
NetworkManager profile and its stored password from this machine." After the op,
verify the UUID is gone, refresh the Saved list, and report "Forgot <SSID>" or "Could
not forget <SSID>; nothing changed / partial <evidence>" — the verify-every-action
decision applied to a destructive op.
*** Findable diagnostics report
Every diagnose, repair, and speed/performance run ends with a "Copy report" / "Open
report" action in Diagnostics. The report carries the step statuses + elapsed, the
final classification, the last speed/latency result when available, scan age,
route/interface owner, the redacted event-log tail, and the bundle path when verbose
capture ran. It states explicitly whether any repair mutated state and whether
cleanup/verification passed. "Logs exist somewhere" isn't enough when the network is
already down — the report is the one artifact the user copies to hand over.
*** Visual contract — a Waybar-attached popup
The panel reads as part of the bar, not a separate app. Match the live Waybar theme:
the dark rounded capsule (=border-radius: 1rem=), the golden border, compact monospace
text, and the =custom/net= state colors. Avoid square corners next to rounded UI, keep
cards out of cards, and use compact icon+label controls with tooltips for the advanced
repairs. Reuse any existing archsetup-owned GTK/panel conventions. (Non-blocking for
engine work; blocks final V2 UX acceptance.)
** Panel state, cancellation, permissions
State machines for: connection-list loading, rescan-in-progress,
activation-in-progress, diagnose-running, repair-running, speedtest-running. Plus
the real terminal states on this two-machine fleet: no-wifi-hardware (desktop →
ethernet-only view) and missing speedtest-go. (No GPG-key state — there's no
credential store; secrets live in NM.) ("No NetworkManager" is not a modeled
state — NM is always present
on these machines; if nmcli is somehow absent the panel shows a single hard-error
and exits.) Long operations show elapsed time and are cancellable where the
underlying op allows (rescan, speedtest, probe); clearly non-cancellable ones
(an in-flight activation) show elapsed + a disabled control. Permission-denied
(sudo/polkit declined) is a first-class outcome with the "nothing changed"
message, never a silent failure.
Interaction-pattern catalog (=~/code/rulesets/patterns/=) principles that apply:
- transient-state-buttons — all the network levers in one place, reachable by
one chord (the bar click), state visible.
- default-most-common-friction-proportional — connections MRU-ordered so the
common pick is first; destructive ops (remove) and privacy-changing ones
(reset, override) get a confirm, switching does not.
- one-prompt-picker-typed-prefix — if the connection picker ever goes
keyboard-driven, kind (wifi/eth/saved/in-range) + name in one typed picker.
** Panel UX flow (settle before Phase 2)
The concrete interaction defaults, so the GTK build isn't inventing them:
- *Default focus*: the Connections section, current connection's row selected. If
the indicator opened the panel because of a captive/no-internet state, focus
Diagnose instead with the relevant action highlighted.
- *Row content*: glyph (signal bars / wired / active check) + name + a secondary
line (security type, "active"/last-used). The active row is visually pinned at
top of its group.
- *Buttons*: one *primary* per section (Connections: Connect to the selected row;
Diagnose: Run diagnose; Repair: "Get me online"; Speed test: Run). Secondary
actions (add / edit / remove / rescan; individual repair tiers) are smaller and
grouped.
- *Disabled rules*: Connect disabled on the already-active row; Repair tiers
disabled while one runs; Speed test disabled while running; add/edit disabled
for enterprise (with the "edit in nmtui/nmcli" hint).
- *Confirmations* (exact wording): Reset → "Reset <SSID>? This drops the
connection and reconnects with a new MAC."; Bounce → "Restart networking? All
links drop briefly."; DNS override → "Temporarily set DNS to 1.1.1.1 for the
test? It reverts automatically."; Remove → "Forget <SSID>? The saved password is
deleted."
- *"Get me online" reporting*: shows each escalation step live (Unblock rfkill →
Reset → Bounce → Portal) with per-step pass/fail and stops at the first that
restores internet or at a terminal state, naming the next action.
- *After close*: the bar reflects the new state immediately (signal/refresh on
next poll); a running speedtest/diagnose keeps running and notifies on finish
(panel close doesn't cancel it).
- *Keyboard*: Esc closes; Tab moves between sections; arrows move rows; Enter
fires the section primary; the connection list is type-to-filter.
* Connection management (nmcli)
- Every op via nmcli per the nmcli contract above (terse, escaped, UUID-keyed,
bounded =--wait=).
- MRU ordering from NM's =connection.timestamp= (last activated), descending.
- Ethernet appears in the list whenever a wired device is present, selectable at
any time; switching just brings the chosen connection up.
- *Mutation safety + rollback*: switching keeps the current connection up until
the new one activates successfully (=nmcli --wait 30=); on failure it does not
tear down the working link, surfaces the failure, and leaves the prior
connection active. =net down= notes that NM may auto-reactivate a profile and
reports the post-op active connection so the user isn't surprised. A switch that
needs a password it doesn't have prompts (or fails with "password required"),
never silently strands. The exact NM command sequence (preflight active-state
read → activate target → verify default route → on failure, confirm prior
still up) is pinned in the engine and tested against fake nmcli.
- *Add/edit scope*: open + WPA-PSK only in v1. Existing saved profiles of any
type (including enterprise) can be *activated*; editing an enterprise profile
shows "edit via nmtui/nmcli" rather than a broken partial form.
* Connection secrets (no separate store)
Per Craig's call: don't build a parallel credential store. Settings and secrets
live where NetworkManager already keeps them, so there's one source of truth and
no extra dependency (no GPG, no gpg-agent, no =~/.config/net/connections=).
- *Where secrets live*: =/etc/NetworkManager/system-connections/<name>.nmconnection=,
root-owned =0600=, with the PSK/EAP secret stored inline (the default
=secret-flags=0= "owned by NM"). That's already secure-at-rest (root-only) and
is what =nmcli= reads/writes.
- *How we touch them*: every add/edit/remove goes through =nmcli= (=connection add
/ modify / delete=), which writes the =.nmconnection= with the right ownership
and perms. We never read or write =system-connections= files directly (root) and
never copy a secret out of them.
- *No export / import / sync* — there's nothing to sync. A new machine gets its
connections the way it always has (the user joins, or restores NM profiles),
not from a tool-specific vault.
- *config file*: =~/.config/net/config= still exists, but only for non-secret
preferences (speedtest server, redaction flags, probe TTL). It holds no
credentials.
- *No secret leakage*: PSK/EAP never appear in =net=' =--json= output, the event
log, or error text (tested) — even though NM is the store, our surfaces must not
echo a secret =nmcli= happens to return.
* Speed test
- Backend: *=speedtest-go=* (=--json=, =--server=, =--no-download/--no-upload=),
already installed on velox (AUR =speedtest-go-bin=). No new dependency for v1.
librespeed-cli is the documented fallback for a self-hosted LibreSpeed server.
- =net speedtest --json= parses speedtest-go's JSON into the =speedtest= schema.
- *Server policy*: auto-select nearest by default; allow a pinned server id in
=~/.config/net/config=.
- *Timeout + cancellation*: a hard run timeout (e.g. 60s); the panel run is
cancellable (kills the child). Offline / rate-limited / no-server errors map to
the failure-message table.
- *Tests*: fixture JSON (success) and fixture stderr (offline, no server,
malformed output) drive =net speedtest= parsing without touching the network.
* Help + documentation
In-app help has three layers, each reachable in the situation it's needed:
- *CLI help (works from a dead-GUI TTY)*: =net --help= lists the subcommands in
one screen; =net <cmd> --help= documents each (flags, what it mutates, the
console-recovery targets). The Makefile targets are self-describing (=make help=
lists =online= / =net-doctor= / etc. with one-line descriptions). This is the
layer that matters most when you're at a console with no network.
- *Panel help (in the GUI)*: a small =?= affordance in the panel header opens an
inline help pane — what each section does, which Repair actions mutate state,
what the indicator glyphs/colors mean. Per-control tooltips on the less-obvious
buttons (rfkill, bounce, DNS override). No external help browser.
- *User guide (the durable doc)*: a README / docs page covering every command,
the indicator states + glyphs, the panel sections, the config file keys, the
recovery make targets, troubleshooting (the failure-message table), and
rollback. Written so a future session — or Craig six months out — can operate
and recover the module from the doc alone.
The failure-message table above is the single source of truth for the
troubleshooting text; the guide and the panel help both render from it rather
than restating it.
* Enhancement radar
Low-cost adjacent affordances, each dispositioned so cheap wins aren't lost and
the v1 panel stays focused. (Several are already in v1 by virtue of other
sections; marked here so the consideration is visible.)
| Enhancement | Disposition | Reason |
|-------------------------------------+-------------+--------------------------------------------------------|
| Open / copy portal URL | v1 | already in the captive flow; trivial Open + Copy |
|-------------------------------------+-------------+--------------------------------------------------------|
| Forget network | v1 | it's the remove op, already specced |
|-------------------------------------+-------------+--------------------------------------------------------|
| Rescan now | v1 | already a Connections control |
|-------------------------------------+-------------+--------------------------------------------------------|
| Retry with hardware MAC | v1 | captive already has --hardware-mac; expose in Repair |
|-------------------------------------+-------------+--------------------------------------------------------|
| Pin speedtest server | v1 | already a config key |
|-------------------------------------+-------------+--------------------------------------------------------|
| Copy redacted doctor report | v1 | cheap, serves the observability/support goal |
|-------------------------------------+-------------+--------------------------------------------------------|
| Show last good network / result | vNext | needs small history persistence |
|-------------------------------------+-------------+--------------------------------------------------------|
| Watch mode for net doctor | vNext | a --watch loop; handy at a TTY, not v1-critical |
|-------------------------------------+-------------+--------------------------------------------------------|
| Actionable desktop notifications | vNext | dunst supports actions; extra wiring |
|-------------------------------------+-------------+--------------------------------------------------------|
| Keyboard connection picker (fuzzel) | vNext | the typed-prefix pattern; panel covers v1 |
|-------------------------------------+-------------+--------------------------------------------------------|
| QR-code share / import WiFi | rejected | low value for a personal 2-machine setup; phones do QR |
|-------------------------------------+-------------+--------------------------------------------------------|
* Waybar wiring
- Replace =custom/netspeed= with =custom/net= in the bar's module list (same
slot).
- Module def: =exec: waybar-net=, =return-type: json=, =interval: 2=, a =signal=
for on-demand refresh (next free signal after wtimer's 14), =on-click=,
=on-click-right=, =on-click-middle= per the phase-aware interactions (each
dispatches a detached job, never blocks).
- Remove the old =on-click: pypr toggle network= scratchpad only once the panel
replaces it (Phase 2); Phase 1 keeps it as the interim manager.
* Testing plan (TDD)
- *Engine (normal)* — fake =nmcli= + =curl= + =speedtest-go= on PATH; assert
command sequences and parsed/emitted JSON for status, list, up/down,
add/edit/remove, probe, diagnose, repair, speedtest. Pure state/format
functions tested directly. JSON schemas locked by example.
- *Portal parser* — already covered in =tests/captive= (Normal/Boundary/Error +
the real SONIFI body). The engine's native probe reuses the same cases.
- *nmcli parsing* — escaped colon/backslash/newline in SSID, duplicate names,
hidden SSID, non-ASCII, wired-mid-session, multi-active (wifi+tether).
- *Failure + concurrency (the risky classes)* — slow/hung nmcli/curl/speedtest
(degraded state within budget), concurrent =net status= probe refresh
(single-flight), corrupt cache (recovered), stale cache after SSID change
(invalidated), permission denied / sudo declined, DNS-override cleanup failure
(=cleanup-unverified=), NM partial activation (rollback keeps prior link),
secret redaction, missing speedtest-go, no wifi hardware, rfkill soft/hard
block.
- *Doctor classification* — fixture-driven =net doctor= over fake nmcli/curl
asserting the right terminal classification + that =--fix= stops before
destructive repairs: auth failures (=needs-user-action=), upstream/AP failure
(=upstream-not-local=), VPN-routed failure (=deferred/vpn=), and the DNS classes
(hijack → portal, broken-but-1.1.1.1-works → offered override, egress-blocked →
upstream). Assert the failure-mode coverage table's "detects / repairs / terminal
action" holds for each row.
- *Indicator* — drive =net status --json= through =waybar-net=, assert the JSON
per state (online / captive / no-internet / degraded / wired / disconnected /
rfkill), iface override via env.
- *Panel* — pocketbook-style: backing logic (list ordering, op dispatch,
state-machine transitions), not GTK widgets.
- *NM secrets / no-leak* — add/edit writes the secret into NM via nmcli (asserted
against fake nmcli, never to a tool-owned file); assert no PSK/EAP appears in any
=--json=, log line, or error (there is no credential store to round-trip).
- *Live checklist (gated out of the suite)* — a "Manual testing and validation"
task per phase for the real-network states (captive at a hotel, no-internet,
switch under load, reset, speedtest) that can't be faked.
** Harness + coverage gate
The concrete contract, matching the repo's existing convention (not pytest — the
dotfiles suites are =unittest=, run by =make test= as =python3 -m unittest= over
=tests/*/test_*.py=; 33 suites today):
- *Framework*: =unittest=. Each suite is =tests/<name>/test_<name>.py=
(=tests/net/=, =tests/waybar-net/=), collected by the existing =make test= loop
— no new runner, no pytest dependency.
- *Fakes on a temp PATH*: =fake-nmcli=, =fake-curl=, =fake-speedtest-go=,
=fake-rfkill=, =fake-resolvectl= live as executable stubs in =tests/<name>/=
(the =tests/layout-navigate/fake-hyprctl= pattern). A fixture file encodes the
command→canned-output map and the stub appends each invocation to a log the test
asserts against. Subprocess timeouts are simulated by a stub that sleeps past the
budget; =net status= must still return the degraded state.
- *Waybar wrappers end-to-end*: =waybar-net= is run as a subprocess with the fake
PATH and the env overrides (iface, cache path), asserting the emitted JSON — same
as =tests/waybar-netspeed=.
- *Coverage*: coverage.py is absent system-wide (and not importable), so coverage
runs in a throwaway venv (=python3 -m venv=, =pip install coverage=, =coverage
run -m unittest=, =coverage report=) — the method the wtimer suite used (95%).
Target: *branch* coverage over =net/= and the wrapper, ≥ 90% on the pure
classifier/parser modules.
** Coverage as a gap-finder, not a number (per phase)
Line coverage alone misses the branches that matter here, so each phase ends with
a *coverage-gap pass*, not just a percentage:
- After the first green run, read the branch report and map every uncovered branch
to either a new test or a consciously-excluded live-only behavior (with a comment
or a Manual-testing entry naming it).
- *Branch coverage is required* for the pure logic: the doctor classifier (every
outcome — fixable / needs-user-action / upstream-not-local / deferred-vpn), the
cleanup-unverified path, the redaction paths, the degraded hot-path fallback, the
timeout branches, and the portal/nmcli parsers.
- A phase isn't "done" until its coverage-gap pass is recorded — uncovered logic is
either tested or explicitly excused, never silently uncovered.
* Files touched (planned, all in =~/.dotfiles=)
- =net/= package (src-layout, like pocketbook) — engine + panel.
- =hyprland/.local/bin/waybar-net= — the indicator (replaces =waybar-netspeed=).
- =hyprland/.local/bin/net= — engine CLI entry (console-script shim).
- =hyprland/.config/waybar/config= — swap =custom/netspeed= → =custom/net=;
remove =custom/airplane=.
- =hyprland/.config/waybar/style.css= — captive / no-internet / degraded /
rfkill classes; remove airplane classes.
- =tests/net/=, =tests/waybar-net/= — suites.
- =captive= — refactor: extract probe + reset into functions callable
non-interactively (a =--json= probe mode) so the engine reuses them.
- =~/.config/net/config= — seed config (probe TTL, speedtest server, redaction
flags). No secrets; not a credential store.
- dotfiles =Makefile= — add the console-recovery targets (=online=, =net-doctor=,
=net-status=, =net-diagnose=, =net-portal=, =net-reset=, =net-bounce=).
- *Deletions once net ships* (the airplane module is absorbed):
=hyprland/.local/bin/waybar-airplane=, =hyprland/.local/bin/airplane-mode=,
=tests/waybar-airplane/=, =tests/airplane-mode/=, and the =custom/airplane=
module + its css.
- archsetup Hyprland step — add =gtk4-layer-shell=, =python-gobject=,
=speedtest-go-bin= to the install lists (the only archsetup change; no =gpg=
added, secrets stay in NM's store).
* Resolved decisions (Craig's calls + this response)
1. Panel UI tech → GTK4 + gtk4-layer-shell, shared pocketbook scaffold (one
panel shell, reused by the desktop-settings sibling).
2. Engine language → Python =net= package; shells out to =captive= for the
portal-force flow, native cheap probe for the bar path.
3. Connectivity probe → split cadence (fast link poll every 2s + slow cached
internet/captive probe, TTL ~45s) with single-flight + atomic cache.
4. No keyboard-modifier clicks (waybar can't qualify them) — the panel hosts the
rich actions; bar clicks dispatch detached jobs (phase-aware).
5. No separate credential store (Craig's call, cj). Secrets live in NM's own
=system-connections= (root =0600=, inline), touched via nmcli. No GPG, no
gpg-agent, no =~/.config/net/connections=. Supersedes the earlier GPG-store
design.
6. =custom/netspeed= absorbed into =custom/net=; throughput moves to the tooltip.
7. Speed-test backend → =speedtest-go= (already installed), not a new
librespeed-cli dependency; librespeed-cli is the self-hosted fallback.
8. Code lives in the dotfiles repo; archsetup only installs deps.
9. v1 add/edit scope = open + WPA-PSK; enterprise/802.1X is activate-only,
add/edit is vNext (settled by Craig 2026-06-29 — no enterprise networks in his
history, so the form would be unused UI).
10. =net doctor= is in v1 (Craig's call, cj) — a one-shot diagnose+fix mode,
reachable from a TTY via =make online= / =make net-doctor=. (The earlier
"defer the doctor/bundle command" decision is reversed.)
11. Diagnose (read-only) and Repair (mutating, confirmed) are separated in the
panel and the CLI; Repair is tiered lightest-first (rfkill → reset → bounce).
12. =custom/net= absorbs the airplane module (Craig's call, cj). *As built
(2026-06-29, option 1): display-only.* net shows the airplane state (reads
the airplane-mode state file); the =airplane-mode= low-power toggle is kept
(radios + CPU + brightness + services is not a network concern) and moved to
=custom/net='s right-click + signal 15. Only the redundant display pieces —
=waybar-airplane=, =custom/airplane=, and the retired =waybar-netspeed= —
plus their tests/css were deleted. The earlier "delete airplane-mode" framing
is superseded.
13. Repair includes a full-stack bounce and an rfkill-unblock (Craig's calls,
cj) — the latter recovers the framework-laptop post-power-loss soft-block.
14. VPN / WireGuard is a planned Phase 5 (Craig's call, cj), not a permanent
exclusion.
V2 redesign decisions (Craig, 2026-06-30):
15. *No terminals anywhere in the module* — =net-popup= is removed; every action and
result renders in the panel. Reverses the part of decision 11 that ran privileged
repairs in a terminal "so sudo/polkit can prompt".
16. *Passwordless privileged path* — a root-owned helper + a narrow NOPASSWD sudoers
rule scoped to it, archsetup-installed, run as =sudo <helper> <verb>=. This gates
decision 15 (a worker thread can't prompt). Absorbs the earlier DoT-toggle
follow-up and fixes the detached-restore-watcher bug.
17. *Verify every action* — each mutating op (repair, connect, forget, add, DNS
override) re-checks its effect and surfaces pass/fail in the panel.
18. *Detect + respond to every failure mode, edges included* — the full ~44-mode
catalog (todo.org redesign task) is the contract; auto-fix where safe, else report
the exact in-panel text. Includes IPv6-only awareness and multi-homing, which need
diagnose to stop being IPv4-only and single-iface.
19. *Navigation* — top tabs Connections | Diagnostics | Performance; Diagnostics
merges Diagnose + Repair (Diagnose | Get Me Online | Advanced over a shared
streaming area); Speed test under Performance.
20. *Automatic diagnostic verbose-capture* (Craig, 2026-06-30) — on a failing
diagnose, elevate the underlying stack (NM / resolved / wpa_supplicant) to debug,
capture the journal + dmesg window, restore (guaranteed + crash-guarded), and
write a redacted bundle. Plus a manual Debug on/off toggle in Advanced. Restore
bulletproof, secrets scrubbed before the bundle, log-level toggles via the V2
helper. See Observability.
* Implementation phases
*Phases 1-3 are SHIPPED* (2026-06-29 → 2026-06-30, dotfiles); their acceptance
criteria passed and the work is live on velox. Phase 4 (docs/rollout) and Phase 5
(VPN) remain. The V2 redesign phases at the end are designed, not yet built.
- *Phase 1 — Indicator + console recovery (task #C).* =net status= + =net probe=
(native cheap probe, reusing captive's logic) + the =captive= probe refactor +
=waybar-net= + the split-cadence cache (single-flight, atomic, stale classes) +
CSS states (incl. rfkill) + performance budget. Plus the CLI-only recovery path:
=net repair= tiers (rfkill / reset / bounce), =net doctor [--fix]=, and the
Makefile targets (=make online= etc.) — all testable without the GTK panel.
Absorbs the airplane state and removes the standalone airplane module. Interim
left-click keeps the existing scratchpad until the panel lands.
- *Acceptance*: fresh-login waybar smoke test shows correct state on
online/captive/no-internet/wired/rfkill; =net status= stays within budget
under a fake slow nmcli (degraded state); =net doctor --fix= recovers a
soft-blocked radio from a TTY; the live captive checklist passes at a real
portal; the airplane state works and the old airplane module is gone;
reverting = swap =custom/netspeed= + =custom/airplane= back.
- *Phase 2 — Panel shell + connection management (task #B core).* GTK4
layer-shell scaffold + =net list/up/down/add/edit/remove/rescan= + MRU list +
mutation safety/rollback + panel state machines.
- *Acceptance*: switch wifi↔wifi and ethernet↔wifi without stranding; a failed
switch leaves the prior link up; add/edit open + WPA-PSK writes the secret to
NM; remove confirms; panel states render for loading/rescan/activation.
- *Phase 3 — Diagnostics + speed test in the panel.* Wire =net diagnose= /
=net repair= / =net doctor= / =net portal= / =net speedtest= into the Diagnose
vs Repair sections; the "Get me online" button; portal Open button; speedtest
progress + cancel.
- *Acceptance*: diagnose runs read-only; each repair tier confirms + verifies
cleanup (DNS override reverts, shown); speedtest result parses from
speedtest-go and a fixture-driven failure shows the right message.
- *Phase 4 — Docs + rollout.* In-app help (=net --help= / per-command help, the
panel help affordance), README/user-guide (commands, panel, config,
troubleshooting, the make targets, rollback), and the manual dep step on ratio.
- *Acceptance*: =net --help= and each subcommand's help are complete; the
user-guide covers every command + the recovery targets; ratio rollout
documented.
- *Phase 5 — VPN / WireGuard (future).* Fold the existing archsetup wireguard
tooling into the same panel + CLI (=net vpn ...=). Out of the v1 milestone;
specced separately when picked up.
V2 redesign phases (designed 2026-06-30, dependency order):
- *V2.1 — Sudo helper + NOPASSWD sudoers (gates everything).* Root-owned helper
dispatching net's fixed privileged verbs, archsetup-installed, narrow sudoers.
Also fixes the detached DoT-restore-watcher bug.
- *Acceptance*: every repair runs passwordless in-panel on a non-NOPASSWD machine;
the sudoers rule is scoped to the helper only.
- *V2.2 — Merged Diagnostics panel + nav restructure.* Connections | Diagnostics |
Performance; the Diagnostics sub-row + shared streaming area; Advanced reveal +
tooltips; delete =net-popup=.
- *Acceptance*: no terminal opens for any action; repair progress streams in the
panel; Speed test lives under Performance.
- *V2.3 — IPv6-aware and multi-homing-aware diagnose.* Stop treating no-IPv4 as a
failure when online over IPv6; identify which interface owns the default route.
- *V2.4 — Close every detect/correct gap in the catalog, with post-action
verification.* Work the redesign-task catalog mode by mode.
- *V2.5 — Automatic diagnostic verbose-capture.* Snapshot/elevate/capture/restore
around a failing diagnose + the Advanced Debug on/off toggle; guaranteed +
crash-guarded restore; redacted support bundle; helper log-level verbs.
- *Acceptance*: a failing diagnose leaves a redacted bundle (NM/resolved/
wpa_supplicant journal + dmesg) and restores every log level; a crash mid-capture
is detected and restored on the next run; the secret-leak test finds no PSK/EAP in
a bundle; the toggle elevates and restores on demand.
* Open items / risks
- gtk4-layer-shell dropdown anchoring under a waybar module needs the same
positioning work pocketbook solved; reuse it. (Phase 2.)
- The =captive= refactor must keep the standalone CLI behavior identical while
exposing a non-interactive =--json= probe; covered by the existing
=tests/captive= suite plus new probe-mode tests. (Phase 1.)
- speedtest-go server selection variance (nearest-server flor) — pin a server in
config if results are noisy. (Phase 3.)
- The background-probe kick from =net status= must be truly non-blocking (spawn +
detach); enforced by the single-flight lock and the performance benchmark test.
* Rollback
Each phase is independent. The indicator (Phase 1) is a drop-in replacement for
=custom/netspeed= (and =custom/airplane=); reverting is swapping those modules
back in the config and restoring their scripts. The panel is additive — not
wiring its clicks leaves the bar working as before. No credential store to roll
back (secrets stay in NM throughout).
* Review findings [40/40]
** DONE Define the structured diagnostics contract :blocking:
The spec says the engine "emits JSON" and that diagnostics "reuse =captive=
verbatim", but the current =~/.dotfiles/common/.local/bin/captive= flow is a
human-readable bash script that mixes diagnostics, sudo prompts, DNS mutation,
browser launch, and terminal prose. A GTK panel cannot reliably turn that into
clear state, progress, cancellation, or useful error messages. Define the
machine contract before implementation: every diagnostic step should have a
stable id, status (=pending/running/pass/warn/fail/skipped=), redacted evidence,
elapsed time, safety outcome, and next action. Keep =captive= as the interactive
CLI, but either refactor reusable probe/reset functions behind =net diagnose
--json= or make =captive= expose a non-interactive JSON mode. This blocks the
panel and logging work because otherwise the implementer must invent the
boundary.
Disposition: accept — added the "Diagnostics contract" section (per-step id /
status / evidence / elapsed / safety / next_action) and the =captive= =--json=
probe-mode refactor under Architecture + Files touched.
** DONE Specify user-facing failure messages and recovery actions :blocking:
The spec names failure states like =no-internet=, =captive=, failed probe,
failed reset, missing DNS, and missing speed-test backend, but it does not define
the messages the user sees or what each message tells them to do next. For this
feature, "error" is not enough: a user needs to know whether WiFi is associated,
whether DHCP succeeded, whether DNS is hijacked/broken, whether HTTP is
intercepted, whether sudo was declined, whether a command timed out, and whether
the system was left unchanged or partially changed. Add a message table for the
indicator, panel, and CLI with: failure class, visible text, evidence included,
redaction rule, and next action. This is blocking because UX quality here is the
product, not an implementation detail.
Disposition: accept — added the "Failure states, messages, recovery" section
covering each class, the visible message, the "what changed" residue note, and
the next action across indicator/panel/CLI.
** DONE Define the debug log and redacted support bundle :blocking:
There is no observability section. When this fails in a hotel or cafe, an agent
needs enough evidence to diagnose it without rerunning destructive actions. Add
log location, rotation/retention, JSONL event schema, command argv logging,
exit-code/stderr capture, elapsed time, selected iface, NM active connection
UUID, probe URL class, HTTP code, redirect host, DNS servers, and cache
read/write events. Also define a =net doctor --json= or =net debug-bundle=
command that emits redacted status, recent log events, dependency versions, and
a reproduction command. Redact SSID if configured, MAC addresses, portal query
tokens, PSKs, EAP identities/passwords, IPs when requested, and all GPG/NM
secrets. This blocks implementation readiness because post-failure diagnosis is
currently left to ad hoc terminal spelunking.
Disposition: modify — accepted the JSONL event log, the schema, and the redaction
rules in full (new "Observability" section). Deferred the dedicated =net
debug-bundle= / =net doctor= command to vNext: for a single-user tool =net
diagnose --json= (the snapshot) plus the event log (the history) cover
post-failure diagnosis; a bundle command is gold-plating for v1. Recorded under
Out + Resolved decision 10.
** DONE Pin the nmcli parsing and timeout contract :blocking:
The spec lists nmcli operations but not the exact fields, output modes, escaping
rules, ID semantics, or timeouts. This is risky because SSIDs and connection
names can contain spaces, colons, duplicates, hidden names, and non-ASCII; the
current =waybar-netspeed= already had an SSID parsing bug. The nmcli manual
documents =--terse=, =--get-values=, =--escape=, =--wait=, ID/UUID/path
selection, =passwd-file=, and built-in connectivity states
(=none/portal/limited/full/unknown=) at
https://man.archlinux.org/man/nmcli.1.en. The spec should require UUIDs for
saved-profile operations, explicit =--wait= budgets, parser tests for escaped
colons/backslashes/newlines/duplicate names/hidden SSIDs, and a decision on when
to use or ignore =nmcli networking connectivity [check]=. This is blocking
because the command wrapper is the core reliability boundary.
Disposition: accept — added the "nmcli contract" section: terse + =--escape= +
=--get-values=, UUID-keyed ops, explicit =--wait= budgets, NM connectivity as a
cheap hint (our probe authoritative), and the parser test matrix.
** DONE Define cache concurrency, atomicity, and stale-state behavior :blocking:
=net status= may spawn =net probe= whenever the cache is stale, but the spec
does not define locking, process coalescing, atomic writes, crash cleanup, or
what happens when the probe hangs. With a 2s Waybar interval, a bad network could
start overlapping probes, corrupt the runtime cache, or keep showing stale
"online" while the link is gone. Add a single-flight lock under
=$XDG_RUNTIME_DIR/waybar=, atomic write+rename for cache updates, max probe
runtime, stale age classes (fresh/stale/expired/unknown), cache invalidation on
iface/SSID/connection UUID change, and tests for concurrent =net status= calls.
This blocks the fast-path design because it is the main performance and
correctness risk.
Disposition: accept — added "Concurrency, atomicity, staleness" under the
Connectivity model: flock single-flight, temp+rename atomic write, ≤6s probe
timeout, fresh/stale/expired/unknown classes, iface/SSID/UUID invalidation, stale
lock reclaim, plus concurrency tests in the test plan.
** DONE Bound hot-path performance with measured budgets :blocking:
The spec says the cheap poll should be sub-100ms, but the proposed fast path
still may call multiple =nmcli= commands every two seconds, read sysfs, parse
throughput, and maybe spawn a background probe. The existing =waybar-netspeed=
had a deliberate sleep for throughput sampling; replacing it must define how
throughput is sampled without sleeping in the bar path. Add a per-command budget
for =waybar-net= and =net status=, a maximum number of subprocesses on the hot
path, a timeout for every subprocess, benchmark tests with fake slow =nmcli=,
and a rule that the indicator emits a degraded JSON state rather than blocking.
This is blocking because Waybar custom modules can visibly freeze or lag when
their exec path stalls.
Disposition: accept — added the "Performance budgets" section: <100ms typical /
<250ms worst, throughput sampled across the poll interval (no in-process sleep),
one nmcli call max on the hot path, timeouts on every subprocess, the degraded
state, and a fake-slow-nmcli benchmark test.
** DONE Make click actions non-blocking and visible :blocking:
Waybar right-click runs =net reset= and middle-click runs =net portal= directly.
Those operations can require sudo, open browsers, mutate DNS, delete/recreate NM
profiles, or hang on network commands, but Waybar click handlers provide no
panel, terminal, progress, or cancellation surface by default. Define whether
right/middle click instead opens the panel focused on the action, dispatches a
background job with notifications, or is removed from v1. If kept, specify
single-flight behavior, how sudo/polkit prompts surface, how success/failure is
reported, and how the user can inspect logs. This blocks UX readiness because
the fastest remediation path is currently the easiest place to hide failure.
Disposition: modify — accepted the concern; made the interactions phase-aware and
non-blocking. Every click dispatches a detached, single-flight background job and
reports via =notify=; sudo surfaces through polkit/the normal prompt; failures go
to the notify + the event log. In Phase 1 (no panel) left-click runs probe +
notify and keeps the scratchpad; from Phase 2 left-click opens the panel focused
on the action. Recorded in the Indicator "Interactions" subsection.
** DONE Specify connection mutation safety and rollback :blocking:
The spec says row click switches connections and remove gets a confirm, but it
does not define what happens when a switch partially succeeds, disconnects the
current working link, needs a password, loses the default route, or triggers
auto-activation. The nmcli manual warns that =connection down= does not prevent
future auto-activation and may internally block a profile until user action.
Define preflight, the exact NM command sequence, whether the old active
connection is kept until the new one proves usable, when rollback is attempted,
how long activation waits, and what the panel says when rollback fails. This is
blocking because the module can strand the user offline.
Disposition: accept — added "Mutation safety + rollback" under Connection
management: keep the prior link up until the target activates (=--wait 30=), no
teardown on failure, password-required surfaced not stranded, =net down= reports
post-op active state + the auto-reactivation caveat, and the pinned NM command
sequence is tested against fake nmcli.
** DONE Define the credential-store security model :blocking:
The GPG store is described as optional and default-unencrypted, but the spec does
not define file modes, schema, secret-source rules, import/export prompts,
recipient verification, stale secret handling, or what is logged. It also says
NM remains source of truth while the user-owned store contains PSK/EAP secrets,
which creates two truth sources for sensitive data. Add a precise schema,
=0600= file creation with parent-dir permissions, encrypted-recipient checks,
plaintext warning text, explicit opt-in flow, redaction requirements, behavior
when NM has a secret not in the store, behavior when the store has a secret NM
rejects, and tests for no secret leakage in JSON/logs/errors. This blocks Phase
4 and the full spec because otherwise the implementer must make security
decisions mid-code.
Disposition: accept — rewrote "Credential storage" with the versioned schema,
=0600= file / =0700= dir, recipient verification on opt-in, the plaintext
warning, secret-source rule (entered/exported, never harvested from root store),
the two-source reconciliation policy (NM wins live, store wins for what NM
lacks, stale-secret flagging), and the no-leak tests.
** DONE Define EAP, enterprise WiFi, and unsupported connection behavior :blocking:
The store says "PSK/EAP" and connection management says add/edit, but there is
no v1 contract for WPA-Enterprise fields, certificates, identity vs anonymous
identity, hidden networks, static IP, proxy settings, metered flags, MAC
randomization, or 802.1X prompt behavior. Either scope v1 to open/WPA-PSK plus
existing saved-profile activation, or define the minimum EAP form and the
unsupported-state messages. This blocks add/edit/import because enterprise WiFi
is too sensitive to hand-wave.
Disposition: modify (scope) — scoped v1 to open + WPA-PSK add/edit, with
*activation* of any existing saved profile (including enterprise). Enterprise /
802.1X add/edit, static-IP, proxy, metered, and MAC-randomization editing are
vNext, shown as "edit via nmtui/nmcli". Recorded in Scope/Out, Connection
management, and Resolved decision 9.
** DONE Split read-only diagnostics from mutating remediation :blocking:
The panel's diagnostics section includes probe, bounce/reset, gateway ping, and
DNS override test in one area, while =captive= currently performs resets and
temporary DNS changes as part of its flow. Users need to know which buttons are
read-only and which mutate NM profiles, MAC mode, DNS, or browser state. Add
separate "Diagnose" and "Repair" actions, confirmations for destructive or
privacy-changing operations, explicit cleanup verification for DNS override, and
a terminal state when cleanup is unverified. This blocks readiness because
network repair must not surprise the user or leave hidden residue.
Disposition: accept — split the panel into a read-only Diagnose section and a
confirmed, mutating Repair section (and split the CLI into =net diagnose= vs =net
repair=). Added =cleanup_verified= + a terminal =cleanup-unverified= state to the
diagnostics contract.
** DONE Define panel state, cancellation, and permissions UX :blocking:
The panel sections list buttons and a streaming output area, but not loading
states, disabled states, empty states, keyboard/focus behavior, cancellation, or
permission-denied handling. Add panel state machines for connection list loading,
rescan in progress, activation in progress, diagnostics running, speedtest
running, and no NetworkManager/no WiFi/no permissions/no GPG key/no
librespeed-cli. Each long operation should be cancellable where possible or
clearly non-cancellable with an elapsed-time display. This blocks the GTK work
because without it the implementer must invent the user flow.
Disposition: modify — accepted the state-machine requirement (added "Panel state,
cancellation, permissions"), but scoped the state set to what can actually occur
on the two-machine fleet: dropped "no NetworkManager" as a modeled state (NM is
always present; a missing nmcli is a single hard-error exit) and kept
no-wifi-hardware, missing speedtest-go, no-GPG-key, plus the in-progress states
with elapsed-time + cancellation where the op allows.
** DONE Verify speed-test dependency, server choice, and failure contract :blocking:
The spec chooses =librespeed-cli= and notes availability/default-server research
as an open risk, but Phase 3 still depends on parsing its JSON and showing
progress. I checked the upstream project page
(https://github.com/librespeed/speedtest-cli) and the AUR URL named by search is
not sufficient as a verified package/install contract in this spec. Add the
exact package name/source to install, command version expected, JSON shape,
server-selection policy, timeout, cancellation behavior, offline/rate-limited
messages, and tests with fixture JSON and fixture stderr. This blocks Phase 3
because speed-test failure modes are otherwise undefined.
Disposition: modify — verified live and changed the backend: =speedtest-go= (AUR
=speedtest-go-bin=, 1.x) is already installed on velox and supports =--json=,
=--server=, =--no-download/--no-upload=, so v1 needs no new dependency.
librespeed-cli (AUR =librespeed-cli= / =-bin=) is the documented self-hosted
fallback. Added the "Speed test" section with server policy, timeout,
cancellation, the failure-message mapping, and fixture-JSON/stderr tests.
** DONE Define dependency installation and repo boundaries :blocking:
The files touched section alternates between archsetup paths and the external
dotfiles repo, while pocketbook has been folded into this repo and its previous
archsetup provisioning was intentionally removed. The spec should state where
the =net= package actually lives, which repository owns the scripts/tests,
whether =gtk4-layer-shell=, =python-gobject=, =librespeed-cli=, =gpg=, =nmcli=,
=curl=, and =resolvectl= are installed by archsetup or assumed present, and the
Makefile targets for test/lint/install. This blocks implementation because the
current path plan can produce code that is not installed on a fresh machine.
Disposition: accept — added the "Repository + dependencies" section: all code in
=~/.dotfiles= (=net/= package in-tree like pocketbook, scripts in the hyprland
tier, tests under =tests/=), archsetup owns only the dep install
(=gtk4-layer-shell=, =python-gobject=, =speedtest-go-bin=; nmcli/curl/resolvectl
already present), Makefile =make test= collects the package suite, and a
daily-drivers note for ratio. Rewrote Files touched to match.
** DONE Expand the test plan for failure, concurrency, and live verification :blocking:
The testing plan covers normal parsing and fake command sequences, but it misses
the riskiest behaviors: slow/hung =nmcli=/=curl=/=librespeed=, concurrent
=net status= cache refresh, corrupt cache, stale cache after SSID change,
permission denied, sudo declined, DNS override cleanup failure, NM partial
activation, duplicate connection names, secret redaction, missing optional
dependencies, no WiFi hardware, wired+tether+WiFi ambiguity, portal redirect
tokens, and Waybar click handlers. Add unit/fixture tests for each class plus a
manual/live checklist gated out of the normal suite. This is blocking because
the current plan would leave the exact "things that can go wrong here" mostly
untested.
Disposition: accept — rewrote the Testing plan with the "Failure + concurrency"
class (slow/hung commands, single-flight, corrupt/stale cache, perm-denied,
cleanup-failure, partial activation, redaction, missing deps, no-wifi,
multi-active) and a per-phase live checklist gated out of the suite.
** DONE Define status JSON schemas and compatibility rules
The spec says all subcommands take =--json= but does not define schemas. Add
versioned JSON examples for =status=, =probe=, =list=, =diagnose=, =speedtest=,
and error envelopes, including nullable fields and unknown/degraded states. This
is non-blocking for product direction but should be fixed before code so tests
can lock the CLI contract.
Disposition: accept — added the "JSON schemas" section with versioned (=v:1=)
envelopes for status / probe / list / diagnose / speedtest and a shared error
envelope, including the degraded/unknown states.
** DONE Rename or alias the phasing section for workflow compatibility
The spec has a usable =Phasing= section, but the spec-review workflow expects an
=Implementation phases= section that can be lifted into =todo.org=. Rename it or
add an alias heading during response. This is non-blocking because the existing
phase decomposition is understandable, but aligning the heading prevents future
workflow friction.
Disposition: accept — renamed =Phasing= → =Implementation phases= and added
per-phase acceptance criteria.
** DONE Add documentation and rollout acceptance checks
Rollback is described, but docs and rollout are thin. Add README/user-guide
updates for commands, panel behavior, config file, GPG opt-in, troubleshooting,
and rollback; add acceptance checks for each phase, including a fresh-login
Waybar smoke test and restoring =custom/netspeed=. This is non-blocking but
important for handing the feature to a future session without re-discovery.
Disposition: accept — added per-phase acceptance criteria under Implementation
phases (incl. the fresh-login waybar smoke test and the =custom/netspeed=
restore), a Phase 4 "Docs + rollout", and (answering Craig's cj follow-up) a
dedicated "Help + documentation" section with the three help layers (CLI help,
panel help affordance, user guide).
** DONE Add a failure-mode coverage table :blocking:
The spec now names many individual network failures, but it still does not carry
one compact coverage matrix that says, for each common failure mode, whether
=net diagnose= detects it, whether =net doctor --fix= can repair it, and what
terminal user action remains when it cannot. Add a table covering at least:
rfkill soft block, rfkill hard block, no WiFi hardware, associated/no DHCP,
gateway unreachable, captive DNS hijack, broken DNS where 1.1.1.1 works, HTTP
portal, HTTP interception without a parseable portal URL, upstream/AP outage,
wrong WPA password or missing secret, enterprise auth/cert failure, duplicate
SSID/connection-name ambiguity, hidden SSID, multiple active links, wedged
NetworkManager, slow/hung command, stale/corrupt cache, DNS cleanup failure,
missing speedtest backend, and VPN/routing interference. This blocks because
Craig asked for confidence that the diagnostics and doctor cover the real field
failures, and prose scattered across sections is too easy to misread.
Disposition: accept — added the "Failure-mode coverage" section: a 22-row table
(every mode the finding named) with detect / doctor-fix / terminal-action
columns, conformed to the org-table standard (rules under every row, ≤120).
** DONE Pin DNS repair semantics in doctor :blocking:
The spec diagnoses DNS hijack, broken hotel DNS, and the temporary 1.1.1.1
override test, but =net doctor --fix= does not say whether it merely recommends
the override, applies a temporary override during recovery, or leaves DNS alone
after diagnosis. Define the exact behavior for each DNS class: captive hijack
should open the portal, broken DNS where 1.1.1.1 works should either offer an
explicit temporary repair with cleanup verification or recommend the command,
and port-53/egress blocking should stop as upstream/not locally fixable. This is
blocking because DNS is one of the most common "connected but unusable" failures
and the current doctor contract is ambiguous.
Disposition: accept — added "DNS handling in doctor (explicit per class)" under
the new Doctor section: hijack → open portal (no DNS mutation); broken-but-1.1.1.1
→ explicit temporary override with cleanup verification under =--fix=, recommend
otherwise; egress-blocked → terminal =upstream-not-local=.
** DONE Make auth failures terminal user-action states :blocking:
Wrong WPA password, missing NM secret, locked keyring/polkit denial, enterprise
802.1X certificate/identity failure, and portal login-required are not fixed by
resetting or bouncing NetworkManager. The doctor sequence should classify these
as =needs-user-action= terminal states, stop before looping through destructive
repairs, and tell the user the exact next action (enter password, edit profile in
=nmtui=/=nmcli=, accept portal terms, provide cert/identity, or retry with
admin auth). This blocks because repeated reset/bounce against auth failures is
slow, noisy, and can make the network state worse without helping.
Disposition: accept — added the =needs-user-action= terminal outcome to the
Doctor section: wrong password / missing secret / keyring-or-polkit denial /
802.1X cert-or-identity failure / portal-login-required all stop the doctor before
any destructive repair and name the exact next step.
** DONE Define upstream/AP/provider failure terminal states :blocking:
Some failures are not client-repairable: AP has no uplink, hotel gateway is
down, DHCP server is broken, gateway drops traffic, ISP outage, or captive
portal backend is failing. The spec should define how =diagnose= proves "local
link is up but upstream is broken" and how =doctor --fix= stops after local
repairs are exhausted with a clear message like "local repairs tried; likely
upstream/AP/provider" plus the evidence. This blocks because users need to know
when to stop poking the laptop and switch networks or contact the venue.
Disposition: accept — added the =upstream-not-local= terminal outcome: diagnose
proves link-up + IP + gateway-reachable but no route out and no captive redirect;
=doctor --fix= stops after local repairs with "local repairs tried; likely
upstream/AP/provider" + evidence → switch network / contact venue.
** DONE Decide how VPN and policy routing affect v1 diagnosis
VPN/WireGuard management is Phase 5, but active VPNs, policy routes, DNS
overrides, and firewall killswitches can break apparent internet access in v1.
The current spec does not say whether v1 detects active VPN/policy routing and
classifies "network is fine, VPN route/DNS is broken" separately from WiFi
failure. Add either a v1 diagnostic check for active VPN/default-route/DNS
ownership with a "deferred repair" outcome, or explicitly state that VPN-routed
failures are out of scope and may be misclassified. This is blocking if Craig
expects the module to diagnose normal daily-driver network failures while VPN
tooling remains separate.
Disposition: accept (chose the detect-and-classify option) — v1 detects an active
VPN / non-NM default route / non-NM DNS owner and classifies =deferred/vpn= ("link
is fine; internet is VPN-routed"), distinct from a WiFi failure. v1 does not
repair it (VPN management is Phase 5); it names the VPN as the likely owner and
stops. Added to the Doctor section + the coverage table + a doctor-classification
test.
** DONE Remove stale GPG-store references from the resolved spec
The spec now decides "no separate credential store; secrets live in
NetworkManager", but the Testing plan still mentions =gpg round-trip= and =GPG
store= tests, and the panel-state list still mentions a no-GPG-key state. Remove
those stale references and replace them with NM-secret/no-secret-leak tests.
This is non-blocking for product behavior but blocking for implementation
clarity: otherwise tests will be written for a credential store that no longer
exists.
Disposition: accept — replaced the Testing-plan =gpg round-trip= / =GPG store=
bullets with an "NM secrets / no-leak" test (add/edit writes the secret via nmcli;
assert no PSK/EAP in any JSON/log/error; no store to round-trip) and dropped the
=no-GPG-key= panel state. Residue from the cj-comment pass that dropped the store.
** DONE Reconcile status, goal, and task text before implementation :blocking:
The spec status says "Implementation-ready with caveats" and "Phase 1 ready to
build", but the body still has an unresolved enterprise add/edit VERIFY, the
Goal still says "optional GPG-encrypted secret store", and the unified task title
still names "GPG-stored secrets" even though the accepted design removed the
store. Before implementation, make the top-level status, goal, scope, task
mapping, and resolved decisions agree with the current design. This blocks
readiness because a developer starting from the top of the file would still build
or plan around abandoned GPG-store behavior.
Disposition: accept — fixed the Goal ("secrets stay in NM's own store"), the
=[#B]= task-mapping line (notes the "GPG-stored secrets" framing is superseded by
decision 5), the enterprise VERIFY (now resolved → Status updated), and corrected
the stale =pytest= mentions to =unittest= (the repo's actual harness). Top-of-file
status/goal/scope/decisions now agree with the design.
** DONE Resolve enterprise add/edit scope or make the caveat explicit :blocking:
The spec still says "One open question for Craig: pull enterprise add/edit into
v1?" and points to a VERIFY in =todo.org=. That is a real product-scope decision:
if enterprise add/edit is in v1, panel forms, nmcli command sequences, tests,
error messages, and docs change materially; if it is out, the UI must consistently
show activate-only with "edit in nmtui/nmcli". Decide it in the spec before
implementation, or downgrade the status to =Ready with caveats= with this exact
accepted caveat. As written, the spec cannot be plain =Ready=.
Disposition: accept — Craig decided (2026-06-29): enterprise add/edit is vNext,
activate-only in v1. Settled in the Status line, the Scope/Out bullet, decision 9,
and the VERIFY (now DONE in todo.org). The UI shows activate-only with "edit in
nmtui/nmcli" consistently. Evidence: 24 saved profiles, 0 enterprise.
** DONE Define the concrete test harness and coverage gate :blocking:
The spec says TDD, fake binaries on PATH, and benchmark tests, but it does not
define the actual harness contract: pytest vs unittest for the =net= package,
where fake =nmcli=/=curl=/=speedtest-go=/=rfkill=/=resolvectl= live, how test
fixtures encode command histories, how subprocess timeouts are simulated, how
Waybar scripts are executed end-to-end, and how coverage is run. Add the exact
Makefile targets (=test=, =test-unit= or package-local =pytest=), pytest config,
coverage command (e.g. branch coverage over =net/= and =waybar-net= wrappers),
minimum threshold, and the rule for reading the coverage report to add missing
tests before declaring a phase done. This blocks readiness because "what is the
test harness?" is still answerable only by analogy to older suites.
Disposition: accept — added the "Harness + coverage gate" section. Corrected the
premise: the repo is =unittest= (=make test= → =python3 -m unittest=, 33 suites),
not pytest. Pinned the fake-binary stub convention (=tests/<name>/fake-*= on a
temp PATH), the fixture command→output map, timeout simulation, the end-to-end
=waybar-net= subprocess run, and coverage via a throwaway venv (coverage.py is
absent system-wide) with a ≥90% branch target on the pure modules.
** DONE Use coverage to find missing behavior, not just report a percentage :blocking:
The spec does not say how coverage findings affect implementation. For this
feature, line coverage alone can miss the important holes: doctor classification
branches, cleanup-unverified paths, redaction paths, degraded hot-path fallbacks,
timeout branches, and auth/upstream/VPN terminal states. Define coverage review
criteria per phase: branch coverage for pure classifiers and parsers, named
untested branches allowed only with comments or manual-check entries, and a
required "coverage gap pass" after the first green test run that maps uncovered
logic back to tests or consciously excluded live-only behavior. This blocks
readiness because the current test plan is broad but does not force the suite to
expose missing edge tests.
Disposition: accept — added the "Coverage as a gap-finder, not a number (per
phase)" subsection: branch coverage required for the doctor classifier (every
outcome), cleanup-unverified, redaction, degraded-fallback, timeout, and the
parsers; a mandatory coverage-gap pass after the first green run mapping each
uncovered branch to a test or a named live-only exclusion; a phase isn't done
until that pass is recorded.
** DONE Convert error classes into exact user-facing strings and evidence fields :blocking:
The failure table and doctor outcomes classify errors well, but many messages
are still templates or descriptions rather than final text. Add exact strings
for indicator tooltip, notification, CLI stderr, JSON =error.message=, and panel
banner/step text for every failure-mode row, including cases doctor cannot fix:
wrong password, missing secret, enterprise cert failure, upstream/AP/provider
failure, VPN-routed failure, hard rfkill block, DNS cleanup failure, speedtest
missing, and HTTP interception without parseable URL. For each string, specify
the redacted evidence included and the next action. This blocks UX readiness
because "useful error" is only testable once the actual text and evidence are
defined.
Disposition: accept — rewrote the Failure states section: each row now carries the
exact final string (with =<placeholder>= evidence), the evidence field, and the
next action, plus a per-surface rendering rule (indicator tooltip / notify /
CLI+JSON error.message+detail+code / panel banner all render the one canonical
string). Added the missing doctor-unfixable rows: hard rfkill, wrong password /
missing secret, enterprise cert failure, upstream/AP/provider, VPN-routed, HTTP
interception without a parseable URL, and DNS cleanup-unverified.
** DONE Add an enhancement disposition table
The spec captures several good enhancements (doctor, Makefile recovery, rfkill,
airplane absorption, VPN phase), but it does not show that low-cost adjacent
enhancements were considered and accepted/deferred/rejected. Add a small radar
table for likely affordances: copy redacted doctor report, open/copy portal URL,
retry with hardware MAC, forget network, rescan now, pin speedtest server, show
last good network/result, watch mode for =net doctor=, desktop notification
actions, QR-code/share WiFi import/export, and keyboard picker. Mark each
=v1=, =vNext=, or =rejected= with a one-line reason. This is non-blocking, but it
prevents accidental loss of cheap UX wins and keeps the v1 panel focused.
Disposition: accept — added the "Enhancement radar" table dispositioning all the
named affordances: open/copy portal URL, forget network, rescan, hardware-MAC
retry, pin speedtest server, copy redacted doctor report = v1; last-good
network/result, doctor watch mode, actionable notifications, keyboard picker =
vNext; QR-share = rejected (low value for a 2-machine personal setup).
** DONE Tighten the panel UX flow before Phase 2
The panel has sections and state machines, but not a concrete interaction flow:
default focused section, row content, primary/secondary buttons, disabled-state
rules, confirmation wording for reset/bounce/DNS override, how "Get me online"
reports each escalation, what stays visible after the panel closes, and keyboard
navigation. Add a short UX flow spec or wire-level outline before Phase 2. This
is non-blocking for Phase 1, but it blocks Phase 2 implementation because a GTK
panel can easily become noisy or surprising if these defaults are invented while
coding.
Disposition: accept — added the "Panel UX flow (settle before Phase 2)"
subsection: default focus (Connections, or Diagnose when opened from a captive
state), row content, one primary button per section, disabled-state rules, exact
confirmation wording for reset/bounce/DNS-override/remove, the live "Get me
online" escalation reporting, what survives panel close, and keyboard nav.
** DONE Reconcile the panel navigation source of truth :blocking:
Disposition: accept — folded into "V2 panel UX". V2 (Connections | Diagnostics |
Performance) is the sole current target; the shipped four-page stack is marked history,
not active design.
The spec now names at least three navigation shapes: the shipped four-page stack
(Connections / Diagnose / Repair / Speed test), the V2 three-tab plan
(Connections / Diagnostics / Performance), and the redesign task's Diagnostics
sub-row (Diagnose / Get Me Online / Advanced). That leaves an implementer free
to keep extra pages and buttons even though Craig is explicitly asking for the
opposite. Make V2 the sole current target: one panel opened from the bar, top
tabs =Connections | Diagnostics | Performance=, with Diagnostics owning the
read-only checks, repair stream, debug capture, doctor report, and related
diagnostic evidence. Mark the old four-page stack as shipped history only, not
active design. This blocks the redesign because the page model determines what
code is deleted, not just labels.
** DONE Fold speed tests into the diagnostic story :blocking:
Disposition: modify — Craig pre-decided Speed test lives under Performance (decision
19), and Performance carries future throughput history, which meets this finding's own
"keep the tab only if it carries ongoing throughput" condition. Accepted the rest:
Diagnostics runs a lightweight inline latency/throughput probe as a Diagnose evidence
row (with skip conditions for offline / metered / no-backend), and the full speed
result is stored in the doctor report. Folded into "V2 panel UX → Diagnostics owns the
diagnostic story".
Speed test is currently isolated under =Performance=, while the Goal and user
mental model treat speed, latency, and packet loss as part of "diagnostics."
That split risks another top-level button/page whose only job is a diagnostic
measurement. Keep the top-level =Performance= tab only if it carries ongoing
throughput/history later; for V2, specify that Diagnostics can run a lightweight
performance check from the same Diagnose/Get Me Online flow when internet is
available, and that the full speed test is presented as a diagnostic evidence
row or secondary action rather than a separate repair-adjacent workflow. Define
when it is skipped (offline, metered/hotspot warning, missing backend) and how
the result is stored in the doctor report. This is blocking because otherwise
the implementation preserves avoidable navigation and misses a useful failure
signal.
** DONE Define saved-list vs available-scan semantics :blocking:
Disposition: accept — folded into "V2 panel UX → Connections". Saved / Available now /
Wired groups; Rescan refreshes only the availability/signal layer, never the Saved
list.
=net list= merges saved profiles with in-range scanned networks, while the panel
copy calls the page "Connections" and the control "Rescan." It is not clear to a
user whether they are looking at saved connections, currently available
networks, or both. The current implementation confirms the ambiguity:
=connections.py= lists saved profiles MRU-first, merges live signal/security for
saved profiles that are in range, then appends unsaved in-range SSIDs with
=uuid: nil=. Rename and specify the groups: e.g. =Saved= (instant, does not
require scan), =Available now= (scan-backed, may still be loading/stale), and
=Wired=. =Scan= should refresh only the availability/signal layer, not gate the
saved profile list. This blocks readiness because it affects loading behavior,
button enablement, and whether unsaved rows can be selected.
** DONE Replace the Add page with join-from-row behavior :blocking:
Disposition: accept — folded into "V2 panel UX → Connections". Selecting an unsaved
Available-now row is the join flow (SSID/security prefilled); the standalone Add modal
is deleted for visible networks; hidden/manual join lives behind Advanced.
The current Add dialog asks for an SSID as free text even though a scan usually
already found the SSID and security type. That is redundant UI and a common
network-manager mistake: it turns "join this visible network" into "copy a name
from the list and type it again." V2 should remove the standalone Add button and
modal for normal visible networks. Selecting an unsaved available row should
become the join flow: the SSID/security are prefilled from the row, open
networks connect with a confirmation only if needed, WPA/WPA2/WPA3-Personal ask
only for the password, and hidden/manual SSID is tucked behind an Advanced
"Join hidden network" affordance. Keep edit/create for enterprise profiles out
of v1/V2 unless explicitly added later. This blocks the redesign because it
changes the primary connection workflow and deletes a whole page/control.
** DONE Pin the supported authentication types in the join flow :blocking:
Disposition: accept — folded into "V2 panel UX → Supported authentication classes".
The spec says "open + WPA-PSK" and "enterprise activate-only," but cafe/hotel
networks also commonly appear as open captive portals, WPA/WPA2/WPA3-Personal
(PSK/SAE), and sometimes transition-mode networks; less commonly they use
enterprise/802.1X, WEP, OWE/enhanced-open, MAC registration, voucher portals, or
proxy-required networks. Define the V2 join matrix from the scanned NM
=SECURITY= value: supported inline (open, captive/open, WPA/WPA2/WPA3 Personal),
activate-only if already saved (802.1X/enterprise), hidden-manual behind
Advanced, and unsupported/rare types with a clear in-panel explanation plus a
non-terminal next step. If an auth type is common enough to support, support it
in the panel; if it is too rare for V2, say "not supported here yet" and keep
the user in the same UI rather than sending them to a terminal tool. Also define
what security label appears in the row so the user knows why a password is or is
not requested. This blocks because the Add/Join deletion above cannot be
implemented safely without knowing which auth classes the simplified flow covers.
** DONE Fix destructive confirmation tense and verification
Disposition: accept — folded into "V2 panel UX → Forget confirmation".
The Forget confirmation says "The saved password is deleted" before the user has
clicked Forget. That reads as if the destructive action already happened. Change
the copy to future tense and name the scope, e.g. "This will remove the saved
NetworkManager profile and its stored password from this machine." After the
operation, verify the UUID is gone, refresh the Saved list, and report either
"Forgot <SSID>" or "Could not forget <SSID>; nothing changed / partial state
<evidence>." This is non-blocking because the existing confirm prevents an
accidental click, but the wording is misleading and the V2 "verify every action"
decision should cover it.
** DONE Make connection loading progressive and observable :blocking:
Disposition: accept — folded into "V2 panel UX → Connections (progressive loading)".
Opening the panel currently says "Loading connections..." while =net list=
collects both saved profiles and the WiFi scan. Saved profiles do not require a
network scan, so a slow scan should not delay the saved list. Split loading into
two phases: render saved NM profiles immediately, then overlay availability,
signal, and unsaved in-range rows when the scan completes. Show a small
scan-in-progress state with elapsed time and stale-last-scan age, and make
Rescan update only the scan-backed fields. This blocks because it is the direct
answer to "why does it take so long to see the list of connections?" and keeps a
bad radio scan from making the whole panel feel broken.
** DONE Define the visual contract with Waybar and existing Archsetup UI
Disposition: accept — folded into "V2 panel UX → Visual contract".
The panel is a layer-shell popup anchored under Waybar, but the spec does not
state the visual contract. The live Waybar theme uses a dark rounded capsule
(=border-radius: 1rem=), golden border, compact monospace text, and state colors
for =custom/net=; the GTK panel currently has a generic title, stack switcher,
default GTK controls, and square-ish/default widget corners. Add a short style
section: panel should read as a Waybar-attached popup, not a separate app; match
Waybar's palette, border/radius, spacing density, and state colors; avoid square
corners where surrounding UI is rounded; keep cards out of cards; use compact
icon+label controls with tooltips for advanced repairs. Also cite any existing
Archsetup-owned GTK/panel conventions that should be reused. This is
non-blocking for engine work but should block final V2 UX acceptance.
** DONE Add a diagnostics report affordance that users can actually find
Disposition: accept — folded into "V2 panel UX → Findable diagnostics report".
The observability design has a JSONL event log, =diagnose --json=, automatic
verbose capture, and redacted bundles, but the panel flow does not yet define
the user affordance that turns those into an inspectable diagnosis. Add a
Diagnostics-side "Copy report" / "Open report" action after every diagnose,
repair, and speed/performance run. The report should include the current step
statuses, elapsed time, final classification, last speed/latency result when
available, scan age, route/interface owner, relevant redacted event-log tail,
and bundle path when verbose capture ran. It must explicitly say whether any
repair mutated state and whether cleanup/verification passed. This is blocking
for observability because "logs exist somewhere" is not enough when the network
is already failing.
* Review and iteration history
** 2026-06-29 Mon @ 17:00:39 -0400 — Codex — reviewer
- *What changed or was recommended:* Rubric: =Not ready=. Applied the
spec-review workflow and added blocking findings for diagnostics structure,
user-facing errors, observability, nmcli contracts, cache concurrency,
hot-path performance, Waybar click actions, mutation rollback, credential
security, unsupported WiFi types, panel states, speed-test dependency
verification, install boundaries, and test coverage.
- *Why:* The spec has the right broad shape, but this feature combines a status
indicator, NetworkManager mutation, sudo-backed remediation, captive-portal
probing, secret storage, and GTK UI. Too many failure modes were still left for
implementers to invent during coding.
- *Artifacts:* Findings recorded in =Review findings=. Local code read included
=~/.dotfiles/hyprland/.local/bin/waybar-netspeed=,
=~/.dotfiles/common/.local/bin/captive=,
=~/.dotfiles/tests/waybar-netspeed/test_waybar_netspeed.py=,
=~/.dotfiles/tests/captive/test_captive.py=, pocketbook's GTK layer-shell
scaffold, current Waybar config, and the linked =todo.org= tasks. External
checks included the Arch nmcli man page and upstream LibreSpeed CLI page.
** 2026-06-29 Mon @ 17:35:00 -0400 — Claude Code (archsetup) — responder
- *What changed:* Processed all 18 findings — 13 accepted, 5 modified, 0
rejected. Accepts added new sections: Repository + dependencies, Performance
budgets, nmcli contract, JSON schemas, Diagnostics contract (+ Diagnose vs
Repair), Failure states/messages/recovery, Observability, cache
concurrency/atomicity/staleness, mutation safety + rollback, the rewritten
credential security model, the expanded test plan, JSON schemas, the renamed
Implementation phases with per-phase acceptance criteria, and a Status line.
Modifies: deferred the =net debug-bundle= command (log + =diagnose --json=
cover it), made click actions phase-aware + non-blocking, scoped enterprise
WiFi to activate-only (add/edit vNext), trimmed the panel state set to states
that occur on the two-machine fleet, and switched the speed-test backend to the
already-installed =speedtest-go= (verified live) instead of a new
librespeed-cli dependency.
- *Why:* Codex's review was strong and mostly right; the modifies are scope calls
for a single-user, two-machine tool (avoid gold-plating, use installed deps,
don't model impossible states) and one phasing reconciliation (the panel is
Phase 2, so Phase-1 clicks can't open it).
- *Artifacts:* All 18 findings completed =DONE= in place with dispositions. Live
check: =pacman -Qq | grep speed= (speedtest-go-bin installed; =speedtest-go
--json= confirmed). Status now "review incorporated (Codex, 2026-06-29)",
implementation-ready with the named Phase-2/3 caveats; Phase 1 ready to build.
** 2026-06-29 Mon @ 17:37:58 -0400 — Claude Code (archsetup) — responder (cj comments)
- *What changed:* Folded in Craig's 10 cj comments on the spec. Design changes:
(1) dropped the separate credential store entirely — secrets stay in NM's own
=system-connections= (root =0600=, inline), touched via nmcli; no GPG, no
gpg-agent (rewrote the secrets section, decision 5, dropped Phase 4 + the gpg
dep). (2) Added =net doctor [--fix]= + Makefile console-recovery targets
(=make online= etc.) as a first-class TTY path; reversed the earlier
defer-the-doctor call (decision 10). (3) Added a full-stack =bounce= repair and
an =rfkill= unblock repair + indicator state — the rfkill one recovers the
framework-laptop post-power-loss soft-block Craig hit. (4) =custom/net= absorbs
the airplane module; the standalone airplane scripts/tests/module are deleted on
ship. (5) Moved VPN/WireGuard from "out" to a planned Phase 5. (6) Added a
"Help + documentation" section (CLI help / panel help / user guide). Answered
the enterprise-defer rationale and the captive-auto-login explanation inline.
- *Why:* Craig's comments simplify (NM-only secrets, fewer deps) and harden the
recovery story (doctor + make targets + rfkill/bounce reachable from a dead-GUI
console — his stated need). Enterprise add/edit in v1 is the one open call,
raised as a VERIFY.
- *Artifacts:* All 10 cj blocks removed. Live checks: airplane scripts/tests
present (confirmed deletion targets); =/etc/NetworkManager/system-connections/=
is root =0600= with inline secrets (confirms the NM-only secret model);
=rfkill= present. VERIFY filed in =todo.org= for the enterprise-scope decision.
** 2026-06-29 Mon @ 17:49:24 -0400 — Codex — reviewer
- *What changed or was recommended:* Added a second feedback batch from the
common network failure-mode review: require a diagnose/doctor coverage table,
pin DNS repair semantics, make auth failures terminal user-action states,
define upstream/AP/provider terminal states, decide how active VPN/policy
routing affects v1 diagnosis, and remove stale GPG-store references.
- *Why:* The design now has a solid recovery shape, but the field question is
sharper: for each common failure, does =diagnose= catch it, does =doctor --fix=
fix it, or does the user need to switch networks/provide credentials/contact
the venue? That mapping needs to live in the spec, not in chat.
- *Artifacts:* Findings appended under =Review findings= after reviewing the
current spec's Diagnostics, Failure states, Doctor, Testing plan, and
responder history.
** 2026-06-29 Mon @ 18:06:22 -0400 — Codex — reviewer
- *What changed or was recommended:* Added a readiness pass focused on
development-time questions, missed enhancements, UX cleanliness, error wording,
and test/coverage mechanics. New findings require reconciling stale status/goal
text, resolving or explicitly caveating enterprise add/edit, defining the
concrete harness and coverage gate, making coverage actionable, turning failure
classes into exact user-facing strings, adding an enhancement disposition
table, and tightening the Phase-2 panel UX flow.
- *Why:* The network failure model is now strong, but implementation would still
force developers to invent test infrastructure, coverage expectations, final
error copy, and several UX details. The spec should decide those up front or
explicitly mark them as caveats.
- *Artifacts:* Findings appended under =Review findings= after reading the
updated spec, current archsetup Makefile test targets, pocketbook's pyproject,
and the linked =todo.org= enterprise VERIFY.
** 2026-06-29 Mon @ 17:55:08 -0400 — Claude Code (archsetup) — responder (2nd review)
- *What changed:* Processed all 6 second-batch findings — 6 accepted, 0 modified,
0 rejected. Added a "Doctor: escalation, classification, terminal states"
section with four outcomes (=fixable= escalating lightest-first; =needs-user-
action=, =upstream-not-local=, =deferred/vpn= terminal) and explicit per-class
DNS handling. Added a "Failure-mode coverage" table (22 rows: detect / doctor-fix
/ terminal action, org-table-standard conformant, ≤120). Cleaned the last
GPG-store residue (Testing plan → NM-secret/no-leak test; dropped the no-GPG-key
panel state). Added a doctor-classification test class.
- *Why:* The findings sharpen the doctor from "escalate repairs" to "classify,
then stop at the right terminal state" — the safety property that keeps
=doctor --fix= from looping destructive repairs against a wrong password or an
upstream outage. The VPN finding got the detect-and-classify option (not repair;
that's Phase 5) so v1 doesn't misread a VPN-routed failure as WiFi down. The
coverage table is the single artifact Craig asked for to trust the field
coverage.
- *Artifacts:* All 6 findings =DONE= in place; cookie =[24/24]=. Table conformed
via =wrap-org-table.el= (max line 120). No new findings introduced; no scope
expansion past what the findings asked. Spec remains implementation-ready, the
enterprise add/edit VERIFY still the one open decision.
** 2026-06-29 Mon @ 18:24:51 -0400 — Claude Code (archsetup) — responder (3rd review)
- *What changed:* Processed the third review (7 findings). The enterprise-scope
finding closed first on Craig's call (vNext, activate-only). The other 6, all
accepted: (1) reconciled the top-of-file text — fixed the Goal's GPG-store
wording, the =[#B]= task-mapping line, the resolved enterprise VERIFY, and
corrected the stale =pytest= mentions to =unittest= (the repo's real harness).
(2) Added a "Harness + coverage gate" section (unittest, fake-binary stubs on a
temp PATH, venv coverage, ≥90% branch on pure modules). (3) Added a per-phase
"coverage as a gap-finder" pass. (4) Rewrote the Failure states section to exact
final strings + evidence fields + a per-surface rendering rule, and added the
missing doctor-unfixable rows. (5) Added the "Enhancement radar" table
(v1/vNext/rejected). (6) Added the "Panel UX flow" subsection.
- *Why:* The findings close the gap between "design decided" and "a developer can
start": the harness/coverage contract, the exact UX strings, and the panel flow
are the things otherwise invented mid-code. The =pytest=→=unittest= correction
was a real defect — the spec contradicted the repo's actual test convention.
- *Artifacts:* All 31 findings =DONE=; cookie =[31/31]=. Both new tables conformed
via =wrap-org-table.el= (coverage 120, radar 110). Harness verified against the
live repo (33 unittest suites, =make test=, coverage.py absent → venv). Status
raised to "Ready for Phase 1; Ready-with-caveats overall" — no open decisions
remain.
** 2026-06-30 Tue @ 16:32:07 -0400 — Claude Code (archsetup) — responder (build + V2 redesign)
- *What changed:* Brought the spec current with what shipped and what got decided
across 2026-06-29/30. Recorded Phases 1-3 as SHIPPED (engine, indicator, GTK4
panel, bar clicks). Added the native captive-login engine (=portal-login= repair
tier replacing the =captive= shell-out), the live-testing portal UX fixes (removed
the polkit-gated flush, already-online short-circuit, Chrome first-run suppression,
in-panel portal, extractor hardening), and the panel auto-hide + Close button.
Then folded in the V2 redesign Craig directed: no terminals anywhere, a passwordless
root-helper + NOPASSWD sudoers as the enabler, verify-every-action, the full
~44-mode failure catalog (now the authoritative coverage, in todo.org), and the
Connections | Diagnostics | Performance nav. Added decisions 15-19, V2 phases
2.1-2.4, and corrected the two failure-table rows the shipped code never matched
(gateway-bounce, venue-DNS auto-revert).
- *Why:* The spec had drifted behind the code (still said "Phase 1 shipped, terminal
for privileged ops") and behind Craig's redesign calls. The doc is the contract for
the V2 build, so it has to state the no-terminal reversal, the sudo-helper
dependency, and where coverage now lives.
- *Artifacts:* Updated Status, Architecture, Engine CLI surface, Doctor DNS handling,
Failure-mode coverage, Interactions, Panel, Resolved decisions, Implementation
phases. Shipped commits referenced: dotfiles =a7d7559=, =eef6b0b=, =450b7f0=. Full
catalog + redesign children in todo.org "Network panel redesign".
** 2026-06-30 Tue @ 17:29:29 -0400 — Codex — reviewer
- *What changed or was recommended:* Rubric: =Not ready= for the V2 redesign until
the new blocking findings are dispositioned. Added a fourth review batch focused on
diagnostics/observability and the simple UX questions Craig raised: reconcile the
active panel nav, fold performance/speed testing into the diagnostics story, make
saved-vs-available list semantics explicit, replace the Add page with join-from-row,
pin supported WiFi authentication classes, fix Forget confirmation tense, make
connection loading progressive, define the Waybar/Archsetup visual contract, and add
a findable redacted diagnostics report affordance.
- *Why:* The spec has strong engine and failure-mode coverage, but the current doc and
shipped implementation still replicate common network-manager UX mistakes: ambiguous
mixed lists, scan-gated saved data, redundant SSID entry, misleading destructive
copy, diagnostics split across too many pages, and observability that exists in logs
more than in the user's workflow.
- *Artifacts:* Findings appended under =Review findings= after reading the
spec-review workflow, the target spec, todo.org's redesign task, and current
dotfiles implementation files =net/src/net/connections.py=,
=net/src/net/gui.py=, =net/src/net/panel.py=, and Waybar =style.css=.
** 2026-06-30 Tue @ 17:36:40 -0400 — Claude Code (archsetup) — responder (4th review)
- *What changed:* Dispositioned all nine fourth-review findings (8 accept, 1 modify,
0 reject) and wove them into a new "V2 panel UX — the target design" section.
Accepts: V2's three-tab nav is the sole target (the four-page stack is history);
Connections splits into Saved / Available now / Wired with Rescan touching only the
scan layer; selecting an unsaved row is the join flow (the Add modal is deleted);
a join matrix pins which NM SECURITY classes are inline-supported / activate-only /
hidden / unsupported; loading renders Saved first and overlays the scan; the Forget
copy goes future-tense + verified; every run ends with a Copy/Open redacted report;
and a Waybar visual contract (rounded capsule, golden border, state colors). Modify:
the speed-test finding kept Craig's decision-19 placement (full speed test under
Performance, which carries future history) while accepting a lightweight inline
latency probe as Diagnose evidence stored in the doctor report. Cookie [40/40].
- *Why:* Codex read the live implementation and caught the UX places where the module
still replicated common network-manager mistakes — mixed lists, scan-gated saved
data, redundant SSID entry, misleading destructive copy, diagnostics scattered
across pages, observability that lived in logs more than the workflow.
- *Artifacts:* Findings 32-40 completed in place with dispositions; the modify reason
on the speed-test finding. New "V2 panel UX" section under Panel. todo.org redesign
task updated to point the V2 build at the dispositioned design.
|