1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
|
#+TITLE: Rulesets — Work
#+AUTHOR: Craig Jennings
#+DATE: 2026-04-19
Tracking TODOs for the rulesets repo that span more than one commit.
Project-scoped (not the global =~/org/roam/inbox.org= list).
* Rulesets Priority Scheme
** Priority
- =[#A]= *Urgent risk or current workflow blocker.* Credential exposure, data loss, destructive behavior, startup breakage, failing tests that block work, or a feature/refactor that unblocks a core daily workflow. =[#A]= requires a =SCHEDULED:= or =DEADLINE:= date — if it can't be dated, it isn't really =[#A]=.
- =[#B]= *Important planned work.* Concrete bugs, high-leverage architecture cleanup, brittle load-order or test gaps, dependency failures, or feature work with clear design and expected near-term use.
- =[#C]= *Useful but optional.* Low-risk cleanup, ergonomics, smoke tests, investigations with limited current impact, or feature work that would improve the setup but isn't yet a committed workflow.
- =[#D]= *Someday or watchlist.* Speculative features, tiny polish, upstream tracking, optimizations without current pain, deferred ideas that shouldn't compete with active maintenance.
The scheme is importance-driven with optional urgency lift. Priority signals "does this matter and when," not "how big" — effort lives in the tags.
** Tags
Every task carries one *type tag* from this set:
- =:feature:= — adds new capability.
- =:chore:= — meta or housekeeping (tooling, sync, version bump, mechanical cleanup).
- =:spec:= — design document, brainstorm output, or research-backed proposal that precedes implementation.
- =:bug:= — fix to incorrect behavior.
Optional *effort and autonomy tags* — orthogonal to type, both can apply on the same task:
- =:quick:= — likely to take ≤30 minutes from start through verification.
- =:solo:= — Claude can complete the work end to end, including verification, without input from Craig.
Tags are assigned and refreshed by =task-audit=; =task-review= keeps them honest in passing.
* Rulesets Open Work
** TODO [#B] Helper-instance support — concurrent same-project Claude :feature:spec:
:PROPERTIES:
:CREATED: [2026-06-11 Thu]
:LAST_REVIEWED: 2026-06-12
:END:
SPEC REVIEWED 2026-06-12: [[file:docs/design/2026-05-28-generic-agent-runtime-spec-review.org][Codex review]] now rates Phase 1.5 =Ready with caveats=. Before any build, keep the Emacs integration as a cross-project handoff to =~/.emacs.d=, preserve the three-ring gate (bats → sandbox drills → pilot project), and do not let startup/helper changes reach synced template paths until the live drills pass.
Implement Phase 1.5 of the generic-agent-runtime spec ([[file:docs/design/2026-05-28-generic-agent-runtime-spec.org][spec]], amended 2026-06-11 with the "Concurrent same-project agents" section). Craig's case: spawn a second Claude in the same project to look things up or update tasks safely while the primary works. The session-context split (AI_AGENT_ID + session-context.d/) already shipped; this builds the rest:
- =agent-roster= detection script (the load-bearing piece, replaces operator action entirely): pgrep + /proc cwd match within project root + self-ancestry exclusion; verified live 2026-06-11 with 4 concurrent agents. Bats coverage.
- Startup detection-first: the roster check runs before Phase A.0's pulls; not-alone routes to the new =helper-mode.org= role contract and runs nothing else; alone keeps crashed-vs-fresh anchor logic (the roster also disambiguates crashed primary from live primary).
- =helper-mode.org= template workflow: identity self-assignment (helper-<rand4>, recorded in its own .d/ context file), the read/write tiers, light start, helper wrap-up. Auto-routed, no trigger phrase.
- =ai --helper= as the deterministic spawn path (Craig's shell-script point: a script can't skip the check): roster → id export → launch with the helper-mode opener; warn-and-run-primary on empty roster. The startup roster check stays as the safety net for raw launches.
- Wrap-up ordering: helper wrap-up re-runs the roster — orphaned helper (primary already gone) assumes full closing duties incl. commit+push (the git ban is concurrency-scoped and lifts when alone; otherwise its edits strand as a dirty tree); primary wrap-up with live helpers pauses at the commit and asks (commit helper WIP / wait / leave closing to the helper).
- Shared-file read/write contract into protocols.org pointing at helper-mode.org (helper: scoped single-heading org edits only; file-wide passes, inbox processing, and all git mutation stay primary-only); helper branches in startup.org (light path, no pulls/rsync) and wrap-it-up.org (archive own file, skip hygiene + commit).
- Bats: launcher id assignment/sanitization, helper-vs-primary resolution, two simultaneous context files.
- Data-integrity items (spec second pass, 2026-06-11): live-helper gate before any file-wide hygiene pass (todo-cleanup/lint-org/wrap-org-table check session-context.d/, pause + ask on live files, surface stale ones); todo-cleanup.el brought up to the backup-to-/tmp invariant (lint-org and wrap-org-table already conform — verified); log-before-write journaling for helper shared-file edits; memory writes primary-only (MEMORY.md has no heading anchors — helpers log candidates instead); agent id in helper-originated inbox-send slugs (minute-resolution filenames can collide).
- Manual validation with Craig: live helper against a live primary — lookup, one scoped todo.org edit, wrap-up, primary commits the helper's edit cleanly. Then the corruption drill: primary attempts wrap-up while the helper is mid-task and the hygiene gate visibly pauses.
Independent of the spec's phases 2-6 (runtime-neutral refactor), which stay gated on their own go/no-go.
** DOING [#C] Check that memories are sync'd across machines via git :spec:
:PROPERTIES:
:LAST_REVIEWED: 2026-06-12
:END:
v1 implemented end-to-end 2026-06-10 (Phases 0-4 below, no-approvals batch). Remaining before DONE: the manual testing and validation child, plus the other personal machines' one-time clone + timer setup (archsetup handoff).
*** 2026-05-14 Thu @ 19:14:11 -0500 Investigate current memory storage
Memory files live at
[[file:/home/cjennings/.claude/projects/-home-cjennings-code-rulesets/memory/][~/.claude/projects/-home-cjennings-code-rulesets/memory/]]
— four files including =MEMORY.md= and three individual entries
(=feedback_never_guess.md=, =project_ai_scripts_canonical_source.md=,
=reference_pdftools_venv.md=). The directory is a plain unmanaged dir
(no symlink, no enclosing git checkout). Neither
[[file:/home/cjennings/.claude/][~/.claude/]] itself nor any subtree
containing the project-memory dirs is tracked in
[[file:/home/cjennings/code/archsetup/][archsetup]] or
[[file:/home/cjennings/code/rulesets/][rulesets]]. Without a symlink
into a stowed or tracked location, memory files don't survive a new
machine setup or a dotfiles restore.
Proposed setup: stow =~/.claude/projects= →
=archsetup/dotfiles/common/.claude/projects/= (path doesn't exist yet
— it's the target location pending VERIFY).
Create the destination in archsetup, move existing per-project
=projects/<encoded-cwd>/memory/= dirs there, run =stow= to link, then
commit + push archsetup. After that, every machine running =stow=
picks up the same memory tree.
*** 2026-05-23 Sat @ 16:12:48 -0500 Decided: dedicated private repo, not stow
Worked through dotfiles → rulesets → dedicated repo. Dropped stow/dotfiles (machine config, wrong cadence) and rulesets (it's pulled first in every session, so memory edits would dirty its tree and skip the startup =git pull --ff-only=). Chose a dedicated private repo on cjennings.net: storage is unified there while recall stays per-project (the encoded-cwd subdirs), since pooling recall would hurt relevance and risk work-private facts surfacing in personal-project artifacts.
*** 2026-05-23 Sat @ 16:12:48 -0500 Shipped: claude-memory.git + folded symlinks
Created bare =git@cjennings.net:claude-memory.git=, cloned to =~/.claude-memory= (later deleted in the reversal below), moved all 7 per-project =memory/= dirs in (54 files; work has 40) and replaced each live =~/.claude/projects/<enc>/memory= with a folded dir-symlink so new memory lands in the clone and a push syncs it. Added =link-claude-memory.sh= (idempotent — recreates the symlinks on a new machine after clone) + README. Private repo, never GitHub (carries work/DeepSat memory). Initial import pushed (=f496370=).
*** 2026-05-24 Sun @ 01:53:35 -0500 Reversed the migration — back to unmanaged per-project memory
Cancelled the follow-up brainstorm and undid the dedicated-repo migration at Craig's call. Moved all 7 memory dirs back to =~/.claude/projects/<enc>/memory/= (content preserved), deleted the =~/.claude-memory= clone, and deleted the bare =claude-memory.git= on the server. Memory is back to its original at-risk state, so the task reopens at [#C] pending a direction. The brainstorm landed on a two-tier idea for whenever this resumes: promote general lessons into a rulesets-tracked file symlinked into =~/.claude/rules/= (loaded into every project natively, one repo), and keep project-specific memory under each project's own =.ai/memory/= (committed where =.ai/= is tracked, at-risk where it's gitignored). Not implemented.
*** 2026-06-05 Fri @ 05:57:35 -0500 Pivot: adopt the existing org-roam KB as the shared agent substrate
Pressure-tested the two-tier idea, then Craig redirected: a shared org-roam knowledge base any project can read and write makes this simpler. Ground truth verified: =~/sync/org/roam/= already exists (484 org files, curated since 2023, Syncthing-synced, not git). So cross-machine sync is already solved, and the task stops being "build a memory-sync system" and becomes "point agents at the KB that already syncs." The dedicated-repo and two-tier approaches are both superseded for the storage+sync half.
Wrote a one-page spec: [[file:docs/agent-knowledge-base-spec.org][agent-knowledge-base-spec.org]] (originally docs/design/2026-06-05-org-roam-knowledge-base-spec.org; superseded by the 2026-06-10 spec-create rewrite at the new path). Five decisions, mechanics recommended: (1) KB is a queried substrate accessed as files (ripgrep + follow =[[id:]]= by grep), not via the org-roam package; (2) capture in harness memory, promote durable facts into the KB (same cadence as the pattern catalog) — resolves the at-risk problem since the valuable knowledge moves to the synced KB; (3) a =claude-rules/knowledge-base.md= pointer rule carries path/query/write-schema/boundary; (4) write schema = roam-valid node + =:agent:= filetag so agent notes stay distinguishable and index on the next =org-roam-db-sync=. The rules layer (=claude-rules/=, =CLAUDE.md=) is untouched — the KB replaces the memory tier, not the rules tier.
*** 2026-06-10 Wed @ 14:29:20 -0500 Spec ratified — write boundary is option C; rewritten to spec-create format
Craig answered via cj annotations in the spec (2026-06-10): DECISION 5 is option C (read-shared, write-scoped — work agents never write the KB). Syncthing does replicate ~/sync/ to a work machine and Craig is fine with how C handles it. Node granularity: per-fact nodes. Write review: agent writes land freely in the KB only — explicitly not permission to post to email, Linear, or any public channel without review and consent. The spec was rewritten into the spec-create format at [[file:docs/agent-knowledge-base-spec.org][agent-knowledge-base-spec.org]] (old draft removed). Implementation explicitly held pending Craig's go-ahead; one decision still open (D7, next VERIFY).
*** 2026-06-10 Wed @ 14:35:40 -0500 Spec review — not ready
Review written at docs/agent-knowledge-base-spec-review.org (deleted on disposition completion; content summarized in the spec's Review dispositions). Rubric: =Not ready=. Blockers: resolve D7 (keep vs retire harness memory) and define the executable personal/work/unknown write-boundary classifier plus work-side write/refusal destination. Medium notes: use concrete ripgrep commands that exclude =*.sync-conflict-*= files, and define seed-node approval/rollback.
*** 2026-06-10 Wed @ 14:44:00 -0500 D7 resolved — keep harness memory as the capture layer
Craig ratified "keep" in chat (2026-06-10). Harness memory stays the ephemeral, auto-recalled capture layer; the KB holds promoted durable facts; Phase 3's wrap-up promotion cadence is mandatory. Spec D7 flipped to accepted; D2 stands as written.
*** 2026-06-10 Wed @ 14:44:00 -0500 Project classification defined — work-root denylist, unknown refuses
Resolved in the spec-response pass: =knowledge-base.md= carries an explicit work-root denylist (initially =~/projects/work=) as the source of truth. Personal = under a known project parent (=~/code/=, =~/projects/=, =~/.emacs.d=) and not denylisted → KB writes allowed. Work or unknown → no KB write; the agent reports the refusal with a one-line redacted summary of the fact. v1 adds no new work-side store — work projects keep their existing project-tree conventions. See the "Project classification and write routing" section of [[file:docs/agent-knowledge-base-spec.org][the spec]]. Denylist completeness is the one open caveat (next VERIFY).
*** 2026-06-10 Wed @ 14:44:00 -0500 Codex review incorporated — spec ready with caveats
Spec-response pass processed the 2026-06-10 Codex review with D7 = keep as a pre-agreed input. Both blockers cleared (D7 accepted; classification/write-routing section added). Mediums accepted: canonical rg commands with conflict-file exclusion, Phase 2 seed-node approval/rollback mechanics, Makefile no-change note, Testing/Verification section. Three recommendations modified, none rejected — see the spec's Review dispositions. Review file deleted per the workflow. Rubric: ready with caveats (denylist confirmation). Implementation tasks broken out below; implementation itself awaits Craig's go.
*** 2026-06-10 Wed @ 17:29:37 -0500 Work-root denylist confirmed — ~/projects/work only
Craig confirmed (2026-06-10, in chat): the denylist is just =~/projects/work=. Archangel is not work-scoped. The spec's one caveat clears — status now ready. Phase 1 is unblocked, but implementation still awaits Craig's explicit go.
*** 2026-06-10 Wed @ 17:57:08 -0500 Spec amended — D8 git transport + migration/metrics/docs/maintenance folds
Craig's five design questions answered and folded into the spec, and D8 ratified (Shape A): the KB moves out of the =~/sync/org= Syncthing share into its own git repo on cjennings.net, with an =agents/= subdirectory for agent writes, a systemd auto-sync timer for Craig's edits, opt-in-by-clone replication (work machine doesn't clone), and the phone staying on the on-demand =~/sync/phone= pattern. Folded in: inclusion criteria + a Phase 1.5 guided memory sweep, a Success metrics section with a 30-day checkpoint, the seed node redefined as the KB's own documentation, and Phase 4 maintenance automation. Phases renumbered 0-4; tasks below updated. Implementation still held.
*** 2026-06-10 Wed @ 18:21:33 -0500 Phase 0 done — roam migrated to git
Backed up (~/roam-backup-2026-06-10.tar.gz), copied to =~/org/roam=, 63 conflict files deleted (424 org files), git repo with origin =git@cjennings.net:roam.git= (initial commit 515693d), old location replaced with a transition symlink. Emacs =roam-dir= updated in user-constants.el + live-reloaded (db rebuilt, 416 nodes); handoff to .emacs.d for the commit. =roam-sync.sh= (6 bats green) on a 15-min systemd user timer, installed + enabled + round-trip verified. Old-path references repointed (protocols task-list pointer, journal workflow, notes template). archsetup handoff covers dotfiles adoption + other-machine clones. rulesets commit fcf554a.
*** 2026-06-10 Wed @ 18:21:33 -0500 Phase 1 done — knowledge-base.md rule live
=claude-rules/knowledge-base.md= written (path, git discipline, query commands, agents/ write schema, denylist + refusal contract, inclusion criteria, capture-then-promote). =make install= linked it machine-wide; verified the link, a known-note query, and conflict-glob exclusion with a planted file. Commit d071f1f.
*** 2026-06-10 Wed @ 18:21:33 -0500 Phase 1.5 done — rulesets swept, 10 projects broadcast
Rulesets' 6 memories classified: 3 promoted as =agents/= nodes (notify-attention pattern, pdftools venv, gpg-agent SSH TTL trap), 2 kept local (rule-encoded in verification.md / interaction.md), 1 kept + de-staled (ai-scripts-canonical updated for the claude-templates subtree fold). Sweep handoff broadcast to the 10 other memory-bearing projects (archsetup, org-drill, pearl, .emacs.d, elibrary, finances, health, home, jr-estate, kit); work skipped by the boundary; the orphaned =linear-emacs= memory dir (project retired, likely pearl's predecessor) noted for Craig.
*** 2026-06-10 Wed @ 18:21:33 -0500 Phase 2 done — seed/doc node written and indexed
=agents/20260610181640-how-the-agent-knowledge-base-works.org= written: the KB's user-facing guide (what agents do, how it syncs, finding/pruning agent content, the rule pointer). Index verified programmatically: =org-roam-node-from-title-or-alias= resolves it with tags (agent reference); node count 416 → 420. Craig's visual check remains in the manual-testing child.
*** 2026-06-10 Wed @ 18:21:33 -0500 Phase 3 done — wrap-up promotes + records the KB receipt
wrap-it-up.org Step 1 gains the promotion check (inclusion-criteria bar) and the mandatory "KB: promoted N / consulted yes-no" Summary line; validation checklist enforces it. Mirror synced, integrity OK (44), parse OK. Commit 242b95e.
*** 2026-06-11 Thu @ 19:26:26 -0500 .emacs.d memory sweep complete (first broadcast response)
First of the 10 broadcast projects to report Phase 1.5 done (handoff 18:23). Inventory 7: promoted 3 to KB (no-make-frame-in-live-daemon, proton-bridge-headless-cert-mismatch, open-images-with-imv — roam commit a915760), kept 3 local at Craig's call (commit-flow-no-approval-gate per-project-scoped; two theme-scoped ones possibly superseded by the palette-columns spec), deleted 1 (superseded by canonical interaction.md rule). 9 projects' sweeps outstanding.
*** 2026-06-12 Fri @ 02:25:12 -0500 Five more sweeps complete via the home folds
Overnight handoffs from home closed five more broadcast targets, each swept at fold-time triage with Craig's approval: jr-estate 2 promoted (forms name-with-number, PDF-editing tooling split; roam 45d8e6c) / 3 kept with area attribution / 2 deleted as rule-encoded or duplicate; finances 0/1/0 (rosalea-daly contact fact kept local); elibrary 0/0/2, health 0/0/1, kit 1/0/2 (hand-prep-items-to-work-inbox promoted into home's memory; the rest duplicated rules or home memories). Nothing from these five met the KB bar that wasn't already encoded. All folded projects' session archives merged area-prefixed into home's .ai/sessions/, so session-harvest's first run sees them. Home covers its own and remaining areas' sweeps through ongoing discipline; still pending from the broadcast: archsetup and work.
*** TODO Agent KB — manual testing and validation :test:
What we're verifying: the v1 acceptance surface that needs Craig's eyes or a live cross-project session. Run after Phases 0-2 land.
- Seed node appears in org-roam (autosync) and in the =rg '#\+filetags:.*:agent:'= inventory.
- In the work project, a durable-storage request produces no write in the KB and the refusal report names the fact.
- In an unknown project (outside =~/code/=, =~/projects/=, =~/.emacs.d=), the agent refuses or asks rather than guessing.
- After Phase 0: an edit made on one machine appears on another within the auto-sync timer interval, no new sync-conflict files appear, and the work machine has no KB clone.
Expected: all four behave per the spec; any miss promotes to a bug task. (Agent-runnable checks — make install link, rg finds a known note, conflict-file exclusion — are verified inside Phases 0-2.)
*** 2026-06-10 Wed @ 18:21:33 -0500 Phase 4 done — monthly hygiene automation live
=scripts/kb-hygiene.sh= (6 bats green, shellcheck clean, read-only by design) inventories =:agent:= nodes, flags orphans / duplicate titles / conflict files, and writes an org report into the rulesets inbox; =roam-hygiene.timer= (monthly, Persistent) installed + enabled. Live run against the real KB verified (4 agent nodes, 428 files, 0 conflicts). Conditional vNext stays in the spec's scope tiers: a =/promote= command if the wrap-up prompt proves insufficient, an =:agent:inbox:= staging tag if free writes prove too noisy. Commit b014095.
** TODO [#C] Morning ops orchestrator pilot — read-only :feature:
:PROPERTIES:
:CREATED: [2026-06-11 Thu]
:LAST_REVIEWED: 2026-06-11
:END:
A scheduled headless morning run chaining the existing pieces: startup checks, the triage-intake scan, a system health check — producing the prep doc plus a report and a notify ping, with all remediation propose-only. Staged adoption from the 2026-06-11 insights report's "Self-Healing Daily Ops Orchestrator": read-only first; promote individual routine remediations to auto only after each has a track record. Known blockers to design around: headless MCP auth (interactively-authenticated servers are absent in cron runs) and the consent boundary (triage Phase D, anything destructive).
** TODO [#C] Build =create-documentation= skill for high-quality project/product docs :feature:
:PROPERTIES:
:LAST_REVIEWED: 2026-06-12
:END:
Create a Claude skill named =create-documentation= that can plan, write,
refresh, and review software documentation across README files, project docs,
developer guides, API docs, operational docs, and generated/published doc
sites.
This is broader than =arch-document=. =arch-document= should remain the
architecture-specific arc42 skill. =create-documentation= should know when to
delegate to it for architecture documentation, but its main job is the full
documentation system around a product or repo: onboarding, tutorials, how-to
guides, reference, explanation, operations, troubleshooting, contribution,
release/upgrade, and publication format.
*** Why this matters
The repo currently has strong skills for architecture, testing, review,
debugging, and workflow. It does not have a general documentation skill that:
- Chooses the right documentation type for the user need.
- Audits existing docs against code and expected user journeys.
- Creates a coherent doc map instead of dumping everything into =README.md=.
- Writes in a consistent technical style.
- Decides source/publish format intentionally (=.md=, =.org=, generated
=.html=, OpenAPI, etc.).
- Treats docs as a maintained product surface with verification, ownership,
navigation, accessibility, and freshness checks.
*** Research notes
**** Documentation frameworks and best-practice sources
- Diataxis separates documentation by reader need:
- Tutorials: learning-oriented, take the reader by the hand.
- How-to guides: task-oriented, solve a specific real problem.
- Reference: information-oriented, accurate and complete lookup material.
- Explanation: understanding-oriented, concepts, background, tradeoffs.
Source: [[https://diataxis.fr/][Diataxis]] and the official guidance around
tutorials/how-to/reference/explanation.
- Django explicitly documents this same organization and teaches readers how
to navigate it: tutorials for beginners, topic guides for concepts,
reference for APIs, how-to guides for recipes. This is a major reason the
docs feel navigable despite large scope.
Source: [[https://docs.djangoproject.com/en/5.2/][Django documentation]]
- Kubernetes separates concepts, tasks, tutorials, and reference. It also has
current/previous-version docs, localization, contribution paths, and
task-focused landing pages. Its docs are good at answering "what is this?"
separately from "how do I do one thing?"
Sources: [[https://kubernetes.io/docs/home/][Kubernetes docs home]],
[[https://kubernetes.io/docs/tasks/][Kubernetes tasks]],
[[https://kubernetes.io/docs/tutorials/][Kubernetes tutorials]]
- Write the Docs emphasizes docs that are precursory, participatory,
exemplary, consistent, current, discoverable, addressable, cumulative, and
comprehensive. Especially important: incorrect docs are worse than missing
docs, and examples should cover common use cases without overwhelming the
reference.
Source: [[https://www.writethedocs.org/guide/writing/docs-principles/][Write the Docs principles]]
- Google developer docs guidance emphasizes project-specific style first,
clarity and consistency, conversational but not frivolous tone, active voice,
second person, descriptive links, global audience, accessibility, sentence
case headings, numbered lists for procedures, code font for code, and alt
text for images.
Sources: [[https://developers.google.com/style/][Google developer documentation style guide]],
[[https://developers.google.com/style/highlights][Google style highlights]],
[[https://developers.google.com/style/accessibility][Google accessible docs]]
- Google's doc best-practices page adds a pragmatic maintenance principle:
minimum viable documentation, update docs with code, delete dead docs, prefer
good over perfect, tell the story of code, and avoid duplication.
Source: [[https://google.github.io/styleguide/docguide/best_practices.html][Google documentation best practices]]
- The Good Docs Project is useful as a template source, especially for
README, how-to, tutorial, concept, reference, troubleshooting, contributor,
and release-note patterns. Do not vendor wholesale; use as prior art.
Source: [[https://www.thegooddocsproject.dev/][The Good Docs Project]]
**** Praised project docs to analyze and steal from
***** Django
Why it works:
- It labels the doc types directly and explains when to use each.
- It has a beginner path, advanced tutorials, topic guides, API reference,
how-to recipes, deployment, security, testing, release notes, and community
help in one coherent index.
- It is versioned, so readers know which framework version the docs target.
- It cross-links introductory material to deeper references without making the
first page a wall of every detail.
Patterns to use:
- Make the top-level docs home a routing page by reader intent.
- Put "How these docs are organized" near the top when the doc set is large.
- Split concept, task, tutorial, and reference instead of mixing them.
- Include "getting help" and "not found?" paths so the docs have an exit ramp.
Source: [[https://docs.djangoproject.com/en/5.2/][Django documentation]]
***** Kubernetes
Why it works:
- It has a large, complex product but maintains separate lanes for Concepts,
Tasks, Tutorials, Reference, and Contribute.
- Task pages are short sequences for one operation; tutorials are larger goals
with several sections. This prevents "one page tries to teach everything."
- It exposes version state clearly, including static old versions and current
docs.
- It supports localization and documentation contribution, which makes the
docs a product surface rather than a side artifact.
Patterns to use:
- For platform or infrastructure docs, include Concepts / Tasks / Tutorials /
Reference as first-class folders.
- Create version/freshness metadata when docs are tied to released software.
- Add doc contribution guidance for projects with external contributors.
- Make operational tasks discoverable by category, not just search.
Sources: [[https://kubernetes.io/docs/home/][Kubernetes docs home]],
[[https://kubernetes.io/docs/tasks/][Kubernetes tasks]]
***** Rust
Why it works:
- Rust has a "bookshelf" rather than one overloaded manual: The Book, Rust by
Example, standard library API reference, Reference, Cargo Guide, Error Index,
Rustonomicon, release notes, platform support, policies, etc.
- The learning path is honest about audience: "assume programmed before, not in
any specific language."
- Reference and learning material are separated. Advanced unsafe guidance gets
its own book.
- Offline docs via =rustup doc= are treated as part of the product.
Patterns to use:
- For broad ecosystems, create a documentation bookshelf rather than a single
mega-doc.
- Separate beginner path, examples, formal reference, advanced/unsafe topics,
tooling docs, error index, release notes, and policies.
- Document assumptions about reader experience.
- Consider offline/local docs for CLI/library ecosystems.
Source: [[https://doc.rust-lang.org/][Rust documentation]]
***** Stripe API docs
Why it works:
- The API reference is organized around resources and common cross-cutting
concerns: authentication, errors, idempotency, pagination, request IDs,
versioning, metadata, connected accounts.
- It pairs prose with concrete request/response examples and client-library
language selection.
- It exposes test-mode vs live-mode distinctions early.
- It offers "Copy for LLM" / "View as Markdown", which acknowledges modern
consumption patterns without sacrificing normal docs UX.
- Its reputation comes from matching developer mental models and making the
common path implementable quickly, not just visual polish.
Patterns to use:
- API docs should be generated from or checked against OpenAPI/JSON schema or
source annotations wherever possible.
- Keep cross-cutting API behavior near the front, before endpoint lists.
- Include runnable examples, auth, errors, pagination, versioning, idempotency,
and sandbox/test data.
- Consider LLM-friendly exports (=llms.txt=, "view as Markdown", stable
anchors), but do not make the docs only for AI.
Source: [[https://docs.stripe.com/api][Stripe API Reference]]
***** FastAPI
Why it works:
- Documentation is part of the framework's value proposition: OpenAPI and JSON
Schema drive interactive Swagger UI and ReDoc automatically.
- It reduces manual drift for API reference by deriving docs from typed code.
- It integrates examples and tutorial-style explanations with standards-based
generated reference.
Patterns to use:
- Prefer generated API reference from code/specs over hand-maintained endpoint
tables.
- Generated docs need human-written overview, concepts, authentication,
examples, and operational guidance around them.
- The skill should identify when an OpenAPI/Swagger/ReDoc/Scalar route already
exists and improve metadata/schema quality instead of creating duplicate
manual docs.
Source: [[https://fastapi.tiangolo.com/features/][FastAPI features]]
*** Format and presentation decisions
**** Default source format: Markdown
Use =.md= as the default for shared project documentation when:
- The repo is on GitHub/GitLab/Forgejo and readers browse docs in the web UI.
- The project already uses MkDocs, Docusaurus, VitePress, Sphinx+MyST,
Jekyll, GitHub Pages, or plain README-driven docs.
- Contributors are expected to edit docs without Emacs-specific tooling.
- The docs need easy static-site publishing.
- The content is README, tutorial, how-to, reference, troubleshooting,
contributing, release notes, runbooks, or ordinary prose + code blocks.
Markdown source works well because it is low-friction, reviewable in diffs,
rendered by repository hosts, and supported by documentation site generators.
MkDocs is a good reference point: Markdown source, YAML config, built-in dev
server, static HTML output, and easy hosting.
Source: [[https://www.mkdocs.org/][MkDocs]]
**** Use Org when the document is Emacs-native or personal/planning-heavy
Use =.org= when:
- The user's workflow is explicitly Emacs/org-mode.
- The document contains TODO states, schedules, priorities, tags, agenda
integration, property drawers, clocking, or personal planning.
- The document is an internal strategy/planning artifact such as V2MOM,
research notes, meeting notes, task triage, or a living personal operating
document.
- The output may later be exported, but the source of truth is intended to be
edited in org-mode.
Do not default team-facing documentation to =.org= unless the team already uses
org-mode. Org can export to HTML, but that does not make it the right authoring
format for non-Emacs contributors.
Sources: [[https://orgmode.org/org.html][Org manual]],
[[https://orgmode.org/worg/org-tutorials/org-publish-html-tutorial.html][Org publish HTML tutorial]]
**** Use HTML as generated/published output, rarely as hand-authored source
Use =.html= when:
- The deliverable is a published static documentation site.
- The document needs interactive widgets, embedded API consoles, custom layout,
or generated navigation/search.
- The project already publishes docs as a website.
- The target audience needs searchable, browsable, linkable pages rather than
repo-local files.
Prefer generated HTML from Markdown/Org/reStructuredText/AsciiDoc/OpenAPI over
hand-authored HTML. Hand-edit HTML only for standalone artifacts, custom landing
pages, or cases where the project already treats HTML templates as docs source.
**** Consider generated/spec-backed formats
Use generated reference when possible:
- API reference: OpenAPI/Swagger/ReDoc/Scalar from code/spec.
- CLI reference: generated from command parser/help output.
- Library API reference: language-native doc tools such as rustdoc, pydoc,
TypeDoc, JSDoc, Go doc, Sphinx autodoc, etc.
- Config reference: generated from schema, types, or validated defaults.
The skill should not duplicate generated reference by hand. It should improve
source comments, schema descriptions, examples, front matter, and surrounding
guides.
**** Presentation requirements
Every generated doc set should have:
- A docs home or README that routes by reader intent.
- Stable headings and anchors for addressability.
- Descriptive link text, no "click here."
- Search/navigation plan when docs exceed a handful of pages.
- Version/freshness metadata when tied to released software.
- Ownership/review cadence for docs likely to rot.
- Accessible structure: semantic headings, alt text, no image-only info,
tables only when appropriate, left-aligned text, readable code blocks.
- Copyable commands and code examples.
- "What changed?" / release notes / migration path when docs describe a new or
changed behavior.
- Troubleshooting path for common failures.
- Clear prerequisites before procedures.
- Verification steps after procedures.
- Support/escalation path when the docs do not answer the question.
- Optional LLM-friendly surfaces for larger doc sets: =llms.txt=,
"copy as Markdown" equivalents, concise page summaries, and stable anchors.
*** Proposed skill design
**** Skill name and trigger
Name: =create-documentation=
Trigger when the user asks to:
- create documentation, docs, README, guide, manual, runbook, tutorial,
quickstart, API docs, CLI docs, troubleshooting docs, contributor docs,
architecture-adjacent docs, release notes, upgrade guide, or doc site;
- improve, audit, reorganize, or publish existing docs;
- decide documentation structure or format for a project.
Do not trigger for:
- architecture-only arc42 docs when =arch-document= is the direct fit;
- ADR creation (=arch-decide=);
- design docs before implementation shape is known (=brainstorm= or
=arch-design=);
- prose polishing only (future writing/humanizer skill);
- inline code comments/docstrings only, unless the user asks to create docs
from them.
**** V1 should be one orchestrating skill, not many separate skills
Build v1 as one skill with explicit phases and subcommands rather than a set
of separate skills. Rationale:
- Documentation tasks often start ambiguous; the first job is classification.
- Splitting too early creates command-discovery burden.
- A single skill can dispatch to existing specialized skills
(=arch-document=, =c4-diagram=, =security-check=, =playwright-js/py= for
doc-site verification) without making users choose the internal pipeline.
Support discoverable subcommands inside one skill:
#+begin_example
/create-documentation audit <path>
/create-documentation plan <path-or-scope>
/create-documentation write <doc-type> <scope>
/create-documentation refresh <path>
/create-documentation publish <path>
/create-documentation review <path>
#+end_example
The default =/create-documentation <scope>= runs audit -> plan -> write ->
review, asking for confirmation before broad rewrites.
**** Future split if v1 gets too large
If the skill grows past a manageable size, split into a discoverable
=documentation-*= chain. Names and order:
1. =documentation-audit= — inventory existing docs, code/docs drift, reader
journeys, missing doc types, stale/generated docs.
2. =documentation-plan= — choose audiences, doc map, formats, source of truth,
publishing path, ownership, and freshness policy.
3. =documentation-write= — write or update the selected docs.
4. =documentation-reference= — generate or improve API/CLI/config/library
reference from source/spec.
5. =documentation-publish= — configure MkDocs/Docusaurus/Sphinx/GitHub Pages
or equivalent, build static HTML, verify links/search.
6. =documentation-review= — quality gate for accuracy, style, navigation,
accessibility, examples, and freshness.
Keep =create-documentation= as the orchestrator and user-facing entry point.
The chain is discoverable because every helper starts with =documentation-= and
the orchestrator prints the next command at each handoff.
*** V1 workflow details
**** Phase 1: Intake and classification
Ask only what is missing from local context:
- Who is the reader? New user, evaluator, integrator, maintainer, operator,
contributor, auditor, support engineer?
- What is the reader trying to do or understand?
- Is this for a public project, internal team, personal workflow, regulated
audience, or customer-facing product?
- Is the output repo-browsed, web-published, printed/exported, or Emacs-native?
- Is there existing code, existing docs, an API spec, generated reference, or
only a concept?
- What is the maintenance expectation? One-off, release-maintained,
continuously updated?
Classify the work into one or more doc types:
- README / landing page.
- Quickstart.
- Tutorial.
- How-to guide.
- Concept/explanation.
- API reference.
- CLI reference.
- Configuration reference.
- Architecture docs (delegate to =arch-document= if arc42/C4/ADR-driven).
- Operations/runbook.
- Troubleshooting/FAQ.
- Upgrade/migration/release notes.
- Contributor/development docs.
- Security/compliance docs.
- Examples/cookbook.
**** Phase 2: Audit existing material
Inventory:
- =README*=, =docs/=, =doc/=, =site/=, =mkdocs.yml=, =docusaurus.config.*=,
=vitepress=, =sphinx=, =docs.rs=, =pkg.go.dev=, OpenAPI specs,
generated docs folders, GitHub Pages config, ADRs, architecture docs,
examples, scripts, CLI help, package metadata.
- Existing doc type coverage: tutorial/how-to/reference/explanation.
- Broken links, stale version numbers, commands that no longer exist,
screenshots that may be stale, code snippets not exercised, doc/code drift.
- Source of truth for generated docs. Flag generated files; do not hand-edit
them until source is known.
- Reader journey gaps: "new user can install?", "first success path?",
"operator can recover?", "contributor can run tests?", "API consumer can
authenticate and handle errors?"
Use =rg= first. For API/CLI reference, prefer structured sources:
OpenAPI/JSON Schema, package metadata, command =--help= output, docstrings, or
language-native documentation tooling.
**** Phase 3: Documentation plan
Write a short plan before broad edits:
- Audiences and priority order.
- Proposed doc map/tree.
- Doc type for each page.
- Source format decision: =.md= / =.org= / generated spec / generated HTML.
- Publishing target, if any.
- Existing docs to preserve, move, merge, or delete.
- Generated-reference strategy.
- Ownership and freshness policy.
- Verification plan.
Stop for confirmation when the plan moves or rewrites more than one file.
**** Phase 4: Write or update docs
Writing rules:
- Lead with the reader's goal, not the implementation history.
- Put prerequisites before steps.
- Use numbered lists for procedures.
- Use bullets for non-ordered choices.
- Use active voice and second person for instructions.
- Keep sentences short and globally readable.
- Define acronyms on first use.
- Use code font for commands, file names, env vars, API names, and literals.
- Use descriptive links.
- Prefer examples that cover the common path and one meaningful edge/error
path.
- Separate examples/tutorials from dense reference.
- Avoid stale duplication: link to canonical generated reference instead of
copying it.
- Include expected output after commands where it helps verification.
- Include cleanup/rollback steps when procedures change state.
- Include troubleshooting for common failures.
- Avoid marketing voice in technical docs. State capability and constraints
plainly.
- No AI attribution in docs, examples, comments, generated pages, footers, or
screenshots.
Page skeletons:
README / docs home:
#+begin_example
# <Project>
<One-paragraph purpose>
## Start here
- New user: <quickstart>
- Existing user with a task: <how-to index>
- API lookup: <reference>
- Maintainer/operator: <operations/contributing>
## Quick example
...
## Documentation map
...
## Support / contributing
...
#+end_example
Tutorial:
#+begin_example
# Tutorial: <goal>
## What you'll build
## Prerequisites
## Step 1 ...
## Checkpoint
## Step 2 ...
## What you learned
## Next
#+end_example
How-to:
#+begin_example
# How to <task>
## When to use this
## Prerequisites
## Steps
## Verify
## Troubleshooting
## Related
#+end_example
Reference:
#+begin_example
# <Thing> reference
## Summary
## Parameters / options / fields
## Behavior
## Errors
## Examples
## Version notes
#+end_example
Explanation:
#+begin_example
# <Concept>
## Problem it solves
## Mental model
## How it fits with related concepts
## Tradeoffs and constraints
## Further reading
#+end_example
Runbook:
#+begin_example
# Runbook: <operation>
## Scope
## Preconditions
## Normal procedure
## Verification
## Rollback
## Alerts and escalation
## Post-incident notes
#+end_example
**** Phase 5: Presentation and publishing
If docs are repo-local only:
- Ensure links render on GitHub/GitLab.
- Keep relative links stable.
- Add an index if more than 4-5 docs exist.
If docs are web-published:
- Detect existing generator and follow it.
- Prefer project-native tooling over introducing MkDocs/Docusaurus/Sphinx.
- If no tooling exists and user wants a site, choose conservatively:
- Python/simple repo: MkDocs Material is a pragmatic default.
- JS/React ecosystem: Docusaurus or VitePress if already in stack.
- Python libraries: Sphinx or MkDocs depending on existing ecosystem.
- API docs: ReDoc/Swagger/Scalar from OpenAPI.
- Build locally if dependencies exist.
- Check links, nav, search, mobile viewport, and accessibility basics.
- Do not commit generated =site/= output unless the project already does.
**** Phase 6: Verification
Verification should match doc type:
- Commands in quickstarts/how-tos: run them or mark not run with reason.
- Code snippets: compile/run where feasible, or use fenced language and note
assumptions.
- API docs: validate OpenAPI/spec if tooling exists.
- Links: run link checker if configured; otherwise sample-check changed links.
- Published site: build docs and inspect output.
- Screenshots: verify current UI if included.
- Generated docs: regenerate from source and confirm no unexpected diff.
Final report must say:
- Files created/changed.
- Doc types covered.
- Format/source-of-truth decisions.
- What was verified.
- What could not be verified.
- Known gaps/follow-ups.
*** Relationship to existing skills
- =arch-document=: use when the requested docs are specifically architecture
docs from brief + ADRs + C4/arc42. =create-documentation= may call it, then
wrap the output in a broader docs map.
- =c4-analyze= / =c4-diagram=: use for diagrams in architecture or concept
docs when visual structure helps.
- =brainstorm=: use before =create-documentation= when the product/feature
itself is still unclear.
- =arch-design= / =arch-decide=: use when documentation reveals missing
architectural choices.
- =security-check=: use when docs include security guidance, auth, secrets,
deployment, or compliance claims.
- =playwright-js= / =playwright-py=: use to verify published doc sites,
interactive docs, screenshots, and browser-rendered examples.
- =codify=: use after a documentation session reveals reusable project-specific
documentation rules.
*** Quality bar and anti-patterns
The skill should reject:
- A giant README that mixes tutorial, reference, architecture, and operations.
- Duplicating generated API/CLI/config reference by hand.
- Unverified commands in quickstarts without a "not run" note.
- Screenshots with no alt text or no update path.
- Tables used for layout instead of actual tabular data.
- "Overview" pages that do not route readers to tasks.
- Tutorials that become reference dumps.
- How-to guides that explain concepts for pages before giving steps.
- Reference pages that hide required options in prose.
- Marketing claims without concrete examples.
- Docs that mention local private paths, personal tooling, or AI attribution in
public artifacts.
- Publishing generated HTML as source unless the project explicitly owns HTML
docs that way.
*** Acceptance criteria for building the skill
- [ ] Directory =create-documentation/= with =SKILL.md=.
- [ ] Frontmatter description includes positive and negative triggers.
- [ ] Skill body includes the V1 phases above.
- [ ] Includes a source-format decision table for =.md= / =.org= / =.html= /
generated spec/reference.
- [ ] Includes doc-type classifier based on Diataxis plus README/runbook/API
additions.
- [ ] Includes examples/skeletons for README, tutorial, how-to, reference,
explanation, runbook, troubleshooting, contributor docs, and API overview.
- [ ] Includes audit checklist for existing repos.
- [ ] Includes publishing guidance without hardcoding one static-site tool.
- [ ] Includes verification checklist and "unable to verify" reporting.
- [ ] Cross-references =arch-document=, =brainstorm=, =security-check=,
=playwright-js=, =playwright-py=, and =codify=.
- [ ] Adds =references/= only if needed; suggested files:
- =references/doc-type-decision.md=
- =references/style-guide.md=
- =references/format-decision.md=
- =references/page-skeletons.md=
- =references/doc-audit-checklist.md=
- [ ] Keep =SKILL.md= concise enough to load; move long skeletons/checklists to
references for progressive disclosure.
- [ ] Run =./scripts/lint.sh= after adding the skill.
*** Open design questions before implementation
- Should the user-facing command be exactly =/create-documentation= while
internal helper names use =documentation-*=, or should all names share the
=create-documentation <subcommand>= form? Recommendation: one skill with
subcommands for v1.
- Should Markdown be the hard default for team docs? Recommendation: yes,
unless the project already uses org/reST/AsciiDoc or the output is personal
Emacs-native planning.
- Should the skill create a docs site automatically? Recommendation: no. It
should propose a site when the doc set exceeds README-scale or when search,
versioning, or public publishing is required. Ask before adding tooling.
- Should it write docs before code exists? Recommendation: yes for specs,
user journeys, and design docs, but route unclear feature/product decisions
through =brainstorm= or =arch-design= first.
- Should it include LLM-specific docs surfaces? Recommendation: optional for
public/library/API docs: =llms.txt= or markdown export is valuable, but normal
human navigation remains primary.
** TODO [#C] Build /research-writer — clean-room synthesis for research-backed long-form :feature:
:PROPERTIES:
:LAST_REVIEWED: 2026-06-12
:END:
Gap in current rulesets: between =brainstorm= (idea refinement → design doc)
and =arch-document= (arc42 technical docs), there's no skill for
research-backed long-form prose — blog posts, essays, white papers,
proposals with data backing, article-length content with citations.
Craig writes documents across many contexts (defense-contractor work,
personal, technical, proposals). The gap is real.
*Evaluated 2026-04-19:* ComposioHQ/awesome-claude-skills has a
=content-research-writer= skill (540 lines, 14 KB) that attempts this. *Not
adopting:*
- Parent repo has no LICENSE file — reuse legally ambiguous
- Bloated: 540 lines of prose-scaffolding with no tooling
- No citation-style enforcement (APA/Chicago/IEEE/MLA)
- No source-quality heuristics (primary vs secondary, peer-review, recency)
- Fictional example citations in the skill itself (models the hallucination
failure mode a citation-focused skill should prevent)
- No citation-verification step
- Overlaps with =humanizer= at polish with no composition guidance
*Patterns worth lifting clean-room (from their better parts):*
- Folder convention =~/writing/<article-name>/= with =outline.md=,
=research.md=, versioned drafts, =sources/=
- Section-by-section feedback loop (outline validated → per-section
research validated → per-section draft validated)
- Hook alternatives pattern (generate three hook variants with rationale)
*Additions for the clean-room version (v1):*
- Citation-style selection (APA / Chicago / MLA / IEEE / custom) with
style-specific examples and a pick-one step up front
- Source-quality heuristics: primary > secondary; peer-reviewed; recency
thresholds by domain; publisher reputation; funding transparency
- Citation-verification discipline: fetch real sources, never fabricate,
mark unverifiable claims with =[citation needed]= rather than inventing
- Composition hand-off to =/humanizer= at the polish stage
- Classification awareness: if the working directory or context signals
defense / regulated territory, flag any sentence that might touch CUI
or classified material before emission
*Target:* ~150-200 lines, clean-room per blanket policy.
*When to build:* wait for a real research-writing task to validate the
design against actual document patterns. Building preemptively risks
tuning for my guess at Craig's workflow rather than his real one.
Triggers that would prompt "let's build it now":
- Starting a white paper / proposal that needs citation discipline
- Writing a technical blog post with external references
- A pattern of hitting the same research-writing friction 3+ times
Upstream reference (do not vendor): ComposioHQ/awesome-claude-skills
=content-research-writer/SKILL.md=.
** TODO [#D] Revisit =c4-*= rename if a second notation skill ships :chore:
:PROPERTIES:
:LAST_REVIEWED: 2026-06-10
:END:
Current naming keeps =c4-analyze= and =c4-diagram= as-is (framework prefix
encodes the notation; "C4" is a discoverable brand). Suite membership is
surfaced via the description footer, not the name.
If a second notation-specific skill ever lands (=uml-*=, =erd-*=, =arc42-*=),
the compound pattern =arch-analyze-<notation>= / =arch-diagram-<notation>=
starts paying off: alphabetical clustering under 'a' amortizes across three+
skills, and the hierarchy becomes regular. At that point, rename all
notation skills together in one pass.
Trigger: adding skill #2 in the notation family. Don't pre-rename.
Candidate future notation skills (not yet in scope — noted for when a
real need arrives, not pre-emptively):
- *UML* (Unified Modeling Language): OO design notation, 14 diagram types
in practice dominated by class / sequence / state / component. Common
in DoD / safety-critical / enterprise-architecture contexts. Tooling:
PlantUML (text-to-diagram), Mermaid UML, draw.io. Would likely split
into =uml-class=, =uml-sequence=, =uml-state= rather than one monolith
— different audiences, different inputs.
- *ERD* (Entity-Relationship Diagram): database schema modeling —
entities, attributes, cardinality. Crow's Foot notation dominates
practice; Chen is academic; IDEF1X is DoD-standard. Tooling:
dbdiagram.io, Mermaid ERD, PlantUML, ERAlchemy (code-to-ERD for SQL).
Natural fit as =erd-analyze= (extract from schema/migrations) and
=erd-diagram= (generate from prose/model definitions).
- *arc42*: already partially covered by =arch-document= (which emits
arc42-structured docs). A standalone =arc42-*= skill would be
redundant unless the arc42-specific visualizations need separation.
Each answers a different question:
- C4 → "What systems exist and how do they talk, at what zoom?"
- UML class/sequence → "What does the code look like / what happens when X runs?"
- ERD → "What's the database shape?"
- arc42 → "What's the full architecture document?"
Deferred pending an actual need that's blocked on not having one of these.
*** DoD-specific notations (DeepSat context)
Defense-contractor work uses a narrower, different notation set than
commercial software. Document the trigger conditions and starting point
so a future decision to build doesn't have to re-derive the landscape.
**** SysML (Systems Modeling Language)
UML 2 profile, dominant in DoD systems engineering. Six diagrams account
for ~all practical use:
- *Block Definition Diagram (BDD)* — structural; like UML class but for
system blocks (components, subsystems, hardware).
- *Internal Block Diagram (IBD)* — parts within a block and how they
connect (flow ports, interfaces).
- *Requirement diagram* — unique to SysML; traces requirements to
satisfying blocks. Essential in regulated environments.
- *Activity diagram* — behavioral flow.
- *State machine* — same shape as UML.
- *Sequence diagram* — same shape as UML.
SysML v1.x is in the field; v2 is emerging but not yet adopted at scale
(as of 2026-04). Tooling dominated by Cameo Systems Modeler / MagicDraw
and Enterprise Architect. Text-based option: PlantUML + =plantuml-sysml=
(git-friendly, growing niche).
*Candidate skills*: =sysml-bdd=, =sysml-ibd=, =sysml-requirement=,
=sysml-sequence=. Three or more in this cluster triggers the
=arch-*-<notation>= rename discussion from the parent entry.
**** DoDAF / UAF (architecture frameworks)
Not notations themselves — frameworks that specify *which* viewpoints a
program must deliver. Viewpoints are rendered using UML/SysML diagrams.
- *DoDAF (DoD Architecture Framework)* — legacy but still
contract-required on many programs.
- *UAF (Unified Architecture Framework)* — DoDAF/MODAF successor,
SysML-based. Gaining adoption on newer contracts.
Common required viewpoints (formal CDRL deliverables or PDR/CDR
review packages):
- *OV-1* — High-Level Operational Concept Graphic. The "cartoon" showing
the system in operational context with icons, arrows, surrounding
actors/environment. *Universally asked for — informal or formal.*
Starting point for any DoD diagram skill.
- *OV-2* — Operational resource flows (nodes and flows).
- *OV-5a/b* — Operational activities.
- *SV-1* — Systems interfaces. Maps closely to C4 Container.
- *SV-2* — Systems resource flows.
- *SV-4* — Systems functionality.
- *SV-10b* — Systems state transitions.
*Informal ask ("send me an architecture diagram") → OV-1 + SV-1 satisfies
90% of the time.* Formal CDRL asks specify the viewpoint set contractually.
*C4 gap*: C4 is rare in DoD. C4 System Context ≈ OV-1 in intent but not
in visual convention. C4 Container ≈ SV-1. Expect a mapping step or
reviewer pushback if delivering C4-shaped artifacts to a DoD audience.
*Candidate skills*: =dodaf-ov1=, =dodaf-sv1= first (highest-value);
=uaf-viewpoint= if newer contracts require UAF.
**** IDEF1X (data modeling)
FIPS 184 — federal standard for data modeling. Used in classified DoD
data systems, intelligence databases, and anywhere the government
specifies the data model. Same shape language as Crow's Foot but with
different adornments and notation conventions.
*Rule of thumb*: classified DoD data work → IDEF1X; unclassified
contractor work → Crow's Foot unless the contract specifies otherwise.
*Candidate skills*: =idef1x-diagram= / =idef1x-analyze= (parallel to a
future =erd-diagram= / =erd-analyze= pair).
**** Tooling baseline
- *Cameo Systems Modeler / MagicDraw* (Dassault) — commercial SysML
dominant in DoD programs.
- *Enterprise Architect (Sparx)* — widely used for UML + SysML + DoDAF.
- *Rhapsody (IBM)* — SysML with code generation; strong in avionics /
embedded (FACE, ARINC).
- *Papyrus (Eclipse)* — open source SysML; free but clunkier.
- *PlantUML + plantuml-sysml* — text-based, version-controllable. Fits a
git-centric workflow better than any GUI tool.
**** Highest-value starting point
If DeepSat contracts regularly require architecture deliverables, the
highest-ROI first skill is =dodaf-ov1= (or whatever naming convention
the rename discussion lands on). OV-1 is the universal currency in
briefings, proposals, and reviews; it's the one artifact that shows up
in every program regardless of contract specifics.
Trigger for building: an actual DoD deliverable that's blocked on not
having a skill to generate or check OV-1-shaped artifacts. Don't build
speculatively — defense-specific notations are narrow enough that each
skill should be driven by a concrete contract need, not aspiration.
** TODO [#C] Token-rotation helper for =@a-bonus/google-docs-mcp= OAuth refresh :feature:quick:
:PROPERTIES:
:LAST_REVIEWED: 2026-06-12
:END:
When a Google refresh token gets revoked (re-grant scopes, removed Connected App, account password reset), recovery is currently manual: run =npx -y @a-bonus/google-docs-mcp= with the right env, follow the URL in a browser, kill the process, base64-encode the new =token.json=, decrypt =secrets.env.gpg=, replace the var, re-encrypt. A small =mcp/refresh-google-docs-token.sh <profile>= would chain that into one command.
*** Sketch
#+begin_src bash
# usage: mcp/refresh-google-docs-token.sh personal
profile="$1"
gpg -d ... | grep -v "GOOGLE_DOCS_${profile^^}_TOKEN_B64" > /tmp/secrets.env.tmp
GOOGLE_MCP_PROFILE="$profile" npx -y @a-bonus/google-docs-mcp &
xdg-open <captured-url>
# wait for ~/.config/google-docs-mcp/$profile/token.json to land
kill %1
echo "GOOGLE_DOCS_${profile^^}_TOKEN_B64=$(base64 -w0 ~/.config/google-docs-mcp/$profile/token.json)" >> /tmp/secrets.env.tmp
gpg -c --cipher-algo AES256 -o mcp/secrets.env.gpg.new /tmp/secrets.env.tmp
mv mcp/secrets.env.gpg.new mcp/secrets.env.gpg
rm /tmp/secrets.env.tmp
#+end_src
The flow tonight worked but took a handful of manual steps. One script collapses it.
Decision (Craig, 2026-05-31): *hold until a token rotation is imminent.* The OAuth re-grant is a browser step that can't be triggered without revoking a live token, so the script can't be verified in isolation. Not marked =:solo:= — when a token actually needs rotating, write and verify in one pass (solo at that point).
** TODO [#C] Generic agent runtime support — Codex spec v0 :spec:design:
:PROPERTIES:
:LAST_REVIEWED: 2026-06-12
:END:
Codex drafted a v0 design doc for making rulesets runtime-neutral rather than Claude-Code-specific. Motivating cases: offline operation with a local LLM, and two LLMs running in the same project at the same time without trampling each other's session-context.
Spec at [[file:docs/design/2026-05-28-generic-agent-runtime-spec.org]] (moved here from inbox on intake).
Immediate correctness issue Codex flagged: the singleton .ai/session-context.org is unsafe under simultaneous agents. Codex recommends starting with Phase 1 only — add AI_AGENT_ID + session-context.d/<id>.org without renaming the rest.
Broader refactor proposes runtimes/ adapter manifests, generic install commands, language-bundle split (common/ + runtimes/<runtime>/), launcher refactor, local model service via llama.cpp/ollama. Big surface area, six phases.
2026-06-12 spec review complete: [[file:docs/design/2026-05-28-generic-agent-runtime-spec-review.org][Codex review]] rubric for the whole spec is =Not ready=. Phase 1 is already shipped, and Phase 1.5 is tracked separately as the helper-instance task. Before any phases 2-5 implementation, decide whether to commit to the larger arc and answer the blocker decisions: generic instruction-file strategy, default local runtime/server, first supported local editing CLI, adapter scope, and compatibility behavior for existing =CLAUDE.md= / =.claude/= projects.
*** 2026-06-10 Wed @ 14:13:55 -0500 Noted Phase 1 already shipped; narrowed scope to the phases 2-6 decision
Phase 1 (the correctness fix) is live: protocols.org documents the AI_AGENT_ID-scoped session-context path (=.ai/session-context.d/<id>.org=) and =.ai/scripts/session-context-path= resolves it. The singleton race Codex flagged is closed. What remains is the spec review plus a go/no-go on the broader runtime-neutral refactor: runtimes/ adapter manifests, generic install commands, language-bundle split, launcher refactor, local model service.
*** 2026-06-11 Thu @ 19:26:26 -0500 Spec amended with the helper-instance slice; implementation split out
Craig's motivating case (a second Claude in the same project for lookups and safe task updates) was under-specified in v0 — it had identity and message targeting but no spawn mechanics and no write-safety contract for the shared files the session-context split doesn't isolate. Added the "Concurrent same-project agents (helper instances)" section (subagent boundary, identity/spawn via =ai --helper=, the tiered read/write contract, light startup, helper wrap-up) and Phase 1.5 to the migration plan. Implementation filed as its own [#B] task ("Helper-instance support"); this task stays scoped to the phases 2-6 go/no-go.
*** 2026-06-12 Fri @ 02:09:10 -0500 Independent spec review complete
Codex ran the spec-review workflow. Outcome: the combined spec is =Not ready= because phases 2-5 still require product decisions and current external-runtime/model verification. Phase 1.5 can proceed only as the already-split helper task, with rollout/manual-validation caveats accepted and no accidental template-wide release before sandbox/pilot drills pass. Review file: [[file:docs/design/2026-05-28-generic-agent-runtime-spec-review.org]].
*** 2026-06-12 Fri @ 02:39:38 -0500 Second review after response pass
Codex re-ran spec-review after the dispositions were folded in. Outcome by arc: Phase 1.5 helper instances =Ready with caveats=; phases 2-5 remain =Not ready= behind the explicit decisions/reverification gate. No new blocking findings for the helper slice. Review file updated in place: [[file:docs/design/2026-05-28-generic-agent-runtime-spec-review.org]].
* Rulesets Resolved
** DONE [#C] Fix =cj-scan= false positives on cj fences nested inside other =#+begin_*= blocks :bug:
CLOSED: [2026-05-15 Fri]
=cj-scan.py= was matching =#+begin_src cj:= / =#+end_src= line-by-line
without awareness of enclosing block scopes. A cj fence embedded inside a
=#+begin_example= block (typically when documenting what the =<cj= yasnippet
emits) or inside =#+begin_src snippet= (the yasnippet definition itself) was
misclassified as a live cj annotation. Surfaced from a /respond-to-cj-comments
run against the dotemacs =todo.org= that reported two false positives in the
=<cj= yasnippet documentation.
Fix: track an active =wrapper_type= state. When the scanner sees =#+begin_<type>=
(for any =<type>= other than =cj:= via the more-specific cj-open regex, which
is checked first), it enters a wrapper state where every line is treated as
content until the matching =#+end_<type>= closer fires. Inside a wrapper, cj
fence patterns and legacy inline =cj:= lines are both suppressed.
Tests: added =TestCjScanNestedFencesIgnored= (6 tests) to
=claude-templates/.ai/scripts/tests/test_cj_scan.py= covering nesting inside
=#+begin_example=, =#+begin_src <other-lang>=, and =#+begin_quote=, plus
regression guards that a wrapper closes cleanly (a subsequent real cj fence
is still detected) and that an unclosed wrapper doesn't silently swallow
later content into false-positive cj blocks.
Full =make test-scripts= equivalent (=python3 -m pytest=): 302 passed, 1
skipped, 0 failures.
** DONE [#A] Add =make doctor= — verify ~/.claude/ matches repo + settings.json :feature:
A drift detector that scans =~/.claude/= and reports anything inconsistent with what the repo expects. Single-command answer to "is my machine consistent with rulesets?"
*** Why this matters
A 2026-05-06 sweep found =~/.claude/hooks/= didn't exist on this machine even though =settings.json= referenced =~/.claude/hooks/precompact-priorities.sh= as a PreCompact hook. Compaction would have silently failed to invoke the hook. The fix was =make install-hooks=, but the breakage was invisible until I happened to grep for it. =make doctor= run regularly (or even as part of session start) would catch this kind of drift in seconds instead of after the fact.
*** Checks
- Every entry in =settings.json= ="hooks"= block points at a file that exists.
- Every entry in =enabledPlugins= has a matching install under =~/.claude/plugins/data/=.
- Every skill in =$(SKILLS)= has a working symlink at =~/.claude/skills/<name>=.
- Every rule in =$(RULES)= has a working symlink at =~/.claude/rules/<name>=.
- Every default hook has a symlink at =~/.claude/hooks/<name>= (warn-only — opt-out is legitimate).
- =settings.json= and =.mcp.json= symlinks resolve to the rulesets versions.
- =mcp/install.py= state matches =claude mcp list= (every server in =servers.json= is registered).
- No dangling symlinks anywhere under =~/.claude/=.
*** Output
One line per check: =ok= / =WARN= / =FAIL=. Final summary: =N ok, M warnings, K failures=. Exit non-zero on any failure so it can ride a pre-flight check.
** DONE [#A] Build =voice= skill — combine =humanizer= with universal + personal style passes :feature:
Combine =humanizer= with universal good-writing passes (Strunk & White, Orwell, Plain English) and the personal-style passes from =commits.md=. Two modes — =general= for arbitrary writing, =personal= for commits/PRs/comments — share a foundation and diverge on register.
Built and shipped 2026-05-07: =voice/SKILL.md= with 39 numbered patterns walked sequentially. Patterns 1-25 carried over from humanizer, 26-31 are universal good-writing additions, 32-39 are personal-only. Migrated three callers (=commits.md=, =respond-to-cj-comments.md=, =start-work.md=). Removed the standalone =humanizer= skill since voice supersedes it.
*** Why this matters
Three transformations want to run together for personal-mode artifacts (commits, PR titles + bodies, PR comments) but lived in three places: =humanizer= as a skill, S&W-style universal rules nowhere (applied ad-hoc), and the personal-style passes as prose steps in =commits.md= that got re-applied by hand each time. Costs: (1) the "I forgot pass (e)" failure mode — skipping a pass without flagging is a defect but happens in practice. (2) No single-call invocation of the full transform. (3) General-mode writing (research notes, philosophy, history) got only humanizer with no universal-prose pass at all. Combining brings them under one skill with one invocation.
*** Design
Two modes:
- *general* (default) — for arbitrary writing not bound for commit/PR/comment publishing (research notes, philosophy/history essays, emails, README prose). Runs:
- humanizer (current behavior — strip AI-generated-writing fingerprints)
- tier-1 universal passes (canonical good-writing rules)
- the 2 personal-style passes that have no register conflict (jargon-fragment rewrite, noun-ified verbs)
- *personal* — for commits, PR titles + bodies, PR comments. Runs general PLUS:
- 8 personal-only passes (first-person rewrite, semicolons, contractions, sentence-split, felt-experience, sentence fragments, terse cut, public-artifact scope check)
The 8 personal-only passes are explicitly *not* in general mode. They conflict with academic / literary / philosophical register. Forcing first-person on a Foucault essay or stripping felt-experience from a journal entry would damage the writing.
*** Tier 1 universals (v1)
From Strunk & White, Orwell's "Politics and the English Language", Plain English Campaign, and Garner's Modern English Usage. Each is a detection-pattern + rewrite-rule pair, mechanical enough to apply consistently across runs.
- *Omit needless words* — curated phrase list (=the fact that= → =that=/=because=, =in order to= → =to=, =at this point in time= → =now=, =due to the fact that= → =because=, =for the purpose of= → =to=, =in spite of= → =despite=, etc.)
- *Long word → short word* — Plain English wordlist (~150 entries: =utilize=→=use=, =commence=→=start=, =terminate=→=end=, =facilitate=→=help=, =demonstrate=→=show=, =sufficient=→=enough=, =prior to=→=before=, =subsequent to=→=after=, =in the event that=→=if=, =a great deal of=→=much=)
- *Active over passive voice* — detect "to be + past-participle" patterns. Suggestion-only in v1 (auto-rewrite is risky in technical contexts where passive is appropriate); graduate to auto-rewrite for unambiguous cases in v2.
- *Comma splices* — detect independent clauses joined only by comma; rewrite to period or semicolon-then-period.
- *Cliché flag* — small curated list (=at the end of the day=, =moving forward=, =going forward=, =at this juncture=, =circle back=, =low-hanging fruit=, =deep dive=, =leverage= as verb).
*** Tier 2 universals (v2)
- *Positive over negative form* (S&W) — =not unlike= → =like=, =do not fail to= → =remember to=, =did not pay any attention= → =ignored=
- *Garner-style word-pair corrections* — comprise/compose, less/fewer, that/which (restrictive vs nonrestrictive), affect/effect, principal/principle
- *Parallelism in lists* — detect mismatched grammar in bullet items
- *Tense consistency* — flag mid-paragraph tense shifts
- *Acronym definition on first use* — detect uppercase tokens used before being expanded
*** Tier 3 (v3, may not land)
- *Concrete-over-abstract* preference
- *Emphatic word at sentence end* (S&W rule 18)
- *Vary sentence length / rhythm*
- *Reading-grade-level scoring* (Hemingway-style)
*** Personal-style pass placement
| # | Pass | Mode | Why |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 1 | First-person voice rewrite | personal only | Forces "I" voice; wrong for |
| | | | academic prose where third-person |
| | | | and "we" are conventional |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 2 | Jargon-fragment → complete sentence | both | Universal clarity, no genre |
| | | | conflict |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 3 | Semicolon → period/comma | personal only | Semicolons are conventional in |
| | | | long-form / academic prose |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 4 | Contractions ("it's", "don't") | personal only | Academic and formal writing |
| | | | typically avoids contractions |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 5 | Sentence split on conjunctions | personal only | Foucault, Hegel, Adorno |
| | | | deliberately use long compound |
| | | | sentences |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 6 | Felt-experience narration ("I'll | personal only | Personal essays *use* |
| | feel this every time") | | felt-experience as content |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 7 | Noun-ified verbs ("the ask", "a | both | Targets corporate-speak with |
| | learn", "the spend") | | curated wordlist; doesn't catch |
| | | | philosophical nominalizations like |
| | | | "the becoming" |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 8 | Sentence fragments → complete (in | personal only | Fragments are valid stylistic |
| | prose) | | devices in literary prose |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 9 | Terse cut (rhetorical padding: | personal only | Tier 1 omit-needless-words covers |
| | "worth noting", "it's important to | | the worst offenders universally; |
| | understand") | | aggressive cut conflicts with |
| | | | academic register |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
| 10 | Public-artifact scope check (local | personal only — *flag-only*, no | Operational/safety check, not |
| | paths, private repos, personal | auto-rewrite | stylistic; auto-masking risks |
| | tooling) | | silently editing meaningful text |
|----+-------------------------------------+-------------------------------------+-------------------------------------|
*** Inclusive-language pass — explicitly excluded
Considered and rejected. Conflicts with planned writing on philosophy/history topics (Foucault on sexuality and gender, history of slavery in New Orleans). Wordlist substitutions would override deliberate vocabulary choices in those genres.
*** V1 scope
- [ ] Skill at =~/code/rulesets/voice/= with =SKILL.md=
- [ ] Frontmatter with positive triggers (commit, PR, comment, "humanize", "voice pass") and negative triggers (code, structured data, plain bullet lists)
- [X] Mode invocation: default = =general= when invoked bare; =personal= invoked explicitly by publish-context callers
- [X] humanizer content migrated from =humanizer/= → =voice/=
- [X] Tier 1 universal passes implemented (5 patterns: #26-30, plus #31 noun-ified verbs as a universal personal addition)
- [X] 2 personal passes that run in both modes (#30 jargon-fragment, #31 noun-ified verbs)
- [X] 8 personal passes that run in personal mode only (#32 first-person, #33 semicolons, #34 contractions, #35 sentence-split, #36 felt-experience, #37 fragments, #38 terse cut, #39 scope check)
- [X] Each pass = detection-pattern + rewrite-rule pair (#39 is detection + flag-only)
- [X] Total v1 pattern count: 31 in general mode (humanizer's 25 + 4 tier-1 + 2 universal personal); +8 personal-only = 39 in personal mode
- [X] Update =commits.md= to invoke =/voice personal= instead of "run =humanizer= and apply five passes manually"
- [X] Remove the existing =humanizer/= skill (no callers outside this repo, all migrated)
- [X] =make doctor= still passes
- [X] =make lint= clean
*** v2 (deferred)
- [ ] Tier 2 universals (positive form, word-pair corrections, parallelism, tense consistency, acronym definition)
- [ ] Per-pass severity flags for Tier 1 active-voice (suggestion-only when actor is implicit; auto-rewrite when actor is named)
- [ ] Reporting mode: list which passes fired and which were no-ops
*** v3 (aspirational, may not land)
- [ ] Tier 3 (concrete-over-abstract, emphatic-word position, sentence-length variation, reading-grade scoring)
- [ ] Progressive disclosure split: =voice/SKILL.md= orchestrator + =voice/passes/<pass-name>.md= per pass with worked examples
*** Migration (resolved)
Decision: deleted =humanizer/= entirely. Three callers (=commits.md=, =respond-to-cj-comments.md=, =start-work.md=) all updated to invoke =/voice= directly. No alias needed since nothing outside the repo invoked humanizer.
*** Naming alternatives considered
- =voice= — chosen. Captures both modes; broad enough.
- =polish= — descriptive of multi-pass nature; less prescriptive about whose voice.
- =house-style= — signals "this is the house style"; appropriate for personal repo.
- =commit-voice= — too narrow (passes apply to research notes, emails, etc. in general mode).
- =humanize= (extending current) — undersells the universal + personal additions.
*** Open questions before implementation
Resolved during implementation:
- Default mode when =/voice= is invoked bare: =general=. Personal-context callers (=commits.md= publish flow, =respond-to-cj-comments.md=) invoke =/voice personal= explicitly. Avoids accidentally first-person-ifying research notes.
- Reporting: skill prints "Summary of changes" listing which patterns fired (audit value).
- Public-artifact scope check (#39): flag-only, user resolves manually. Blocking would frustrate on legitimate path mentions.
- Tier 1 active-voice detection: suggestion-only in v1. Auto-rewrite for unambiguous cases deferred to v2.
** DONE [#B] Add =--archive-done= mode to =.ai/scripts/todo-cleanup.el= :feature:
Opt-in mode that moves every level-2 subtree whose TODO state is DONE or CANCELLED out of the "Open Work" section and into the "Resolved" section of the same org file, subtree intact.
- *Section matching.* Key on a top-level heading containing "Open Work" and one containing "Resolved" — that pairing is the only naming consistent across projects (=Work Open Work= / =Work Resolved= here; bare =Open Work= / =Resolved= elsewhere). Require exactly one match for each; otherwise skip with a clear message, no crash.
- *Modes.* =--check= previews and writes nothing, same as the existing hygiene pass. Idempotent. Not run by default in the wrap-up flow — archiving is consequential, so it stays opt-in: =emacs --batch -q -l todo-cleanup.el --archive-done FILE=.
- *Edge cases.* Source or target section missing; subtree at EOF; nested DONE subtree under an open parent stays put (only level-2 entries move); nothing to move → clean no-op.
- *Tests.* TDD with ERT — the project's first elisp tests. Fixtures (synthetic) under =.ai/scripts/tests/=; run via =make test= (rulesets) or =make test-scripts= (claude-templates), which run pytest + every =tests/test-*.el= ERT suite. Cases: one DONE level-2 moves; multiple; CANCELLED also moves; structural (no-state) headings don't move; nested DONE under an open parent stays; level-2 DONE with open level-3 children moves intact; subtree at EOF; missing source/target section; ambiguous "Resolved"; lowercase headings; nothing-to-do; idempotency; =--check= preview + its idempotency; realistic-sample integration.
Origin: came up while scrubbing a project's todo.org on 2026-05-11 — moving a big completed PROJECT subtree (plus a few smaller ones) into the Resolved section by hand was the cue to build a reusable tool.
Built and shipped 2026-05-11: =--archive-done= added to =.ai/scripts/todo-cleanup.el= test-first; 13-test ERT suite (=tests/test-todo-cleanup.el=) + realistic synthetic fixture (=tests/fixtures/todo-sample.org=), wired into =make test= / =make test-scripts= alongside pytest. The CLI dispatch moved into =tc-main= behind a guard so the suite can =require= the file without firing it. Section matching is case-insensitive and tolerates the =<Project> Open Work= / =<Project> Resolved= naming variants. Opt-in only — not wired into the wrap-up flow. Source of truth is =~/projects/claude-templates/=; rsync'd into this repo.
** DONE [#B] Encode follow-up filing rules into =/start-work=
CLOSED: [2026-05-15 Fri]
Phase 4 step 5 of =/start-work= ("refactor audit") says any candidate that isn't fix-now must land in one of three buckets: fold-into-related-commit, separate =refactor:= commit, or "file a ticket or todo.org entry." The third disposition doesn't say *where* — which leaves the orchestrator picking a location ad-hoc. Result: follow-ups buried under children of an epic parent get orphaned when the parent closes, or follow-ups for standalone tasks scatter across the file with no convention.
Proposed placement rule (already memorized for this project as =feedback_followups_as_siblings.md=, generalizing):
- *Epic-style parent task* (level-2 with multiple level-3 children) → follow-ups file as level-2 *siblings* of the parent. Stays visible after parent closure.
- *Standalone task* (level-2 with no children, or a level-3 inside another structure) → follow-up files as a new level-2 top-level entry in the same =* Open Work= section. Don't nest under the originating task.
Both cases: include a "Triggered by: <date> <task or commit>" line so a future reader sees what surfaced it.
Update =.claude/commands/start-work.md= Phase 4 step 5's "Disposition for each candidate" section to spell this out. Update any cross-references in =commits.md= or other files that touch the discipline.
Triggered by: 2026-05-15 fold-epic session — Craig flagged the gap mid-flight after I'd surfaced a follow-up but hadn't filed it.
** DONE [#A] Consolidate =.ai/= template infrastructure (fold + audit + install-ai + ratio) :feature:
CLOSED: [2026-05-15 Fri]
End-state: one repo (=rulesets=) is the single source of truth for =.ai/= template content. =make audit= verifies and applies drift across every =.ai/=-using project on the machine. =make install-ai= bootstraps new projects. Same setup propagated to ratio so both machines run the same way.
Today (2026-05-15) the canonical-source rule got violated again: rulesets commit =372fb76= added a wrap-up subsection to =rulesets= without going through =claude-templates= first, and the next session's startup rsync was about to silently undo it. Two-repo coordination is the root cause; fold solves it.
Build order: fold first (others depend on the new canonical path), then audit + install-ai in parallel, then test, then propagate to ratio.
*** DONE [#A] Fold =claude-templates= into rulesets
CLOSED: [2026-05-15 Fri]
Two repos, one source of truth. =~/projects/claude-templates/= is the canonical =.ai/= template that gets rsync'd into every project at session start. Keeping it standalone means a second =git pull= in startup Phase A.0, a second remote to push to at wrap-up, and a split history any time a change touches both. Folding it into =rulesets/claude-templates/= gives one repo to clone on a fresh machine and one place to edit templates.
**** Open design choices
- *History.* =git subtree add --prefix=claude-templates ~/projects/claude-templates main= preserves the 84-commit history under the new prefix. Plain content copy (=cp -a= + =git add=) is simpler but loses history. Either is fine since the standalone repo stays archived on =cjennings.net=.
- *Layout.* =rulesets/claude-templates/= mirrors the old repo name and sits next to =claude-rules/= cleanly. Alternative: absorb =.ai/= directly under a different name (=rulesets/.ai-template/= or similar). First option is clearer.
- *bin/ai.* The standalone Makefile symlinks =$HOME/.local/bin/ai → bin/ai=. After the move, fold that into rulesets' Makefile as another install target.
**** Mechanical steps
1. Subtree-merge or copy =~/projects/claude-templates/= into =rulesets/claude-templates/=.
2. Update 3 references in rulesets:
- =.ai/protocols.org= line 163 — pointer in the "Let's run/do the X workflow" section.
- =.ai/workflows/cross-agent-comms.org= line 8 — promotion-target path.
- =.ai/workflows/startup.org= lines 22, 96-98 — Phase A.0 pull + Phase A rsync sources.
3. Update Phase A.0 of =startup.org= to pull rulesets instead of claude-templates. Inside rulesets sessions, the existing project-repo pull already covers it. Outside rulesets (every other project's session), Phase A.0 needs an explicit =git pull= on =~/code/rulesets/= before the rsync — otherwise the templates will be stale.
4. Replace =~/projects/claude-templates/= with a symlink to =~/code/rulesets/claude-templates/= for transition continuity.
5. After every active project has had one session start (and rsync'd the new =startup.org=), drop the symlink and archive =cjennings.net:git/claude-templates.git=.
**** Bootstrap gap
Every project on the machine has a =.ai/workflows/startup.org= that rsyncs from =~/projects/claude-templates/=. Until each project's startup.org gets refreshed (which happens via the rsync itself), the old path needs to keep resolving. The symlink at step 4 is the bridge: old paths resolve into the new location, the rsync delivers the updated startup.org, next session uses the new path directly.
*** DONE [#A] Add =make audit= — drift detector across all =.ai/=-using projects
CLOSED: [2026-05-15 Fri]
Companion to =make doctor= (single-machine scope, checks =~/.claude/=). =audit= is cross-project scope: walks every directory on the machine that has a =.ai/=, diffs the synced template files against the canonical source, and reports drift. =--apply= flag rsyncs the drift into the project's working tree (no auto-commit). Catches stale projects without forcing a session start in each one.
**** Open design choices
- *Scope.* Template-sync drift is the useful flavor: for each project, diff =.ai/protocols.org=, =.ai/workflows/=, =.ai/scripts/= against the canonical source.
- *Source path.* Post-fold: =~/code/rulesets/claude-templates/.ai/=. Build =audit= against the new path from day one.
- *Project discovery.* Walk =~/code/=, =~/projects/=, =~/.emacs.d/= up to depth 3 for any directory containing =.ai/=. Skip the canonical source itself.
- *Default mode is report-only.* =--apply= triggers rsync; =--force= overrides the dirty-skip safety.
**** Per-project flow (designed 2026-05-15)
For each discovered project, in order:
1. Verify =.ai/= exists (path probe). If missing → =FAIL=, skip, continue loop.
2. Detect git tracking via =git check-ignore .ai/= → =tracked= or =gitignored=.
3. Verify no uncommitted =.ai/= changes (=git status --porcelain .ai/=). Dirty → =WARN=, skip rsync unless =--force=.
4. Verify content matches canonical via three =rsync -a --dry-run --itemize-changes= calls (=protocols.org=, =workflows/=, =scripts/=). Zero items = clean.
5. Action (=--apply= only, drift detected): three =rsync -a [--delete]= calls.
6. Verify rsync converged (re-run the dry-runs; zero now).
7. Verify working-tree state after rsync (tracked projects). Report deltas. Do not auto-commit.
8. Verify no unpushed =.ai/= commits (=git log @{u}..HEAD -- .ai/=). Informational only.
**** Output format (mirrors =doctor=)
#+begin_example
Claude-templates source:
ok rulesets/claude-templates is current (origin/main)
Per-project .ai/ drift:
ok ~/projects/work
applied ~/projects/homelab 3 files changed
skipped ~/code/winvm uncommitted .ai/ (use --force)
ok ~/projects/clipper
Summary: 18 ok, 3 applied, 1 skipped, 0 failed
#+end_example
Exit code: =0= if all clean, no skips, no failures. =1= otherwise.
**** Why not extend =make doctor= instead
=doctor= has a clean meaning today: "is this machine's =~/.claude/= consistent with rulesets?" Mixing in cross-project =.ai/= drift muddies the exit code. Keep them separate. =audit= can optionally invoke =doctor= as its last check since both ask "did the symlinks keep up with the source?". A future =make all-checks= can wrap both.
*** DONE [#A] Add =make install-ai PROJECT=<path>= — bootstrap =.ai/= in a fresh project
CLOSED: [2026-05-15 Fri]
Separate target from =audit= because operating on projects that lack =.ai/= is a distinct action. The absence might be intentional, so =audit= skips them. Bootstrap is explicit opt-in.
**** Flow
1. Refuse if =.ai/= already exists in =PROJECT=. Message: "already installed; use =make audit --apply= to update."
2. Verify =PROJECT= is a git checkout (warn if not — works without git, loses some lifecycle benefits).
3. Create =PROJECT/.ai/= directory.
4. Rsync canonical content: =protocols.org=, =workflows/=, =scripts/= (same three rsyncs as =audit=).
5. Seed =PROJECT/.ai/notes.org= from a canonical template with project-name placeholder.
6. Create empty =PROJECT/.ai/sessions/= (with =.gitkeep= for tracked projects).
7. Track or gitignore =.ai/=? Default: ask. Flag: =--track= / =--gitignore=.
8. Print next-steps banner: =make install-lang LANG=<lang> PROJECT=<path>=; open Claude Code in the project.
**** Symmetry with existing install targets
#+begin_example
make install-lang LANG=python PROJECT=/path # language bundle (existing)
make install-ai PROJECT=/path # .ai/ template (new)
make install-lang # no args → fzf-pick
make install-ai # no args → fzf-pick from
# ~/projects/* + ~/code/* dirs
# without an existing .ai/
#+end_example
*** DONE [#A] Test plan for audit + install-ai before propagating to ratio
CLOSED: [2026-05-15 Fri]
Test against the current state of this machine before pushing changes to ratio.
**** =make audit= tests
1. Dry-run report only (no =--apply=). Should show: claude-templates current; per-project drift; correct =ok=/=drift= classifications; summary line and exit code match.
2. After the fold lands, every project should be reported as drift (their =startup.org= still points at the old path). Run =--apply= → rsync converges. Re-run audit → all =ok=.
3. Manually edit one =.ai/workflows/foo.org= in a tracked project. Re-run audit → should report =skipped: uncommitted .ai/=. Run =--apply --force= → rsync clobbers the edit. Verify the edit is gone.
4. Manually delete one =.ai/= dir. Re-run audit → =FAIL: .ai/ missing=. Loop continues.
5. Idempotency: =--apply= twice in a row converges to all =ok= on the second pass.
**** =make install-ai= tests
1. Create =/tmp/test-fresh-project= as a git repo. Run =make install-ai PROJECT=/tmp/test-fresh-project=. Verify =.ai/= structure matches canonical, =notes.org= has placeholder, =sessions/= exists.
2. Run =make install-ai PROJECT=/tmp/test-fresh-project= again → should refuse (=.ai/= already exists).
3. Open Claude Code in the new project. Startup workflow runs cleanly (Phase A.0 + Phase A rsync should be a no-op since the install just ran).
4. fzf form: =make install-ai= with no args. Lists candidate dirs (=~/projects/*=, =~/code/*= without =.ai/=).
**** Pass criteria
- =audit= behavior matches the per-project flow spec for every classification path.
- =install-ai= produces a project indistinguishable from one that's been running sessions for a while.
- =make doctor= still passes 36/0/0 after all the work.
- =make test= (pytest + ERT) passes.
*** DONE [#A] Migrate projects on ratio (second machine)
CLOSED: [2026-05-15 Fri]
After local fold + audit + install-ai are working, propagate to ratio.
**** Steps
1. On ratio: =git -C ~/code/rulesets pull= — picks up the folded =claude-templates/= subdir and updated =Makefile= targets.
2. On ratio: archive or =mv= the standalone =~/projects/claude-templates/= aside, replace with symlink to =~/code/rulesets/claude-templates/= (same bridge mechanic as local).
3. On ratio: =make audit= → see drift across ratio's projects.
4. On ratio: =make audit --apply= → rsync into each tracked/gitignored project. Surface projects with uncommitted =.ai/= drift for manual handling.
5. On ratio: =make doctor= → catch any =~/.claude/= install drift (likely some, since ratio hasn't seen recent rulesets updates).
6. Verify by opening Claude Code in a few ratio projects. Startup should be a no-op or near-zero rsync.
**** Known unknowns
- Ratio may have its own project list overlapping with this machine's but not identical. =audit= discovers projects via the walk, so this is automatic.
- Ratio might have uncommitted =.ai/= work in some projects that this machine doesn't. =audit= surfaces them; handle case-by-case.
- If anything goes wrong, ratio's archived =~/projects/claude-templates/= is the safety net — restore the symlink target and re-run audit.
**** Adjacent: cross-machine memory sync
The =[#A] DOING= memory-sync investigation (todo.org:10) is adjacent. Both involve "make my Claude setup portable across machines." Coordinate so the memory-sync stow approach (if approved) doesn't conflict with this fold's symlink mechanics.
** DONE [#B] Document startup pull-ordering rule in protocols.org
CLOSED: [2026-05-15 Fri]
Phase A.0 of =startup.org= now pulls rulesets ff-only before the project repo
(shipped 2026-05-15 as part of the claude-templates fold — after the subtree
merge, there's no separate claude-templates pull, just rulesets-then-project).
The protocols.org paragraph stating the ordering and "resolve any issues
before proceeding" rule shipped 2026-05-15 in the =** Startup Pull Ordering=
subsection under =IMPORTANT - MUST DO=.
** DONE [#A] Build =/lint-org= skill + wrap-up integration
CLOSED: [2026-05-14 Thu]
Spec: [[file:.ai/specs/lint-org-skill-spec.md]]
A two-mode skill (=interactive=, =mechanical-only=) that runs =org-lint=,
auto-fixes safe categories (item-number, missing-language-in-src-block,
misplaced-planning-info, markdown-bold → single-asterisk), and walks judgment
items (broken local-file links, invalid fuzzy links, verbatim-asterisk false
positives, suspicious-language blocks) inline.
Wrap-up integration: =wrap-it-up.org= invokes
=/lint-org todo.org --mode=mechanical-only= after the existing
=todo-cleanup.el --archive-done= pass. Judgment items defer to a
carry-forward file that the next morning's daily-prep merges in, so
wrap-up never blocks on a judgment call.
Baseline that motivated this: the 2026-05-14 manual pass took =todo.org=
from 55 → 1 lint warnings across two commits (=0d10458= signal,
=9ad5b30= cosmetic). A nightly mechanical sweep keeps the count near
zero forever — each day's drift is small.
** DONE [#C] Test harness for =make audit= + =make install-ai= edge cases :test:
CLOSED: [2026-05-15 Fri]
Three edge cases from the fold-epic test plan were not exercised because they're destructive on real projects:
- =audit --force= clobbers uncommitted =.ai/= work — needs a project with intentionally dirty =.ai/= to verify the override path.
- =audit= reports =FAIL= when =.ai/= is missing — needs a project where the directory was deleted to verify the loop continues past the failure.
- =install-ai= fzf-pick form (no =PROJECT= arg) — needs interactive testing.
Build a self-contained test harness under =.ai/scripts/tests/= that spins up =/tmp/audit-test-projects/= with a known matrix of project states (clean, dirty, missing =.ai/=, pristine, etc.), runs the audit + install-ai targets against it, and asserts expected outputs. The harness should clean up after itself.
Pattern reference: bats or shell-based assertions (similar to the elisp ERT suites for =todo-cleanup= and =lint-org=, but for shell scripts).
Triggered by: 2026-05-15 fold-epic, child 4 test plan; commits =94782ee= (audit) + =d364cf2= (install-ai).
** DONE [#A] wrap it up mentions github, which isn't the remote for many projects. :chore:
CLOSED: [2026-05-16 Sat]
For many of them, git.cjennings.net mirrors to github.com, and github.com isn't the remote.
For many others, git.cjennings.net is the remote with no mirror.
Remove or replace the reference to github.com
** DONE [#B] Phase A startup blind to =claude-templates/inbox/= post-fold :bug:fold:
CLOSED: [2026-05-19 Tue]
Resolved on inspection: the bug is moot in current state. =inbox-send.py='s discovery scans =~/code/*= and =~/projects/*= single-level only, so =claude-templates/= (two levels under =~/code/=) is never a routable target; the 2026-05-15 incident was a one-time manual workaround because =rulesets/inbox/= didn't exist yet, and that root inbox was added in =470085f=. =claude-templates/inbox/= was removed 2026-05-15 and is no longer on disk.
Phase A's inbox check at =startup.org:107= runs =\ls -la inbox/= against the project root. Post-fold, the canonical's inbox sits inside the subtree at =claude-templates/inbox/= and never gets scanned. A 2026-05-15 cross-project handoff from a dotemacs session dropped a record there; the next rulesets session (this one) missed it at startup entirely. Picked up only when the working-tree drift surfaced during the publish flow.
Fix: extend Phase A's discovery to also scan =claude-templates/inbox/= when the canonical lives in-repo (i.e., when =claude-templates/.ai/= exists alongside =./.ai/=). The Phase B/C inbox-processing flow already handles per-file routing once a file is surfaced; the gap is only in discovery.
Adjacent question worth answering at the same time: should cross-project handoffs file into =./inbox/= at the project root (matching what Phase A already scans), or stay in =claude-templates/inbox/= and rely on the discovery fix? The =inbox-send= script's target-project logic is the place to settle that.
Triggered by: 2026-05-15 evening session, surfaced when committing the test-harness work.
** DONE [#A] Implement task-review daily-habit per spec
CLOSED: [2026-05-20 Wed]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-20
:END:
Spec: [[file:docs/design/task-review.org]]
Retires =wrap-it-up.org='s date-coverage scan and replaces it with a daily list-hygiene review (N=7 oldest-unreviewed top-level =[#A]= / =[#B]= / =[#C]= tasks per session, ~12-day rotation). Built as a pure Claude workflow — Shape B, no elisp; see the spec's Revision section for why the elisp approach was dropped.
Status:
1. [X] =task-review-staleness.sh= + bats (count + =--list= modes).
2. [X] =wrap-it-up.org= health check (threshold 30).
3. [-] =task-review.el= — dropped (Shape B is a pure workflow, not an Emacs mode).
4. [X] New =task-review.org= workflow + INDEX entry (the existing listing workflow was renamed to =open-tasks.org= to free the name).
5. [X] Startup nudge in template =startup.org= (threshold 7), not the project-only startup-extras layer.
6. [X] Smoke test against live =todo.org= — first cycle run 2026-05-20 (7 tasks reviewed: 3 re-grades, 1 cancellation, 1 bump-and-tag).
Triggered by: 2026-05-16 brainstorm on retiring the date-coverage scan.
** CANCELLED [#B] Build =ov-1= skill for DoDAF OV-1 (High-Level Operational Concept Graphic)
CLOSED: [2026-05-20 Wed]
Cancelled during the 2026-05-20 task review.
Triggered by SOFWeek (May 2026, Tampa) — DeepSat attending; DoD attendees
may ask for architecture diagrams. OV-1 is the universal informal
currency in DoD briefings ("show me the architecture" → OV-1 by default).
Priority upgrades to =[#A]= if Craig confirms scenario 2 below (personal
load-bearing need at the event); stays =[#B]= or drops to =[#C]= if
scenario 1 (team already covers it, future asset only).
*** Prior art (searched 2026-04-19)
No existing Claude Code skill exists for DoDAF / OV-1 / SV-1 / SysML.
- =anthropics/skills= — 17 skills, zero DoDAF/SysML/defense coverage.
- =awesome-claude-code= list — zero hits for DoDAF/OV-1/SysML/UAF.
- =mfsgr/sysml2dodaf= — empty repo (0 stars, no code). Vapor.
- =HowardKao-1130/mini-NEXEN= — broad SE methodology skill that
name-drops DoDAF as a trigger keyword; no artifact generation. 0 stars.
- =gaphor/gaphor= (Apache-2.0, 2.2k stars) — mature UML/SysML GUI
modeler. Not a skill; not a pipeline. Useful reference only.
Nearest prior art to lean on when building:
- DoDAF 2.02 Viewpoints & Models reference (dodcio.defense.gov) —
canonical OV-1 exemplars. Embed 3-5 layouts as skill =references/=.
- Pattern from existing =c4-diagram= skill — same shape (prose → diagram
spec), swap the viewpoint vocabulary to DoDAF.
- PlantUML for SV-1 (when that skill comes later); Mermaid or draw.io
XML for OV-1 lightweight visuals.
*** Build scope (when triggered)
*In scope:*
- Input: prose description of a system + its operational context.
- Output: structured OV-1 *spec* — performers, external actors (other
systems, forces, adversaries), relationships (data/control flows),
narrative captions, classification marking, legend requirements.
- DoDAF 2.02 completeness checklist as a quality gate — verify the
produced spec contains every element a correct OV-1 requires.
- Optional lightweight visual: draw.io XML or Mermaid approximation for
quick review; NOT a finished rendering.
*Out of scope:*
- Icon libraries, pictorial assets, finished PowerPoint export. OV-1
final art belongs to a designer or Craig in Visio/PowerPoint; the
skill's job is the spec and the check, not the slide.
- SV-1, SV-2, UAF, IDEF1X, other viewpoints. Build only when a
concrete need triggers each.
Estimate: 4-6 hours.
*** Craig's investigation before kickoff
1. Does DeepSat's systems-engineering or marketing team already have an
OV-1 (or the equivalent briefing artifact) for SOFWeek?
2. If yes (scenario 1) — skill is a future asset, not event-load-bearing.
Ship after SOFWeek. Priority drops to =[#C]=.
3. If no, or if the scenario is "Craig may need to produce/iterate an
OV-1 on the fly during the event" (scenario 2) — skill is load-bearing
for the event. Priority upgrades to =[#A]=; build before SOFWeek.
4. Confirm the classification level the skill needs to handle
(unclassified-only? or FOUO markings? affects the classification
block in the spec).
5. Confirm the target rendering format DeepSat uses for OV-1
deliverables (PowerPoint slide? Cameo? Visio? affects whether the
skill emits draw.io XML vs Mermaid vs pure structured spec).
*** Related
See also the DoD-specific notations section under the later TODO
(=c4-*= rename revisit) — OV-1 is flagged there as the highest-value
starting point across the DoD notation landscape (SysML, DoDAF/UAF,
IDEF1X). This entry is the execution plan for that starting point.
** DONE [#A] Split team-specific publishing rules out of commits.md :commits:
CLOSED: [2026-05-22 Fri]
Shipped 3cb467e. Moved the DeepSat publishing steps (Linear ticket-state, the Slack notification protocol + channel ID, the GHE host, the team merge norm, the Linear ticket-body structure) out of the global =claude-rules/commits.md= into =teams/deepsat/claude/rules/publishing.md=. The global file keeps the universal skeleton and uses seams ("run the project's publishing overlay here if present") like startup-extras. Added =install-team= (targeted per-project copy, keyed on PROJECT, never globally symlinked) and generalized =sync-language-bundle.sh= to keep team overlays fresh at startup (3 new bats; make test green).
Remaining deploy step (cross-project, surfaced to Craig): install the overlay into the DeepSat work project — =make install-team TEAM=deepsat PROJECT=<deepsat-path>= — so it actually loads there.
** DONE [#A] Define a /voice-unavailable fallback in the commits.md publish flow :commits:
CLOSED: [2026-05-22 Fri]
Added an "If =/voice= is unavailable" paragraph to the Single-skill gate in =commits.md=: walk the same patterns inline (the flow already names which matter), state the skill was unavailable and the pass was applied by hand ("/voice unavailable — patterns walked inline"), and flag the missing skill for install. The gate is the pattern walk, not the tooling. The original "=humanizer= unavailable" framing was moot (humanizer → /voice).
** DONE [#A] wrap-it-up Step 3.5 assumes GitHub-family remote :chore:quick:
CLOSED: [2026-05-22 Fri]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-20
:END:
Documented the assumption inline at =wrap-it-up.org= Step 3.5 (chose the lightweight path over a provider-agnostic rewrite): the =gh= lookup expects a GitHub-family host, holds today via DeepSat on GHE, flagged for update if a future Linear project lands on GitLab/Gitea/Bitbucket.
Triggered by: 2026-05-16 wrap-it-up github.com cleanup (audit of the same file).
Step 3.5 (Linear ticket-state hygiene) at =wrap-it-up.org:207= says "the project's GitHub remote — use =gh pr list ...=". Currently fine in practice: the step is Linear-gated, and the only Linear-using project is DeepSat (on =deepsat.ghe.com=, a GitHub-family host where =gh= works). Would break if a future Linear-using project lived on a non-GitHub host (gitlab, gitea, bitbucket). Either drop the GitHub-family assumption (provider-agnostic lookup, harder) or document the assumption explicitly so future projects know the step needs an update if they don't fit.
** DONE [#C] Review pass: tighten skills and rulesets after 2026-05-04 audit
CLOSED: [2026-05-22 Fri]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-20
:END:
All 55 grouped-index items dispositioned (2026-05-22): ~49 edited across skills, commands, rule files, hooks, and the two playwright skills; several came out moot post-audit (humanizer→voice, skills→commands, typescript ruleset added); the two commits.md items shipped as the team-overlay split + /voice fallback. Freshness-checked each item against current reality before editing.
Source notes used in this pass:
- C4 official docs: C4 is notation-independent; System Context and Container
diagrams are enough for most teams; every diagram needs title, key/legend,
explicit element types, and audience-appropriate abstraction.
[[https://c4model.com/diagrams][C4 diagrams]],
[[https://c4model.com/diagrams/notation][C4 notation]],
[[https://c4model.com/abstractions/component][C4 component]]
- arc42 docs: quality requirements need measurable scenarios; section 10
should reference top quality goals and capture lesser quality requirements
with specific measures. [[https://docs.arc42.org/section-10/][arc42 section 10]],
[[https://quality.arc42.org/articles/specify-quality-requirements][specifying quality requirements]]
- ADR references: ADRs capture one justified architecturally significant
decision and its rationale; Nygard's original guidance emphasizes short,
numbered, repository-stored records and superseding rather than rewriting old
decisions. [[https://adr.github.io/][adr.github.io]],
[[https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions][Nygard ADR article]]
- Playwright docs: prefer user-visible locators and web assertions; locators
auto-wait and retry; =networkidle= is discouraged for testing readiness.
[[https://playwright.dev/docs/best-practices][Playwright best practices]],
[[https://playwright.dev/docs/locators][Playwright locators]],
[[https://playwright.dev/docs/next/api/class-page][Playwright page API]]
- OWASP references: Top 10 2021 includes Broken Access Control,
Cryptographic Failures, Injection, Insecure Design, Security
Misconfiguration, Vulnerable and Outdated Components, Identification and
Authentication Failures, Software and Data Integrity Failures, Security
Logging and Monitoring Failures, and SSRF; WSTG adds a broader testing map
across configuration, identity, authn/z, sessions, input validation, error
handling, cryptography, business logic, client-side, and API testing.
[[https://owasp.org/Top10/2021/][OWASP Top 10 2021]],
[[https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/][OWASP WSTG]]
- V2MOM references: Salesforce calls the last M "Measures" and emphasizes a
simple alignment document with prioritized Methods, explicit Obstacles, and
measurable outcomes. [[https://trailhead.salesforce.com/content/learn/modules/selfmotivation/get-focused-with-your-personal-v2mom][Salesforce Trailhead personal V2MOM]],
[[https://www.salesforce.com/blog/?p=12][Salesforce V2MOM alignment]]
- Prompt research: the cited Meincke paper is titled "Call Me A Jerk:
Persuading AI to Comply with Objectionable Requests"; its scope is
persuasion increasing compliance with objectionable requests, not a general
proof that persuasion framing improves prompt quality.
[[https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5357179][SSRN paper]]
- Combinatorial testing references: NIST supports t-way combinatorial testing
and notes pairwise is one covering strength, with higher-strength arrays
useful for failures requiring more interacting factors.
[[https://www.nist.gov/publications/practical-combinatorial-testing-beyond-pairwise][NIST beyond pairwise]],
[[https://www.nist.gov/publications/combinatorial-software-testing][NIST combinatorial testing]]
*** Grouped index (for batching by area)
Each item below is a one-line summary of a sub-TODO further down. Tick the box when the matching sub-TODO is moved to =DONE=. Items are grouped by area so they can be batched (e.g., "do all Playwright items in one session").
**** Browser testing
- [X] [#A] =playwright-js=: locator/assertion-first guidance (replace raw CSS, =networkidle=)
- [X] [#B] =playwright-js= + =playwright-py=: reconcile headless/visible defaults
- [X] [#B] =playwright-js= + =playwright-py=: remove emoji console markers from examples
**** Frontend / UI
- [X] [#B] =frontend-design=: WCAG 2.2 alignment, accessibility non-optional
- [X] [#B] =frontend-design=: harmonize aesthetic guidance with anti-pattern rules
**** Security
- [X] [#A] =security-check=: OWASP 2021 + WSTG coverage
- [X] [#B] =security-check=: tooling and offline/network caveats
**** Combinatorial testing
- [X] [#B] =pairwise-tests=: t-way escalation guidance beyond pairwise
- [X] [#B] =pairwise-tests=: clarify negative value syntax + generator availability
**** V2MOM
- [X] [#A] =create-v2mom=: rename Metrics → Measures (Salesforce alignment)
- [X] [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog
- [X] [#B] =create-v2mom=: mitigation/owner fields for Obstacles
**** Prompt engineering
- [X] [#A] =prompt-engineering=: correct/narrow Meincke citation
- [X] [#B] =prompt-engineering=: eval-harness requirement for production prompts
**** Codify
- [X] [#B] =codify=: stale-entry review + privacy checks before writing project =CLAUDE.md=
**** Code review
- [X] [#A] =review-code=: resolve local-verification vs CI boundary
- [X] [#B] =review-code=: =CLAUDE.md= citation scope for public artifacts
- [X] [#B] =review-code=: relax three-strengths rule for tiny/failing diffs
**** PR / review responses
- [X] [#A] =respond-to-review=: remove review-process language from commit messages
- [X] [#B] =respond-to-review=: use unresolved threads + resolution state
- [X] [#B] =respond-to-cj-comments=: drop personal absolute paths from public-writing (moot — already clean)
- [X] [#B] =respond-to-cj-comments=: fallback when =humanizer= or =emacsclient= unavailable (moot — superseded by /voice + VERIFY pattern)
**** Branch workflow
- [X] [#A] =finish-branch=: fix base-branch detection
- [X] [#B] =finish-branch=: worktree-aware pull/merge safety
- [X] [#B] =start-work=: tool-availability + ceremony-scaling rules
- [X] [#B] =start-work=: claim-before-justify rollback risk
**** Tests / TDD
- [X] [#B] =add-tests=: fix missing =typescript-testing.md= reference or add ruleset (moot — ruleset now exists)
- [X] [#B] =add-tests=: explicit exceptions to "all three categories per function"
**** Debugging / RCA
- [X] [#B] =debug=: capture environment + recent-change context before hypotheses
- [X] [#B] =root-cause-trace=: constrain defense-in-depth to trust boundaries
- [X] [#B] =five-whys=: require evidence + counterfactual validation per why
**** Brainstorming
- [X] [#B] =brainstorm=: timebox + research/source rules for high-stakes designs
**** Architecture
- [X] [#B] =arch-decide=: timeless examples, drop unverifiable claims
- [X] [#B] =arch-decide=: standardize statuses + immutability language
- [X] [#B] =arch-design=: threat modeling + privacy/compliance as first-class inputs
- [X] [#B] =arch-design=: separate paradigms from tactical patterns
- [X] [#B] =arch-document=: arc42/Q42 quality scenarios
- [X] [#B] =arch-document=: staleness + ownership metadata for generated docs
- [X] [#B] =arch-evaluate=: confidence levels for framework-agnostic findings
- [X] [#B] =arch-evaluate=: report skipped tool checks explicitly
**** C4 modeling
- [X] [#A] =c4-analyze= + =c4-diagram=: notation/output fallback (not draw.io-only)
- [X] [#B] =c4-analyze= + =c4-diagram=: clarify abstraction boundaries
**** Global rules
- [X] [#B] =commits.md=: split DeepSat/Linear/Slack-specific from global rules → promoted to a top-level task (deferred for Craig)
- [X] [#A] =commits.md= + publish flows: =humanizer=-unavailable fallback → promoted to a top-level task (deferred; humanizer premise moot)
- [X] [#B] =verification.md=: explicit "unable to verify" reporting standard
- [X] [#B] =testing.md=: property-based + mutation testing as escalation paths
- [X] [#B] =testing.md=: soften absolute TDD with explicit spike protocol
- [X] [#B] =subagents.md=: capability/availability + cost checks
**** Languages
- [X] [#A] =python-testing.md=: revisit in-memory SQLite guidance
- [X] [#B] =python-testing.md=: separate "never mock ORM" from unit-test boundaries
- [X] [#B] =elisp.md=: drop tool-specific advice
- [X] [#B] =elisp-testing.md=: batch-mode + native-comp caveats
**** Hooks
- [X] [#A] =hooks/README.md=: include =destructive-bash-confirm.py= in install/settings snippets
- [X] [#A] =hooks/git-commit-confirm.py= + =gh-pr-create-confirm.py=: inspect message/body files referenced by =-F= / =--body-file=
- [X] [#B] =hooks/destructive-bash-confirm.py=: shell-aware command parsing (not regex)
*** 2026-05-22 Fri @ 15:47:10 -0500 Made playwright guidance locator/assertion-first, dropped networkidle-as-readiness
Rewrote the readiness guidance in both =playwright-js/SKILL.md= and =playwright-py/SKILL.md=: reconnaissance now waits for a visible app landmark via a web assertion or locator (=expect(...).toBeVisible()= / =get_by_role(...).wait_for()=), not =networkidle= (which Playwright discourages). Updated the login/form examples to =getByLabel=/=getByRole= + web assertions, the API_REFERENCE.md waiting section, and =lib/helpers.js= defaults (=waitForPageReady= now defaults to =load= and prefers a caller-supplied landmark; =authenticate= races the success indicator over a =load= navigation). node --check passes.
*** 2026-05-22 Fri @ 14:23:02 -0500 Added headed/headless decision tables to both playwright skills
Added matching purpose-based decision tables to =playwright-js/SKILL.md= (was "always visible") and =playwright-py/SKILL.md= Best Practices (was "always headless"). Each names its own default and points at the other skill, so the difference is deliberate, not a habit-flip: headed for interactive debugging, headless for CI/pytest. Also softened the absolutist "Always launch... headless" comment in the py example.
*** 2026-05-22 Fri @ 15:47:10 -0500 Removed emoji console markers from the playwright skills
Replaced every emoji status marker with a plain ASCII prefix across =playwright-js/= (run.js, lib/helpers.js, SKILL.md) and =playwright-py/= (SKILL.md, examples/*.py): 📦/⚡/📄/📥/🎭/🚀/📋/✅/❌/🔍/📸/✓/✗ → =[setup]=/=[run]=/=[ok]=/=[error]=/=[fail]= etc. Post-change emoji grep is clean (excluding node_modules); node --check and py_compile pass.
*** 2026-05-22 Fri @ 14:35:16 -0500 Made accessibility a non-optional WCAG 2.2 gate in frontend-design
Added an "Accessibility Gate (required before handoff)" section to =frontend-design/SKILL.md= covering keyboard operation, focus visibility, focus-not-obscured (2.2), target size (2.2), contrast, reduced motion, labels, and semantic structure — a baseline for all frontend work, not just interactive components. Rewrote the Build/Review phases to build accessibly as you go and clear the gate before handoff, and bumped =references/accessibility.md= from WCAG 2.1 to 2.2 with backing detail for the new criteria.
*** 2026-05-22 Fri @ 14:35:16 -0500 Added a "creative but bounded" section to frontend-design
Added a subsection under Frontend Aesthetics framing the bold/maximalist directions as tools, not obligations: domain fit, readability first, responsive stability, and no decorative effect that degrades the workflow. Reconciles rather than contradicts the maximalist encouragement (maximalism stays on the table as deliberate usable density), and ties the readability bullet to the new accessibility gate.
*** 2026-05-22 Fri @ 14:35:16 -0500 Updated security-check to OWASP Top 10 2021 + WSTG mapping
Replaced the older six-category list in =.claude/commands/security-check.md= with the full Top 10 2021 set, each finding mapped to a 2021 category or WSTG area. Added the four missing categories (Insecure Design, Software and Data Integrity Failures, Security Logging and Monitoring Failures, SSRF) plus explicit checks for object/function-level authorization, SSRF on URL-fetch paths, update/plugin/dependency integrity, and logging/monitoring gaps.
*** 2026-05-22 Fri @ 14:35:16 -0500 Added scanner tooling + network caveats to security-check
Added an optional configured-scanners step (=gitleaks=/=trufflehog= secrets, =semgrep= source patterns, OSV scanner, lockfile-diff review) that supplements the manual scans, plus a network caveat: dependency audits that can't run (offline, tool absent, DB unreachable) must report "not run" naming the tool and reason, never read as a pass. Carried that into the no-issues summary.
*** 2026-05-22 Fri @ 14:35:16 -0500 Added t-way escalation guidance to pairwise-tests
Added an "Escalating Beyond Pairwise (t-way)" subsection: start with pairwise across the whole space, then escalate specific high-risk clusters to 3-way+ when history, safety, security, or domain coupling says a fault needs more than two interacting factors. Lists escalation triggers and shows the sub-model order syntax (={ A, B, C } @ 3=) vs a blanket =/o:3= bump, stressing targeted not uniform escalation. Cites NIST combinatorial-testing work.
*** 2026-05-22 Fri @ 14:35:16 -0500 Clarified PICT ~ syntax + honest generator-availability path in pairwise-tests
Added a "~ prefix" explanation (PICT marker tagging a value as negative/invalid, not an arithmetic operator; PICT pairs negatives with valid values once and strips the marker before the SUT) and a stop-at-the-model rule: if neither the =pict= binary nor =pypict= is present, produce the model and stop rather than hand-writing a table and passing it off as PICT output.
*** 2026-05-22 Fri @ 14:43:17 -0500 Renamed Metrics → Measures throughout create-v2mom
Full rename across =.claude/commands/create-v2mom.md= (acronym expansions, Phase 7 heading, the "Measures must be measurable" principle, exit criteria, review questions, red flags, examples) to match Salesforce's official term. Kept the "vanity metrics" idiom intact — it's the anti-pattern term, not a section reference.
*** 2026-05-22 Fri @ 14:43:17 -0500 Split strategy from execution in create-v2mom task migration
Rewrote Phase 8 (and tightened Phase 5.5): tasks stay in the backlog grouped by method, and each method gains a one-line link to where its tasks live, instead of transplanting the task tree into the V2MOM. Strategy (V2MOM) and execution (backlog) are now explicitly separate sources of truth, keeping the V2MOM concise.
*** 2026-05-22 Fri @ 14:43:17 -0500 Made create-v2mom obstacles operational (mitigation/owner/cadence)
Phase 6 now captures, per obstacle: name, manifestation, stakes, mitigation, owner, and review cadence — with a worked example per domain (health/finance/software), a "good obstacle" characteristic, a Phase 9 review question, and a red flag for candid-but-not-operational obstacles. An obstacle without a countermove is now flagged as an observation, not a plan.
*** 2026-05-22 Fri @ 14:43:17 -0500 Corrected and narrowed the Meincke citation in prompt-engineering
Fixed the title to "Call Me A Jerk: Persuading AI to Comply with Objectionable Requests" (SSRN abstract_id=5357179) in all three spots (frontmatter, Seven Principles intro, References). Reframed the ~33%→72% result as what it is — a prompt-safety caution that persuasion raises compliance with objectionable requests — explicitly not evidence that persuasion framing improves engineering prompt quality. Kept the seven principles as a tone vocabulary.
*** 2026-05-22 Fri @ 14:43:17 -0500 Added an eval-harness requirement to prompt-engineering critique mode
Added critique step 7 + a checklist line: for fragile or reusable/production prompts, write 3-5 adversarial/edge inputs, run both the old and new prompt against each, and record the behavioral delta. A throwaway prompt can ship on the rewrite alone; a discipline/reused/production one can't. Without examples, "the rewrite is better" is an assertion, not a result.
*** 2026-05-22 Fri @ 14:43:17 -0500 Added mandatory stale-entry + privacy pre-write checks to codify
Added a "Mandatory pre-write checks" block at the top of Phase 3 (Write) in =.claude/commands/codify.md=: a stale-entry scan (update/remove no-longer-true entries in place, don't append contradictions around them) and a privacy/leak check carrying both questions verbatim — "safe if the project were public?" and "belongs in private memory instead?" — routing private content to auto-memory. Gates, not background guidance.
*** 2026-05-22 Fri @ 14:06:41 -0500 Scoped review-code's CI-trust rule to reviewing, not shipping
Expanded the False-Positive Filter bullet in =review-code/SKILL.md=: "trust CI, don't run builds" applies to reading a diff, not producing one. A pre-commit/pre-push flow still owes the local verification =verification.md= requires (run the suite or state "not run because..."). Closes the apparent contradiction with =verification.md= / =finish-branch=.
*** 2026-05-22 Fri @ 14:06:41 -0500 Added private-vs-public CLAUDE.md citation modes to review-code
Expanded the Content scope section in =review-code/SKILL.md= with two modes: a private/internal review cites =CLAUDE.md= directly; a public/team review translates the rule into the engineering reason it encodes and doesn't name the rules file (a teammate can act on the reason, not on a file they can't reach). Same principle =commits.md= states for personal tooling in public artifacts.
*** 2026-05-22 Fri @ 13:48:14 -0500 Relaxed review-code "three strengths" to up-to-three-or-none
Changed all three "three minimum" spots in =review-code/SKILL.md= (Strengths section, Critical Rules DO list, Anti-Patterns) to "up to three specific; say none found on a tiny or weak diff." Reframed the old "No Strengths section" anti-pattern as "Skipping strengths out of laziness" so a substantive diff still demands them while a weak one can honestly report nothing notable. Landed alongside Craig's adjacent edit telling reviewers not to explain why a strength is good (sycophantic padding).
*** 2026-05-22 Fri @ 14:12:24 -0500 Removed review-process language from respond-to-review commit guidance
Replaced the =fix: Address review — [description]= example (and the matching description-line phrasing) in =.claude/commands/respond-to-review.md= with "name the actual fix (=fix: validate export filename=), not the review that prompted it." Killed the non-ASCII dash and the process-in-commit pattern that conflicted with =commits.md=.
*** 2026-05-22 Fri @ 14:12:24 -0500 Made respond-to-review fetch unresolved threads + resolve after verification
Rewrote section 1 (Gather) in =.claude/commands/respond-to-review.md= to pull =reviewThreads= via =gh api graphql= with =isResolved=, skipping already-resolved threads so settled feedback isn't re-processed; top-level conversation comments still come from REST. Added a section-4 step: reply and resolve a thread only after the fix is verified, never before.
*** 2026-05-22 Fri @ 14:12:24 -0500 Verified respond-to-cj-comments no longer embeds an absolute path (moot)
Already resolved by a prior migration: =grep= for =/home/= and =/Users/= in =.claude/commands/respond-to-cj-comments.md= returns nothing. The public-writing section refers to the rules by name, not by local path. No edit needed.
*** 2026-05-22 Fri @ 14:12:24 -0500 Closed respond-to-cj-comments humanizer/emacsclient fallback (largely moot)
Overtaken by two later changes: =/humanizer= was replaced by =/voice personal= (no =/humanizer= invocation remains), and the mandatory =emacsclient= summary-open was replaced by the in-place VERIFY-task pattern (workflow line ~262, Craig's 2026-05-12 standing instruction). Only a stale descriptive phrase remained — tidied "humanizer's signs of AI writing" to "the signs of AI writing." The original fresh-environment-fallback concern no longer applies as written.
*** 2026-05-22 Fri @ 14:51:37 -0500 Fixed finish-branch base-branch detection
Rewrote Phase 2: resolve the base *branch name* in priority order (open PR's =baseRefName=, then =git symbolic-ref --short refs/remotes/origin/HEAD= stripped, then ask), and compute the merge-base *SHA* separately only where a commit range is needed. Made the branch-name-vs-merge-base distinction explicit, since the old command returned a SHA where a branch name was needed.
*** 2026-05-22 Fri @ 14:51:37 -0500 Made finish-branch merge safer + worktree-aware
Added pre-flight checks to Option 1 (Merge Locally): dirty-tree refusal with no auto-stash, protected-branch awareness, upstream-gated =git pull --ff-only=, and merge-commit-vs-rebase as a team-policy choice instead of a hardcoded =--no-ff=. Replaced the fragile =git worktree list | grep <branch>= detection with a =git rev-parse --git-dir= vs =--git-common-dir= comparison plus =git worktree list --porcelain= for the path.
*** 2026-05-22 Fri @ 14:51:37 -0500 Added tool-availability + ceremony-scale paths to start-work
Added a "Tool availability" section (graceful degradation when Linear MCP / =gh= / =/voice= / Playwright are missing — do what's available, surface what isn't, don't block) and a "Ceremony scale" section (trivial / small / standard tiers so a two-line fix skips ticket+branch+gates unless asked). The =humanizer= reference in the original item is moot — the file already uses =/voice= throughout.
*** 2026-05-22 Fri @ 14:51:37 -0500 Resolved start-work claim-before-justify rollback risk
Split the claim by tracker type: personal todo.org claims defer to after the Justify gate (a killed task needs no rollback), while team trackers (Linear/GitHub) still claim first to signal intent but record prior state (status, assignee, label) so the Phase 2 rollback restores exactly it. Updated the per-tracker rollback steps and the matching anti-pattern.
*** 2026-05-22 Fri @ 14:28:41 -0500 Verified add-tests typescript-testing.md reference resolves (moot)
Resolved since the audit: =languages/typescript/claude/rules/typescript-testing.md= now exists, and =add-tests/SKILL.md:68= references it by bare filename, the same way it references =python-testing.md= (both get copied into a project's =.claude/rules/=). The "missing file" premise no longer holds. No edit needed.
*** 2026-05-22 Fri @ 14:28:41 -0500 Added a category-exception protocol to add-tests
Added an exception note to step 7 (proposal) in =add-tests/SKILL.md=: pure adapters, generated code, tiny pass-through wrappers, and framework glue may skip a category that would only re-test the framework, but the skip must be stated and justified in the plan and the behavior covered at integration/E2E level — never a silent omission. Step 12 (write) now points back to "honor documented category exceptions."
*** 2026-05-22 Fri @ 14:25:37 -0500 Added environment + recent-change capture to debug Phase 1
Added a fourth Phase-1 step in =debug/SKILL.md=: record versions, feature-flag/config state, dataset/fixture, seed/clock, concurrency, and recent commits/config-infra changes. Noted that intermittent bugs usually live in environment/state transitions (and "what changed recently" is often the fastest route), while a deterministic local bug only needs a one-liner. Updated the phase's closing recap to include the context.
*** 2026-05-22 Fri @ 14:25:37 -0500 Constrained root-cause-trace defense-in-depth to boundaries
Rewrote step b in =root-cause-trace/SKILL.md=: instead of "add a check at each layer that could have caught it," add one only at a layer that owns a boundary or invariant — ingress/trust, persistence, invariant-owning service, final render. Added the explicit rule that a pass-through function owning neither shouldn't get a duplicate null check (validation spam). Recast the three example layers as the boundary types.
*** 2026-05-22 Fri @ 14:25:37 -0500 Required evidence + counterfactual per why in five-whys
Expanded step 2 in =five-whys/SKILL.md=: each link now owes an evidence field (a log/commit/metric/config you can point to) and a counterfactual check (remove this cause — does the symptom above plausibly not happen?). Framed the counterfactual as the main guard against monocausal storytelling, and updated the worked example to show both fields.
*** 2026-05-22 Fri @ 15:51:59 -0500 Added timebox + fresh-sources rules to brainstorm
Phase 1 gained a "Timebox the dialogue" rule (aim for the one-sentence restatement in ~5-8 questions, then move on and park the rest as open questions). Phase 2 gained "Ground high-stakes claims in fresh sources" (check load-bearing claims about markets/regulations/tools/vendors/APIs against a current source; mark unverified ones as assumptions). The design-doc skeleton gained an "## Assumptions" section that distinguishes researched facts (with source) from assumptions (to confirm before building).
*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-decide examples timeless + required citations
Dated the MongoDB multi-document-transaction example (scoped to 2024-01) with a backing reference, and added a "Cite, don't assert" Do: every concrete technical claim about a tool/version/platform carries a link, doc, version, or "checked YYYY-MM" date, or gets a domain-neutral placeholder — so unsourced "X can't do Y" doesn't rot into stale fact.
*** 2026-05-22 Fri @ 14:59:32 -0500 Standardized arch-decide ADR statuses + immutability rule
Declared a canonical five-status set (Proposed, Accepted, Rejected, Deprecated, Superseded) with an explicit "no synonyms" line, and spelled out the immutability rule in the Don'ts: an accepted ADR's body is frozen, only status/link metadata changes, a changed decision gets a new superseding ADR and the old one stays as the historical record.
*** 2026-05-22 Fri @ 14:59:32 -0500 Added Trust/Data/Compliance phase to arch-design
Added a new Phase 4 (Trust, Data, and Compliance) before the paradigm shortlist: trust boundaries, data classification, abuse/misuse cases, privacy constraints, compliance evidence, and operational ownership — surfaced early so the architecture is drawn around them, not retrofitted by a downstream =security-check=. Threaded into the workflow list, brief template (new §6), review checklist, and anti-patterns.
*** 2026-05-22 Fri @ 14:59:32 -0500 Split paradigms from tactical patterns in arch-design
Split Phase 5's single mixed table into Step 1 (pick one paradigm: monolith/microservices/layered/event-driven/serverless/pipeline/space-based) and Step 2 (compose tactical patterns: DDD, hexagonal, CQRS, event sourcing — several or none, often per-module), with composition examples and an anti-pattern against treating DDD/CQRS as alternatives to a paradigm. Recommendation + brief now name a paradigm plus composed patterns.
*** 2026-05-22 Fri @ 14:59:32 -0500 Expanded arch-document quality scenarios to the Q42 six-part template
Replaced §10's thin "Under [condition]..." template with the arc42/Q42 six-part structure (source, stimulus, environment, artifact, response, response measure), each glossed, with the cart-checkout example rewritten across all six parts. A one-line prose form stays acceptable once all six parts are recoverable.
*** 2026-05-22 Fri @ 14:59:32 -0500 Added staleness/ownership metadata to arch-document output
Added a per-section metadata block (owner, generated-against SHA + date, review cadence, "stale-when" conditions) as an HTML-comment header plus a visible Doc-status note, with field-fill guidance, and a whole-document Doc Status table replacing the README's "Last Updated" stub. Wired into the review checklist and an "Undated docs" anti-pattern.
*** 2026-05-22 Fri @ 14:59:32 -0500 Added confidence levels to arch-evaluate findings
Added a "Confidence and Provenance" subsection: every framework-agnostic finding carries High/Medium/Low + how it was determined, with a required "Not fully checked because..." note when scale, runtime imports, reflection, or dynamic dispatch cap certainty. Updated the example findings and review checklist; a finding with no note now asserts a full read.
*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-evaluate report skipped tool checks explicitly
Replaced "skip silently" with explicit reporting: for each detected language whose tool isn't configured or can't run, emit an Info "tool not configured / not run" finding (with an example) so the audit shows what was and wasn't verified. A check that didn't run no longer reads as a pass. Updated workflow step 4 and the review checklist.
*** 2026-05-22 Fri @ 14:51:37 -0500 Added notation/output fallback to c4-analyze + c4-diagram
Both commands now treat C4 as notation-independent: a "Choosing a notation" section (draw.io XML, Structurizr DSL, Mermaid with native C4 types, PlantUML/C4-PlantUML) and a headless fallback that emits a text notation (Mermaid or Structurizr DSL) and skips PNG-export/desktop-open when =drawio= or a GUI is absent, rather than failing. draw.io is now one option, not the only one.
*** 2026-05-22 Fri @ 14:51:37 -0500 Clarified C4 abstraction boundaries in c4-analyze + c4-diagram
Added an "Abstraction boundaries" section to both: a Container is a separately deployable/runnable unit (not synonymous with a Docker container — a SPA or managed DB counts), a Component lives inside one Container and isn't separately deployable. Added a 4e "Verify single abstraction level" check that walks every element and relationship to confirm it stays at the diagram's level, notation-independent.
*** 2026-05-22 Fri @ 15:10:35 -0500 Added "When You Cannot Verify" standard to verification.md
Added a section requiring, when a verification command can't run, a four-part report: command attempted, why it couldn't run, risk left unverified, and the smallest next command for the user. States the principle that a check that didn't run is never reported as a pass — "unable to verify" is a required honest outcome, not silence. Placed after Red Flags.
*** 2026-05-22 Fri @ 15:10:35 -0500 Added property-based + mutation testing escalation to testing.md
Added an "Escalation Beyond Category and Pairwise" section: property-based testing for invariants over a broad input domain (round-trips, idempotence, ordering — Hypothesis/fast-check/proptest) and mutation testing for when high line coverage hides thin assertions (mutmut/cosmic-ray/Stryker). Both framed as escalation paths to reach for on a gap, not gates on every unit.
*** 2026-05-22 Fri @ 15:10:35 -0500 Added a disciplined spike protocol to testing.md
Formalized the existing "I need to spike first" excuse-table row into a "Spike Exception (Disciplined)" subsection under TDD Discipline: TDD stays the default, but a spike is sanctioned when all three hold — timeboxed, spike code not committed, and the first failing test written before productionizing the discovered approach. Built on the existing row rather than contradicting it.
*** 2026-05-22 Fri @ 15:10:35 -0500 Added pre-dispatch availability + cost checks to subagents.md
Added a "Pre-Dispatch Checks" section with two gates: Availability (no Agent capability → do the work in the main thread under the same scope/constraints/output discipline the contract would enforce) and Cost (when writing the full contract costs more than the task, do it inline). Cross-references the existing "Don't Subagent At All" section and "Subagenting trivial work" anti-pattern rather than duplicating.
*** 2026-05-22 Fri @ 15:06:04 -0500 Revised python-testing SQLite guidance toward production-like DBs
Replaced "prefer in-memory SQLite for speed" with: run ORM/query tests against a production-like DB (same engine as prod, often containerized), since SQLite diverges from Postgres/MySQL on query semantics, constraints, transactions, JSON, time zones, and indexes (a test can pass on SQLite and fail in prod). SQLite stays only for pure unit tests with no DB-semantics dependency.
*** 2026-05-22 Fri @ 15:06:04 -0500 Clarified python-testing ORM-mocking boundary
Changed the "never mock" bullet from "ORM queries" to "ORM internals (querysets, sessions, model internals)" and added a paragraph: domain services use real model methods/validation, but a thin orchestration unit can inject a fake at a deliberate data-access port (a repository/interface the code owns). That's still mocking at a boundary, not at ORM internals.
*** 2026-05-22 Fri @ 15:06:04 -0500 Made elisp.md editing advice tool-agnostic
Rephrased the "prefer Write over repeated Edits" bullet around intent: land nontrivial Elisp as one cohesive change rather than dribbling it in over tiny partial edits (which accumulate paren mismatches), and run paren-balance + byte-compile checks immediately after, whatever editing mechanism the environment uses.
*** 2026-05-22 Fri @ 15:06:04 -0500 Added batch-mode + native-comp caveats to elisp-testing.md
Added three sections: Batch-Mode Reproducibility (=emacs --batch= as source of truth, no interactive-session state, no blocking prompts, deterministic), Isolating Emacs State (temp =user-emacs-directory=, explicit load-path, declared deps only, with an unwind-protect sandbox example), and Byte-Compile/Native-Comp Warnings (=byte-compile-error-on-warn=, native-comp gated on =native-comp-available-p= and kept opt-in/version-aware).
*** 2026-05-22 Fri @ 15:16:22 -0500 Synced hooks/README install snippets with the destructive hook (opt-in)
Brought the README's manual-install and settings-JSON snippets in line with the canonical =hooks/settings-snippet.json= (which already wires all three) and the Makefile's opt-in design: added the destructive-bash-confirm.py symlink as an opt-in step, added its settings entry, and reworded the note to say all three are no-op-safe but the destructive gate is opt-in (=make install-hooks= excludes it by default — link manually before relying on the snippet entry).
*** 2026-05-22 Fri @ 15:35:06 -0500 Hooks now scan file-backed commit/PR messages
Added =read_referenced_file()= to =_common.py= (safe local read: missing/oversize/non-UTF-8 → None) and wired it in: =git-commit-confirm.py= =extract_commit_message= now handles =-F=/=--file=/=--file===<path>= (reads + scans the file, falls through to UNPARSEABLE → asks if unreadable), and =gh-pr-create-confirm.py= reads =--body-file= content instead of a placeholder. Attribution scanning now sees the real committed/posted text. Built a pytest harness (=hooks/tests/=, importlib-by-path loader for the hyphen-named hooks) and wired =hooks/tests= into =make test=. 54 hook tests pass; full suite green.
*** 2026-05-22 Fri @ 15:35:06 -0500 Rewrote destructive-bash rm parsing on shlex
=detect_rm_rf= now tokenizes with =shlex.split= instead of a whitespace split, so quoted/spaced paths and combined/separate/reordered flags (=-rf=, =-r -f=, =-fr=, =--recursive=/=--force=) all parse. Fails toward asking — returns a sentinel that still fires the modal — on unbalanced quotes or when a forced recursive rm coexists with a compound/pipeline/substitution/redirect construct. Documented the supported/unsupported shell constructs in the docstrings, and extended the dangerous-path banner to =$HOME=-prefixed and wildcard targets. Covered by 25 new tests. (Pre-existing, out-of-scope: path-prefixed =rm= like =/bin/rm= still isn't matched.)
** DONE [#B] Add =make remove= for interactive ruleset removal via fzf
CLOSED: [2026-05-22 Fri]
Shipped: =scripts/remove.sh= (three modes — =--list=, =--remove-selected= reading stdin, and the default fzf-multi interactive flow) + =make remove= target + =scripts/tests/remove.bats= (5 cases). Lists only symlinks resolving into the repo (foreign links left alone); rm's picked links while leaving repo sources untouched; reports-and-continues on a missing target; quiet no-op on empty selection. shellcheck clean, make test green. Dropped the stale =bridge= entry per the note below.
Add a Makefile target that lists every currently-installed ruleset entry
and lets me pick one or more to remove via fzf. Granular alternative to
=make uninstall= (removes everything) and =make uninstall-hooks= (removes
only hooks).
*** Why this matters
Tearing down a single skill, rule, hook, or config file currently means
either running =make uninstall= and re-installing what I want to keep,
or =rm=ing the symlink directly and remembering the exact path. Both are
friction. An interactive picker lets me filter, multi-select with Tab,
and confirm with Enter — the typical fzf flow. Costs about 3-5 seconds
per teardown instead of 15+ seconds of "what's the exact name?".
*** Design
The recipe builds a tab-separated list of every currently-installed item,
categorized by type, and pipes it to =fzf --multi=. The user filters,
marks with Tab, and confirms with Enter. The recipe parses the selections
and =rm=s the matching symlinks.
#+begin_example
skill debug
rule commits.md
hook destructive-bash-confirm.py
config settings.json
commands commands
bridge claude-rules
#+end_example
Each line is =<kind>\t<name>=. The recipe maps =<kind>= to the right path:
- =skill= → =$(SKILLS_DIR)/<name>=
- =rule= → =$(RULES_DIR)/<name>=
- =hook= → =$(HOOKS_DIR)/<name>=
- =config= → =$(CLAUDE_DIR)/<name>=
- =commands= → =$(CLAUDE_DIR)/commands=
- =bridge= → =$(SKILLS_DIR)/claude-rules=
Source files in =rulesets/= stay untouched. =make install= re-creates the
removed links if needed (the install loop is idempotent).
*** Edge cases
- Esc instead of Enter → empty selection → clean exit, no removal.
- Filter to nothing then Enter → same as Esc.
- Selected item already gone → =rm= fails visibly, processing continues
on the rest.
- =fzf= not installed → fail fast with a clear error (matches the pattern
used by =install-lang=).
*** Possible extensions
- Parallel =make pick-install= target that lists not-yet-installed items
and installs the chosen ones. Symmetric UX, same fzf flow.
- Confirmation prompt when more than N items selected (defense against
accidental select-all).
- =--source= flag that also runs =git rm= against the rulesets source for
the selected item. Probably bad idea — too easy to lose work.
- The =bridge → $(SKILLS_DIR)/claude-rules= entry above is stale — the
bridge symlink got removed in a later commit. Drop that bullet when the
recipe lands.
** DONE [#B] Document the =mcp/= install pipeline in =mcp/README.org=
CLOSED: [2026-05-22 Fri]
Wrote =mcp/README.org= covering everything in the "what to cover" list: the file layout (tracked vs gitignored), the secrets-bundle shape (plain =${VAR}= secrets + base64-bundled OAuth artifacts, AES256 symmetric =gpg -c=), the install flow (decrypt → materialize keys/token caches at mode 600 → expand → register unregistered, idempotent), the http/sse-vs-stdio transport split, token rotation when a Google refresh token is revoked, and adding a new server. Grounded in a read of the actual =install.py= + =servers.json=.
=mcp/= has =install.py=, =servers.json=, =secrets.env.gpg=, =gcp-oauth.keys.json= (gitignored, regenerated at install). No README. Coming back to this in three months I'll re-discover how the bundle is structured, what =install.py= does, and how to rotate tokens. Saving that re-discovery is the whole point.
*** What to cover
- Layout: what each file is, which are tracked vs gitignored.
- Secrets bundle shape: how vars are listed in =secrets.env=, the symmetric-encryption pattern (=gpg -c --cipher-algo AES256=), the base64-bundled OAuth artifacts (=GCP_OAUTH_KEYS_JSON_B64=, =GOOGLE_DOCS_PERSONAL_TOKEN_B64=, =GOOGLE_DOCS_WORK_TOKEN_B64=).
- Install flow: =make install-mcp= → =install.py= decrypts, writes the keys file and Google Docs token caches at mode 600, expands =${VAR}= in =servers.json=, calls =claude mcp add --scope user= for unregistered servers. Idempotent.
- Token rotation: when a refresh token gets revoked, the recovery flow (re-auth on one machine, re-bundle, recommit).
- Adding a new server: edit =servers.json=, add any new =${VAR}= placeholders to the bundle, re-encrypt.
- The OAuth dance for HTTP-transport servers (linear, notion) versus stdio (google-docs-*) — different paths, different gotchas.
** DONE [#C] Add =make uninstall-mcp= + =mcp/install.py --check= for symmetry :feature:solo:quick:
CLOSED: [2026-05-28 Thu]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-28
:END:
Currently the MCP install pipeline only flows one direction. No way to remove rulesets-managed MCP servers in one command. No way to ask "what's the drift between =servers.json= and =claude mcp list=" without eyeballing.
*** =make uninstall-mcp=
Iterate over =servers.json=, run =claude mcp remove <name> -s user= for each. Ignore "not registered" errors. Idempotent.
*** =mcp/install.py --check=
Dry-run mode. Decrypt secrets, but instead of registering, print the drift report:
- Servers in =servers.json= not in =claude mcp list= → =MISSING=
- Servers in =claude mcp list= not in =servers.json= → =EXTRA=
- Servers in both → =ok=
Useful for diagnosing connection failures and for the eventual =make doctor= integration.
** DONE [#C] Update =README.org= with MCP install pipeline section :chore:solo:quick:
CLOSED: [2026-05-28 Thu]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-28
:END:
=README.org= covers global install, per-project language bundles, and design principles, but doesn't mention =make install-mcp= or the =mcp/= directory. Add a short section after "Per-project language bundles" describing the user-scope MCP install pattern (decrypt → expand → register) and pointing at the eventual =mcp/README.org=.
** DONE [#C] Consolidate =claude-templates/Makefile= after fold :chore:quick:solo:
CLOSED: [2026-05-28 Thu]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-28
:END:
Sibling follow-up from the fold child (2026-05-15). After the subtree merge, =rulesets/claude-templates/Makefile= still has its standalone =install= / =uninstall= / =list= / =test-scripts= targets. The =install= target's =bin/ai= logic is now duplicated in =rulesets/Makefile=. Both work; the redundancy is harmless but worth cleaning up.
Options:
- *Delete* =claude-templates/Makefile= entirely — forces all install through rulesets root. Cleaner.
- *Strip down* to just =test-scripts= — the one piece not redundant with =rulesets/Makefile=.
- *Leave it* — slight redundancy, no functional harm.
Triggered by: 2026-05-15 fold session's refactor audit (commit =2d645fc=).
** DONE [#C] Run =--archive-done= sweep at start of =open-tasks.org= Phase A :chore:quick:solo:
CLOSED: [2026-05-28 Thu]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
From pearl handoff 2026-05-28. =open-tasks.org= Next Mode reads =* Project Open Work= and skips =* Project Resolved= correctly, but a level-2 task that completed during a session sits as =** DONE= under Open Work until something archives it. Between cleanups, a freshly-DONE task can surface as a "what's next" candidate.
Proposed fix: as the first step of =open-tasks.org= Phase A, run =emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done todo.org=, then read =todo.org=. The cleanup tool already exists; this is wiring it into the workflow.
Cost: a few hundred ms at the start of every "what's next" invocation. Win: recommendations never include DONE work.
Optional refinement: gate behind a check for read-only / dry-run mode if that's ever introduced. The default invocation archives.
** DONE [#C] Triage Codex enhancement backlog :spec:
CLOSED: [2026-05-28 Thu]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
Triaged interactively 2026-05-28. Disposition table for all 14 items lives at [[file:docs/design/2026-05-28-rulesets-enhancement-backlog.org][2026-05-28-rulesets-enhancement-backlog.org]] under "Triage Dispositions": 3 accepted (filed below as TODOs), 3 pilot/scope-limited (filed below), 2 marked as conventions rather than tracked tasks, 6 rejected with rationale. Items #1 and #2 already had homes (#16 and the Phase-1 codex TODO).
** DONE [#C] Canonical/mirror drift detection via pre-commit hook or =make sync-check= :feature:quick:solo:
CLOSED: [2026-05-28 Thu]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
From the codex enhancement backlog (item #7), reframed: don't dedupe the dual source — the canonical-in-=claude-templates/= + mirror-in-=.ai/= pattern is a feature (other projects rsync from the canonical; the mirror lets rulesets-as-a-project have a working copy). The real pain is sync-discipline overhead — every workflow edit needs both copies updated, and forgetting one leaves the next startup's rsync to surface the drift.
Scope: write a small =scripts/sync-check.sh= (or fold into the existing Makefile) that diffs =claude-templates/.ai/workflows/= against =.ai/workflows/=, exits non-zero on drift. Wire as a pre-commit hook (=githooks/pre-commit= or equivalent) so the discipline is enforced before publish, not at the next startup. =make sync-check= as a manual entry point.
Verification: introduce a deliberate diff, commit, hook should block. Restore parity, hook should pass.
** DONE [#C] Add =make status= — compose audit + doctor + open-task count :feature:quick:solo:
CLOSED: [2026-05-28 Thu]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
From the codex enhancement backlog (item #12), scope-limited: =make status= only. Reject the rest of #12 (=make sync= duplicates the existing sync flow; =make health= wraps existing checks without adding signal; =make bootstrap-project= duplicates =install-ai= + =install-lang=).
Scope: one Makefile target that prints a compact summary of:
- Install audit state (clean / drift, calling =make audit=).
- Machine-global doctor state (calling =make doctor=).
- Open-task count (top-level entries in =todo.org= under =* Rulesets Open Work=).
- Inbox count (files in =inbox/= excluding =.gitkeep= and =PROCESSED-= prefixes).
- Git working-tree status (clean / dirty, ahead/behind upstream).
Output should be roughly 10 lines, scannable in one glance. Composes the existing checks; no new logic except the summary formatting.
** DONE [#C] Iteration-history backfill for spec-review and spec-response :docs:followup:
CLOSED: [2026-05-28 Thu]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-28
:END:
Source: org-drill inbox 2026-05-28.
Once the in-flight WIP lands (the requirement that specs carry a bottom =Review and iteration history= section, with iteration / date / contributor / role / what / why / artifacts), backfill the two workflow files themselves using rulesets' session history as evidence.
Files to update:
- =claude-templates/.ai/workflows/spec-review.org=
- =claude-templates/.ai/workflows/spec-response.org=
Investigation: search =.ai/sessions/=, =.ai/notes.org=, inbox archive, and git log for mentions of these workflow docs. Identify review/response/design iterations, dates, and contributors (including agents where known: Claude Code, Codex, local models). Distinguish high-confidence history (commits, dated session entries) from inferred (chat-only context). Recommend whether enough evidence exists to populate the section, and draft the entries if so.
Dependency: spec-review.org and spec-response.org have uncommitted edits in flight. Wait for those to land before writing to the files. The read-only research portion (search sessions, identify iterations, draft entries to a scratch file) can run in parallel without conflict.
** DONE [#B] Startup Phase A rsync propagates dirty rulesets WIP into downstream projects :feature:
CLOSED: [2026-05-30 Sat]
:PROPERTIES:
:CREATED: [2026-05-29 Fri]
:LAST_REVIEWED: 2026-05-29
:END:
Fixed via option 1 (skip-when-dirty), scoped to the synced source paths: startup.org Phase A now guards the protocols/workflows/scripts rsyncs behind a =git status --porcelain= check on =claude-templates/.ai/{protocols.org,workflows/,scripts/}=, skipping the sync when any are dirty. The propagation anomaly (cross-project-broadcast.org / page-signal.org not reaching jr-estate) was a timeline artifact: both files were added in 664bf01 on 2026-05-29, after jr-estate's Phase A rsync had already run — correct behavior, not a bug.
From jr-estate handoff 2026-05-29. When rulesets has uncommitted WIP at the moment a downstream project starts a session, Phase A.0 reports "dirty, skipping pull" and proceeds. Phase A's =rsync -a --delete= then runs against the dirty rulesets working tree and copies the WIP state into the downstream project's =.ai/workflows/= and =.ai/scripts/=. The downstream project's =git status= then shows drift the user did not author. Two bad recovery paths: commit the drift as "chore: sync .ai tooling from templates" (creates fake commit history about template state) or leave it dirty (noisy wrap-ups, pressure to commit anyway).
Three options proposed in the handoff:
1. *Skip-when-dirty.* Make Phase A's workflows/ and scripts/ rsync no-op when Phase A.0 reports rulesets dirty. Simplest defense.
2. *Clean-files-only.* Restrict the rsync to files git considers unmodified in rulesets. Untracked files in rulesets do not propagate. Most precise.
3. *Clean-ref-based.* Cache the last-known-clean state as a git tag or ref and rsync from that ref rather than the working tree. Most decoupled, also the most infrastructure.
Recommendation (mine): option 1. The downstream impact of skipping a sync once is small (the next session with rulesets clean catches up), and the implementation is one =if [ "$dirty" -eq 0 ]= guard around the existing rsync block. Option 2 adds shellout complexity per file; option 3 requires tagging discipline that has no other reason to exist.
The original handoff also noted a related anomaly: even with =--delete=, two files that DO exist in rulesets canonical (=cross-project-broadcast.org=, =page-signal.org=) did NOT propagate to jr-estate. Worth confirming whether that was a transient rsync issue or evidence of a deeper Phase A bug. Could be ordering: those files were added to rulesets AFTER the jr-estate Phase A rsync ran, in which case the behavior is correct and the report is misreading the timeline.
Source: =inbox/2026-05-29-0832-from-jr-estate-investigate-startup-rsync-carried-dirty.org= (processed and deleted).
** DONE [#B] Codex Phase 1 — AI_AGENT_ID + session-context.d/<id>.org :feature:
CLOSED: [2026-05-30 Sat]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
Shipped backward-compatibly. New =.ai/scripts/session-context-path= helper resolves the active path from =AI_AGENT_ID=: unset → the legacy =.ai/session-context.org= singleton (one-agent default unchanged, per the spec's compatibility rule), set → =.ai/session-context.d/<sanitized-id>.org=. startup.org's existence check and wrap-it-up.org's rename now resolve through the helper (with a singleton fallback for older checkouts); wrap folds the agent id into the archive name. protocols.org documents the rule. Verified: 5 bats cases + a two-agent simulation showing distinct paths per id. Larger runtime-neutral arc (runtimes/ manifests, launcher refactor) stays parked under the parent spec.
Lifted from the broader codex runtime spec ([[file:docs/design/2026-05-28-generic-agent-runtime-spec.org]]) as the immediate-correctness slice independent of the larger arc. The singleton =.ai/session-context.org= is unsafe under simultaneous agents — two LLMs running in the same project at the same time would overwrite each other's session state.
Scope: introduce an =AI_AGENT_ID= environment variable and split the single =session-context.org= into a per-agent =session-context.d/<id>.org= directory. No other phases of the runtime refactor are in this task — keep the surface small, fix the race, ship.
Touches: =.ai/protocols.org= (rename rule + recovery anchor), =.ai/workflows/startup.org= (Phase A check), wrap-up workflow (rename target), per-project session record discoverability.
Verification: simulate two agents sharing a project (separate AI_AGENT_ID values) and confirm session-context writes land in distinct files without interleaving.
Parent: see [[Generic agent runtime support — Codex spec v0]] above for the larger arc this is sliced from.
** DONE [#C] Decide on category-3 rule copies in the deepsat tree :chore:quick:solo:
CLOSED: [2026-05-31 Sun]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-28
:END:
Diffed 2026-05-31. Both copies (coding-rulesets vendored + orchestration_dashboard_mvp) are byte-identical to each other and stale against canonical: =testing.md= 221 lines behind with 5 lines unique to the copies (older wording or a small team tweak), =verification.md= 40 behind with nothing unique. Same older vendored version in both spots. Left untouched per the A1 decision — team-owned, and canonicalizing would create a cross-repo dependency on the private rulesets (the orchestration_dashboard_mvp pair is team-visible from Vrezh's PR thread). No files modified.
While symlinking personal-project =.claude/rules/= mirrors to the rulesets canonical on 2026-05-07, two locations didn't fit the "personal mirror → symlink" pattern and were left untouched pending judgment:
- =~/projects/work/deepsat/code/coding-rulesets/claude-rules/{testing,verification}.md= — looks like a vendored team-shared copy.
- =~/projects/work/deepsat/code/orchestration_dashboard_mvp/.claude/rules/{testing,verification}.md= — could be project-specific overrides.
For each: read the file, diff against the rulesets canonical, decide whether it's an intentional diverge (leave alone), stale (sync content), or should canonicalize (replace with symlink and accept the cross-repo dependency). The orchestration_dashboard_mvp pair is the project where Vrezh's PR review surfaced this whole thread, so any decision there has team-visibility implications.
Decision (Craig, 2026-05-31): *leave team-tree copies alone.* Personal rulesets does not reach into team repos — canonicalizing would create a cross-repo dependency on the private rulesets, and the orchestration_dashboard_mvp copy is team-visible. This makes the task solo: diff each copy against canonical, record whether it's identical / drifted / overridden in the disposition, and close as "left alone (team-owned)" without modifying the team-tree files.
** DONE [#C] Audit language-specific rule files for cross-project duplication :chore:solo:
CLOSED: [2026-05-31 Sun]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-28
:END:
Audited 2026-05-31. Findings: in sync with canonical (=languages/<lang>/claude/rules/=) — work =python-testing.md=, deepsat =typescript-testing.md=, =.emacs.d= =elisp-testing.md= + =elisp.md=. Drifted — =gloss= and =chime= (byte-identical to each other): =elisp-testing.md= 44 lines behind (canonical added Batch-Mode Reproducibility + Isolating Emacs State; zero lines unique to the copies), =elisp.md= one line behind (canonical expanded the edit-cohesively guidance). No project-specific additions anywhere — every copy is either current or purely stale.
Disposition: *leave them project-local* (the task's own option). The language-rule copies in code projects are the bundle's deliberate copy-and-sync model, not the symlink pattern the generic rules (commits/testing/verification/subagents) use in personal doc-projects. =sync-language-bundle.sh= auto-fixes drifted bundle rules on each startup, so gloss/chime self-heal the moment those projects next boot — no canonicalize/symlink needed, and symlinking would fight the bundle model. Did not reach into work/deepsat/gloss/chime/.emacs.d from here (cross-project boundary; team copies left alone per the 2026-05-31 category-3 decision).
The four canonical rules (=commits=, =testing=, =verification=, =subagents=) are now symlinked across the five personal-project mirrors as of 2026-05-07. But several language-specific rule files exist in multiple project mirrors and may be duplicated or drifted:
- =python-testing.md= in =~/projects/work/.claude/rules/=
- =typescript-testing.md= in =~/projects/work/deepsat/code/.claude/rules/=
- =elisp-testing.md= and =elisp.md= in =~/.emacs.d/=, =~/code/gloss/=, =~/code/chime/=
The Elisp pair is the most suspicious — three repos using essentially the same rules. Audit: diff these across the projects, check for drift, then decide whether to canonicalize them under =~/code/rulesets/claude-rules/languages/<lang>/= and symlink, or leave them as project-local.
** DONE [#C] Refactor =daily-prep.org= to delegate to =triage-intake.org= for the triage section :chore:solo:
CLOSED: [2026-05-31 Sun]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-28
:END:
Collapsed Phase 3's inline source scans (sub-steps 3b email / 3c mark-read / 3d Slack / 3e Linear / 3f PRs / 3g dedup, ~280 lines) into four: 3b runs the triage-intake engine, 3c surfaces today's reactive items as Day's Priorities thin links, 3d re-sorts by urgency, 3e writes the audit footer from the engine's coverage. Source coverage carries via the engine's Phase 0 two-dir glob (general + .ai/project-workflows/ plugins), so the work account's Gmail/Slack/Linear/GHE plugins still get scanned. Adapted the downstream refs (Prep Doc Structure rule, Heads-up FYI source, Recommended Approach Pattern reframed as engine-applied), removed the orphaned Linear-digest note, added a Living Document entry. Verified: workflow-integrity clean (no dangling script refs), sync-check clean, full suite green. daily-prep.org went 825 → 576 lines.
=daily-prep.org= still does its own inline triage (Gmail × 3 accounts, Slack, Linear, GHE PRs, calendars) as part of the full prep flow. =triage-intake.org= is now a source-agnostic engine that loads =triage-intake.<source>.org= plugins (refactored 2026-05-26), so daily-prep could call the engine and consume its synthesis instead of duplicating the source-scan logic. That DRYs up a large workflow and keeps both flows in sync when sources change — a source change now lives in one plugin that both flows pick up.
Scope:
- Identify the sections in =daily-prep.org= that do the inline triage (the email / Slack / Linear / PR / calendar fan-out, plus the "Sources checked: ..." footer at the top of each generated prep doc).
- Replace those sections with "run the =triage-intake.org= engine" and adapt the downstream sections (Heads-up, Day's Priorities, Carry-forwards) to read the engine's synthesis output rather than the inline scan results.
- Verify the generated prep doc still has the same shape (Heads-up + Day's Priorities + Carry-forwards + Sources checked).
- Reconcile source coverage: daily-prep's inline triage scans work accounts (3 Gmail, Slack, Linear, GHE PRs) that are project-specific plugins under =.ai/project-workflows/=, not general plugins. The delegation must ensure the engine loads those project plugins (Phase 0 globs both dirs) so nothing daily-prep currently scans drops out.
Origin: came up while authoring =triage-intake.org= on 2026-05-11; body refreshed after the engine/plugin refactor on 2026-05-26.
** DONE [#C] Templatize =make coverage-summary= into the language bundles (Elisp pilot) :feature:solo:
CLOSED: [2026-05-31 Sun]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-28
:END:
Done 2026-05-31 (Elisp pilot, the scoped milestone): ported the kernel into the elisp bundle as a self-contained =languages/elisp/claude/scripts/coverage-summary.el= (no coverage-core dependency), proven end-to-end against the real dotemacs SimpleCov report (93 tracked, 27 untested modules surfaced, project number 66.4%). The missing-file-as-0% + unit-weighted number is the kernel. Delivery: the script ships under =.claude/scripts/= (gitignored, auto-fixed on drift by =sync-language-bundle.sh=); =languages/elisp/coverage-makefile.txt= holds the project-owned Makefile fragment, seeded at project root by =install-lang.sh= and dropped into =.ai/inbox/= by sync when that convention exists. Tests: 12 ERT (=languages/elisp/tests/=, wired into =make test=), 5 new sync bats, 2 new install-lang bats. The fan-out to Python/Go/TS is the follow-up below.
Borrow dotemacs's =make coverage-summary= into the language bundles. After =make coverage= writes a coverage file, =coverage-summary= prints per-unit covered/total with percentages, a unit-weighted project number, and a list of source files present on disk but missing from the coverage report.
*The kernel — the only part worth building.* Weight the project number by file/module rather than by line, and count a source file absent from the report as 0% instead of omitting it. A module no test imports just doesn't appear in coverage.py or nyc output, so it silently fails to drag the number down. That missing-file detection is the value; everything else (per-file table, total) the built-in reporters already print, so don't reimplement those.
*Scope Elisp-first.* Port the proven dotemacs version into the elisp bundle, prove the pattern end-to-end, then fan out. Don't open all four bundles at once.
*Delivery (settled 2026-05-25).* Two rulesets-owned pieces per language:
- The summary *script* ships in the bundle under =.claude/= (inside the now-gitignored tooling footprint), copied in on install and auto-fixed on drift by =sync-language-bundle.sh=, never committed by the project.
- One *text file per language* holding the Makefile fragment (the =coverage-summary= target plus its =coverage= prerequisite) and a block recommending how to set up coverage for that language. The bundle never edits the project's own Makefile.
- *New project:* install copies that file in for the project to own.
- *Existing project:* sync drops the fragment into the project's =inbox/= rather than touching its Makefile — the project adopts it deliberately.
*Prerequisite caveat.* The summary presumes a coverage harness exists (undercover, coverage.py, nyc, =go cover=). Several bundles may have no =make coverage= yet, so for those this task implies adding the harness first — or the per-language file documents it as a prereq.
Per-language parser (the script is ~40 lines over each tool's output):
- Elisp: undercover SimpleCov JSON (=.coverage/simplecov.json=) — dotemacs/auto-dim scripts already parse this.
- Go: =go test -coverprofile=cover.out=; parse =cover.out= (simple text), or lean on =go tool cover -func=.
- Python: =coverage json= per-file JSON, or lean on =coverage report=.
- TypeScript/JS: nyc/Istanbul =coverage-final.json= / json-summary.
Reference (dotemacs): =scripts/coverage-summary.el=, =modules/coverage-core.el=, and the =coverage= / =coverage-summary= Makefile targets.
Origin: handoff from the .emacs.d session, 2026-05-25.
** DONE [#C] Fan out coverage-summary across all language bundles :feature:
CLOSED: [2026-05-31 Sun]
:PROPERTIES:
:CREATED: [2026-05-31 Sun]
:END:
Done 2026-05-31: coverage-summary now ships in all four bundles. Elisp pilot, then Python, Go, and TypeScript. Each parses its tool's report (SimpleCov / coverage.py JSON / Go cover.out / Istanbul json-summary), counts on-disk source files absent from the report as 0%, and file-weights the project number. The plumbing proved generic: =install-lang.sh= seeds the project-owned =coverage-makefile.txt= and ships the script into the gitignored =.claude/scripts/=; =make test= discovers ERT (=test-*.el=), pytest (=test_*.py=), =go test= (=*_test.go=), and =node --test= (=*.test.js=) under =languages/*/tests/=, each guarded on its toolchain. TypeScript and Go scripts were dogfooded (Go against a live profile, TS against the CLI); Python and TS weren't run against a live coverage tool (coverage.py / nyc not installed) — proven against faithful fixtures matching each tool's stable schema.
Remaining follow-ups (not blockers):
- Go is a coverage-only slice — =languages/go/= has no rule file, so =sync-language-bundle.sh= can't fingerprint it and won't sync-maintain the script. Build out the real Go bundle (=go.md= / =go-testing.md= + =CLAUDE.md=) to close that.
- First real adopters of the Python and TS scripts should sanity-check against a live =coverage json= / nyc =coverage-summary.json= run.
Original notes retained below for the next person.
The Elisp pilot proved the pattern; Python and Go followed. The plumbing is generic: =install-lang.sh= seeds the fragment, and =make test= now discovers ERT (=test-*.el=), pytest (=test_*.py=), and =go test= (=*_test.go=) under =languages/*/tests/=. TypeScript is the last one.
- TypeScript/JS: nyc/Istanbul =coverage-final.json= / =coverage-summary.json=. Same kernel: file-weighted project number, on-disk =*.ts=/=*.js= absent from the report counted as 0%. nyc prints its own table, so the script focuses on the missing-file list and the number. Needs a vitest/jest (or =node --test=) discovery path in =make test=, mirroring the go-test block.
Notes for the next person, from the Python + Go runs:
- Python: parses coverage.py's =files[path].summary.{covered_lines,num_statements}= (stable since coverage 5.x), resolves report paths against the report's parent dir. Proven against a synthetic report, not a live =coverage json= run (coverage.py wasn't installed). Sanity-check against a real one.
- Go: =languages/go/= is a coverage-only slice with no rule file, so =sync-language-bundle.sh= can't fingerprint it (detection keys on a bundle's own =.claude/rules/*.md=). The script is delivered by =make install-lang LANG=go= but is not sync-maintained until the Go bundle gets a real rule file + =CLAUDE.md=. Building out that bundle is the natural companion task. Also: modern =go test ./...= already lists every module package in the profile at 0%, so the missing-file list is usually empty for in-module code; it earns its keep on build-tagged files and dirs outside =./...=.
** DONE [#C] Enumerate implementation tasks in =spec-review.org= Phase 6 :feature:solo:
CLOSED: [2026-05-31 Sun]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
Added a Phase 6 step that lifts the spec's =Implementation phases= into a drop-in =todo.org= block (one =[#B]= per phase + a test-surface entry mirroring =Acceptance criteria=); a spec lacking phase decomposition raises that as a finding instead. Added Exit Criterion 6 and a review-history entry. Pure workflow-doc change.
From pearl handoff 2026-05-28. =spec-review.org= Phase 6 currently says "log deferred work to =todo.org=: v1 implementation = [#B] ... vNext/someday = [#D]." That covers deferred and v1 in passing but doesn't lift the spec's =Implementation phases= section into a drop-in =todo.org= block.
Proposed addition to Phase 6: a structured step that reads the spec's =Implementation phases= section and produces a =[#B] TODO= entry per phase (subject line, tags, one-line body, pointer back to spec), plus a final entry for the test surface (unit / integration / e2e / manual-verify mirroring the spec's =Acceptance criteria= when present). Emit under a new section "Implementation tasks (drop-in for todo.org)" in the review file. Format follows =todo-format.md= (terse heading, body holds context, tags on heading).
Three wins: handoff is one paste not a re-read; forces specs to be implementable in pieces (a spec without a phase decomposition fails this step, surfacing the shape problem); closes the loop on =Acceptance criteria= as manual-verify entries.
If the spec lacks an =Implementation phases= section, the step is the prompt to ask the author to add one before =Ready=.
** DONE [#C] Add =.aiignore= for agent inventory exclusions :chore:solo:
CLOSED: [2026-05-31 Sun]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
Shipped a gitignore-syntax =.aiignore= at the rulesets root (deps, build output, language caches, editor cruft, token artifacts, lockfiles-as-agent-read-skip) and documented the convention + defaults + lockfile policy in protocols.org ("Recursive Reads"). Per Craig's scope call (2026-05-31): did NOT wire audit.sh / diff-lang.sh / sync-language-bundle.sh — they do targeted finds over .ai/.claude/bundle dirs, never naive whole-tree walks, so honoring .aiignore there would be dead code. Script-side honoring belongs in a future catalog/inventory tool if one ships; the real consumer today is agent recursive reads (the protocols guidance).
From the codex enhancement backlog (item #8). Filesystem scans by agents and helper scripts pick up =node_modules=, =__pycache__=, =.pytest_cache=, lockfiles, generated OAuth artifacts, and test caches, even when those are gitignored. Token waste during exploration and skewed project summaries.
Scope: add a shared =.aiignore= file (or =rulesets-ignore.json= if a more structured format helps) listing default exclusions. Teach the scripts that walk the project (=audit.sh=, =diff-lang.sh=, =sync-language-bundle.sh=, future =catalog= work if any) to honor it. Document in =protocols.org= so agents know to consult it before naive recursive reads.
Keep the lockfile policy explicit: ignored when a local skill dependency cache, tracked when reproducibility matters.
** DONE [#C] Workflow test harness — drift + integrity tests :feature:solo:
CLOSED: [2026-05-31 Sun]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
From the codex enhancement backlog (item #10). Startup's drift check catches index-vs-directory mismatches but not deeper integrity: a workflow that references a script that's been renamed, a plugin whose parent engine has been deleted, a required section missing from a newly-added workflow.
Scope: add =scripts/tests/workflow-integrity.bats= (or pytest equivalent) verifying:
- Every =.org= file in =.ai/workflows/= is either indexed in =INDEX.org= or classifiable as a source plugin under an indexed engine.
- Every indexed workflow file actually exists.
- Every =file:= or shell-command reference inside a workflow to a script under =.ai/scripts/= or =scripts/= resolves to an existing file.
- Every source plugin maps to a parent workflow that exists and is indexed.
- Required sections (Overview, When to Use, the workflow's main phases) are present in each workflow.
- Workflow trigger phrases are unique enough to route — no two workflows claim the same exact trigger.
Wire into =make test=. Run on the canonical =claude-templates/.ai/workflows/= as the source of truth.
** DONE [#C] Token-tier pilot on largest workflows :feature:solo:
CLOSED: [2026-05-31 Sun]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
Done 2026-05-31: restructured both =startup.org= and =triage-intake.org= into the four-lane structure (Summary / Execution / Reference / History), preserving every existing instruction. triage-intake's reorder ran through a content-preservation guard (the multiset of content lines is unchanged; only heading depth and lane grouping moved). workflow-integrity, sync-check, and the full test suite pass.
From the codex enhancement backlog (item #5), scope-limited to a pilot rather than a universal template change.
Apply a standardized section structure to the largest workflow files first — =startup.org= and =triage-intake.org= are the prime candidates. Sections:
- *Summary* / *Quick Contract* — one-screen purpose and outputs.
- *Execution* — the steps an agent must follow.
- *Reference* — examples, edge cases, rationale, old decisions.
- *History* / *Design Notes* — durable context not needed every run.
Decision (Craig, 2026-05-31): *approved the four-lane structure (Summary/Execution/Reference/History) and the scope — restructure both =startup.org= and =triage-intake.org= now.* Makes the task solo: apply the lanes to both, preserving every existing instruction (reorganize, don't rewrite), verify the workflows still read coherently and the drift/integrity checks pass.
Teach startup/routing to read =Summary= only at routing time, then =Execution= only for the selected workflow. Other sections become opt-in.
After the pilot, evaluate: did the savings show up in real session token use? Did the structure constrain the workflow expressiveness too much? If yes to savings and no to constraint, expand to the next-largest workflows. If not, document why and stop. Don't templatize universally — shorter workflows don't need tiering.
** DONE [#B] Add Signal MCP server (rymurr/signal-mcp) :feature:
CLOSED: [2026-06-02 Tue]
:PROPERTIES:
:CREATED: [2026-05-29 Fri]
:LAST_REVIEWED: 2026-05-29
:END:
Done 2026-06-02. Registered signal-cli to the Google Voice pager account, added the signal-mcp entry to servers.json, installed via make install-mcp (claude mcp list shows it connected), and documented the signal-cli + GV dependency in mcp/README.org. The GV-registration dependency this task flagged is resolved. Shipped in cfaff12 (page-signal routing) and this commit (README).
Install [[https://github.com/rymurr/signal-mcp][rymurr/signal-mcp]] so Claude can call =send_message_to_user=, =send_message_to_group=, and =receive_message= natively rather than shelling out to the =page-signal= wrapper. Python, MCP framework, depends on =signal-cli= being configured locally.
Two-way capability is the differentiator over the CLI: =receive_message= lets the agent listen for replies on the phone, enabling page-as-confirm flows, "should I proceed?" loops over Signal, and structured Q&A across devices.
*** Dependency
This depends on the Google Voice account being registered with =signal-cli= first. Sending from Craig's primary number to itself doesn't notify (Signal treats it as one account on linked devices). The MCP server takes =--user-id= at startup, one account per instance, so it has to point at the GV account, with the primary as the per-send recipient.
If GV registration is still pending when this task runs, block here and surface that.
*** Implementation
- =mcp/servers.json= — add =signal-mcp= entry under stdio transport (=command=, =args=, optional =env= for the user-id pointer).
- =mcp/README.org= — document the signal-cli + GV-registration dependency and the user-id pattern.
- =mcp/secrets.env.gpg= — only if the MCP server's user-id needs to be encrypted (probably not; the GV number isn't a secret beyond being personal).
- Verify: =make install-mcp= followed by =make check-mcp= shows =signal-mcp ok=; smoke-test via a Claude tool call sending a message + waiting on =receive_message=.
*** Why this matters
=page-signal= is the fast path (a hook, a script, a make recipe can call it without an MCP round-trip). The MCP server is the smart path. When Claude wants to send and then *react to the reply*, the CLI can't do that — only the MCP server can. The two complement each other; this task adds the second half.
** DONE [#C] task-review pass at end of task-audit :chore:solo:
CLOSED: [2026-06-02 Tue]
:PROPERTIES:
:LAST_REVIEWED: 2026-06-02
:END:
Have the =task-audit= workflow chain a =task-review= pass as its final phase, so a freshly-audited list also gets the lighter staleness/honesty sweep without a second invocation. The legend already notes the division of labor — task-audit assigns and refreshes tags, task-review keeps them honest in passing — so running task-review at the tail of task-audit closes the loop in one pass. Edit =claude-templates/.ai/workflows/task-audit.org= (and the synced mirror) to add the final phase; check whether =open-tasks.org= already invokes task-review so the chaining stays consistent.
** DONE [#C] lint-followups drift — reconcile-on-write + audit dead-link reaping :feature:solo:
CLOSED: [2026-06-02 Tue]
:PROPERTIES:
:LAST_REVIEWED: 2026-06-02
:END:
From an .emacs.d handoff (2026-06-02): running task-audit against a large todo.org proved several =.ai/lint-followups.org= entries stale (four dead-link flags pointed at docs that now exist; three near-duplicate dated lint runs had piled up). Two fixes, scoped separately.
1. =lint-org= workflow/script (the real fix): reconcile-on-write. Before appending a run, drop entries whose finding no longer reproduces (dead link now resolves, flagged block/timestamp now clean) and dedupe against the prior run instead of re-logging. Key entries by content/finding rather than line number, so they survive edits to the target file (line numbers go stale immediately).
2. =task-audit.org= (small, narrow): in the Phase C link-hygiene step, when fixing/verifying a =file:= link, also reap any matching dead-link entry in the project's lint-followups file so the two artifacts don't drift. Scope explicitly to dead-link entries — do NOT pull general lint cleanup into the audit; that mixes two concerns and slows the audit.
** DONE [#C] start-work Justify gate: explicit "reasons not to do this" item :feature:quick:solo:
CLOSED: [2026-06-02 Tue]
:PROPERTIES:
:LAST_REVIEWED: 2026-06-02
:END:
From a work handoff (2026-06-02, surfaced running /start-work on a clean low-risk refactor). The Phase 2 Justify gate has "Downsides" and "Alternatives considered" but no forced devil's-advocate verdict on "should we even do this?" Add a "top reasons not to do this" item: surface the top three objections if any exist; when none rise to a real objection, state one line instead of manufacturing three (e.g. "Nothing material argues against this; no reason to defer or drop it"). Building the case against the work before committing is cheapest exactly at this gate, which is its purpose. Edit the start-work skill's Justify-gate phase.
** DONE [#C] start-work Approach gate: spec-needed check :feature:quick:solo:
CLOSED: [2026-06-02 Tue]
:PROPERTIES:
:LAST_REVIEWED: 2026-06-02
:END:
From Craig (2026-06-02). The Approach phase should consider whether the work needs a spec when one doesn't already exist. For a big task, this isn't a silent skip — the pre-confirmation summary must explicitly report why a spec isn't needed, so the decision is visible and challengeable at the gate rather than assumed. Small tasks can pass without comment. Edit the start-work skill's Approach-gate phase to add the spec-needed consideration and the big-task report-why-not requirement.
** DONE [#B] Cross-project pattern catalog :spec:thinking:
CLOSED: [2026-06-05 Fri]
:PROPERTIES:
:LAST_REVIEWED: 2026-06-02
:END:
From pearl handoffs [[file:docs/design/2026-05-27-pattern-catalog-pearl-notes.org][2026-05-27]] + [[file:docs/design/2026-05-28-pattern-catalog-no-empty-input.org][2026-05-28 follow-up]].
Meta-question: how do good patterns travel from project A to project B? Pearl shipped three worked examples worth capturing — one-prompt picker with typed prefix (pearl-pick-source), magit-transient state buttons, and "no empty input as meaningful" (none-sentinel as first candidate). Each is a small principle with wide surface area; without a catalog, every project re-derives them from scratch.
Open design questions before any implementation:
- Catalog format — structured (one pattern per file with frontmatter) vs free-form doc
- Surfacing mechanism — agent-driven (model spots opportunity) vs human-driven (Craig grep-searches)
- Anti-patterns included or only what worked
- Intake cadence — every time one lands, or batch review
- Home — rulesets repo (agent visibility) vs Linear doc vs per-project cross-links
Pearl recommends a one-page spec (problem + design + open questions + acceptance) before implementation. Pearl available to come back for spec-review iterations.
*** 2026-05-28 Thu @ 08:12:55 -0500 Pearl shipped patterns 4-6, filed alongside the prior two
Three more pearl handoffs landed and were filed during this audit. Filed: [[file:docs/design/2026-05-28-pattern-catalog-prompt-labels-and-defaults.org][prompt-labels-and-defaults]] (patterns 4-5: label-matches-behavior, default-most-common with friction-proportional-to-consequence) and [[file:docs/design/2026-05-28-pattern-catalog-prompt-collapse.org][prompt-collapse]] (pattern 6: collapse N orthogonal prompts into one enriched prompt). The catalog's evidence base is now four pearl notes in =docs/design/= covering six patterns plus the synthesizing principle Pearl articulated — "choices on screen, accurately labeled, ordered by what the user most often wants, friction sized to the cost of being wrong."
*** 2026-06-05 Fri @ 00:47:59 -0500 Spec approved as written — all 5 decisions + 3 open questions accepted
Craig approved the spec ([[file:docs/design/2026-06-02-pattern-catalog-spec.org][2026-06-02-pattern-catalog-spec.org]]) as written. Confirmed: one file per pattern with frontmatter; home =patterns/= in rulesets; thin =claude-rules/patterns.md= pointer, agent-driven; anti-patterns as a per-pattern field; capture-on-landing/promote-on-review intake. Open questions resolved to the spec's leans: directory name =patterns/=; concrete-now, generalize-on-second-use; manual promote flow first, no =/pattern= skill yet. Built as =.org= files with =#+KEYWORD= frontmatter (Craig's call over the initial =.md= draft); the =claude-rules/patterns.md= pointer stays =.md= since the rules layer and the Makefile glob require it.
*** 2026-06-05 Fri @ 00:47:59 -0500 Built the catalog — 6 seed patterns + pointer + README
Created =patterns/= with the six seed patterns (one-prompt-picker-typed-prefix, transient-state-buttons, no-empty-input-as-meaningful, label-matches-behavior, default-most-common-friction-proportional, collapse-orthogonal-prompts), each carrying the frontmatter contract (name/principle/problem/tags/source/examples) plus Problem/Do/Anti-pattern/Applicability/Related sections. =patterns/README.org= states the root principle, the frontmatter contract, and the intake cadence. =claude-rules/patterns.md= is the agent-facing pointer, auto-installed via the Makefile RULES glob. Sourced from the four pearl notes in =docs/design/=.
** CANCELLED [#C] Try Skill Seekers on a real DeepSat docs-briefing need :chore:
CLOSED: [2026-06-10 Wed]
:PROPERTIES:
:LAST_REVIEWED: 2026-05-28
:END:
=Skill Seekers= ([[https://github.com/yusufkaraaslan/Skill_Seekers]]) is a Python
CLI + MCP server that ingests 18 source types (docs sites, PDFs, GitHub
repos, YouTube videos, Confluence, Notion, OpenAPI specs, etc.) and
exports to 20+ AI targets including Claude skills. MIT licensed, 12.9k
stars, active as of 2026-04-12.
*Evaluated: 2026-04-19 — not adopted for rulesets.* Generates
*reference-style* skills (encyclopedic dumps of scraped source material),
not *operational* skills (opinionated how-we-do-things content). Doesn't
fit the rulesets curation pattern.
*Next-trigger experiment (this TODO):* the next time a DeepSat task needs
Claude briefed deeply on a specific library, API, or docs site — try:
#+begin_src bash
pip install skill-seekers
skill-seekers create <url> --target claude
#+end_src
Measure output quality vs hand-curated briefing. If usable, consider
installing as a persistent tool. If output is bloated / under-structured,
discard and stick with hand briefing.
*Candidate first experiments (pick one from an actual need, don't invent):*
- A Django ORM reference skill scoped to the version DeepSat pins
- An OpenAPI-to-skill conversion for a partner-vendor API
- A React hooks reference skill for the frontend team's current patterns
- A specific AWS service's docs (e.g. GovCloud-flavored)
*Patterns worth borrowing into rulesets even without adopting the tool:*
- Enhancement-via-agent pipeline (scrape raw → LLM pass → structured
SKILL.md). Applicable if we ever build internal-docs-to-skill tooling.
- Multi-target export abstraction (one knowledge extraction → many output
formats). Clean design for any future multi-AI-tool workflow.
*Concerns to verify on actual use:*
- =LICENSE= has an unfilled =[Your Name/Username]= placeholder (MIT is
unambiguous, but sloppy for a 12k-star project)
- Default branch is =development=, not =main= — pin with care
- Heavy commercialization signals (website at skillseekersweb.com,
Trendshift promo, branded badges) — license might shift later; watch
- Companion =skill-seekers-configs= community repo has only 8 stars
despite main's 12.9k — ecosystem thinner than headline adoption
** DONE [#C] Promote meeting-prep to a template workflow :feature:solo:
CLOSED: [2026-06-10 Wed]
:PROPERTIES:
:LAST_REVIEWED: 2026-06-10
:END:
meeting-prep lives in the work project's =project-workflows/= and is general-purpose — it builds a per-meeting prep doc — but its body carries project-specific references: =deepsat/assets/= transcript paths, Linear as the tracker, =knowledge.org=. Promoting to =claude-templates= means generalizing those to project-neutral terms (the project's transcript home, the project's tracker), adding it plus its =meeting-prep.pre-wire.org= supporting doc to the =.ai/= mirror and INDEX.org, and a workflow-integrity pass. Once promoted, the daily-prep 5-Day Look-Ahead's conditional "where the project has one" reference can become a direct link.
Out of the 2026-06-10 daily-prep handoff from the work project.
** DONE [#C] Build Craig's writing voice profile from real corpora :spec:
CLOSED: [2026-06-10 Wed]
:PROPERTIES:
:CREATED: [2026-05-29 Fri]
:LAST_REVIEWED: 2026-05-29
:END:
Shipped across 2026-05-29 → 2026-06-10. =voice/references/voice-profile.org= is the canonical paired file: Phases 1-2 corpora measured (commit bodies 128k words + email/PR/review registers), all 45 patterns carry entries with basis and history, and every reconciliation delta landed in =voice/SKILL.md= (#13/#33 self-discipline reframing, #7 soft flag, new corpus-derived #43-#45). Extension corpora (Slack, long-form, syntactic fragment detection) deliberately not pursued.
Build a grounded profile of Craig's actual writing voice by mining the corpora he's produced over time. The =voice/SKILL.md= patterns today are observation-derived (em-dash zero-tolerance, semicolon → period, contractions kept, sentence-fragment rewrite, felt-experience cut, etc.). Some are spot-on; others are intuition. A real corpus pass would tell us which patterns are genuinely Craig's voice and which were guesses, plus surface idioms, sentence structures, and vocabulary the current ruleset misses.
*** Sources to mine
- *Email* — sent folders across all three accounts (=gmail=, =dmail/DeepSat=, =cmail/Proton=). Filter to Craig-authored (not forwards or replies-just-quoting). Separate work voice (=dmail=) from personal voice (=gmail=, =cmail=) since they're likely distinct registers.
- *Commit messages* — =git log --author= across his repos. Captures terse-imperative voice.
- *PR descriptions and review comments* — same corpora. More deliberate prose than commits.
- *Org files he authored* — =notes.org=, todo bodies he typed, design docs in =docs/design/=, journal entries. Heavier on first-person voice than emails.
- *Slack/messages* — DeepSat work slack, family group, friends. Casual register.
- *Long-form artifacts* — résumé, proposals, white papers, blog posts (if any).
Skip session-context files, which are Claude-co-written and would muddy the signal.
*** Output
- =voice/references/voice-profile.org= (or =.md=) — the canonical reference doc:
- Vocabulary tendencies (preferred verbs, avoided cliché classes, technical-vs-plain word choice).
- Sentence structures (typical length, conjunction patterns, parenthetical use).
- Punctuation patterns (em-dash actual frequency, semicolon vs period split, contraction rate).
- Register markers (signs of formal vs casual mode, work vs personal).
- Idioms and recurring phrasings.
- "Anti-patterns" — phrasings Craig consistently avoids that show up in AI-generated prose.
- Updated =voice/SKILL.md= patterns grounded in evidence rather than intuition. Patterns that the corpus confirms get strengthened; patterns the corpus contradicts get rewritten or removed.
Each finding should cite at least two evidence samples from the corpora so the basis for a rule is reviewable.
*** Approach
Phase 1 (corpus assembly) — pull the relevant slices: sent-mail dumps, =git log --author --no-merges --pretty=format:'%B'=, =gh pr list --author= bodies, org-file extracts. Strip headers, replies-quoted blocks, signatures. Land in =voice/corpus/= (gitignored if the project's =.ai/= is gitignored, tracked if private repo with private remote).
Phase 2 (analysis) — pass over the corpus with focused queries: distribution of em-dashes per 1000 words, semicolon count, contraction frequency by register, sentence-length histogram, top-N adjectives/adverbs, etc. Subagent dispatch fits here.
Phase 3 (draft profile) — write =voice-profile.org= with findings + evidence. Surface contradictions with the current ruleset.
Phase 4 (reconcile with voice/SKILL.md) — present the deltas to Craig. Each delta is one of: confirm existing rule with evidence, strengthen rule, weaken rule, add new pattern, remove unsupported pattern. Apply approved deltas.
*** Privacy
Email and Slack content is private. The corpus must NOT enter any commit unless rulesets stays on the private cjennings.net remote (which it does today). If a future move to a public remote is on the table, the corpus and any direct quotes have to go before that happens. The profile doc itself can stay (it's analysis, not raw content), but cite by pattern not by verbatim quote.
*** Why this matters
The voice skill earns its place when Craig sees the rewrite and recognizes it as his own voice rather than a "clean" AI voice that approximates him. Today the skill catches common AI tells (em-dashes, semicolons, the felt-experience tic), which is useful. Corpus-grounding would make it catch the absence of *Craig-specific positive traits* — the phrasings he actually reaches for — not just the AI traits he doesn't.
Likely improves =/voice personal= output quality on PR bodies, commit messages, and email drafts. Compound interest over the long run.
** DONE [#C] Wide org-table handling — helper/lint/standard :spec:
CLOSED: [2026-06-11 Thu]
:PROPERTIES:
:LAST_REVIEWED: 2026-06-11
:END:
The org-table standard keeps project-doc tables <=120 cols with multi-line wrapped cells and a rule between rows, but nothing enforces it and hand-wrapping a wide cell into multi-row form is tedious and error-prone. Decide among: (a) a helper that auto-wraps a wide table into multi-row cells at a target width, (b) a lint check that flags tables over the width budget, (c) tighten the written standard with a worked before/after example. Likely some combination. A worked before/after example exists in a work-project prep doc (a 6-col table reformatted by hand to a 4-col multi-row-cell version), to be reproduced generically when this lands.
Out of a work-project handoff 2026-06-09.
Resolution 2026-06-11: all three shipped. (c) The standard, generalized from the work project's notes.org local copy, is now claude-rules/org-tables.md (globally loaded; render-width semantics — links measure at their visible label, never split a link) with the worked wrapped-table example. (a) .ai/scripts/wrap-org-table.el reflows tables mechanically: render-width measurement, link-atomic tokenizing, column shrink-to-floor allocation, continuation rows, rules between logical rows; idempotent (rule-delimited continuation groups merge back before re-wrapping); 23 ERT tests. (b) lint-org.el gained an org-table-standard judgment check (width overruns, missing rules; conformant wrapped tables not false-flagged); 5 new ERT tests, 32 total. Verified end-to-end on a demo file: 150-col table reflowed to budget, idempotent second pass, lint clean on the result.
** DONE [#C] SessionStart-on-clear hook for auto-resume :feature:
CLOSED: [2026-06-11 Thu]
:PROPERTIES:
:LAST_REVIEWED: 2026-06-11
:END:
Add a SessionStart hook (matcher: clear) in settings.json that auto-injects "read .ai/session-context.org and resume if present, else run startup.org". Today /flush prompts the user to /clear and the next session relies on the model re-reading session-context; the hook makes resume automatic on /clear. Keep full startup.org for genuine fresh starts (new day, other machine, been away). Likely lands as claude-templates workflow notes plus the hook in settings.json.
The checkpoint+resume halves already shipped as /flush. This is the remaining automation piece. Out of a work-project handoff 2026-06-09 (process tooling, belongs in rulesets not the work project).
Resolution 2026-06-11: the hook itself had already shipped 2026-06-02 (hooks/session-clear-resume.sh + the SessionStart clear entry in the tracked settings.json — this task duplicated it). What was actually broken: make install didn't cover hooks, so the symlink never reached machines that hadn't run make install-hooks by hand, and the hook errored silently on every /clear. Fixed by folding default-hook linking into make install (startup's Phase A.0 now propagates hooks machine-wide), with bats coverage in scripts/tests/install-hooks-link.bats. Both hook branches verified on ratio; the live /clear fire is a one-keystroke manual test.
*** TODO Manual testing and validation :test:
**** /clear mid-session resumes from the anchor
What we're verifying: the SessionStart(clear) hook fires and the fresh context resumes instead of cold-starting.
- In any project session with a live .ai/session-context.org (this rulesets session qualifies), type /clear
- Send any short message (the injected context loads but the model waits for your next keystroke)
Expected: the reply starts with "flushed." on its own line, restates the Active Goal and immediate Next Step, and does NOT run the startup workflow.
** 2026-06-12 Fri @ 02:56:58 -0500 New personal projects are home regroupings — no mechanism needed
Craig's call (2026-06-12): new personal projects will live in home, and there's no project-creation mechanism to build — he'll be working in home and simply decide to group some things differently. Nothing to do.
Concurrence, verified: no template doc directs new personal work into ~/projects (first-session.org, install-ai.sh, and the README carry no such guidance; the only ~/projects references are discovery-root scans, which home and work still need). The situation as it stands: a new personal "project" is an area dir plus tasks inside home's existing =.ai/= machinery, no bootstrap step; =first-session.org= remains the bootstrap for standalone code projects in ~/code, unchanged and correct; "launch finances"-style trigger phrases for folded names degrade politely to the no-match candidate list, worth work only if real friction shows up.
** DONE [#C] Build =/update-skills= skill for keeping forks in sync with upstream :feature:
CLOSED: [2026-06-11 Thu]
:PROPERTIES:
:LAST_REVIEWED: 2026-06-10
:END:
The rulesets repo has a growing set of forks (=arch-decide= from
wshobson/agents, =playwright-js= from lackeyjb/playwright-skill, =playwright-py=
from anthropics/skills/webapp-testing). Over time, upstream releases fixes,
new templates, or scope expansions that we'd want to pull in without losing
our local modifications. A skill should handle this deliberately rather than
by manual re-cloning.
Shipped 2026-06-11: [[file:.claude/commands/update-skills.md][/update-skills command]] + [[file:scripts/update-skills.py][helper script]] (17 bats tests) + three bootstrapped manifests under [[file:upstreams/][upstreams/]]. The first real upstream drift will exercise the interactive per-file/per-hunk flow end to end; the merge mechanics are covered by the test suite.
*** 2026-06-11 Thu @ 17:05:28 -0500 Specification written as the shipped artifacts
The command doc ([[file:.claude/commands/update-skills.md][update-skills.md]]) carries the user-facing spec: discovery, classification statuses, the per-file confirmation and per-hunk conflict flow, mark-synced semantics, and the missing-baseline fallback. The script's module docstring specifies the manifest schema. Two deviations from the 2026-05-16 design, with reasons: manifests live centrally at =upstreams/<name>/= instead of per-skill =.skill-upstream= dotfile dirs (arch-decide became two flat files in =commands/= and can't carry one — a =files= rename map covers it); baselines were seeded from the 2026-06-11 upstream HEADs since the true fork-point commits are unrecoverable, so pre-existing local modifications classify as =local-only= going forward.
*** 2026-05-16 Sat @ 01:14:20 -0500 original goals and decisions
**** Design decisions (agreed)
- *Upstream tracking:* per-fork manifest =.skill-upstream= (YAML or JSON):
- =url= (GitHub URL)
- =ref= (branch or tag)
- =subpath= (path inside the upstream repo when it's a monorepo)
- =last_synced_commit= (updated on successful sync)
- *Local modifications:* 3-way merge. Requires a pristine baseline snapshot of
the upstream-at-time-of-fork. Store under =.skill-upstream/baseline/= or
similar; committed to the rulesets repo so the merge base is reproducible.
- *Apply changes:* skill edits files directly with per-file confirmation.
- *Conflict policy:* per-hunk prompt inside the skill. When a 3-way merge
produces a conflict, the skill walks each conflicting hunk and asks Craig:
keep-local / take-upstream / both / skip. Editor-independent; works on
machines where Emacs isn't available. Fallback when baseline is missing
or corrupt (can't run 3-way merge): write =.local=, =.upstream=,
=.baseline= files side-by-side and surface as manual review.
**** V1 Scope
- [ ] Skill at =~/code/rulesets/update-skills/=
- [ ] Discovery: scan sibling skill dirs for =.skill-upstream= manifests
- [ ] Helper script (bash or python) to:
- Clone each upstream at =ref= shallowly into =/tmp/=
- Compare current skill state vs latest upstream vs stored baseline
- Classify each file: =unchanged= / =upstream-only= / =local-only= / =both-changed=
- For =both-changed=: run =git merge-file --stdout <local> <baseline> <upstream>=;
if clean, write result directly; if conflicts, parse the conflict-marker
output and feed each hunk into the per-hunk prompt loop
- [ ] Per-hunk prompt loop:
- Show base / local / upstream side-by-side for each conflicting hunk
- Ask: keep-local / take-upstream / both (concatenate) / skip (leave marker)
- Assemble resolved hunks into the final file content
- [ ] Per-fork summary output with file-level classification table
- [ ] Per-file confirmation flow (yes / no / show-diff) BEFORE per-hunk loop
- [ ] On successful sync: update =last_synced_commit= in the manifest
- [ ] =--dry-run= to preview without writing
**** V2+ (deferred)
- [ ] Track upstream *releases* (tags) not just branches, so skill can propose
"upgrade from v1.2 to v1.3" with release notes pulled in
- [ ] Generate patch files as an alternative apply method (for users who prefer
=git apply= / =patch= over in-place edits)
- [ ] Non-interactive mode (=--non-interactive= / CI): skip conflict resolution,
emit side-by-side files for later manual review
- [ ] Auto-run on a schedule via Claude Code background agent
- [ ] Summary of aggregate upstream activity across all forks (which forks have
upstream changes waiting, which don't)
- [ ] Optional editor integration: on machines with Emacs, offer
=M-x smerge-ediff= as an alternate path for users who prefer ediff over
per-hunk prompts
**** Initial forks to enumerate (for manifest bootstrap)
- [ ] =arch-decide= → =wshobson/agents= :: =plugins/documentation-generation/skills/architecture-decision-records= :: MIT
- [ ] =playwright-js= → =lackeyjb/playwright-skill= :: =skills/playwright-skill= :: MIT
- [ ] =playwright-py= → =anthropics/skills= :: =skills/webapp-testing= :: Apache-2.0
**** Open questions
- [ ] What happens when upstream *renames* a file we fork? Skill would see
"file gone from upstream, still present locally" — drop, keep, or prompt?
- [ ] What happens when upstream splits into multiple forks (e.g., a plugin
reshuffles its structure)? Probably out of scope for v1; manual migration.
- [ ] Rate-limit / offline mode: if GitHub is unreachable, should skill fail
or degrade gracefully? Likely degrade; print warning per fork.
** DONE [#C] Monthly session-harvest workflow :feature:
CLOSED: [2026-06-11 Thu]
:PROPERTIES:
:CREATED: [2026-06-11 Thu]
:LAST_REVIEWED: 2026-06-11
:END:
A monthly pass over recent =.ai/sessions/= summaries across projects proposing promotion candidates: patterns for the catalog, durable facts for the KB, rule refinements, workflow learnings. Sibling cadence to the roam-hygiene timer; a workflow run on schedule, not a standing agent. From the 2026-06-11 insights report's "Canonical-Aware Knowledge & Workflow Curator" — the capture/promote machinery exists (pattern catalog, /codify, KB); this adds the mining cadence.
Shipped 2026-06-11 as [[file:.ai/workflows/session-harvest.org][session-harvest.org]] (template + INDEX entry): five phases, four promotion lanes, /codify-grade gates + work-confidentiality scrub, =:LAST_HARVEST:= marker in notes.org, and the KB receipt-line metrics readout for the ~2026-07-10 checkpoint. Window filter reads session-filename date prefixes (mtime proved unreliable in a live test). First run due ~2026-07-11.
** CANCELLED [#B] todo-cleanup.el per-area Open Work / Resolved pairs :feature:
CLOSED: [2026-06-11 Thu]
=--archive-done= assumes exactly one level-1 "Open Work" and one "Resolved" heading per todo.org. Home's consolidated file briefly carried per-area pairs and the pass skipped. Filed from home's 2026-06-11 addendum, then held the same evening when Craig flagged that he expected a single pair.
Cancelled 2026-06-11: Craig confirmed the decision — one todo queue with a single Open Work / Resolved pair. Home reshapes its consolidated file to that form, and the existing single-pair tooling works unmodified. No code change needed.
** CANCELLED [#D] todo-cleanup =--archive-done= reports 0 moves while moving subtrees :bug:
CLOSED: [2026-06-12 Fri]
:PROPERTIES:
:CREATED: [2026-06-12 Fri]
:END:
Observed at the 2026-06-12 wrap: the pass relocated closed subtrees from Open Work to Resolved while printing "todo-cleanup --archive-done: 0 subtree(s) moved".
CANCELLED 2026-06-12 — cannot reproduce. =todo-cleanup.el= is unchanged since the wrap that logged this, and =tc-archived= is incremented inline with each move and read straight in the report, so no move can go uncounted. Running the exact pre-archive state (=b6d286f:todo.org=) through the tool reports the right count (3 moved, all listed). The "0 moved" was a correct second-run report: =open-tasks.org= Phase A runs =--archive-done= after wrap-it-up already archived, so the second pass finds nothing to move and prints 0 next to the first pass's git diff. Not a code defect.
|