diff options
Diffstat (limited to 'docs/design/gptel-network-tools.org')
| -rw-r--r-- | docs/design/gptel-network-tools.org | 407 |
1 files changed, 0 insertions, 407 deletions
diff --git a/docs/design/gptel-network-tools.org b/docs/design/gptel-network-tools.org deleted file mode 100644 index aae2cc2a8..000000000 --- a/docs/design/gptel-network-tools.org +++ /dev/null @@ -1,407 +0,0 @@ -#+TITLE: Design: gptel network tools -#+AUTHOR: Craig Jennings -#+DATE: 2026-05-16 -#+OPTIONS: toc:nil num:nil - -* Status - -Draft. Brainstorm output captured from a =/brainstorm= session on -2026-05-16. Sibling to -=docs/design/gptel-git-tools-magit-backend.org= and the broader theme -hierarchy under =** TODO [#B] GPTel Tool Work= in =todo.org=. - -The conventional vs tail-sample exploration covered three categories -(network, text/data, build/code). Network was selected as the next -build target; this doc captures the network slice in full. The other -two categories are referenced briefly and live as theme stubs under -=*** TODO [#B] Filesystem Related Tools= and -=*** TODO [#B] Development Workflow Related Tools= in =todo.org=. - -* Problem - -The current =gptel-tools/= set covers filesystem CRUD, web fetch, and -git status/log/diff. When the user asks the agent "why can't I reach -X?" or "what's on my LAN right now?" the agent has no affordances -- -it can only suggest commands the user runs manually. - -Network diagnosis is a recurring task on this laptop (homelab, mixed -wifi/wired, occasional VPN, NetworkManager-managed connections). The -agent should be able to run read-only network probes directly, return -structured findings, and synthesize an explanation. Anything that -mutates network state (=nmcli connection up=, route changes) stays -behind =:confirm t=. - -* Non-goals - -- Active offensive scanning, vulnerability probes, or exploitation - tooling. Out of scope at the wrapper boundary -- nmap's - =-A=/=-O=/aggressive modes are rejected, NSE is deferred. -- Scanning networks the user doesn't own. Public targets are gated - behind an explicit =external=t= flag and =:confirm t=. -- Real-time/streaming inspection (=iftop=, =nethogs=, =tcpdump - follow=). Snapshot tools only; streaming tools don't fit the - request/response shape of gptel tools. -- Replacing Magit's git tooling, mu4e's mail handling, or any other - Emacs-native workflow. Network tooling is the gap. - -* Approaches considered - -The =/brainstorm= run generated six candidate themes across three -categories. Three conventional (high-prior), three tail samples -(genuinely different regions of the option space). Network was -chosen as the first build target; the others are recorded for -follow-up sessions. - -** Recommended: network triage bundle (conventional #1) - -Five tools covering discovery, diagnostics, and inspection: - -| Tool | Purpose | -|-------------------+--------------------------------------------------| -| =net_diagnose= | "Why can't I reach X?" -- composite probe | -| =net_discover= | "What's on this subnet?" -- LAN host discovery | -| =net_services= | "What's listening on host X?" -- service detect | -| =network_status= | "What's my current network state?" -- snapshot | -| =dns_lookup= | Typed DNS query (A/AAAA/MX/NS/TXT/SRV/CAA) | - -Detailed in =* Design= below. - -*** Pros - -- Hits the highest-leverage daily question (connectivity diagnosis) - with a single mental entry point (=net_diagnose=). -- Atomic tools (=dns_lookup=, =network_status=) for cases the - composite is too coarse for. -- All read-only at the network layer; =:confirm nil= for RFC1918, - =:confirm t= for public targets. -- nmap's two genuinely-unique capabilities (subnet discovery, service - enumeration) get first-class wrappers. - -*** Cons - -- Five tools is heavy for one category. Some are thin wrappers around - a single command. -- Composite =net_diagnose= hides which sub-check fired; debugging the - tool itself is harder than debugging atomic tools. -- nmap is the one tool that *can* get the user in trouble. Target - gating must be airtight or it's the wrong tool to ship. - -** Rejected: code-quality fan-out (conventional #2) - -=shellcheck_run=, =format_check= (black/prettier/gofmt/rustfmt/elisp, -returns unified diff), =lint_run= (eslint/ruff/golangci-lint), -=dot_render=, =mermaid_render=. - -Folded into =*** TODO [#B] Development Workflow Related Tools= as -per-language work rather than a standalone bundle. Most of the per- -language wins land in the existing prog-*.el modules' format-on-save -and LSP attachments; the agent benefits more from /reading/ those -buffers than from re-running the formatters via tool calls. - -** Rejected: GitHub workspace (conventional #3) - -=gh_pr_view=, =gh_issue_search=, =gh_run_logs=, =gh_pr_diff=. - -Overlaps with the magit-backend track (=gptel-git-tools-magit-backend=) -for several queries. Better treated as a follow-on once the magit -backend lands -- some queries are local (magit) and some are remote -(gh), and the seam is clearer after the local side is built. - -** Rejected: DNS-chain inspector (tail sample) - -=dns_chain= walks NS -> A/AAAA -> MX -> SPF -> DMARC -> DKIM for a -domain and returns a structured assessment with red flags ("MX -missing TLS-RPT", "SPF includes >10 lookups", "DMARC policy=none"). - -Real value when it's useful but probably 5 calls/year for this -laptop. =dns_lookup= covers 90% of the recurring need; the chain -walker is parked for a possible follow-on. - -** Rejected: awk_eval / sed_eval with explanation (tail sample) - -Accept snippet + sample input, return both the transformed output and -a plain-English explanation of what the snippet does. - -Doubles work the model already does internally -- the model is -already good at generating and explaining awk/sed. Real win would -only be the actual execution against actual data, which the eshell -escape hatch in the Filesystem section already covers. - -** Adopted as project convention: plan/apply split (tail sample) - -=rsync_plan= / =rsync_apply= split: plan always runs =--dry-run= and -returns the file list and byte counts that *would* transfer; apply is -a separate tool registration with =:confirm t=. Same shape for -=nmcli= (status read vs connection mutate) and any other mutating -tool. - -Promoted to a documented convention rather than a single tool: any -mutating wrapper in =gptel-tools/= should split into a preview and an -apply. The preview is =:confirm nil= so the agent can plan -autonomously; the apply is =:confirm t= and stops cleanly for human -review. Applies to =rsync=, =nmcli connection up=, =ssh= mutations, -and the pandoc/ffmpeg/imagemagick output-writing tools in the -Filesystem section. - -* Design - -** Tool 1: =net_diagnose= - -Composite "why can't I reach X?" probe. Given a target (hostname or -IP), runs a sequence of sub-checks and returns a structured result: - -1. =dig +short= on the name (skip if target is an IP literal). -2. =ping -c 3 -W 2= against the resolved IP. -3. =traceroute -n -w 2 -q 1 -m 20= to the IP. -4. If a port is given: =curl --max-time 5 -o /dev/null -sw '%{http_code}\n'= - for ports 80/443, or =nc -zv -w 3= for arbitrary TCP ports. - -Output shape (alist or plist returned to the model): - -#+begin_src text - ((target . "example.com") - (resolved-to . "93.184.216.34") - (dns-time-ms . 12) - (ping . ((sent . 3) (received . 3) (avg-ms . 14.2))) - (traceroute . ((hops . 8) (last-hop . "93.184.216.34"))) - (port-check . ((port . 443) (status . "200") (tls . "ok")))) -#+end_src - -Caps: total runtime <30s. Each sub-check has its own timeout. If a -sub-check fails (no ping reply, no route, no DNS), the field carries -the failure mode rather than aborting the whole call -- the agent -needs the partial picture to reason. - -=:confirm nil=. Read-only. - -** Tool 2: =net_discover= - -Wraps =nmap -sn <subnet>= for LAN host discovery. Two argv shapes: - -- =net_discover ()= -- defaults to the current LAN, derived from - =ip route get 1.1.1.1= and the matching interface's =/24=. -- =net_discover :subnet "192.168.1.0/24"= -- explicit subnet. - -Guardrails: - -- Subnet must be RFC1918, link-local (169.254/16), CGNAT (100.64/10), - or loopback. Public subnets rejected at the validator. -- Subnet mask must be /22 or smaller (no /16 or wider). At /22 that's - ~1024 hosts -- enough for any homelab. Default home network is /24. -- =--host-timeout 30s --max-retries 1= to bound runtime. - -Output: list of =(ip mac hostname state)= tuples. - -=:confirm nil= for RFC1918 / link-local / CGNAT / loopback. Public -subnets never reach this tool (validator rejects). - -** Tool 3: =net_services= - -Wraps =nmap -sV= for service/version detection on a single host. - -Argv: - -- =:host= -- required. RFC1918 / link-local / CGNAT / loopback by - default. Public hosts require =:external t= which flips - =:confirm t=. -- =:ports= -- optional port spec. Default: top-100 (=--top-ports - 100=). Custom lists allowed: ="22,80,443,5432,6379"= or - ="1-1024"=. Hard cap: 1024 ports total. -- =:fast= -- if t, uses =--top-ports 20= for a quick check. - -Mode allowlist enforced at the wrapper: only =-sV= with optional -=-p=. Reject =-A=, =-O=, =-T4=/=-T5=, =--script=, raw-packet flags. - -Output: list of =(port protocol state service version banner)= -tuples, parsed from =-oG -= (greppable output). - -=:confirm nil= for RFC1918 / link-local / CGNAT / loopback. -=:confirm t= for any target reachable only as a public IP/hostname. - -** Tool 4: =network_status= - -Snapshot of the local network state. Composite of: - -- =ip -br addr= -- interfaces and their addresses. -- =ip route= -- routing table. -- =nmcli -t -f NAME,TYPE,DEVICE,STATE connection show --active= -- - active NetworkManager connections. -- =ss -tulpn= (or =netstat -tulpn= fallback) -- listening sockets. -- =resolvectl status= (or =/etc/resolv.conf= fallback) -- DNS - resolver state. - -Output: structured alist with sections for each. - -=:confirm nil=. Read-only. - -Note: this is also the candidate target for the plan/apply split if -=nmcli connection up=/=down= ever lands as a tool -- =network_status= -becomes the "plan" side and any mutation is a separate tool. - -** Tool 5: =dns_lookup= - -Typed DNS query. Argv: - -- =:name= -- required. The DNS name to query. -- =:type= -- record type. Default =A=. Allowed: =A=, =AAAA=, =MX=, - =NS=, =TXT=, =SRV=, =CAA=, =CNAME=, =PTR=, =SOA=. -- =:server= -- optional resolver. Default uses system resolver. - When set, must be RFC1918 or one of a small allowlist (=1.1.1.1=, - =8.8.8.8=, =9.9.9.9=) so the tool can't be used to probe arbitrary - hosts via DNS. - -Output: list of records with TTL. For =MX= and =SRV=, includes -priority/weight/port. For =TXT=, the records are split into the -quoted segments dig returns. - -=:confirm nil=. Read-only. - -** Shared helpers - -In =gptel-tools/network_tools.el= (single file, mirrors the -magit-backend plan for git tools): - -- =cj/gptel-net--validate-target HOST &optional ALLOW-PUBLIC= - - Resolves HOST. Rejects unless resolved IP is RFC1918 / - link-local / CGNAT / loopback, unless ALLOW-PUBLIC is non-nil. - - Returns the resolved IP on success. - -- =cj/gptel-net--validate-subnet CIDR= - - Rejects non-private subnets and subnets wider than /22. - - Returns =(network mask)= on success. - -- =cj/gptel-net--current-lan= - - Derives the current /24 from =ip route get 1.1.1.1=. - -- =cj/gptel-net--run ARGS &key TIMEOUT= - - Wraps =process-file= with a uniform timeout, color/encoding - posture, and structured return =(exit-code stdout stderr)=. - -- =cj/gptel-net--parse-nmap-greppable STRING= - - Parses nmap =-oG -= output into structured tuples. - -- =cj/gptel-net--truncate TEXT MAX-BYTES= - - Same shape as the existing per-tool truncate helpers. Open - question whether this consolidates into =system-lib.el= alongside - the matching helpers in =web_fetch.el= and =update_text_file.el=. - -** Caps - -| Tool | Default cap | Hard cap | -|------------------+------------------------+------------------------| -| =net_diagnose= | <30s total runtime | <30s total runtime | -| =net_discover= | /24 default, /22 max | /22 | -| =net_services= | top-100 ports | 1024 ports | -| =network_status= | uncapped (snapshot) | uncapped | -| =dns_lookup= | uncapped | uncapped | - -** =:confirm= posture - -| Tool | RFC1918 target | Public target | -|------------------+-------------------+-------------------------| -| =net_diagnose= | =:confirm nil= | =:confirm t= | -| =net_discover= | =:confirm nil= | rejected at validator | -| =net_services= | =:confirm nil= | =:confirm t= | -| =network_status= | =:confirm nil= | n/a (local snapshot) | -| =dns_lookup= | =:confirm nil= | =:confirm nil= | - -=dns_lookup= stays =:confirm nil= for public names because DNS is -read-only and innocuous. =net_diagnose= and =net_services= against -public targets are gated because pinging/probing public hosts isn't -*illegal* but it can trip rate-limits or get the user flagged on a -managed network. - -** Tests - -Single file =tests/test-gptel-tools-network-tools.el=. Real subnets -are not available in CI, so: - -- =net_discover= and =net_services= are stubbed via =cl-letf= on - =cj/gptel-net--run=, returning canned nmap output. Real nmap - invocation tested via one =:tags '(:integration)= test that runs - =nmap -sn 127.0.0.1/32= and asserts the parser handles the real - format. -- =net_diagnose= sub-checks stubbed individually so each failure mode - can be exercised. -- =network_status= sections stubbed per-command; one integration test - runs against the live system and asserts the structure parses. -- =dns_lookup= stubbed against canned =dig= output; one integration - test against =localhost= via the system resolver. - -Rough count: ~12 shared-helper tests (validators, current-lan -detector, parsers) + ~7 per tool x 5 tools = ~47 tests. - -** Risk surface - -| Risk | Mitigation | -|-----------------------------------------------------------+---------------------------------------------------------------------| -| nmap scan against an unintended target | Validator gates on resolved IP, not on the input string. Public | -| | targets require explicit =:external t= flag + =:confirm t=. | -| Scan triggers IDS/IPS on a corporate/managed network | Default modes are non-aggressive (=-sn=, =-sV= only). No =-A=, no | -| | NSE, no high T-level. =:confirm t= for non-RFC1918 targets gives | -| | the user a manual checkpoint. | -| =net_diagnose= hangs on a slow target | Per-sub-check timeouts; total runtime cap; partial-failure return | -| | rather than abort. | -| nmap not installed on the system | =:command= check at module load via =cj/executable-find-or-warn= | -| | (matching the prettier/pyright pattern documented in CLAUDE.md). | -| Network tools shell out via =process-file= | argv-list invocation, no shell. =shell-quote-argument= unused | -| | because no shell is involved. | -| /tmp pollution or banner output writing to disk | All output captured to buffer via =process-file=, never written. | - -* Open questions - -1. *Default port set for =net_services=.* Top-100 (nmap default), - top-1000 (full default scan, slower), or a custom homelab-tuned - list (=22, 80, 443, 445, 3389, 5432, 6379, 8080, 8443, 9090, 9000, - 631=)? My read: top-100 default + =:fast t= for top-20 + custom - override for the homelab list when needed. -2. *NSE in v1 or deferred?* Skip entirely (clean v1) or ship a small - allowlist (=ssl-cert=, =http-title=, =ssh-hostkey=)? My read: - skip in v1. If a real use case shows up (TLS audit), add a single - =net_tls_audit= tool wrapping just =ssl-enum-ciphers=/=ssl-cert= - rather than a generic NSE escape hatch. -3. *Consolidate the truncate helper.* Same open question as the - magit-backend doc: move =cj/gptel-net--truncate= and its siblings - into =system-lib.el= as =cj/gptel-tools--truncate-bytes=, or keep - per-module? My read: consolidate when there are three callers - (web_fetch, update_text_file, network_tools all qualify). -4. *Composite vs atomic for =net_diagnose=.* Build it as one - composite, or break it into =ping_run=, =traceroute_run=, - =port_check= and let the agent compose? My read: composite is - better -- the agent reasons in "diagnose-this-target" terms more - often than in "just-ping-this". Atomic sub-tools can be added - later if the composite proves coarse-grained. -5. *Promote plan/apply split to documented convention now?* Or wait - until a second tool exercises it (post-rsync)? My read: document - the convention in the Filesystem section body now, since pandoc / - ffmpeg / imagemagick all benefit, even before any of them ship. -6. *nmcli mutation tools.* Out of scope for this doc but worth - flagging: =nmcli connection up <name>= / =nmcli connection down - <name>= / =nmcli device wifi connect <ssid>=. These would be the - first apply-side tools under the plan/apply convention, with - =network_status= as the plan side. - -* Effort estimate - -M (1-3 hours). Five tools + shared helpers + ~47 tests. Most of the -time is test authoring (canned nmap output, dig output, ss output); -production code is small because each tool is a thin =process-file= -wrapper plus a parser. - -* Next steps - -- Resolve open questions #1 and #2 before any code lands (the - =net_services= shape can't be finalized without them). -- Once approved, the work attaches to =*** TODO [#B] (Network bundle: - net_diagnose / net_discover / net_services / network_status / - dns_lookup)= -- a new theme under =*** TODO [#B] (Networking tools - category)= which itself becomes a new top-level under =** TODO [#B] - GPTel Tool Work= in =todo.org=, peer to the existing Filesystem - section. -- Implementation follows =/start-work= flow: TDD, characterization - tests for the parsers first (canned nmap/dig/ss fixtures), then - the wrappers, then the registrations in - =cj/gptel-local-tool-features=. -- After landing, revisit candidate #6 (plan/apply split) -- the - first apply-side tool (=nmcli connection up=, =rsync_apply=, - pandoc-output) exercises the convention end-to-end. |
