aboutsummaryrefslogtreecommitdiff
path: root/root-cause-trace
diff options
context:
space:
mode:
Diffstat (limited to 'root-cause-trace')
-rw-r--r--root-cause-trace/SKILL.md11
1 files changed, 6 insertions, 5 deletions
diff --git a/root-cause-trace/SKILL.md b/root-cause-trace/SKILL.md
index fad4601..bb46f41 100644
--- a/root-cause-trace/SKILL.md
+++ b/root-cause-trace/SKILL.md
@@ -99,13 +99,14 @@ Two actions, in order:
**a. Fix at the trigger.** In the example: change the query to `INNER JOIN`, or explicitly handle customer-less orders (depending on intent).
-**b. Add defense-in-depth.** For each layer between the trigger and the symptom, ask: *could this layer have caught the bad value?* If yes, add the check.
+**b. Add defense-in-depth — at boundaries, not everywhere.** Don't armor every layer between the trigger and the symptom; that's validation spam that buries the real check and slows every call. Add a check only at a layer that *owns* a boundary or invariant:
-- Parser/validator layer: reject rows without `customer_id`
-- Service layer: throw if `order.customer` is nil instead of passing it downstream
-- Formatter layer: render "Unknown customer" rather than crashing
+- **Ingress / trust boundary** — where untrusted or external data first enters (parser/validator layer: reject rows without `customer_id`)
+- **Persistence boundary** — before writing to or reading from a store
+- **Invariant-owning layer** — the service that's supposed to guarantee a fact (throw if `order.customer` is nil rather than passing it downstream)
+- **Final render/output** — degrade gracefully (render "Unknown customer" rather than crashing)
-Each defense means the next time something similar happens, it surfaces earlier and with better context. The goal isn't any single check — it's that the bad value can't propagate silently.
+A pass-through function that neither owns the invariant nor crosses a boundary should *not* get a duplicate null check — let the boundary layers carry it. Each boundary defense means a similar bad value surfaces earlier and with better context; the goal isn't any single check, it's that the bad value can't propagate silently past the layer responsible for it.
## Adding Instrumentation