Testing
What we run, why, and how to invoke each suite. Numbers in claims should always be measured ([MEASURED]) — never project them.
1. Sensor unit tests (Rust)
cd backend && make sensor-test # cargo test --release
Covers detection logic in detect.rs and anomaly.rs. Synthetic frames, no privileges.
2. Sensor bench (Rust)
make sensor-bench # ./target/release/arpg-sensor bench 2000000
Hot‑path parse+inspect on 2 M synthetic frames. Sub‑µs/frame on lab hardware proves the <2 ms SLA with several orders of magnitude headroom.
3. Sensor selftest
make sensor-selftest # synthetic Tier 1/2/3 cases + verdicts
Exercises every rule without touching the network. Useful as a first sanity check after a detector change.
4. Correlator + API (Go)
cd backend && go test ./... # everything in correlator/ and api/
There aren't many unit tests today; the truth is in the end‑to‑end runs below.
5. Labeled‑dataset evaluation
make eval # backend/generator/eval.py against the dataset
Reads a labeled set (poison / GARP storm / mix / benign), runs the sensor in replay mode, computes precision / recall / FP. The release note records the measured numbers; update docs/OPERATIONS.md if they change materially.
6. Fuzz / adversarial harness
make fuzz # backend/generator/fuzz.py
Mutated ARP frames (oversized, truncated, weird opcodes, partial replies) — sensor must neither crash nor silently drop them.
7. Latency probe
make latency-probe # detect→mitigate p95
End‑to‑end timing on synthetic attacks. Reports p50 / p95 / p99 over the configured window. SLA target is <100 ms; lab measurement is around 5 ms.
8. Frontend type‑check
cd frontend && npm run build # tsc --noEmit + vite build
The build script runs tsc --noEmit first; a TS error fails the build. There is no component test suite today — the integration coverage lives in Playwright (below).
9. End‑to‑end (Playwright)
cd tests/e2e
npm install
PLAYWRIGHT_HOST_PLATFORM_OVERRIDE=ubuntu24.04-x64 npx playwright install chromium
node audit.mjs http://127.0.0.1:8080 # screenshots every page (sanity)
node writeflow.mjs http://127.0.0.1:8080 # binding add/delete + policy toggle
node rbac_ui.mjs http://127.0.0.1:8080 # viewer vs admin UI gates
Screenshots land in tests/e2e/shots/. The API must be running with demo accounts on port 8080.
10. Manual smoke
sudo python3 backend/generator/arp_attack.py -i ens33 --mode poison --spa 192.168.10.1
Watch the dashboard. Expect a CRITICAL incident within ~5 s, an audit row, and (in GUARDED/ENFORCE) a corrective ARP from the controller. Acknowledge the incident to close the loop.
11. What we don't test (honest debt)
- Multi‑VLAN sharding under load — designed, not stress‑tested.
- vMAC‑mimicry against a real HA segment — currently exercised only against synthetic
- NAC / RADIUS CoA actuator — wired up against a stub; the L2 path on real hardware needs
HSRP/VRRP allow‑list entries.
vendor‑specific integration.
Document these gaps when claiming results; don't paper over them.