3.6 KiB
3.6 KiB
Kattila Agent Implementation Plan
This document outlines the detailed architecture and implementation steps for the Go-based Kattila Agent.
Overview
The Kattila Agent continuously gathers network topology information from the host OS (using ip and wg commands), cryptographically signs the data, and pushes it to the Kattila Manager. If direct communication fails, it uses peer scanning to find a relay path.
User Review Required
Important
- Do we assume
wg,ipcommands are always available in thePATHof the agent?- The TXT record is returned with wrapping quotes (e.g.,
"955f333e5b9cc..."). The agent will strip these quotes. Is the PSK used exactly as-is for the HMAC key?- For Wireguard peer scanning during relay fallback: Will the agent scan the entire subnet of
allowed ips(e.g.,172.16.100.8/29) to find other agents on port5087, or just guess based on endpoints? Scanning the small subnet is usually reliable.- We should parse
wg show all dumpinstead of rawwgif possible, since it's much easier and safer to parse TSV outputs programmatically. Is it okay to usewg show all dumpinstead of human-readablewg?
Proposed Architecture / Packages
1. config Package
- Responsibilities: Load
.envfile containingDNS,MANAGER_URL, etc. Provide access to environment configurations. - Store the agent's unique ID (which is generated and saved locally on first run to persist across restarts until
/status/reset).
2. security Package
- Key Discovery: Periodically resolve the TXT record of the configured
DNSname to get the Bootstrap PSK. Strip any surrounding quotes. Keep a history of the current and two previous keys. - HMAC Generation: Provide a function to calculate
HMAC-SHA256of JSON payloads using the current PSK. - Nonce Generation: Generate cryptographically secure base64 strings for the
noncefield.
3. network Package
- Execute OS commands and parse their outputs:
ip -j a: Parse the JSON output intoInterfacestructs.ip -j -4 r: Parse the JSON output intoRoutestructs.wg show all dump: Parse the TSV wireguard connections. Ifwghuman-readable parsing is strictly required, we will build a custom state-machine parser for the provided format.
- Maintain a gathering function
GatherStatus()that bundles all these details into the expecteddatapayload.
4. api Package (Agent HTTP Server)
- Runs an HTTP server on
0.0.0.0:5087using standardnet/http. - Endpoints:
GET /status/healthcheck: Return200 OK {"status": "ok"}POST /status/reset: Delete localagent_idstate, delete internal cache, and trigger a fresh registration loop.GET /status/peer: Return local network info so peers can decide routing paths.POST /status/relay: Accept relay payloads, ensure ownagent_idis not inrelay_path(loop detection), and forward to manager.
5. reporter Package (Main Loop)
- Triggers every 30 seconds.
- Gathers data from
networkpackage. - Wraps it in the report envelope:
version,tick,type,nonce,timestamp,agent_id,hmac. - Sends POST request to Manager.
- Relay Fallback: On failure, queries local wireguard interfaces, pings port
5087on known subnets, and attempts to find a working peer to relay through.
Verification Plan
Automated testing
- Write unit tests for parsing the provided
ipandwgexample files. - Write unit test for the PSK rotation logic.
Manual Verification
- Run the agent locally and verify the logs show successful gathering of interfaces and routes.
- Force a bad manager URL and observe logs indicating relay peer scanning behavior.