Files
kattila.status/agent/# Kattila Agent Implementation Plan.md
2026-04-17 19:23:04 +03:00

3.6 KiB

Kattila Agent Implementation Plan

This document outlines the detailed architecture and implementation steps for the Go-based Kattila Agent.

Overview

The Kattila Agent continuously gathers network topology information from the host OS (using ip and wg commands), cryptographically signs the data, and pushes it to the Kattila Manager. If direct communication fails, it uses peer scanning to find a relay path.

User Review Required

Important

  • Do we assume wg, ip commands are always available in the PATH of the agent?
  • The TXT record is returned with wrapping quotes (e.g., "955f333e5b9cc..."). The agent will strip these quotes. Is the PSK used exactly as-is for the HMAC key?
  • For Wireguard peer scanning during relay fallback: Will the agent scan the entire subnet of allowed ips (e.g., 172.16.100.8/29) to find other agents on port 5087, or just guess based on endpoints? Scanning the small subnet is usually reliable.
  • We should parse wg show all dump instead of raw wg if possible, since it's much easier and safer to parse TSV outputs programmatically. Is it okay to use wg show all dump instead of human-readable wg?

Proposed Architecture / Packages

1. config Package

  • Responsibilities: Load .env file containing DNS, MANAGER_URL, etc. Provide access to environment configurations.
  • Store the agent's unique ID (which is generated and saved locally on first run to persist across restarts until /status/reset).

2. security Package

  • Key Discovery: Periodically resolve the TXT record of the configured DNS name to get the Bootstrap PSK. Strip any surrounding quotes. Keep a history of the current and two previous keys.
  • HMAC Generation: Provide a function to calculate HMAC-SHA256 of JSON payloads using the current PSK.
  • Nonce Generation: Generate cryptographically secure base64 strings for the nonce field.

3. network Package

  • Execute OS commands and parse their outputs:
    • ip -j a: Parse the JSON output into Interface structs.
    • ip -j -4 r: Parse the JSON output into Route structs.
    • wg show all dump: Parse the TSV wireguard connections. If wg human-readable parsing is strictly required, we will build a custom state-machine parser for the provided format.
  • Maintain a gathering function GatherStatus() that bundles all these details into the expected data payload.

4. api Package (Agent HTTP Server)

  • Runs an HTTP server on 0.0.0.0:5087 using standard net/http.
  • Endpoints:
    • GET /status/healthcheck: Return 200 OK {"status": "ok"}
    • POST /status/reset: Delete local agent_id state, delete internal cache, and trigger a fresh registration loop.
    • GET /status/peer: Return local network info so peers can decide routing paths.
    • POST /status/relay: Accept relay payloads, ensure own agent_id is not in relay_path (loop detection), and forward to manager.

5. reporter Package (Main Loop)

  • Triggers every 30 seconds.
  • Gathers data from network package.
  • Wraps it in the report envelope: version, tick, type, nonce, timestamp, agent_id, hmac.
  • Sends POST request to Manager.
  • Relay Fallback: On failure, queries local wireguard interfaces, pings port 5087 on known subnets, and attempts to find a working peer to relay through.

Verification Plan

Automated testing

  • Write unit tests for parsing the provided ip and wg example files.
  • Write unit test for the PSK rotation logic.

Manual Verification

  • Run the agent locally and verify the logs show successful gathering of interfaces and routes.
  • Force a bad manager URL and observe logs indicating relay peer scanning behavior.